In this blog post, we will cover how we perform stacking using Carbon Black Response and how we can use this methodology to find anomalies in your environment. In reality, an awesome threat hunter would like to have the following data at their disposal:
For this blog post, we will focus on Real Time (RT) process executions within Carbon Black Response. The concept of stacking is simple, we start with collecting data of the same type and choose specific fields in which we want to perform frequency analysis on. Basically, we’re cherry picking specific processes we know attackers will use and abuse and store the results into a pivot table to view the data in various way not possible through the average interface. We can save the query results for each process to an excel file or a database depending on your preference. For this post, we use .csv as the default file extension with a pipe delimiter.
For real time process executions, let’s roll with command line arguments to begin. We then pick a window of time we want to stack and then specify our Carbon Black Response query. In this example, I’ll use PowerShell, cscript, wscript and a few other queries to get us going. Let’s first inspect the config.json file:
To begin using this script, you must first add in your Carbon Black Response URL and API token (found under your user profile). Next, set the year, month, and day you wish to begin stacking. Just like the CBR: Intel Tester script, this tool will run a daily query starting with a specified date until it reaches the current date. The results will be saved to an output file named after each query you specify under the queries object. In this example, you will get six output files, each containing the results of their query with search results dating back to 2018/9/1, assuming your query syntax is correct and you have data dating back to this date. After your config.json is configured, you can run the script. Your standard output will show what query is currently running and for what day the process is searching for, as outlined below.
Once the query completes for your given date range, the worker will move on to the next item in the queue until the queue is empty.
Some queries are very intense and take a lot time to search. You should test each query with the intel tester before stacking them to ensure you’re not paging down 500,000 records (not that you can’t, just not ideal). You may be able to tune this script to remove noise from commonly used applications or scripts within your environment to help yield better results for your stack.
After each query completes, you will have an output file yielding the results. Let’s take a look at our PowerShell query in Microsoft Excel below:
Since this data is delimited by a pipe, we need to split it into its proper columns. We can do this in Excel by selecting the first column in the spreadsheet, then click the Data > Text to Columns button in the toolbar. We then click the Delimited option, click Next and then specify the other delimiter option with | (pipe). Once the delimiter is set, click Finish. Your output file should look like the following image below:
Now that the data is formatted properly, let’s get into smashing the stack. We start by creating a pivot table with all the columns. You do this by selecting all the data and selecting Insert >PivotTable and clicking OK to accept the default data range. Once your pivot table is created, you will see a menu on the right hand side called PivotTable Fields. We will select cmdline from these fields and drag/drop this field into both the Rows and Values panes. Your output should look like the following:
We can now use the power of frequency analysis to identify anomalies in the Powershell stack, as we are showing the frequency of occurrence for the cmdline field. Since the Powershell items I collected are all evil, this isn’t the best example. Let’s take a look at the example query jp_cert_spread_of_infection.
We can see after formatting and sorting inside the pivot table, we have some interesting things inside the stack. Granted, my example dataset isn’t very large, but you can quickly see a few malicious items such as:
reg add “HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run” /v UpdateSvc /t REG_SZ /d “C:\TMP\p.exe -s \\ ‘net user’ > C:\TMP\o2.txt” /f
REG ADD HKCU\Environment /f /v UserInitMprLogonScript /t REG_MULTI_SZ /d “C:\TMP\mim.exe sekurlsa::LogonPasswords > C:\TMP\o.txt”
As you use this script for larger production environments, you will be able learn about your environment. Over time, you should be able to understand what’s normal, who runs specific scripts/applications, what times the applications are usually run, from what path, on what system, with what arguments, etc.). Let’s tweak the data into two levels, by username and hostname:
With the additional fields added, we can see what command line arguments are run by both user and hostname. We can also see the evil reg add commands were run on the hostname jack-pc with the local account of Jack-PC\Jack. I’m only showing a fraction of how stacking the real time data in Carbon Black Response can be used for proactive threat hunting and learning/baselining your environment. Other stacking ideas may include:
- Stacking by parent name and process name to identify the most common and uncommon parents/children for processes
- Stacking by process path to identify unusual execution locations of known utilities or system binaries
- Stacking server groups and reviewing process run the last 30 days by username, path and command line arguments
Enter sub stacking
If you thought stacking was cool, you going to enjoy sub stacking even more. While the concept sub stacking is the same as above, we’re going to dig deeper into the process metadata to stack on the process attributes, not just the process summary information as we did above. The six attributes (at least what I call them) of a process in Carbon Black are:
- RegMods
- FileMods
- NetConns
- ModLoads
- CrossProcs
- ChildProcs
You can read up on the various attributes at the following link below:
By default, the stacking script only queries the summary API api/v1/process. This API only returns the process summary data, it doesn’t include each processes metadata/details unless you actually go to the process link itself using the id and segment_id returned in the summary results. Querying each processes details will give you additional data such as file modifications, network connections, module loads, registry modifications, etc.
When performing sub stacking, it’s important to remember that for each process returned from the summary API, the script will open each processes details and extract its attribute you selected. For example, if you ask for all netconns for mstsc: (process_name:mstsc.exe AND netconn_count:[1 TO *]) and the summary API returns 50 matching results, the script will inspect each process (ALL 50) and extract out ALL the netconns for EACH process. Some processes make a lot of network connections (chrome, firefox, internet explorer, etc..), file mods, reg mods and other attributes. Don’t be surprised if the output file is huge or the script takes longer than usual to complete. Ideally, any query returning more then 5k-10k results on the summary view should be tuned based on the attribute you’re filtering on.
In order to invoke sub stacking, you need to add the attribute property to your config.json for each query. Currently, the supported attribute values are:
- netconn
- modload
- regmod
- filemod
- crossproc
- childproc
If you do not specify an attribute property for a query, the script will perform normal stacking on the query and will not query the details of the process itself. An example of sub stacking for network connections made by Powershell.exe is as follows:
Notice we added the “AND netconn_count:[1 TO *]” to the query. This extra term filters down the search to only return Powershell processes that have network connections vs returning all Powershell processes, including those without network connections. Setting the attribute property to netconn in this query block will tell the script you want to invoke sub stacking on the netconn attribute. Each record will contain the following base fields:
- Hostname
- ProcessStart
- ProcessName
- ProcessPath
- Cmdline
- ProcessMD5
- Username
- ParentName
- ParentMD5
- Id
- SegmentId
- CBR_Link
- QueryName
- QueryTimestamp
In addition, each netconn would have the following fields appended to the base fields:
- Timestamp
- LocalIP
- RemoteIP
- LocalPort
- RemotePort
- Protocol
- Direction
- Domain
If you wish to perform this style of threat hunting at a greater scale for all processes, I would encourage you to review the event forwarder guide found here: and use an event processing pipeline to handle the forwarded events.
I hope this script comes in use for those using Carbon Black Response. Happy Hunting!
Special thanks to Mike Scutt (@OMGAPT), Jeff Chan, Jason Garman and the CB team for all the help.