It happens sometimes that I have to analyze a piece of malware which is really annoying: fake calls, fake APIs and lots of opaque constructs and following the code on IDA it’s a pain, so I wanted a quick way to extract some info to ease my job.
My solution was to do some tracing using DynamoRIO and then import the collected data into IDA, it’s kinda an overshoot using it to extract few info like I’m doing, but I also wanted to play with it ;)
I was interested in few info:
- a list of basic blocks that were executed
- called APIs
- dump of new code
here is trace sample:
THREAD: D1C EntryPoint: 00401100 BB@00401100 AB@7C90E195 NtQuerySystemEnvironmentValueEx(106)!ntdll.dll BB@00401133 BB@0040113A BB@00401161 AB@7C90E156 NtQuerySemaphore(103)!ntdll.dll BB@00401176 ... BB@00401357 EB@00971530 - DUMPED (00971000 > 00AA0000) EB@00971290 EB@00971552
As you have probably guessed each thread has its trace and each basic block is marked differently depending on where it’s located:
- BB (Basic Block): program code
- AB (API Block): DLLs, specifically APIs code
- EB (External Block): code not in the program image nor in the DLLs – usually allocated memory
Also when an EB is encountered the corresponding memory region is dumped to disk for later analysis.
All this information is made available into IDA thanks to a little script that will color the first instruction of each executed basic block (note that DynamoRIO basic blocks could differ from those of IDA ) and load the dumped memory ranges.
That’s what it looks like once you run the script in IDA:
here all the code you need to start playing.[updated 11-2011 for a small fix in log output]