Sunday, January 3, 2016

Post #18 (Or... "Ogres Have Layers, Obfuscation Has Layers...")

So, let's talk about obfuscation.

I am by no means an expert on static malware analysis, but I recently came across a sample that employed a method of obfuscation that I have not yet seen. The sample in question was JS/Kryptik.AYR and the method of obfuscation was a combination of random variable names, string concatenation, hexadecimal values, and tertiary operators.

To start, let's take a block of text out of the malware sample:

Selected text from the original malware sample.

Wow, right? There are a few things going on here. Obviously, the variable names themselves are random to hide their purpose, but beyond that, we see weird operations going on. Those operations are tertiary operations, and they work as follows.

First, we set up a comparison, such as two numbers being compared to one another. Then, we add in the "?" operator, which signifies a tertiary operation. The next two values are the possible values for the variable we declared. So, if the operation we set up results in a "true" result, the first value is assigned, and if the operation is false, the second value is assigned. So, to use an example in JavaScript: var comparison=(1!=2?"true":"false"); will result in the variable "comparison" being set to "true", but if we change it to var comparison=(1==2?"true":"false");, it will instead be set to "false". This, in essence, is how tertiary operators work.

Bearing that in mind, we must then examine the obvious conclusion; obviously, half of the listed strings in the code are garbage, throw-away values that are only there to serve as clutter. Add this to the fact that a large portion of the string values are written in hexadecimal, and you have quite an effective source of obfuscation.

Again, I am not an expert, but if you want to look at what this code is doing, you can either manually decode it, which would be massively time-consuming, or you could perform dynamic analysis, but you will likely get vague results and won't understand the mechanics of the malware itself. Myself, I took the approach of debugging the sample with Firebug as it ran in the browser, along with cleaning up the code and inserting lots of document.write(variable_name). That looked a little like this:

A section of the code that I cleaned up. I chose to group the variables together to make it easier to read.

Once I cleaned up the code, I simply built a small HTML page and loaded it up. Below, you can see the ASCII, full-concatenated value of the variable shown in the last image:

And here we can see what had been hidden behind all that obfuscation. Well... we can see at least one variable, that is.

Well, that is basically all there was to it. Still, the code that had been inserted into the hacked website I reviewed was over 500 lines in length, nearly half of the original webpage's HTML total length, including malicious code.

Let this serve as a reminder that attackers will go to painstaking lengths to protect their assets and prevent analysis/detection of their tools.

Cheers, and stay safe out there!