SeasideBishop: A C port of b33f’s UrbanBishop shellcode injector

Introduction

This post covers a recent C port I wrote of b33f’s neat C# shellcode loader UrbanBishop. The prolific Rastamouse also did a veriation of UrbanBishop, using D/Invoke, called RuralBishop. This injection method has some quirks I hadn’t seen done before, so I thought it would be interesting to port it to C.

Credit of course goes to b33f, Rastamouse as well, and special thanks to AsaurusRex and Adamant for their help in getting it working.

The code for this post is available here.

The Code

First, a quick outline of the injection method, and then I will break it down API by API. SeasideBishop creates a section and maps a view of it locally, opens a handle to a remote process, maps a view of that same section into the process, and copies shellcode into the local view. As as view of the same section is also mapped in the remote process, the shellcode has now been allocated across processes. Next a remote thread is created and an APC is queued on it. The thread is alerted and the shellcode runs.

Opening The Remote Process

get-pid

Above we see the use of the native API NtOpenProcess to acquire a handle to the remote process. Native APIs calls are used throughout SeasideBishop as they tend to be a bit more stealthy than Win32 APIs, though they are still vulnerable to userland hooking.

Sections

A neat feature of this technique is the way that the shellcode is allocated in the remote process. Instead of using a more common and suspicious API like WriteProcessMemory, which is well known to AV/EDR products, SeasideBishop takes advantage of memory mapped files. This is a way of copying some or all of a file into memory and operating on it there, rather than manipulating it directly on disk. Another way of using it, which we will do here, is as an inter-process communication (IPC) mechanism. The memory mapped file does not actually need to be an ordinary file on disk. It can be simply a region of memory backed by the system page file. This way two processes can map the same region in their own address space, and any changes are immediately accessible to the other.

The way a region of memory is mapped is by calling the native API NtCreateSection. As the name indicates, a section, or section object, is the term for the memory mapped region.

create-section

Above is the call to NtCreateSection within the local process. We create a section with a size of 0x1000, or 4096 bytes. This is enough to hold our demo shellcode, but might need to be increased to accommodate a larger payload. Note that the allocation will be rounded up to the nearest page size, which is normally 4k.

The next step is to create a view of the section. The section object is not directly manipulated, as it represents the file-backed region of memory. We create a view of the section and make changes to that view. The remote process can also map a view using the same section handle, thereby accessing the same section. This is what allows IPC to happen.

local-map-section

Here we see the call to NtMapViewOfSection to create the view in the local process. Notice the use of RW and not RWX permissions, as we simply need to write the shellcode to the view.

memcpy

Next a simple memcpy writes our shellcode to the view.

remote-map-section

Finally we map a view of the same section in the remote process. Note that this time we use RX permissions so that the shellcode is executable. Now we have our shellcode present in the remote process’s memory, without using APIs like WriteProcessMemory. Now let’s work on executing it.

Starting From The End

In order to execute our shellcode in the remote process, we need a thread. In order to create one, we need to give the thread a function or address to begin executing from. Though we are not using Win32 APIs, the documentation for CreateRemoteThreadEx still applies. We need a “pointer to [an] application-defined function of type LPTHREAD_START_ROUTINE to be executed by the thread and [serve as] the starting address of the thread in the remote process. The function must exist in the remote process.” The function we will use is RtlExitUserThread. This is not a very well documented function, but debugging indicates that this function is part of the thread termination process. So if we tell our thread to begin executing at this function, we are guaranteed that the thread will exit gracefully. That’s always a good thing when injecting into remote processes.

So now that we know the thread will exit, how do we get it to execute our code? We’ll get there soon, but first we need to get the address of RtlExitUserThread so that we can use it as the start address of our new remote thread.

function-address

There’s a lot going on here, but it’s really pretty simple. RtlExitUserThread is exported by ntdll.dll, so we need the DLL base address first before we can access its exports. We create the Unicode string needed by the LdrGetDllHandle native API call and then call it to get the address of ntdll.dll. With that done, we need to create the ANSI string required by LdrGetProcedureAddress to get the address of the RtlExitUserThread function. Again, notice no suspicious calls to LoadLibrary or GetProcAddress here.

Creating The Thread

Now that we have our thread start address, we can create it in the remote process.

create-remote-thread

Here we have the call to NtCreateThreadEx that creates the thread in the target process. Note the use of the pRemoteFunction variable, which contains the start address of RtlExitUserThread. Note also that the true argument above is a Boolean value for the CreateSuspended parameter, which means that the thread will be created in a suspended state and will not immediately begin executing. This will give us time to tell it about the shellcode we’d like it to run.

Execution

We’re in the home stretch now. The shellcode is in the remote process and we have a thread ready to execute it. We just need to connect the two together. To do that, we will queue an Asynchronous Procedure Call (APC) on the remote thread. APCs are a way of asynchronously letting a thread know that we have work for it to do. Each thread maintains an APC queue. When the thread is next scheduled, it will check that queue and run any APCs that are waiting for it, and then continue with its normal work. In our case, that work will be to run the RtlExitUserThread function and therefore exit gracefully.

queue-apc

Here we see how the thread and our shellcode meet. We use NtQueueApcThread to queue an APC onto the remote thread, using lpRemoteSection to point to the view containing the shellcode we mapped into the remote process earlier. Once the thread is alerted, it will check its APC queue and see our APC waiting for it.

alert-thread

A quick call to NtAlertResumeThread and the thread is alerted and runs our shellcode. Which of course pops the obligatory calc.

calc

Conclusion

I thought this was a neat injection method, with some quirks I hadn’t seen before, and I enjoyed porting it over to C and learning the concepts behind it in more detail. Hopefully others will find this useful as well.

Thanks again to b33f, Rasta, Adamant, and AsaurusRex for their help!