under void*->writeup DUCTF23

# binary_mail (DownUnderCTF-2023)
## overview
The code for the challenge as well as binaries and the solution can be found at
=> https://git.sr.ht/~gotplt/writeups
This is a challenge where you can send/receive and view messages.
I didn't get it in time, but i had a working exploit on my machine which wouldn't work on the remote server. Read on to see what i got and what i missed from this awesome little challenge ;) .
The messages are stored on local files and from the source code you can see that there is a win() function available for us to jump to.
Also we notice the protections of the binary (NX/ASLR)
```
[*] '/home/dsp/ctf/downunder/binary_mail'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled
```
## Finding the bugs
By looking at the source we see that there are type length values being exchanged with every message to and from the binary. The types that we see have to do with requesting the user for input, getting back answers, errors etc.
The TL part of the TLV is defined as
```
typedef enum {
    TAG_RES_MSG,
    TAG_RES_ERROR,
    TAG_INPUT_REQ,
    TAG_INPUT_ANS,
    TAG_COMMAND,
    TAG_STR_PASSWORD,
    TAG_STR_FROM,
    TAG_STR_MESSAGE
} tag_t;

typedef struct {
    tag_t tag;
    unsigned long len;
} taglen_t;
```
which would give us 12 bytes 4 for the enum which should be of u_int32_t type and 8 for the length. Looking at the length being unsigned
originally i thought that if i could overflow it i would be able to wrap it around to something small so i was on the lookout for such a bug. Another class of bugs that is always applicable in such code is type confusion, or generally the manipulation of that control part of a message (the TL) with the data part of the message (V).
We also see a function which access the mailboxes in the filesystem as files
```
FILE* get_user_save_fp(char username[USERPASS_LEN], const char* mode) {
    char fname[USERPASS_LEN + 6];

    snprintf(fname, USERPASS_LEN + 6, "/tmp/%s", username);
    FILE* fp = fopen(fname, mode);

    return fp;
}
```
so maybe we could can't just open the flag file from there since it would need to have a valid structure. The structure of a valid mailbox is
```
[TAG_STR_PASSWORD][PASS_LEN][...PASSWORD...][TAG_STR_FROM][FROM_LEN][...FROM...][TAG_STR_MESSAGE][MSG_LEN][...MESSAGE...]
```
The code seems to have the buffer sizes and writes correct, and possible string termination bugs don't seem to be applicable, since there are no extensive uses of strlen and because the length are generally taken from the the TL. In the handle_auth() function which checks the password on a user's mailbox before proceeding on to display a message though, there seems to be some overly generous error logging reporting back the type and length it read. In the context of CTFs this usually points to some sort of infoleak. I noticed it at the time but then my attention turned elsewhere, due to other findings. That was a mistake.
```
        read_taglen(fp, &tl);
        if(tl.tag != TAG_STR_PASSWORD || tl.len >= USERPASS_LEN) {
            print_tlv(TAG_RES_ERROR, "corrupted user file, got invalid taglen %d %lld", tl.tag, tl.len);
            return 0;
        }
```
Trying to find the most complex part of the code to look for bugs, i noticed the view_mail function. The idea is that it should open the mailbox, call handle_auth on it to verify the password stored at the beginning and then:
* read a tag_length and if it fails report that there are no messages
* if the tag is not TAG_STR_FROM or the legth larger than allowed fail. Otherwise copy "from: "+[...FROM...] to tmpbuf and add a newline
* read a tag_length, if the tag is not TAG_STR_MESSAGE or the combined length of that message and the sender are larger that the buffer fail (remember that unsigned overflow?)
* read message length message bytes to tmpbuf.
Below i have commented a part on this flow that will become relevant after the final finding.
```
    FILE* fp = handle_auth(tmpbuf);
    if(fp == 0) return;

    read_taglen(fp, &tl);
    if(feof(fp)) {
        print_tlv(TAG_RES_MSG, "no mail");
        return;
    }
    /*
     * if that length was attacker controlled, we could guide
     * the parser to read the next TLV from a location of our choice
     * probably in the message contents
     */
    unsigned long t1 = tl.len; 
    if(tl.tag != TAG_STR_FROM || t1 >= USERPASS_LEN) {
        print_tlv(TAG_RES_ERROR, "mail invalid from");
        return;
    }
    memcpy(tmpbuf, "from: ", 6);
    fread(tmpbuf + 6, 1, t1, fp);
    tmpbuf[6 + t1] = '\n';

    read_taglen(fp, &tl);
    unsigned long t2 = tl.len;
    if(tl.tag != TAG_STR_MESSAGE || (t1 + t2) >= USERPASS_LEN + MESSAGE_LEN) {
        print_tlv(TAG_RES_ERROR, "mail invalid message");
        return;
    }
    memcpy(tmpbuf + 6 + t1 + 1, "message: ", 9);
    fread(tmpbuf + 6 + t1 + 1 + 9, 1, t2, fp);

```
Finally we see a commented version of the send_message function. I noticed the bug by interacting with the program and sending messages. Funny enough the first couple of times i created users with usernames of the same length ("testuser1", "testuser2") etc, and the program was behaving correctly. When though i create a username larger than the other, i would get the "invalid message" printout from the function above. The send_mail function:
* reads a username and authenticates it by opening the sender's mailbox and checking the password
* reads a recipient and opens their mailbox in append mode (so it will write right after their password)
* writes the sender's username TLV (the bug is here, cause the L is actually still the recipient's username L)
* writes the message TLV
```
    if(handle_auth(tmpbuf) == 0) return; // auth sender or exit

    print_tlv(TAG_INPUT_REQ, "recipient");
    read_taglen(stdin, &tl); // tl is now recipient, t1 is still sender
    if(tl.tag != TAG_INPUT_ANS || tl.len >= USERPASS_LEN) {
        print_tlv(TAG_RES_ERROR, "invalid recipient input");
        return;
    }
    fread(tmpbuf2, 1, tl.len, stdin); // read recipient len bytes
    tmpbuf2[tl.len] = '\0';
    FILE* fp = get_user_save_fp(tmpbuf2, "a+");

    /* below on the taglen that was supposed to be the sender the length is still that of the recipient */
    pack_taglen(TAG_STR_FROM, tl.len, tmpbuf+USERPASS_LEN);
    fwrite(tmpbuf+USERPASS_LEN, 1, 12, fp); 
```

## Writing an exploit
Since we have control of the sender's length in the the FROM part of a message by controlling the length of the recipient's username we can attampt to land the parser in a TLV that we created. Below i use T_ and L_ for TYPE and LENGTH, and RP, RU, SU for recipient password, recipient username, and sender username accordingly. Notice that by controlling L_RU we can jump in the contents of MSG and have the parser read our fake message.
```
[T_RP][L_RP][RP][T_SU][L_RU][SU][T_MSG][L_MSG][        MSG         ]
                                              [T_MSG1][L_MSG1][MSG1]
```
And what would that message be? well if you remember we need to pass the length addition check in view_mail() which adds L_SU (which is actually L_RU) and L_MSG and checks they are smaller than the sum of their maximum values. If we make L_MSG be the max unsigned long value (0xffffffffffffffff) then the addition is going to evaluate to L_RU. Following that fread() is going to read ulong_max bytes from the file or until it reaches EOF.
Since the binary has ASLR but we have a win function my plan was to overwrite the 2 last bytes of the saved stack pointer. Then when ret is called that value would be poped to rip and execution would go to the win function. I would also have my exploit run apporximate 16 times until it hits a valid address (the last 3 nibbles of an address are known due to the offsets inside segments not changing, so we only have to bruteforce the high nibble of the 2 bytes). By looking in the binary with gdb, i found the save instruction pointer to main from view_mail to be 1208 bytes after the beginning of tmpbuf ("from: " string). Since tmpbuf is 1168 bytes in length we needed to overflow 40 (+ 2 to overwrite the 2 saved rsp bytes) past that. This is larger than the length of message (1024) so we will need to send another message to be appended after this one, and have the rip overwrite payload in that. 

## Failing to get the flag
Since i was doing my tested on an older debian 11 vm i didn't have the right libc initially to run this binary (>= 2.34). So i had downloaded 2.34, compiled it, patchelf it to use the proper loader and rpath (just mentioning that here since it's useful)
```
patchelf --set-interpreter /path/to/your/libc/lib/ld-xxxx.so.2 --set-rpath /path/to/your/libc/lib/ binary
```
The exploit was working on my local machine 1/16th of the time as expected. However on their machine it was failing to work. A team-member suggested that it could be memory alignment (which was correct) so i just had to find a closeby instruction to land that wouldn't mess up the state too much and would be at an address that is divisible by 16. However taking a look at the nearby instructions in the binary again we see
```
gef➤  x/10xi *win-10
   0x1260 :    mov    edi,eax    // this won't work due the one more ret later.
   0x1262 :    call   0x10d0 
   0x1267 :    nop
   0x1268 :    pop    rbp
   0x1269 :    ret               // this one.
   0x126a :        push   rbp        // not stack aligned
   0x126b :      mov    rbp,rsp
   0x126e :      lea    rax,[rip+0xd93]        # 0x2008 // we could land in here?
   0x1275 :     mov    rdi,rax
   0x1278 :     call   0x1070 

gef➤  x/xi 0x1270
   0x1270 :      add    eax,0xd93 // no cause rax will not have the correct system() arg ("cat /flag")

```
There is nothing! At this point i gave up 6 minutes before the ctf ended. The next day i decided to try and use the info leak possibility somehow. And sure enough i could attempt to read an invalid file as a mailbox and get back the first 12 bytes of it (TL). It just so happens that entries in /proc/$PID/maps are in the form start-end ... and start is 12 bytes long. So i could leak the base address of the binary. That would allow me to make the stack pointer the address of a ret instruction in that binary (i now know all the bytes) which would eventually pop the following 8 bytes from the stack. Again by knowing the base address i know the full value of the win() function and can ret2win. properly.

## conclusion
Failure is usally better than success and i sure learned a lot from this challenge. Most importantly to never abandon early a potential lead. In these challenges with high probability everything is there for a reason. Thanks to the people that ran this great CTF and i really hope to play again on it next year :).
writeup DUCTF23 - binary mail