AFL-fuzz实践

初次见面

这里使用AFL-TRAINING这个仓库学习AFL的FUZZ。

可以通过docker构建所需要的环境，到仓库的environment下

1	docker build . -t fuzz-training

构建容器，这里因为网络问题可能失败，多构建几次就行了。同时也建议替换清华源和提前准备好AFL++仓库

1
2

# 放到apt-get update前
RUN sed -i s@/archive.ubuntu.com/@/mirrors.tuna.tsinghua.edu.cn/@g /etc/apt/sources.list && sed -i s@/security.ubuntu.com/@/mirrors.tuna.tsinghua.edu.cn/@g /etc/apt/sources.list && apt-get clean

同时需要git clone的话，可以设置docker代理~/.docker/config.json

{
 "proxies":
 {
   "default":
   {
     "httpProxy": "http://192.168.1.12:58591",
     "httpsProxy": "http://192.168.1.12:58591",
     "noProxy": "*.test.example.com,.example2.com,127.0.0.0/8"
   }
 }
}

这里192.168.1.12是本机ip，端口就是代理端口，记得把代理的allow lan打开，也就是允许局域网内的机器通过代理。

在build十几分钟后，启动容器

1	sudo docker run --privileged -ti --name=afl-train -e PASSMETHOD=env -e PASS=password fuzz-training /bin/bash

接着到quickstart里按照教程测试一下

正常运行就ok

harness

目录下给了几个文件，我们需要编写一个代码来对目标程序进行输入，即harness，libraray.c的代码如下。

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>

#include "library.h"

void lib_echo(char *data, ssize_t len){
	if(strlen(data) == 0) {
		return;
	}
	char *buf = calloc(1, len);
	strncpy(buf, data, len);
	printf("%s",buf);
	free(buf);

	// A crash so we can tell the harness is working for lib_echo
	if(data[0] == 'p') {
		if(data[1] == 'o') {
			if(data[2] =='p') {
				if(data[3] == '!') {
					assert(0);
				}
			}
		}
	}
}

int  lib_mul(int x, int y){
	if(x%2 == 0) {
		return y << x;
	} else if (y%2 == 0) {
		return x << y;
	} else if (x == 0) {
		return 0;
	} else if (y == 0) {
		return 0;
	} else {
		return x * y;
	}
}

这里有两个函数，如果要对其进行fuzz，就需要从标准输入读取，例如

#include <unistd.h>
#include <string.h>
#include <stdio.h>

#include "library.h"

// fixed size buffer based on assumptions about the maximum size that is likely necessary to exercise all aspects of the target function
#define SIZE 50

int main() {
	// make sure buffer is initialized to eliminate variable behaviour that isn't dependent on the input.
	char input[SIZE] = {0};

	ssize_t length;
	length = read(STDIN_FILENO, input, SIZE);

	lib_echo(input, length);
}

对于接受两个参数的lib_mul，可以使用

#include <unistd.h>
#include <string.h>
#include <stdio.h>

#include "library.h"

// fixed size buffer based on assumptions about the maximum size that is likely necessary to exercise all aspects of the target function
#define SIZE 100

int main(int argc, char* argv[]) {
	if((argc == 2) && strcmp(argv[1], "echo") == 0) {
		// make sure buffer is initialized to eliminate variable behaviour that isn't dependent on the input.
		char input[SIZE] = {0};

		ssize_t length;
		length = read(STDIN_FILENO, input, SIZE);

		lib_echo(input, length);
	} else if ((argc == 2) && strcmp(argv[1], "mul") == 0) {
		int a,b = 0;
		read(STDIN_FILENO, &a, 4);
		read(STDIN_FILENO, &b, 4);
		printf("%d\n", lib_mul(a,b));
	} else {
		printf("Usage: %s mul|echo\n", argv[0]);
	}
}

在需要读取全部数据时，我们也可用argv的形式实现。

#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, char* argv[]) {
    ifstream file;
    size_t length;
    file.open(argv[1]);
    file.seekg(0, ios::end);
    length = file.tellg();
    file.seekg(0, ios::beg);
    char *buffer = new char[length + 1];
    file.read(buffer, length);
    file.close();
    // do something with buffer
}

libxml2

本节对libxml2进行fuzz，这是一个解析xml数据的库。

需要发现和尝试的漏洞是CVE-2015-8317

1
2

CC=afl-clang-fast ./autogen.sh # you could also use afl-clang-lto, which is usally the better choice, but - oddly - in this case it takes longer to find the bug with an lto build.
AFL_USE_ASAN=1 make -j 4

进行编译和安装。

先看看漏洞，是xmlParseXMLDecl函数存在问题

从代码上看，我们对xmdReadFile或者xmlReadMemory进行fuzz都能运行到这个函数，所以需要写harness进行调用。

在libxml2的网站中，提供了一些example，可以挑一个进行修改。http://www.xmlsoft.org/examples/

并且利用一些afl函数来fuzzhttps://github.com/AFLplusplus/AFLplusplus/blob/08ca4d54a55fe73e64a994c41a12af61f52e497e/instrumentation/README.persistent_mode.md

#include "libxml/parser.h"
#include "libxml/tree.h"
#include <unistd.h>

__AFL_FUZZ_INIT();

int main(int argc, char **argv) {
    #ifdef __AFL_HAVE_MANUAL_CONTROL
        __AFL_INIT();
    #endif
    unsigned char *buf = __AFL_FUZZ_TESTCASE_BUF;  // must be after __AFL_INIT

    xmlInitParser();
    while (__AFL_LOOP(1000)) {
        int len = __AFL_FUZZ_TESTCASE_LEN;
        xmlDocPtr doc = xmlReadMemory((char *)buf, len, "a.xml", NULL, 0);
        if (doc != NULL) {
            xmlFreeDoc(doc);
        }
    }
    xmlCleanupParser();

    return(0);
}

对于这些__AFL的定义，我们可以从这些定义中一窥究竟

#ifndef __AFL_FUZZ_TESTCASE_LEN
  ssize_t fuzz_len;
  #define __AFL_FUZZ_TESTCASE_LEN fuzz_len
  unsigned char fuzz_buf[1024000];
  #define __AFL_FUZZ_TESTCASE_BUF fuzz_buf
  #define __AFL_FUZZ_INIT() void sync(void);
  #define __AFL_LOOP(x) ((fuzz_len = read(0, fuzz_buf, sizeof(fuzz_buf))) > 0 ? 1 : 0)
  #define __AFL_INIT() sync()
#endif

其实就是我们保持了xmlParser的一个状态，不断的去读取buf的内容，在现有状态下进行fuzz loop，从而提高fuzz效率。

进行插桩

AFL_USE_ASAN=1 afl-clang-fast ./harness.c -I libxml2/include libxml2/.libs/libxml2.a -lz -lm -o fuzzer
mkdir in
echo "<hi></hi>" > in/a
afl-fuzz -i in -o out -x /home/fuzzer/AFLplusplus/dictionaries/xml.dict ./fuzzer @@

这里加上了xml字典，方便我们生成更有用的testcase

在涉及到对crash进行分类的时候，可以参考这篇文章里提到的https://mundi-xu.github.io/2021/03/12/Start-Fuzzing-and-crashes-analysis/

这里首先安装expolitable和afl-utils，用python进行安装即可。

这个harness是好，但是如果我们需要进行复现或者用afl-collect归类，则无法使用程序输入，所以修改代码为

#include "libxml/parser.h"
#include "libxml/tree.h"
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>


int main(int argc, char **argv) {
    xmlInitParser();
    FILE* input_file = fopen(argv[1], "r");
    struct stat sb;
    if (stat(argv[1], &sb) == -1) {
        perror("stat");
        exit(EXIT_FAILURE);
    }

    char* buf = malloc(sb.st_size);
    fread(buf, sb.st_size, 1, input_file);
    int len = sb.st_size;
    
    fclose(input_file);
    xmlDocPtr doc = xmlReadMemory((char *)buf, len, "a.xml", NULL, 0);
    if (doc != NULL) {
        xmlFreeDoc(doc);
    }
    xmlCleanupParser();
    return(0);
}

编译一份，测试一下crash

但是这些错误根据函数栈，都是代码xmlFatalError抛出的错误，使用afl-collect也没有发现crash

1	afl-collect -d crashes.db -e gdb_script -r -rr ./out ./in -j 8 -- ./fuzzer

所以加长fuzz时间，大概7个小时跑出44个crash，但是还是没有找到，但是我们至少了解了AFL一些宏定义。

heartbleed

先配置程序

1
2
3

cd openssl
CC=afl-clang-fast CXX=afl-clang-fast++ ./config -d
AFL_USE_ASAN=1 make

这个challenge不需要我们新建harness文件，直接在handshake.cc里面完成即可，实际上很简单，我们直接完成数据的输入部分即可。

// Copyright 2016 Google Inc. All Rights Reserved.
// Licensed under the Apache License, Version 2.0 (the "License");
#include <openssl/ssl.h>
#include <openssl/err.h>
#include <assert.h>
#include <stdint.h>
#include <stddef.h>
#include <unistd.h>

#ifndef CERT_PATH
# define CERT_PATH
#endif

SSL_CTX *Init() {
  SSL_library_init();
  SSL_load_error_strings();
  ERR_load_BIO_strings();
  OpenSSL_add_all_algorithms();
  SSL_CTX *sctx;
  assert (sctx = SSL_CTX_new(TLSv1_method()));
  /* These two file were created with this command:
      openssl req -x509 -newkey rsa:512 -keyout server.key \
     -out server.pem -days 9999 -nodes -subj /CN=a/
  */
  assert(SSL_CTX_use_certificate_file(sctx, "server.pem",
                                      SSL_FILETYPE_PEM));
  assert(SSL_CTX_use_PrivateKey_file(sctx, "server.key",
                                     SSL_FILETYPE_PEM));
  return sctx;
}

int main() {
  static SSL_CTX *sctx = Init();
  SSL *server = SSL_new(sctx);
  BIO *sinbio = BIO_new(BIO_s_mem());
  BIO *soutbio = BIO_new(BIO_s_mem());
  SSL_set_bio(server, sinbio, soutbio);
  SSL_set_accept_state(server);

  /* TODO: To spoof one end of the handshake, we need to write data to sinbio
   * here */
  uint8_t data[100] = {0};
  size_t size = read(STDIN_FILENO,data,100);
  if (size == -1){
    printf("Failed to read from stdin \n");
    return (-1);
  }
  BIO_write(sinbio, data, size);

  SSL_do_handshake(server);
  SSL_free(server);
  return 0;
}

编译

1
2

AFL_USE_ASAN=1 afl-clang-fast++ -g handshake.cc openssl/libssl.a openssl/libcrypto.a -o handshake -I openssl/include -ldl
/home/fuzzer/AFLplusplus/utils/asan_cgroups/limit_memory.sh -u fuzzer afl-fuzz -i in -o out ./handshake

这里提示需要关闭交换分区，但是怎么尝试都关不掉，也就无法继续进行了，看看这个漏洞原理吧。

ntbq

在4.2.2中，替换ntpqmain函数内容为

#ifdef __AFL_HAVE_MANUAL_CONTROL
        __AFL_INIT();
#endif
        int datatype=0;
        int status=0;
        char data[1024*16] = {0};
        int length=0;
#ifdef __AFL_HAVE_MANUAL_CONTROL
        while (__AFL_LOOP(1000)) {
#endif
                datatype=0;
                status=0;
                memset(data,0,1024*16);
                read(0, &datatype, 1);
                read(0, &status, 1);
                length = read(0, data, 1024 * 16);
                cookedprint(datatype, length, data, status, stdout);
#ifdef __AFL_HAVE_MANUAL_CONTROL
        }
#endif
        return 0;

编译运行

CC=afl-clang-fast ./configure && AFL_HARDEN=1 make -C ntpq
cd ..
mkdir in
echo aaa > in/a
afl-fuzz -i in -o out -x ntpq.dict ntp-4.2.2/ntpq/ntpq

马上就可以跑出一些crash，对crash进行分类。

1	afl-collect -d crashes.db -e gdb_script -r -rr ./out ./in -j 8 -- ntp-4.2.2/ntpq/ntpq

发现很多可利用漏洞，afl-collect进行去重和unintesting后留下了6个场景，并且afl-collect生成了gdb_script，通过gdb指定文件可以复现crash场景

1	gdb --command=in/gdb_script

注意到，通过expolitable插件为我们提示了如何利用

通过分析发现这是越界访问了，也是CVE-2009-0159的漏洞点。

接着来看看怎么查看覆盖率，来到4.2.8的文件夹下

CC=clang CFLAGS="--coverage -g -O0" ./configure && make -C ntpq
cd ..
for F in out/default/queue/id* ; do ./ntp-4.2.8p10/ntpq/ntpq < $F > /dev/null ; done
cd ./ntp-4.2.8p10/ntpq/ && llvm-cov gcov ntpq.c

可以看到，cookedprint函数都没有被覆盖到

总结

后续的实验就是教了下各种输入的fuzz方法，比如环境变量，可以修改main函数中让先从stdin输入，然后setenv设置环境变量。afl-analyze可以对输入文件的各个字节进行检查，并且进行分类，帮助我们找到输入中有用的字节。