Fuzzing Testing
Last updated
Last updated
Paper Link: https://arxiv.org/abs/2312.04512
Smart contracts have emerged as the cornerstone of blockchain technology, enabling the automation and execution of agreements without the need for intermediaries. It has revolutionized the way agreements are executed in decentralized environments, offering transparency, efficiency, and trustlessness.
However, smart contracts, like any software, are susceptible to vulnerabilities that can be exploited by malicious actors. Common vulnerabilities include reentrancy attacks, integer overflow, authorization flaws, and logic errors. These vulnerabilities can result in financial loss, privacy breaches, and disruption of services. While the potential benefits of smart contracts are vast, ensuring the security of users who interact with them is paramount. By focusing on user security, we aim to foster a safer and more resilient ecosystem for smart contract adoption and usage.
Example. Imagine a scenario where a token contract allows users to buy tokens but not sell them, resulting in the token price continuously rising due to the inability to sell. This setup attracts more users to invest, believing they will benefit from the increasing value of the tokens. However, if the contract owner withdraws all funds at this point, the users' funds are drained from the contract, causing significant financial losses to investors. This is a classic example of a honeypot contract, deployed with malicious intent to exploit the trust and naivety of users in the blockchain ecosystem. To mitigate the risks associated with honeypot contracts, users must exercise caution and conduct thorough due diligence before interacting with smart contracts, especially those offering high returns or incentives that seem too good to be true. In addition, developers have a responsibility to prioritize security and transparency when deploying smart contracts. By adhering to best practices and conducting thorough security audits, developers can help prevent the creation of honeypot contracts that pose a threat to unsuspecting users.
Towards smart contract user security, we propose a novel fuzzing paradigm MuFuzz, which can dynamically test smart contract and expose potential vulnerability in the contracts. Fuzzing has been proven to be a practical technique in the field of smart contract security for uncovering vulnerabilities. Users can utilize MuFuzz to preemptively detect smart contracts on the blockchain, allowing them to identify potential pitfalls in the contract beforehand and mitigate possible risks associated with the contract.
In the following, we present the specific technical details of MuFuzz, which consists of three key components, namely sequence-aware mutation, mask-guided seed mutation, and dynamic-adaptive energy adjustment. Figure 1 shows the high-level architecture and analysis pipeline of MuFuzz.
MuFuzz begins by taking the contract source code as inputs, which is then compiled into three types of representations, i.e., bytecode, application binary interface (ABI), and abstract syntax tree (AST). Bytecode is disassembled into EVM instructions for fuzzing. Meanwhile, MuFuzz captures the data dependencies of all state variables in the contract. By analyzing the ABI and AST, MuFuzz is able to figure out which state variables are defined and which functions contain state variables. Since smart contracts are stateful programs, MuFuzz ignores functions that do not contain any state variables because they cannot affect the persistent state. MuFuzz then tracks each state variable and its read and write operations, such as assignments and comparisons.
Afterwards, MuFuzz derives a transaction sequence based on the information gathered from the data dependency analysis of the state variables. Put succinctly, MuFuzz approximately determines a sequence of transactions in which transaction T1 is executed before transaction T2 only if T1 writes a state variable V where T2 reads it. As a result, MuFuzz is able to estimate the invocation order of each transaction in the sequence.
MuFuzz then determines the inputs of the transaction sequence. A trivial way is to randomly generate the test inputs of transactions. However, due to the randomness, it suffers from inherent difficulties in satisfying complicated branch conditions. To address these challenges, MuFuzz introduces a seed evolution paradigm that iteratively refines the test inputs of transactions. MuFuzz first adopts a branch-distance-feedback seed selection strategy, guiding the fuzzer to select high-quality seeds. Furthermore, MuFuzz employs a mask-guided seed mutation strategy, which allows the fuzzer to identify the certain parts of the test inputs that should not be mutated, thus guiding the seed mutation to hit target branches more efficiently. MuFuzz starts by creating an empty seed queue and a set of seeds as inputs, followed by performing seed selection and seed mutation, respectively.
Branch Distance Feedback. MuFuzz tracks the seed execution and records the branches that each test case covers. Whenever a test case covers a new branch, it is added to the seed queue. While this strategy has been shown to quickly traverse most of the branches, it still has difficulty in covering those branches that are guarded by strict conditions.
Mutation Masking. To further bias test input generation towards target branches, MuFuzz incorporates a mask-guided seed mutation into MuFuzz. The mutation masking strategy derives from the observations that: (1) certain parts of a test input that hits a deeply nested branch are critical to satisfying the necessary conditions for reaching that branch; (2) certain parts of a test input that makes the distance to cover a branch smaller play a key role in approaching that branch. Therefore, to generate more mutated inputs hitting target branches, the crucial parts of the test inputs should not be mutated. Inspired by this, MuFuzz first customizes the selection of test inputs to mutate from the seed queue. It selects the inputs that either hit the deeply nested branches or make the branch distance smaller. We say that a branch br is a nested branch if and only if br contains at least two nested conditional statements. Each nested branch is associated with a nested score, which is set to the number of nested conditional statements. After filtering out which seeds need to be mutated, MuFuzz introduces a mutation mask computation algorithm, aiming to approximate the critical parts of the test inputs that are not allowed to mutate. MuFuzz engages a set of mutation operators, including byte flipping, replacing bytes with interesting values, byte insertion, and byte deletion.
In practice, after reviewing a large number of real-world smart contracts, we empirically observe that the updating of state variables tends to be protected by strict branch conditions or hidden in deeply nested branches. Unfortunately, conventional fuzzers may waste massive resources in fuzzing common branches, while the allocated energy is insufficient for the deeply nested branches or branches that are likely to contain bugs. To address this problem, MuFuzz adopts a dynamic-adaptive energy adjustment mechanism, which enables the fuzzing resource allocation for each branch more balanced and flexible.
MuFuzz is equipped with a pre-fuzz phase that executes a test input on an instrumented EVM to collect the exercised path. Given the path P, MuFuzz initializes the fuzzing resources. After that, it analyzes all split points (i.e., branch instruction) in P. During the pre-fuzz phase, MuFuzz will set a weight value for each exercised branch. Note that the nested branches are assigned different weight values based on the value of nested score, and the branch covering a vulnerable instruction is assigned an additional weight value. It is worth mentioning that the pre-fuzz phase yields little impact on the overall runtime overhead of the fuzzer.
In subsequent fuzzing rounds, MuFuzz dynamically adjusts resource allocation according to the weight value of each branch. This suggests that the higher the weight value of a target branch, the more fuzzing resources will be allocated along the path to that branch. Moreover, MuFuzz also leverages the energy allocation feedback to guide seed mutation, namely the seeds that reach branches covering the vulnerable instructions are preferentially selected and fuzzed. With the assistance of the dynamic-adaptive energy allocation strategy, MuFuzz is able to take care of these target branches, making the fuzzing process more balanced for each branch.
BLOCK DEPENDENCY
15
UNPROTECTED DELEGATECALL
17
ETHER FREEZING
14
INTEGER OVER-/UNDER- FLOW
62
REENTRANCY
16
UNPROTECTED SELF-DESTRUCT
23
STRICT ETHER EQUALITY
19
TRANSACTION ORIGIN USE
2
UNHANDLED EXCEPTION
27
Total
195
Table 1 The nine types of smart contract vulnerabilities can be detected by MuFuzz
MuFuzz now is able to detect nine types of smart contract vulnerabilities. On 155 vulnerable smart contracts, MuFuzz uncovers 195 true positives, which are summarized in Table 1.
BLOCK DEPENDENCY
21
20
UNPROTECTED DELEGATECALL
0
0
ETHER FREEZING
0
0
INTEGER OVER-/UNDER- FLOW
42
42
REENTRANCY
10
7
UNPROTECTED SELF-DESTRUCT
1
1
STRICT ETHER EQUALITY
2
2
TRANSACTION ORIGIN USE
0
0
UNHANDLED EXCEPTION
10
9
Total
86
81
Table 1 Real-World Case Studies of MuFuzz
We randomly select 100 real smart contracts from Etherscan, where each contract contains more than 30,000 transactions in Ethereum. We manually check the bug detection results and classify them into true positives. In addition, we present the overall branch coverage (i.e., the average of the 100 contract runs) of MuFuzz. Table 2 summarizes the experimental results. From the table, we can see that MuFuzz reports a total of 86 bug alarms. Out of the 100 contracts, 39 contracts are flagged as having at least one of these alarms. We manually verify the alarms and confirm that 94% of them are true positives.
Overall, user security is a fundamental consideration in the design and deployment of smart contracts. By actively addressing prevalent vulnerabilities, we can establish a more secure ecosystem conducive to the widespread adoption and utilization of smart contracts. As the smart contract landscape continues to evolve and expand, it is critical to place a strong emphasis on strengthening user security to foster trust in decentralized systems. MuFuzz, with its integration of advanced technologies such as sequence-aware mutation, mask-guided seed mutation, and dynamic adaptive energy adjustment, is a critical tool for dynamically testing smart contracts. By enabling users to proactively identify potential security risks in advance, MuFuzz serves as a critical safeguard for protecting their interests from potential violations.