Publications | Soha Hussein

2024

Generator-Based Fuzzers with Type-Based Targeted Mutation

Soha Hussein, Stephen McCamant, and Mike Whalen

2024

Abs PDF

As with any fuzzer, directing Generator-Based Fuzzers (GBF) to reach particular code targets can increase the fuzzer’s effectiveness. In previous work, coverage-guided fuzzers used a mix of static analysis, taint analysis, and constraint-solving approaches to address this problem. However, none of these techniques were particularly crafted for GBF where input generators are used to construct program inputs. The observation is that input generators carry information about the input structure that is naturally present through the typing composition of the program input.In this paper, we introduce a type-based mutation heuristic, along with constant string lookup, for Java GBF. Our key intuition is that if one can identify which sub-part (types) of the input will likely influence the branching decision, then focusing on mutating the choices of the generators constructing these types is likely to achieve the desired coverages. We used our technique to fuzz AWSLambda applications. Results compared to a baseline GBF tool show an almost 20% average improvement in application coverage, and larger improvements when third-party code is included.

2023

Structural Test Input Generation for 3-Address Code Coverage Using Path-Merged Symbolic Execution

Soha Hussein, Stephen McCamant, Elena Sherman, Vaibhav Sharma, and Mike Whalen

In 2023 IEEE/ACM International Conference on Automation of Software Test (AST), 2023

Abs PDF

Test input generation is one of the key applications of symbolic execution (SE). However, being a path-sensitive technique, SE often faces path explosion even when creating a branch-adequate test suite. Path-merging symbolic execution (PM-SE) alleviates the path explosion problem by summarizing regions of code into disjunctive constraints, thus traversing at once a set of paths with the same prefixes. Previous work has shown that PM-SE can reduce run-time up to 38%, though these improvements can be impaired if the summarized code results in complex constraints or introduces additional symbols that increase the number of branching points in the later execution. Considering these trade-offs, examining the ability of PM- SE to generate branch-adequate test inputs is an open research problem. This paper investigates it by developing a technique that extracts structural coverage-related queries from disjoint constraints. Using this approach, we extend PM-SE to generate branch-adequate test inputs. Experiments compare the effectiveness and efficiency of test input generation by SE and PM-SE techniques. Results show that those techniques are complementary. For some programs, PM-SE yields faster coverage, with fewer generated tests, while for others, SE performs better. In addition, each technique covers branches that the other fails to discover.
Java Ranger: Supporting String and Array Operations in Java Ranger (Competition Contribution)

Soha Hussein, Qiuchen Yan, Stephen McCamant, Vaibhav Sharma, and Michael W. Whalen

In Tools and Algorithms for the Construction and Analysis of Systems, 2023

Abs PDF

Java Ranger is a path-merging tool for Java Programs. It identifies branching regions of code and summarizes them by generating a disjunctive logical constraint that describes the behavior of the code region. Previously, Java Ranger showed that a reduction of 70% of execution paths is possible when used to merge branching regions of code that support numeric constraints.

2022

Counterexample-Guided Inductive Repair of Reactive Contracts

Soha Hussein, Sanjai Rayadurgam, Stephen McCamant, Vaibhav Sharma, and Mats Heimdahl

In Proceedings of the IEEE/ACM 10th International Conference on Formal Methods in Software Engineering, 2022

Abs PDF

Executable implementations are ultimately the only dependable representations of a software component’s behavior. Incorporating such a component in a rigorous model-based development of reactive systems poses challenges since a formal contract over its behaviors will have to be crafted for system verification. Simply hypothesizing a contract based on informal descriptions of the component is problematic: if it is too weak, we may fail in verifying valid system-level contracts; if it is too strong or simply erroneous, the system may fail in operation. Thus, establishing a valid and strong enough contract is crucially important.In this paper, we propose to repair the invalid hypothesized contract by replacing one or more of its sub-expressions with newly composed expressions, such that the new contract holds over the implementation. To this effect, we present a novel, sound, semantically minimal, and under reasonable assumptions terminating, and complete counterexample-guided general-purpose algorithm for repairing contracts. We implemented and evaluated our technique on more than 4,000 mutants with various complexities generated from 29 valid contracts for 4 non-trivial Java reactive components. Results show a successful repair rate of 81.51%, with 20.72% of the repairs matching the manually written contracts and 60.79% of the repairs describing non-trivial valid contracts.

2021

Counterexample Guided Inductive Repair of Reactive Contracts

Soha Hussein, Vaibhav Sharma, Stephen McCamant, Sanjai Rayadurgam, and Mats Heimdahl

In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2021

Abs PDF

Using third-party executable components to build control systems poses challenges for veriﬁcation. This is because the informal behavior descriptions that typically accompany the components often fall short of the needed rigor. Consequently, there is a need to formalize a component contract that is strong enough to help establish system properties and also weak enough to account for all potential component behaviors in the system’s context. In this paper, we present a novel approach that allows an analyst to hypothesize a component contract, explore if the component meets the contract, and, if not, have automated support to help repair the contract. Preliminary results show that, in more than 32% of the cases, the repaired contract is logically equivalent to a developer-written one; in a further 63% of cases, it is a distinct, valid, and non-trivial property of the component.

2020

Java Ranger: Statically Summarizing Regions for Efficient Symbolic Execution of Java

Vaibhav Sharma, Soha Hussein, Michael W. Whalen, Stephen McCamant, and Willem Visser

In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020

Abs PDF

Merging execution paths is a powerful technique for reducing path explosion in symbolic execution. One approach, introduced and dubbed “veritesting” by Avgerinos et al., works by translating abounded control flow region into a single constraint. This approach is a convenient way to achieve path merging as a modification to a pre-existing single-path symbolic execution engine. Previous work evaluated this approach for symbolic execution of binary code, but different design considerations apply when building tools for other languages. In this paper, we extend the previous approach for symbolic execution of Java. Because Java code typically contains many small dynamically dispatched methods, it is important to include them in multi-path regions; we introduce dynamic inlining of method-regions to do so modularly. Java’s typed memory structure is very different from the binary representation, but we show how the idea of static single assignment (SSA) form can be applied to object references to statically account for aliasing. We have implemented our algorithms in Java Ranger, an extension to the widely used Symbolic Pathfinder tool. In a set of nine benchmarks, Java Ranger reduces the running time and number of execution paths by a total of 38% and 71% respectively as compared to SPF. Our results are a significant improvement over the performance of JBMC, a recently released verification tool for Java bytecode. We also participated in a static verification competition at a top theory conference where other participants included state-of-the-art Java verifiers. JR won first place in the competition’s Java verification track.
Java Ranger at SV-COMP 2020 (Competition Contribution)

Vaibhav Sharma, Soha Hussein, Michael W. Whalen, Stephen McCamant, and Willem Visser

In Tools and Algorithms for the Construction and Analysis of Systems, 2020

Abs PDF

Path-merging is a known technique for accelerating symbolic execution. One technique, named “veritesting” by Avgerinos et al. uses summaries of bounded control-flow regions and has been shown to accelerate symbolic execution of binary code. But, when applied to symbolic execution of Java code, veritesting needs to be extended to summarize dynamically dispatched methods and exceptional control-flow. Such an extension of veritesting has been implemented in Java Ranger by implementing as an extension of Symbolic PathFinder, a symbolic executor for Java bytecode. In this paper, we briefly describe the architecture of Java Ranger and describe its setup for SV-COMP 2020.

2013

Controlling Data Flow with a Policy-Based Programming Language for the Web

Thierry Sans, Iliano Cervesato, and Soha Hussein

In Secure IT Systems, 2013

Abs PDF

It has become increasingly easy to write Web applications and other distributed programs by orchestrating invocations to remote third-party services. Increasingly, these third-party services themselves invoke other services and so on, making it difficult for the original application developer to anticipate where his/her data will end up. This may lead to privacy breaches or contractual violations. In this paper, we explore a simple distributed programming language that allows a web service provider to infer automatically where user data will travel to, and the developer to impose statically-checkable constraints on acceptable routes. For example, this may provide confidence that company data will not flow to a competitor, or that privacy-sensitive data goes through an anonymizer before being sent further out.

2012

Security-Policy Monitoring and Enforcement with JavaMOP

Soha Hussein, Patrick Meredith, and Grigore Roşlu

In Proceedings of the 7th Workshop on Programming Languages and Analysis for Security, 2012

Abs PDF

Software security attacks represent an ever growing problem. One way to make software more secure is to use Inlined Reference Monitors (IRMs), which allow security specifications to be inlined inside a target program to ensure its compliance with the desired security specifications. The IRM approach has been developed primarily by the security community. Runtime Verification (RV), on the other hand, is a software engineering approach, which is intended to formally encode system specifications within a target program such that those specifications can be later enforced during the execution of the program. Until now, the IRM and RV approaches have lived separate lives; in particular RV techniques have not been applied to the security domain, being used instead to aid program correctness and testing. This paper discusses the usage of a formalism-generic RV system, JavaMOP, as a means to specify IRMs, leveraging the careful engineering of the JavaMOP system for ensuring secure operation of software in an efficient manner.