Weidong Cui, Electrical Engineering and Computer Sciences University of California at Berkeley, Technical Report No. UCB/EECS-2006-115
An increasing variety of malware like worms, spyware and adware threatens both personal and business computing. Modern malware has two features: (1) malware evolves rapidly; (2) selfpropagating malware can spread very fast. These features lead to a strong need for automatic actions against new unknown malware. In this thesis, we aim to develop new techniques and systems to automate the detection of new unknown malware because detection is the first step for any reaction. Since there is no single panacea that could be used to detect all malware in every environment, we focus on one important environment, personal computers, and one important type of malware, computer worms. To tackle the problem of automatic malware detection, we face two fundamental challenges: false alarms and scalability. We take two new approaches to solve these challenges. To minimize false alarms, our approach is to infer the intent of user or adversary (the malware author) because most benign software running on personal computers is user driven, and authors behind different kinds of malware have distinct intent. To achieve early detection of fast spreading Internet worms, we must monitor the Internet from a large number of vantage points, which leads to the scalability problem — how to filter repeated probes. Our approach is to leverage protocol-independent replay of application dialog, a new technology which, given examples of an application session, can mimic both the initiator and responder sides of the session for a wide variety of application protocols without requiring any specifics about the particular application it mimics. We use replay to filter frequent multi-stage attacks by replaying the server side responses. To evaluate the effectiveness of our new approaches, we develop the following systems:(1) BINDER, a host-based detection system that can detect a wide class of malware on personal computers by identifying extrusions, malicious outbound network requests which the user did not intend; (2) GQ, a large-scale, high-fidelity honeyfarm system that can capture Internet worms by analyzing in real-time the scanning probes seen on a quarter million Internet addresses, with emphases on isolation, stringent control, and wide coverage.
Professor Randy H. Katz
Dissertation Committee Chair