Insider Threat Detection

From Large Scale Data Analysis Lab
Jump to: navigation, search

Who are Insiders?

Insiders, who are given permission to access the computer systems within a domain of normal behavior and accomplish their work efficiently, perform malicious activities, including secret information stealth, damage, and modification.

Researches on insider threat have been divided into several categories: the philosophy of malicious insiders, the behavior patterns and motivations of malicious insiders, and the detection of malicious insiders. Randazzo et al. proposed six findings on their analysis on 80 cases.

  • The executions by criminals are "low and slow." In average, the time elapse almost 32 months to be detected by the victim organization.
  • The means of insiders were not very sophisticated. The role they served are seldom technical, or their fraud behaviors are not explicitly technical.
  • The frauds by manager are different from the ones by non-manager in their damage and duration.
  • Most cases don't involved with collusion.
  • An audit, customer complaints, or coworker suspicion are the three common detection ways of insider incidents.
  • The target of committing fraud is on personal identifiable information (PII).

Damage by Malicious Insiders

The attack from insiders is hard to detect. According to a speech in RSA Conference 2013 based on the the survey over 10 years period, "The average cost per incident is $412,000, and the average loss per industry is $15 million over ten years. In several instances, damages reached more than $1 billion."

The cost of insider attack can be so large; the attack of insider is hard to detect. The conventional security systems are designed to detect attacks from outside. There are few cases for studying. Comparing to intruders, insiders have more sophisticated knowledge about the computation systems and data structure. Gunasekhar, Rao, and Basu study the definition of different type of insiders, discovering the difficulties in detection, and proposing ways to prevent. In their study, the types of insiders can be (1) Pure insiders, (2) Insider Associates, (3) Insider affiliates, and (4) Outside affiliates. The pure insiders are the employees. The insider associates are the other employees able to physically access to the computers, such as security guards, and janitors. The insider affiliates care the friends or spouses of employees, who have chances to access to the computer when visiting. Outside affiliates can access into the network via unprotected wireless network.


We begin our research on insider threat with the datasets generated by CERT Division, in partnership with ExactData LLC, and under sponsorship from DARPA I2O, for experimentation on insider threat investigations. The CERT dataset is synthesized by Blasser and Lindauer \cite{certpaper} with the cooperation with relationship graph model, asset graph model, behavior model, communications model, topic model, psychometric model, decoy model, and threat scenarios. The relationship graph model describes the relational network in the organization. The asset graph model shows the association between non-human assets and various individual in the organization. The behavior model shows the connection between employees and their assets; the behavior includes their relationship, interests, affinities, and psychological profiles. The communication model is a probability expression of the current organization state. The topic model is the topical interest for each employee; their interest reflects their data consumption and data production. The psychometric model shows a set of static personality characteristics of each employee. The decoy model is a set of decoy files. The threat scenario is made up with the consultation with counter-intelligence experts. There are 5 different insider attack scenarios in CERT dataset. An example of the scenarios in CERT dataset is shown below:

 User who did not previously use removable drives or work after hours begins logging in after hours, 
 using a removable drive, and uploading data to Leaves the organization shortly thereafter.

The CERT dataset has 10 versions of insider threat dataset. In each dataset, millions of events of different types partitioned into different files, one for HTTP, one for logons, etc. An insider threat incident is a combination of user actions that are spread out across multiple files.

Approaches to Anomaly Insider Detection

To detect malicious insiders, the followings are our approaches to the problem:

  • Insider Threat Detection using kNN on Session Feature Vectors: We first look into the pattern of malicious insiders. We first consolidate the log streams into one, and partition them into "Sessions". We then extract feature vector from the session, and apply unsupervised learning algorithm for detecting the malicious session.
  • Graph Analysis on Email Dataset: We also analyze the community graph from the email transactions. We look into the transactions between employees and build up community graph. We can discover abnormal transactions or outliers from the community graph.

External links