Performing proper filtering of data can dramatically save time and and costs associated with e-discovery. By removing duplication of documents and emails, the data set becomes much more manageable for review. De-duplication has proven to reduce the amount of data needed to be reviewed by 90%. After de-duplication occurs, culling techniques can be applied to the metadata typically by dates, people, organizations and key words.
CaseDriven can also perform near-deduplication, which can be used to detect files with the same content but in different formats. While exact de-duplication can result in dramatic duplicate removal, near de-duplication can also result in finding potential discoverable electronic file repositories.
Searching becomes critical within a large data set. CaseDriven is able to search the data in a variety of ways. Some of those searching methods include:
- Keyword
- Stemming and Fuzzy
- Proximity
- Boolean
- Email Thread Identification
- Conceptual and Clustering