August 19, 2011

eDiscovery: Is the Age of the Cyborg at Hand?

by Alan Brooks

By:  Douglas E. Forrest, Esq.

A Cyborg is part human/part machine; in practice, we’re talking about an organism that has enhanced abilities due to technology, not the Terminator.

While the term is generally applied to humans with technology that is physically attached, such as prosthetic devices, I’m going to stretch a bit here and use it to cover the coordinated, iterative, interwoven human/machine process of technology-assisted review. There is a new study out which persuasively, if not definitively, shows that technology-assisted review can achieve results superior to manual review at a fraction of the cost.

The study is written up in Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review, Richmond Journal of Law and Technology, Vol. XVII, Issue 3, Article 11 (2011), and was conducted by its authors Maura R. Grossman and Gordon V. Cormack.

Grossman (counsel at Wachtell, Lipton, Rosen & Katz and a prominent eDiscovery expert) and Cormack (a Professor of Computer Science at the University of Waterloo, where he is a Co-Director of their Information Retrieval Group) were both coordinators of the2010 Legal Track of TREC (the National Institute of Standards and Technology’s seminal Text Retrieval Conference).

The TREC Legal Track, http://trec-legal.umiacs.umd.edu/, is where the vast, more general computer science field of Information Retrieval meets the specialized requirements of eDiscovery. A key aspect of the annual Legal Track is a kind of bake-off on a fairly large-scale eDiscovery project contested by teams from law firms, vendors and universities.

The study examines the 2009 interactive task, which used the Enron e-mail dataset “ to determine which document sets (where a set was defined to be an email message with its attachments) should be produced in response to a production request for which a ‘topic authority’ was available to answer specific questions posed by a participating team.” (The results for the 2010 interactive task are not yet publically available).

The study focused on the results from the best 2 of the 11 entering teams, both of which leveraged human input, e.g., relevancy determinations, made on a relatively small number of documents, on an iterative basis to hone machine-learning systems which then analyzed the vastly larger sets of remaining documents.

The results were dramatic. As summarized in the study’s conclusion:

“…the myth that exhaustive manual review is the most effective – and therefore, the most defensible – approach to document review is strongly refuted. Technology-assisted review can (and does) yield more accurate results than exhaustive manual review, with much lower effort. “

But, as the conclusion goes on to say:

“Of course, not all technology-assisted reviews (and not all manual reviews) are created equal. The particular processes found to be superior in this study are both interactive, employing a combination of computer and human input. While these processes require the review of orders of magnitude fewer documents than exhaustive manual review, neither entails the naïve application of technology absent human judgment. “

While the study is statistically rigorous and, accordingly, heavily caveated, the bottom line implications are clear: in the right hands, cyborg processes – the well-guided interactive, iterative interplay of human input and machine-learning applications – can lower the cost and raise the defensibility of eDiscovery review.

Doug is Eastern Regional Director – Discovery Strategy & Management at ILS.