Rem’s World: Sherlock Holmes and eDiscovery?

Remu Ogaki, Esq., Senior Project Manager, The CJK Group

Litigation has some surprising similarities to a Sherlock Holmes novel. It’s not enough for Holmes to point a finger at Mr. X and say, you’re the murderer. Holmes must build an argument, step by step, by laying evidence to support each and every point he wants to make.

Mr. X was in the room, because of A, B, C.

Mr. X had the murder weapon because of D, E, F.

Mr. X had the motive to kill because of G, H, I.

Issue tags in eDiscovery basically help a litigator accomplish the Holmes role in a complex case.

Issue tags are electronic markers applied to a document to indicate for later use what legal issues such document relates to. This can be useful, because with a click of a button, an attorney can pull up all the financial documents that were in the data set, for example.

However, this information that the litigator wants can be at tension with another important factor: review speed, and by extension, review cost.

Pace refers to the documents per minute by which an eDiscovery contract worker reviews documents.

Since it is standard industry practice to conduct eDiscovery billed by the hour, for any given number of documents, the faster the reviewers are able to move through the documents, the lower the discovery costs in litigation. I’m putting aside, for now, the other model of doc review, which pricing review “per doc.” The per doc model will be explored on another installment of “Rem’s World.”

The Tradeoff

There is a tradeoff between “Information” and “pace.” The greater the number of issue tags and factual matters the reviewers are instructed to seek out and categorize, the slower the review will proceed.

I’ve personally seen instructions that request reviewers categorize documents into as many as 22 different types of issues. In practical terms, simply scrolling through 22 different issue items looking for the one you need for a particular document can be time consuming.

However, the biggest problem is the sheer number of criteria both to memorize, and to attempt to apply to each document. When there are 22 different possible ways a document can receive an issue tag, the complexity of analysis that must be applied to each and every document can be staggering.

Ideally, a reviewer is able to go through 40 or more documents an hour. This means, a reviewer can spend roughly 90 seconds reading through an email and making a determination on how it should be categorized, then applying that categorization through a series of clicks.

Consider how practical it is to expect someone to repeatedly and rapidly apply a 22-prong test to a document in 90 seconds over and over for hours.

Furthermore, there is an unfortunate but common misunderstanding that the “Number of clicks” a reviewer must undertake to finish coding a document determines the amount of time it takes to review a document. For example, if a reviewer must click 6-7 issue tags to responsive documents before moving to the next document, this indeed can add 5-10 seconds to the review process.

This is a mistake of correlation and causation. A more complex analysis that is applied to a document will require more clicks, and so a review that requires numerous clicks is likely to be complex, and therefore slower.

However, on a slower, complex review where the average rate of review is around 30 documents per hour, the average number of seconds per document is around 120 seconds. While it is true that having numerous clicks to make per document can add up over time, the impact of this is relatively minimal—the vast majority of the increased time per document (90 seconds à 120 seconds or more) must be coming from somewhere else.

That “somewhere else” is the increased complexity of analysis required. The greater the number of different categories of information has been asked for, the slower the review will go.

You may ask, in all practical considerations, aren’t these the same thing?

One unfortunate way in which I have seen this misunderstanding manifest in a real-life case is where the law firm believes that they can circumvent any slowdown in review by minimizing the number of issue tags… but leaving the full complexity of the information asked for.

The firm thought that by reducing clicks, they could still ask for the same amount of information and have a rapid pace, getting the best of both worlds.

The instructions reduced the number of issue tags for 5 headings, 5 types of tags.

However, each tag had 4-6 “subheadings” any of which could require the tag.

For example, 1 tag regarding “important communications” might require the tag if

Alex and Beth talk regarding “X” between 2008 and 2011

Charlie and Dina talk regarding “X” between 2009 and 2014

Elizabeth has any contact with Widget Company at any time

Finn speaks to anyone regarding “Y” before 2015

However, the problem is there are still 4 separate issues that must be searched for, each of which have different timelines, several of which have nothing to do with each other.

Furthermore, by collapsing the 4 tags into a single generic “important communications” tag, the reviewers are deprived of any reminders that help them remember the key information they are looking for.

For example, if the tags were labeled as:

Communications: Alex/Beth 2008-2011

Communications: Charlie/Dina 2009 – 2014

Communications: Elizabeth Widgets

Communications: Finn 2015

Each tag would provide a visual cue that reminds the reviewer of what is important as they look through thousands of electronic files.

A tag that collapses all the information into a generic tag name forces each reviewer to try to memorize voluminous information on what is being searched for—particularly if the information appears only rarely in the dataset, this can easily lead to forgotten information and missed information.

What this underscores is what is truly important is the “amount of information” that is requested from the reviewers. Requesting highly detailed and voluminous information from reviewers condensed into a small handful of tags will not achieve the same pace of review as a simple review. Such attempts may even be counterproductive.

So What Does this All Mean?

The reality of litigation is that at times, you really need a huge number of categories of information from your eDiscovery vendor. Complexity can be unavoidable.

However, one should also be aware that there is a very close relationship between the amount of “information” requested from the eDiscovery vendor and every effort should be made to limit the amount of issue tagging requested of the reviewers to an absolute minimum, or as simple as possible. What I described above is exponentially more challenging when your electronic data is in another language. The task of the Sherlock Holmes-type eDiscovery Project Manager is no easy feat, particularly when he/she is sleuthing for clues amid a sea of non-English gibberish!