TREC 2007 Legal Track

Interactive Challenge Task

March 18, 2007
Send comments to the mailing list or to

The principal goal of the TREC legal track is to support experimental investigation of alternative designs for systems to support "E-Discovery", the process by which evidence stored in digital form is made available for use in litigation. Additional details on the track are available at

In 2006, six research teams submitted "automatic runs," experiment results that were created without human intervention. This involved automatically indexing the collection, automatically generating queries from the "topic descriptions" that were provided with the collection, and automatically generating result sets (usually ranked in an order approximating decreasing probability of relevance, as estimated by the system). This process yields repeatable comparisons between alternate system designs, but three factors limit the degree to which experiment results are representative of real applications. First, the process of automatic query generation is at best an imperfect approximation of what a real person would do. Indeed, approximating human behavior at this task is so difficult that it is common practice for researchers who are interested principally in system design to simply take all of the words from one or more fields as the topic description as if all of those words had been typed by the user as the query. Such an approach can yield useful comparisons of system capabilities, although with some risk that compensating behavior by real users that might tend to minimize differences in practice remains unmodeled. A second important limitation of fully automatic experiments is that they do not attempt to model query refinement behavior, which both simulation studies and actual user studies have repeatedly identified as an important factor in the effective use of information retrieval systems. A third potential limitation, less often remarked upon but potentially of greater importance early in the development of new technology, is that the form and content of the topic description reflects a set of assumptions about system capabilities that may constrain the design space that can be explored in this way. In the TREC-2006 legal track, for example, the topic descriptions contained only natural language terms. That decision, taken early in the design of the track, naturally would have made it easier for teams to automate the generation of queries containing natural language terms than it would have been for them to generate queries containing the metadata terms that (which are also present in the document collection).

For 2007, we are proposing an "interactive challenge task" in an effort to begin to explore these issues. We have patterned this task on a pilot effort that was conducted in 2006 in which a single professional searcher sought to identify relevant documents for each topic that automated systems would be unlikely to find. This effort was successful, identifying an average of 35 relevant documents per topic (over 39 topics) that were not highly ranked by any automatic run. In addition to the (unsurprising) confirmation that people and machines together can achieve more than machines alone could do, identification of these additional relevant documents can help system designers to focus some of their efforts on this set of documents that have proven to be particularly challenging for present search technologies.

Our design for the TREC 2007 Legal Track Interactive Challenge Task differs from the 2006 pilot in three important ways: (1) searchers will focus on overall recall (as in a real e-discovery task) rather than just on documents that they expect automated systems would be unlikely to find, (2) we will focus on a small number of topics so that we can compare the results of different searchers who apply different search strategies to the same task, and (3) we'll add an element of competition so that we can have some fun with this!

Here's how things will work (all dates 2007):