Documentation for Terrier v5.x

Welcome to the documentation for the Terrier IR platform v5.x. If you are a new user, we recommend that you begin with a quickstart guide from those listed below. The quickstart guides will introduce you to core concepts when using Terrier within different scenarios. If you are looking to find out about a particular function or component of Terrier, scroll down this page to the main Table of Contents.

Quickstart Guides

Running Batch IR Experiments with Terrier

This quickstart guide is designed for information retrieval students and researchers looking to use Terrier to experiment with or learn about some aspect of a search engine. The main learning outcomes are: how to download and install a local copy of the Terrier platform; how to produce an on-disk index from a collection of documents; and how to issue single queries as well as batches of queries over that index from the command line.

Integrating Terrier as a Search Engine into your Java Application

This quickstart guide is for software developers that want to use Terrier as a search engine within their own application. The guide covers how to import Terrier as an application dependancy using Maven, how to create an index, how to index files within your java program, and how to issue queries to the index. A variant of the quickstart shows the same using exclusively in-memory data structures.

Table of Contents

Platform Information

OverviewAn overview of what the Terrier platform is, and what it can be used for.
What’s NewWhat has changed in the Terrier platform in the recent releases.
Terrier ArchitectureAn overview of the main architectural components of Terrier.
Terrier Add-on ComponentsAn overview of the add-on components available for Terrier.
Query LanguageA description of the query language that Terrier supports.
Future Features & Known IssuesUpcoming features in future releases.

Quickstart Guides

Running Batch IR Experiments with Terrier A quickstart guide is designed for information retrieval students and researchers.

Common Configuration Options

Configuring TerrierA brief introduction to the configuration of Terrier
Configuring IndexingA guide of indexing, and how it can be configured to your needs.
Configuring RetrievalA guide of the retrieval functionalities, covering frequently-used retrieval methodologies, such as TF-IDF, Okapi’s BM25, language models (Hiemstra and Ponte & Croft) and weighting models from the probabilistic Divergence From Randomness (DFR), as well as query expansion (pseudo-relevance feedback).
Configuring Retrieval using ControlsA summary of the controls that can be used to configure retrieval.
Configuring Real-time Index StructuresAn introduction to the real-time index structures in Terrier.

Advanced Functionality

Learning to Rank with TerrierA guide to using multiple retrieval features with learning to rank techniques to enhance search effectiveness.
Advanced Learning to Rank using Tagged Query TermsA discussion of how query terms can be tagged to allow different retrieval features.
Pluggable CompressionA guide to configuring byte-level compression schemes to reduce the size of Terrier’s index structures.
Non English language supportDescription of support functionalities in Terrier for indexing and retrieving from documents written in languages other than English.

Search Applications

Web-based TerrierA guide to using the Web-based application of Terrier.
Desktop SearchA summary of the Desktop Search application of Terrier available from Github.

Experiment Support

TREC Experiment ExamplesAn example of how to create an index and produce a TREC run on the WT2G and Blogs06 collections.
Evaluation of ExperimentsShows how the results of experiments can be evaluated using the in-built evaluation package in Terrier.

Extending Terrier

Developing with TerrierIntroduction to developing applications using Terrier.
Extending IndexingIn depth guide about extending indexing
Indexer Details More information about the roles of various classes in the indexing process.
Extending RetrievalIn depth guide about retrieval, and how various retrieval functionalities can be integrated into Terrier, as well as, how you can use Terrier to obtain various statistics about the terms and the collection.

Other Resources

Terrier API JavadocAPI documentation of each class in Terrier.
Description of DFRDescription of the Divergence From Randomness framework that Terrier implements.
Terrier ForumThe Terrier discussion forum is for developers and users of the Terrier platform to discuss the software, ask questions, post patches and share tips.
Terrier WikiHints and tips, and configurations for various well-known corpora.
BibliographyIf you use Terrier in your research, please cite us!
ContactsTerrier Contacts

Webpage: http://terrier.orgContact: School of Computing ScienceCopyright (C) 2004-2020 University of Glasgow. All Rights Reserved.