Lucene retrieval model

Claudia Santos Tue, 30 Dec 2008 01:03:42 -0800

Hello,

I would like to know more about Lucene's retrieval model, more specifically
about the boolean model.
Is that a standard model or an extended model? I mean, it returns just 
documents that
match the boolean expression or include in the search result all Documents 
which correspond to the given conditions, regardless of
the boolean connectors - AND, OR, NOT and calculate a weight between 0 and 1 
for all search results that contains at least one of the terms. 
The extended model evaluates documents with only one of the terms with a 
smaller value than one that contains both.


In the Apache Lucene - Scoring's page i found not that much about: 
"Lucene scoring uses a combination of the Vector Space Model (VSM) of
Information Retrieval and the Boolean model to determine how relevant a
given Document is to a User's query. In general, the idea behind the VSM is
the more times a query term appears in a document relative to the number of
times the term appears in all the documents in the collection, the more
relevant that document is to the query. It uses the Boolean model to first
narrow down the documents that need to be scored based on the use of boolean
logic in the Query specification. Lucene also adds some capabilities and
refinements onto this model to support boolean and fuzzy searching, but it
essentially remains a VSM based system at the heart."

Thanks in advance for any responses

Lucene retrieval model

Reply via email to