Hello, I would like to know more about Lucene's retrieval model, more specifically about the boolean model. Is that a standard model or an extended model? I mean, it returns just documents that match the boolean expression or include in the search result all Documents which correspond to the given conditions, regardless of the boolean connectors - AND, OR, NOT and calculate a weight between 0 and 1 for all search results that contains at least one of the terms. The extended model evaluates documents with only one of the terms with a smaller value than one that contains both.
In the Apache Lucene - Scoring's page i found not that much about: "Lucene scoring uses a combination of the Vector Space Model (VSM) of Information Retrieval and the Boolean model to determine how relevant a given Document is to a User's query. In general, the idea behind the VSM is the more times a query term appears in a document relative to the number of times the term appears in all the documents in the collection, the more relevant that document is to the query. It uses the Boolean model to first narrow down the documents that need to be scored based on the use of boolean logic in the Query specification. Lucene also adds some capabilities and refinements onto this model to support boolean and fuzzy searching, but it essentially remains a VSM based system at the heart." Thanks in advance for any responses