Hi,

I am trying to retrieve Terms for a given set of documents (int array or
Bitset), which is the result of a query.

// Index creation

// Query with an IndexSearcher

IndexSearcher searcher = new IndexSearcher(ir);
TopDocs docs = searcher.search(query, 100);

>From the "docs", an array of int can be extracted which represents a set of
document resulting from the query. From this set, I need a way to get terms
frequency for this set only.

I have identified a solution by storing terms vectors during index creation,
using a FieldType with "type.setStoreTermVectors(true);"
And then requesting for each document of the set the stored terms vector
(with Terms acting like a "single-document inverted index"):

Terms docterms = ir.getTermVector(docs.scoreDocs[0].doc, "contents");  //
Terms of a document

However, I would have to then merge the Terms of each document forming the
set, which is not a pleasant solution. Is there a way by using only the
lucene API to request the Terms for a given sub-set of document present in
the index ?

My question is similar to those: 
-
http://stackoverflow.com/questions/2924089/how-to-count-term-frequency-for-set-of-documents
-
http://stackoverflow.com/questions/17789969/with-lucene-4-3-1-how-to-get-all-terms-which-occur-in-sub-range-of-all-docs






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Terms-of-a-given-set-of-documents-subset-of-the-full-index-tp4133702.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to