Re: Extracting data from Lucene index files

Venkateshprasanna Tue, 19 Dec 2006 19:21:17 -0800

> Take a look at TermDocs and TermEnum.

I need to get the frequency of each word in each of the documents I have
indexed.


This is what I could do with TermEnums and TermDocs. For each Term from
TermEnum, I have instantiated a TermsDoc and for each doc, I am trying to
get the frequency of the Term.

    IndexReader ir = IndexReader.open("index file");
    TermEnum terms = ir.terms();
    while(terms.next()) {
        TermDocs docs = ir.termDocs(terms.term());
        
        while(docs.next()) {
                TermFreqVector tfv = 
ir.getTermFreqVector(docs.doc(),"contents");
                String indexTerms[] = tfv.getTerms();
                int indexFreqs[] = tfv.getTermFrequencies();

                for(int i = 0; i<indexTerms.length; i++) {
                        System.out.println(indexTerms[i]+" "+indexFreqs[i]);
                }
         }
     }

But there is no way of getting the frequency of only 'that' term in 'that'
document. I have to get the entire vector. This puts the loop in jeopardy.
How can I overcome this?

-- 
View this message in context: 
http://www.nabble.com/Extracting-data-from-Lucene-index-files-tf2813318.html#a7984092
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Extracting data from Lucene index files

Reply via email to