Hello Sengly

First of all you have to make sure, that you create new Fields, which you add to a Document, with the appropriate constructor. You have to specify the usage of term vectors (Field.TermVector.YES):

new Field("text", "your text...", Field.Store.YES, Field.Index.TOKENIZED,Field.TermVector.YES));

Without the explicit storage of the term vectors it is not possible to get the term vectors during searching.

Once you build the index, you can use the suggested method getTermFreqVector().

To get the top n keywords from the hits object you can iterate over the first results.
Here is an example:

           for (int i = 0; i < 10; i++) {
               int docNumber = hits.id(i);
TermFreqVector[] termsV = ir.getTermFreqVectors(docNumber); //return an array of term frequency vectors for the specified document. for (int xy = 0; xy < termsV.length; xy++) { //loop over all terms-vectors in the current document
                   String[] terms = termsV[xy].getTerms();
for (int termsInArray = 0; termsInArray < terms.length; termsInArray++) { //toDo: count the occurrence of the terms
                   }

               }
           }

Hope this helps.
Thomas


Sengly Heng wrote:
Hello all,

I would like to extract the term freq vector from the hit results as a total
vector not by document.

I have searched the mailing and I found many have talked about this issue
but I still could not find the right solution to this matter. Everyone just
suggested to look at getTermFreqVector and TermEnum.

I wonder if there someone has already done this before and what was your
solution? Would you please share?

Also how to get a list of top n keywords from that hit results. I have also
looked at HighFreqTerms (in the contribution repositories as well as the
one implemented by Luke) but still this class is rather for the usage when
we want to get the top n keywords from an index and not from the hit
results.

Thank you.

Best regards,

Sengly



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to