Erick, Solr termfreq implementation also uses DocsEnum with the assumption that freq are called on ascending doc IDs which is valid when scoring from from the hit list. If freq is requested for an out of order doc, a new DocsEnum has to be created.
Bianca, can you explain your use case in more details? What did you mean by having a new document? A new document is added to the index? Then you already have to reopen the searcher/reader anyway to get a new DocsEnum. On Aug 19, 2014, at 08:26 AM, Erick Erickson <erickerick...@gmail.com> wrote: Hmmm, I'm not at all an expert here, but Solr has a function query "termfreq" that does what you're doing I think? I wonder if the code for that function query would be a good place to copy (or even make use of)? See TermFreqValueSource... Maybe not helpful at all, but... Erick On Tue, Aug 19, 2014 at 7:04 AM, Bianca Pereira <aivykar...@gmail.com > wrote: > Hi everybody, > > I would like to know your suggestions to calculate Term Frequency in a > Lucene document. Currently I am using MultiFields.getTermDocsEnum, > iterating through the DocsEnum 'de' returned and getting the frequency with > de.freq() for the desired document. > > My solution gives me the result I want but I am having time issues. For > instance, I want to calculate the term frequency for a given term for N > documents in a sequence. Then, every time I have a new document I have to > retrieve exactly the same DocsEnum again and iterate until find the > document I want. Of course I cannot cache DocsEnum (yes, I did this huge > mistake) because it is an iterator. > > Do you have any suggestions on how I can get Term Frequency in a fast way? > The unique suggestion I had up to now was "Do it programatically, don't use > Lucene". Should be this the solution? > > Thank you. > > Regards, > Bianca Pereira --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org