Re: Calculate Term Frequency

Tri Cao Tue, 19 Aug 2014 09:58:32 -0700

Erick, Solr termfreq implementation also uses DocsEnum with the assumption that 
freq are called on ascending
doc IDs which is valid when scoring from from the hit list. If freq is 
requested for an out of order doc, a new
DocsEnum has to be created.


Bianca, can you explain your use case in more details? What did you mean by 
having a new document? A new
document is added to the index? Then you already have to reopen the 
searcher/reader anyway to get a new
DocsEnum.

On Aug 19, 2014, at 08:26 AM, Erick Erickson <[email protected]> wrote:

Hmmm, I'm not at all an expert here, but Solr has a function
query "termfreq" that does what you're doing I think? I wonder
if the code for that function query would be a good place to
copy (or even make use of)? See TermFreqValueSource...

Maybe not helpful at all, but...
Erick

On Tue, Aug 19, 2014 at 7:04 AM, Bianca Pereira <[email protected]        > 
wrote:
       > Hi everybody,
       >
       > I would like to know your suggestions to calculate Term Frequency in a
       > Lucene document. Currently I am using MultiFields.getTermDocsEnum,
       > iterating through the DocsEnum 'de' returned and getting the frequency 
with
       > de.freq() for the desired document.
       >
       > My solution gives me the result I want but I am having time issues. For
       > instance, I want to calculate the term frequency for a given term for N
       > documents in a sequence. Then, every time I have a new document I have 
to
       > retrieve exactly the same DocsEnum again and iterate until find the
       > document I want. Of course I cannot cache DocsEnum (yes, I did this 
huge
       > mistake) because it is an iterator.
       >
       > Do you have any suggestions on how I can get Term Frequency in a fast 
way?
       > The unique suggestion I had up to now was "Do it programatically, 
don't use
       > Lucene". Should be this the solution?
       >
       > Thank you.
       >
       > Regards,
       > Bianca Pereira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Calculate Term Frequency

Reply via email to