Re: HighFreqTerms for results set

2011-07-21 Thread Mihai Caraman
It's only available in Solr and it's based on UnInvertedField . Lucene 3.4.0 should have itimplemented too. I ran a small index in Solr and it does the job by showing

Re: HighFreqTerms for results set

2011-07-20 Thread Israel Tsadok
This is very interesting. Do you know how query faceting is implemented?

Re: HighFreqTerms for results set

2011-07-20 Thread Erik Fäßler
I might be mistaken here, but why exactly wouldn't you use the facet approach? I don't know exactly about how to do this in core Lucene but with Solr it works very well also for multi-valued fields. You could just say "give me the 100 most frequent terms in field X" for each field you're inter

Re: Re: HighFreqTerms for results set

2011-07-19 Thread Mihai Caraman
Yeah, that's to slow to use. Thank you very much for your answers. I really appreciate it. All the best, Mihai C

Re: Re: HighFreqTerms for results set

2011-07-19 Thread Israel Tsadok
On Tue, Jul 19, 2011 at 12:20 PM, wrote: > Israel, if you have this implemented, I'd appreciate if you can crunch some > numbers so I know how slow it actually is, for future comparison? Let's say > on 100.000 results, each of which have up to 50 words, or 50.000 results > with 100 words each ...

Re: Re: HighFreqTerms for results set

2011-07-19 Thread caraman . mihai
Israel, if you have this implemented, I'd appreciate if you can crunch some numbers so I know how slow it actually is, for future comparison? Let's say on 100.000 results, each of which have up to 50 words, or 50.000 results with 100 words each ... how much time does it take or how many queri

Re: HighFreqTerms for results set

2011-07-19 Thread Israel Tsadok
We faced this problem a long time ago, and ended up just extracting all the matching documents, re-analyzing and counting the terms using a MultiSet. It was very slow, but it worked. You might

Re: HighFreqTerms for results set

2011-07-18 Thread Mihai Caraman
Faceted search is for single-term fields, wright? Isn't it bad practice to apply it for each word in each field in the resulting set?(if it's even posible) Again, I want to find the most frequent word in a resulting set. Words are in fields that contain phrases, not in their own field. 2011/7/18

Re: HighFreqTerms for results set

2011-07-18 Thread Manish Bafna
Use Facet by that field. It will bring up top words. On Mon, Jul 18, 2011 at 6:03 PM, Mihai Caraman wrote: > So I looked around and found no viable solution for this problem: > How to extract the most frequent terms in the search result set after > submitting the query. > > HighFreqTerms >

HighFreqTerms for results set

2011-07-18 Thread Mihai Caraman
So I looked around and found no viable solution for this problem: How to extract the most frequent terms in the search result set after submitting the query. HighFreqTerms and docFreq