I am sorry for raising up this thread after 6 months.
But we have still problems with faceted search on full-text fields.
We try to get most frequent words in a text field that is created in 1 hour.
The faceted search takes too much time even the matching number of documents
(created_at within 1
Another thing you can try is trunk. This specific case has been
improved by an order of magnitude recenty.
The case that has been sped up is initial population of the
filterCache, or when the filterCache can't hold all of the unique
values, or when faceting is configured to not use the
the facet
counts of all 1M of facet terms, or did you limit the number of facet terms
returned to a small number?
Also did your entire index fit within RAM?
--- On Sat, 6/5/10, Furkan Kuru furkank...@gmail.com wrote:
From: Furkan Kuru furkank...@gmail.com
Subject: Re: Faceted Search Slows Down
On Sun, Jun 6, 2010 at 7:38 AM, Furkan Kuru furkank...@gmail.com wrote:
facet.limit = default value 100
facet.minCount is 1
The document count that matches the query is 8-10K in average. I did not
calculate the terms (maybe using using facet.limit=-1 and facet.minCount=1)
My index entirely
We try to provide real-time search. So the index is changing almost in every
minute.
We commit for every 100 documents received.
The facet search is executed every 5 mins.
Here is the stats result after facet search with normal facet.method=fc (it
took 95 seconds)
*name: * fieldValueCache
On Sun, Jun 6, 2010 at 1:12 PM, Furkan Kuru furkank...@gmail.com wrote:
We try to provide real-time search. So the index is changing almost in every
minute.
We commit for every 100 documents received.
The facet search is executed every 5 mins.
OK, that's the problem - pretty much every
Using the Zoie/Bobo combination gives you realtime faceting. (Lucene based)
http://sna-projects.com/zoie/
http://sna-projects.com/bobo/
wiki write-up:
http://snaprojects.jira.com/wiki/display/BOBO/Realtime+Faceting+with+Zoie
We can take this over to the zoie/bobo mailing list if you have
Ok, I will have a look at distributed search, multi-core solr solution.
Thank you Yonik,
On Sun, Jun 6, 2010 at 8:54 PM, Yonik Seeley yo...@lucidimagination.comwrote:
On Sun, Jun 6, 2010 at 1:12 PM, Furkan Kuru furkank...@gmail.com wrote:
We try to provide real-time search. So the index is
strategy still work in this case? If
not, is there any other way to mitigate the cache re-building problem of facet
search?
--- On Sun, 6/6/10, Yonik Seeley yo...@lucidimagination.com wrote:
From: Yonik Seeley yo...@lucidimagination.com
Subject: Re: Faceted Search Slows Down as index gets
The documents full-text fields are 140 chars length (tweets).
Actually I had looked at those parameters and thought no change was
neccessary because the terms per document would be few and the unique term
count was nearly 1 M. I don't know exactly but average term count per
document text can be
Hello,
I have been dealing with real-time data.
As the number of total indexed documents gets larger (now 5 M)
a faceted search on a text field limited by the creation time, which we use
to find the most used word in all these text fields, gets slow down.
query string: created_time:[NOW-1HOUR
Faceting on a full-text field is hard.
What version of Solr are you using?
If it's 1.4 or later, try setting
facet.method=enum
And to use the filterCache less, try
facet.enum.cache.minDf=100
-Yonik
http://www.lucidimagination.com
On Fri, Jun 4, 2010 at 10:31 AM, Furkan Kuru
I am using 1.4 version.
I have tried your suggestion,
it takes around 25-30 seconds now.
Thank you,
On Fri, Jun 4, 2010 at 5:54 PM, Yonik Seeley yo...@lucidimagination.comwrote:
Faceting on a full-text field is hard.
What version of Solr are you using?
If it's 1.4 or later, try setting
Slows Down as index gets larger
To: solr-user@lucene.apache.org, yo...@lucidimagination.com
Date: Friday, June 4, 2010, 11:25 AM
I am using 1.4 version.
I have tried your suggestion,
it takes around 25-30 seconds now.
Thank you,
On Fri, Jun 4, 2010 at 5:54 PM, Yonik Seeley
yo
On Fri, Jun 4, 2010 at 7:33 PM, Andy angelf...@yahoo.com wrote:
Yonik,
Just curious why does using enum improve the facet performance.
Furkan was faceting on a text field with each word being a facet value. I'd
imagine that'd mean there's a large number of facet values. According to the
15 matches
Mail list logo