Re: Facet Performance

2020-06-17 Thread Erick Erickson
queryResultCache doesn’t really help with faceting, even if it’s hit for the main query. That cache only stores a subset of the hits, and to facet properly you need the entire result set…. > On Jun 17, 2020, at 12:47 PM, James Bodkin > wrote: > > We've noticed that the filterCache uses a sig

Re: Facet Performance

2020-06-17 Thread James Bodkin
We've noticed that the filterCache uses a significant amount of memory, as we've assigned 8GB Heap per instance. In total, we have 32 shards with 2 replicas, hence (8*32*2) 512G Heap space alone, further memory is required to ensure the index is always memory mapped for performance reasons. Ide

Re: Facet Performance

2020-06-17 Thread Michael Gibney
To expand a bit on what Erick said regarding performance: my sense is that the RefGuide assertion that "docValues=true" makes faceting "faster" could use some qualification/clarification. My take, fwiw: First, to reiterate/paraphrase what Erick said: the "faster" assertion is not comparing to "fac

Re: Facet Performance

2020-06-17 Thread Erick Erickson
Uninvertible is a safety mechanism to make sure that you don’t _unknowingly_ use a docValues=false field for faceting/grouping/sorting/function queries. The primary point of docValues=true is twofold: 1> reduce Java heap requirements by using the OS memory to hold it 2> uninverting can be expen

Re: Facet Performance

2020-06-17 Thread James Bodkin
The large majority of the relevant fields have fewer than 20 unique values. We have two fields over that with 150 unique values and 5300 unique values retrospectively. At the moment, our filterCache is configured with a maximum size of 8192. From the DocValues documentation (https://lucene.apac

Re: Facet Performance

2020-06-17 Thread Anthony Groves
Ah, interesting! So if the number of possible values is low (like <= 10), it is faster to *not *use docvalues on that (indexed) faceted field? Does this hold true even when using faceting techniques like tag and exclusion? Thanks, Anthony On Wed, Jun 17, 2020 at 9:37 AM David Smiley wrote: > I

Re: Facet Performance

2020-06-17 Thread David Smiley
I strongly recommend setting indexed=true on a field you facet on for the purposes of efficient refinement (fq=field:value). But it strictly isn't required, as you have discovered. ~ David On Wed, Jun 17, 2020 at 9:02 AM Michael Gibney wrote: > facet.method=enum works by executing a query (ag

Re: Facet Performance

2020-06-17 Thread Michael Gibney
facet.method=enum works by executing a query (against indexed values) for each indexed value in a given field (which, for indexed=false, is "no values"). So that explains why facet.method=enum no longer works. I was going to suggest that you might not want to set indexed=false on the docValues face

Re: Facet Performance

2020-06-17 Thread James Bodkin
Thanks, I've implemented some queries that improve the first-hit execution for faceting. Since turning off indexed on those fields, we've noticed that facet.method=enum no longer returns the facets when used. Using facet.method=fc/fcs is significantly slower compared to facet.method=enum for us

Re: Facet Performance

2020-06-16 Thread Erick Erickson
Ok, I see the disconnect... Necessary parts if the index are read from disk lazily. So your newSearcher or firstSearcher query needs to do whatever operation causes the relevant parts of the index to be read. In this case, probably just facet on all the fields you care about. I'd add sorting too if

Re: Facet Performance

2020-06-16 Thread James Bodkin
I've been trying to build a query that I can use in newSearcher based off the information in your previous e-mail. I thought you meant to build a *:* query as per Query 1 in my previous e-mail but I'm still seeing the first-hit execution. Now I'm wondering if you meant to create a *:* query with

Re: Facet Performance

2020-06-16 Thread Erick Erickson
Did you try the autowarming like I mentioned in my previous e-mail? > On Jun 16, 2020, at 10:18 AM, James Bodkin > wrote: > > We've changed the schema to enable docValues for these fields and this led to > an improvement in the response time. We found a further improvement by also > switching

Re: Facet Performance

2020-06-16 Thread James Bodkin
We've changed the schema to enable docValues for these fields and this led to an improvement in the response time. We found a further improvement by also switching off indexed as these fields are used for faceting and filtering only. Since those changes, we've found that the first-execution for q

Re: Facet Performance

2020-06-12 Thread Erick Erickson
I question whether fiterCache has anything to do with it, I suspect what’s really happening is that first time you’re reading the relevant bits from disk into memory. And to double check you should have docVaues enabled for all these fields. The “uninverting” process can be very expensive, and

Re: Facet Performance

2020-06-12 Thread James Bodkin
We've run the performance test after changing the fields to be of the type string. We're seeing improved performance, especially after the first time the query has run. The first run is taking around 1-2 seconds rather than 6-8 seconds and when the filter cache is present, the response time is a

Re: Facet Performance

2020-06-11 Thread James Bodkin
Could you explain why the performance is an issue for points-based fields? I've looked through the referenced issue (which is fixed in the version we are running) but I'm missing the link between the two. Is there an issue to improve this for points-based fields? We're going to change the field

Re: Facet Performance

2020-06-11 Thread Erick Erickson
There’s a lot of confusion about using points-based fields for faceting, see: https://issues.apache.org/jira/browse/SOLR-13227 for instance. Two options you might try: 1> copyField to a string field and facet on that (won’t work, of course, for any kind of interval/range facet) 2> use the deprec

Facet Performance

2020-06-11 Thread James Bodkin
We’ve been running a load test against our index and have noticed that the facet queries are significantly slower than we would like. Currently these types of queries are taking several seconds to execute and are wondering if it would be possible to speed these up. Repeating the same query over a

Re: Facet performance problem

2018-02-20 Thread Shawn Heisey
On 2/20/2018 1:18 AM, LOPEZ-CORTES Mariano-ext wrote: We return a facet list of values in "motifPresence" field (person status). Status: [ ] status1 [x] status2 [x] status3 The user then selects 1 or multiple status (It's this step that we called "facet filtering

RE: Facet performance problem

2018-02-20 Thread LOPEZ-CORTES Mariano-ext
solution? -Message d'origine- De : Erick Erickson [mailto:erickerick...@gmail.com] Envoyé : lundi 19 février 2018 18:18 À : solr-user Objet : Re: Facet performance problem I'm confused here. What do you mean by "facet filtering"? Your examples have no facets at all, just

Re: Facet performance problem

2018-02-19 Thread Erick Erickson
I'm confused here. What do you mean by "facet filtering"? Your examples have no facets at all, just a _filter query_. I'll assume you want to use filter query (fq), and faceting has nothing to do with it. This is one of the tricky bits of docValues. While it's _possible_ to search on a field that'

Facet performance problem

2018-02-19 Thread MOUSSA MZE Oussama-ext
Hi We have following environement : 3 nodes cluster 1 shard Replication factor = 2 8GB per node 29 millions of documents We've faceting over field "motifPresence" defined as follow: Once the user selects motifPresence filter we executes search again with: fq: (value1 OR value2 OR value3 OR

Re: Really slow facet performance in 6.6

2017-10-25 Thread Yonik Seeley
On Mon, Oct 23, 2017 at 3:06 PM, John Davis wrote: > Hello, > > We are seeing really slow facet performance with new solr release. This is > on an index of 2M documents. A few things we've tried: What happens when you run this facet request again? The first time a UIF faceting

Re: Really slow facet performance in 6.6

2017-10-24 Thread Günter Hipler
have a look for more background information: https://issues.apache.org/jira/browse/SOLR-8096 it's not only related to version 6.6. It's a question of design since 5.x Günter On 23.10.2017 21:06, John Davis wrote: Hello, We are seeing really slow facet performance with new so

Re: Really slow facet performance in 6.6

2017-10-23 Thread Toke Eskildsen
John Davis wrote: > We are seeing really slow facet performance with new solr release. > This is on an index of 2M documents. I am currently running some performance experiments on simple String faceting, comparing Solr 4 & 6. There is definitely a performance difference, but it is n

Really slow facet performance in 6.6

2017-10-23 Thread John Davis
Hello, We are seeing really slow facet performance with new solr release. This is on an index of 2M documents. A few things we've tried: 1. method=uif however that didn't help much (the facet fields have docValues=false since they are multi-valued). Debug info below. 2. changing query

Re: JSON facet performance for aggregations

2017-05-25 Thread Saman Rasheed
gex: is there a way to narrow it down to just one regex term, e.g. *thing*, so it will find soothing, somthing, everything each with their number of occurences for the document? thanks, ~ From: Yonik Seeley Sent: 24 May 2017

Re: JSON facet performance for aggregations

2017-05-24 Thread Yonik Seeley
On Mon, May 8, 2017 at 11:27 AM, Yonik Seeley wrote: > I opened https://issues.apache.org/jira/browse/SOLR-10634 to address > this performance issue. OK, this has been committed. A quick test shows about a 30x speedup when faceting on a string/numeric docvalues field with 100K unique values and d

Re: JSON facet performance for aggregations

2017-05-08 Thread Yonik Seeley
ming at that case? > > Please advise. > > Thanks > Mikhail > > -Original Message- > From: Yonik Seeley [mailto:ysee...@gmail.com] > Sent: Sunday, May 07, 2017 6:25 PM > To: solr-user@lucene.apache.org > Subject: Re: JSON facet performance for aggregations &g

RE: JSON facet performance for aggregations

2017-05-08 Thread Mikhail Ibraheem
recommend streaming at that case? Please advise. Thanks Mikhail -Original Message- From: Yonik Seeley [mailto:ysee...@gmail.com] Sent: Sunday, May 07, 2017 6:25 PM To: solr-user@lucene.apache.org Subject: Re: JSON facet performance for aggregations OK, so I think I know what's going on.

Re: JSON facet performance for aggregations

2017-05-07 Thread Yonik Seeley
> Thanks > Mikhail > > -Original Message- > From: Yonik Seeley [mailto:ysee...@gmail.com] > Sent: Sunday, April 30, 2017 1:04 PM > To: solr-user@lucene.apache.org > Subject: Re: JSON facet performance for aggregations > > It is odd there would be quite such a big perfor

RE: JSON facet performance for aggregations

2017-04-30 Thread Mikhail Ibraheem
Hi Yonik, We are using Solr 6.5 Both studentId and grades are double: We have 1.5 million records. Thanks Mikhail -Original Message- From: Yonik Seeley [mailto:ysee...@gmail.com] Sent: Sunday, April 30, 2017 1:04 PM To: solr-user@lucene.apache.org Subject: Re: JSON facet performance

Re: JSON facet performance for aggregations

2017-04-30 Thread Yonik Seeley
ms: > > json.facet={ >studentId:{ > type:terms, > limit:-1, > field:" studentId " > >} > } > > > Thanks > > > -Original Message- > From: Vijay Tiwary [mailto:vijaykr.tiw...@gmail.com] > Sent: Sunday, April 30, 2017 10:44 AM

RE: JSON facet performance for aggregations

2017-04-30 Thread Mikhail Ibraheem
- From: Vijay Tiwary [mailto:vijaykr.tiw...@gmail.com] Sent: Sunday, April 30, 2017 10:44 AM To: solr-user@lucene.apache.org Subject: RE: JSON facet performance for aggregations Please enable doc values and try. There is a bug in the source code which causes json facet on string field to run ver

RE: JSON facet performance for aggregations

2017-04-30 Thread Vijay Tiwary
ld. > It is huge difference between json and flat here. Do you know the reason > for this? Is there a way to improve it ? > > -Original Message- > From: Vijay Tiwary [mailto:vijaykr.tiw...@gmail.com] > Sent: Sunday, April 30, 2017 9:58 AM > To: solr-user@lucene.apache.org >

RE: JSON facet performance for aggregations

2017-04-30 Thread Mikhail Ibraheem
@lucene.apache.org Subject: Re: JSON facet performance for aggregations Json facet on string fields run lot slower than on numeric fields. Try and see if you can represent studentid as a numeric field. On Apr 30, 2017 1:19 PM, "Mikhail Ibraheem" wrote: > Hi, > > I am trying to do ag

Re: JSON facet performance for aggregations

2017-04-30 Thread Vijay Tiwary
ds : > stats=true&facet=true&stats.field={!tag=piv1 > sum=true}grades&facet.pivot={!stats=piv1}studentId > > > > We are hoping to use one approach json or flat for all our services. JSON > facet performance is better for many case. > > > > Please advise on why the performance for this is so bad and if we can > improve it. Also what is the default algorithm used for json facet. > > > > Thanks > > Mikhail >

JSON facet performance for aggregations

2017-04-30 Thread Mikhail Ibraheem
;stats.field={!tag=piv1 sum=true}grades&facet.pivot={!stats=piv1}studentId We are hoping to use one approach json or flat for all our services. JSON facet performance is better for many case. Please advise on why the performance for this is so bad and if we can improve it. Also what i

Re: prefix facet performance

2017-04-24 Thread Yonik Seeley
t; - > --- > Alessandro Benedetti > Search Consultant, R&D Software Engineer, Director > Sease Ltd. - www.sease.io > -- > View this message in context: > http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331553.html > Sent from the Solr - User mailing list archive at Nabble.com.

Re: prefix facet performance

2017-04-24 Thread alessandro.benedetti
- --- Alessandro Benedetti Search Consultant, R&D Software Engineer, Director Sease Ltd. - www.sease.io -- View this message in context: http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331553.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: prefix facet performance

2017-04-21 Thread Maria Muslea
I see. Once I specify a prefix the number of terms is MUCH smaller. Thank you again for all your help. Maria On Fri, Apr 21, 2017 at 1:46 PM, Yonik Seeley wrote: > On Fri, Apr 21, 2017 at 4:25 PM, Maria Muslea > wrote: > > The field is: > > > > > > > > and using unique() I found that it has

Re: prefix facet performance

2017-04-21 Thread Yonik Seeley
On Fri, Apr 21, 2017 at 4:25 PM, Maria Muslea wrote: > The field is: > > > > and using unique() I found that it has 700K+ unique values. > > The query before (that takes ~10s): > > wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/ > > the query after (that is almost

Re: prefix facet performance

2017-04-21 Thread Maria Muslea
t, R&D Software Engineer, Director > Sease Ltd. - www.sease.io > -- > View this message in context: http://lucene.472066.n3. > nabble.com/prefix-facet-performance-tp4330684p4331309.html > Sent from the Solr - User mailing list archive at Nabble.com. >

Re: prefix facet performance

2017-04-21 Thread alessandro.benedetti
Benedetti Search Consultant, R&D Software Engineer, Director Sease Ltd. - www.sease.io -- View this message in context: http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331309.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: prefix facet performance

2017-04-21 Thread Maria Muslea
e schema config for your > field, > > Cheers > > > > - > --- > Alessandro Benedetti > Search Consultant, R&D Software Engineer, Director > Sease Ltd. - www.sease.io > -- > View this message in context: http://lucene.472066.n3. > nabble.co

Re: prefix facet performance

2017-04-21 Thread alessandro.benedetti
. Let's start from the algorithm you are using and the schema config for your field, Cheers - --- Alessandro Benedetti Search Consultant, R&D Software Engineer, Director Sease Ltd. - www.sease.io -- View this message in context: http://lucene.472066.n3.nabble.com/pre

Re: prefix facet performance

2017-04-18 Thread Maria Muslea
Hmmm, not sure. Probably in the range of 100K-500K. Before writing the email I was just looking at: http://yonik.com/facet-performance/ Wow, using facet.method=enum makes a big difference. I will read on it to understand what it does. Thank you so much. Maria On Tue, Apr 18, 2017 at 5:21 PM

Re: prefix facet performance

2017-04-18 Thread Yonik Seeley
How many unique values in the index? You could try facet.method=enum -Yonik On Tue, Apr 18, 2017 at 8:16 PM, Maria Muslea wrote: > Hi, > > I have ~40K documents in SOLR (not many) and a multivalued facet field that > contains at least 2K values per document. > > The values of the facet field lo

prefix facet performance

2017-04-18 Thread Maria Muslea
Hi, I have ~40K documents in SOLR (not many) and a multivalued facet field that contains at least 2K values per document. The values of the facet field look like: A/B, A/C, A/D, C/E, M/F, etc, and I use facet.prefix. q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/ with "concept" de

Re: 5.4 facet performance thumbs-up

2015-12-23 Thread Yonik Seeley
Awesome, thanks for the feedback! -Yonik On Tue, Dec 22, 2015 at 5:36 PM, Aigner, Max wrote: > I'm happy to report that we are seeing significant speed-ups in our queries > with Json facets on 5.4 vs regular facets on 5.1. Our queries contain mostly > terms facets, many of them with exclusion

5.4 facet performance thumbs-up

2015-12-22 Thread Aigner, Max
I'm happy to report that we are seeing significant speed-ups in our queries with Json facets on 5.4 vs regular facets on 5.1. Our queries contain mostly terms facets, many of them with exclusion tags and prefix filtering. Nice work!

Re: 答复: (Issue) How improve solr facet performance

2014-05-27 Thread david.w.smi...@gmail.com
t > the performance, it is improved about 2.5 times compared with no parameter. > So, It is improved a lot under your advice, thanks a lot. > 2. > Now I have another performance issue, It's the group performance. > The number of data is as same as facet perform

答复: (Issue) How improve solr facet performance

2014-05-27 Thread Alice.H.Yang (mis.cnsh04.Newegg) 41493
s, Alice Yang +86-021-51530666*41493 Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042) -邮件原件- 发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk] 发送时间: 2014年5月24日 15:17 收件人: solr-user@lucene.apache.org 主题: RE: (Issue) How improve solr facet performance Alice.H.Yang (mis.cnsh0

RE: (Issue) How improve solr facet performance

2014-05-24 Thread Toke Eskildsen
Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote: > 1. I'm sorry, I have made a mistake, the total number of documents is 32 > Million, not 320 Million. > 2. The system memory is large for solr index, OS total has 256G, I set the > solr tomcat HEAPSIZE="-Xms25G -Xmx100G"

fw: (Issue) How improve solr facet performance

2014-05-23 Thread Alice.H.Yang (mis.cnsh04.Newegg) 41493
ay 23, 2014 8:08 PM To: d...@lucene.apache.org Subject: Re: (Issue) How improve solr facet performance On Fri, 2014-05-23 at 11:45 +0200, Alice.H.Yang (mis.cnsh04.Newegg) 41493 wrote: > We are blocked by solr facet performance when query hits many > documents. (about 10,000,000) [320M document

RE: Facet performance

2013-10-23 Thread Lemke, Michael SZ/HZA-ZSW
On Tue, October 22, 2013 5:23 PM Michael Lemke wrote: >On Tue, October 22, 2013 9:23 AM Toke Eskildsen wrote: >>On Mon, 2013-10-21 at 16:57 +0200, Lemke, Michael SZ/HZA-ZSW wrote: >>> QTime fc: >>>never returns, webserver restarts itself after 30 min with 100% CPU >>> load >> >>It might be

RE: Facet performance

2013-10-23 Thread Toke Eskildsen
On Tue, 2013-10-22 at 17:25 +0200, Lemke, Michael SZ/HZA-ZSW wrote: > On Tue, October 22, 2013 11:54 AM Andre Bois-Crettez wrote: > >> This is with Solr 1.4. > >Really ? > >This sound really outdated to me. > >Have you tried a tried more recent version, 4.5 just went out ? > > Sorry, can't. Too m

RE: Facet performance

2013-10-22 Thread Lemke, Michael SZ/HZA-ZSW
On Tue, October 22, 2013 11:54 AM Andre Bois-Crettez wrote: > >> This is with Solr 1.4. >Really ? >This sound really outdated to me. >Have you tried a tried more recent version, 4.5 just went out ? Sorry, can't. Too much `grown' stuff. Michael

RE: Facet performance

2013-10-22 Thread Lemke, Michael SZ/HZA-ZSW
On Tue, October 22, 2013 9:23 AM Toke Eskildsen wrote: >On Mon, 2013-10-21 at 16:57 +0200, Lemke, Michael SZ/HZA-ZSW wrote: >> QTime fc: >>never returns, webserver restarts itself after 30 min with 100% CPU >> load > >It might be because it dies due to garbage collection. But since more >m

Re: Facet performance

2013-10-22 Thread Andre Bois-Crettez
This is with Solr 1.4. Really ? This sound really outdated to me. Have you tried a tried more recent version, 4.5 just went out ? -- André Bois-Crettez Software Architect Search Developer http://www.kelkoo.com/ Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège socia

RE: Facet performance

2013-10-22 Thread Toke Eskildsen
On Mon, 2013-10-21 at 16:57 +0200, Lemke, Michael SZ/HZA-ZSW wrote: > QTime enum: > 1st call: 1200 > subsequent calls: 200 Those numbers seems fine. > QTime fc: >never returns, webserver restarts itself after 30 min with 100% CPU > load It might be because it dies due to garba

RE: Facet performance

2013-10-21 Thread Lemke, Michael SZ/HZA-ZSW
On Mon, October 21, 2013 10:04 AM, Toke Eskildsen wrote: >On Fri, 2013-10-18 at 18:30 +0200, Lemke, Michael SZ/HZA-ZSW wrote: >> Toke Eskildsen wrote: >> > Unfortunately the enum-solution is normally quite slow when there >> > are enough unique values to trigger the "too many > values"-exception. >

RE: Facet performance

2013-10-21 Thread Toke Eskildsen
On Fri, 2013-10-18 at 18:30 +0200, Lemke, Michael SZ/HZA-ZSW wrote: > Toke Eskildsen [mailto:t...@statsbiblioteket.dk] wrote: > > Unfortunately the enum-solution is normally quite slow when there > > are enough unique values to trigger the "too many > values"-exception. > > [...] > > [...] And yes

RE: Facet performance

2013-10-18 Thread Chris Hostetter
: >> 1. q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0 : >> 2. q=word&facet.field=CONTENT&facet=true&facet.prefix=a&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0 : > : >> The only difference is am empty facet.prefix in the

Re: Facet performance

2013-10-18 Thread Otis Gospodnetic
DocValues is the new black http://wiki.apache.org/solr/DocValues Otis -- Solr & ElasticSearch Support -- http://sematext.com/ SOLR Performance Monitoring -- http://sematext.com/spm On Fri, Oct 18, 2013 at 12:30 PM, Lemke, Michael SZ/HZA-ZSW wrote: > Toke Eskildsen [mailto:t...@statsbiblioteke

RE: Facet performance

2013-10-18 Thread Lemke, Michael SZ/HZA-ZSW
Toke Eskildsen [mailto:t...@statsbiblioteket.dk] wrote: >Lemke, Michael SZ/HZA-ZSW [lemke...@schaeffler.com] wrote: >> 1. >> q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0 >> 2. >> q=word&facet.field=CONTENT&facet=true&facet.prefix=a&

RE: Facet performance

2013-10-18 Thread Toke Eskildsen
Lemke, Michael SZ/HZA-ZSW [lemke...@schaeffler.com] wrote: > 1. > q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0 > 2. > q=word&facet.field=CONTENT&facet=true&facet.prefix=a&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0 > T

Facet performance

2013-10-18 Thread Lemke, Michael SZ/HZA-ZSW
I am working with Solr facet fields and come across a performance problem I don't understand. Consider these two queries: 1. q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0 2. q=word&facet.field=CONTENT&facet=true&facet.prefix=a&fac

Re: Multivalued fields and facet performance

2011-01-10 Thread Howard Lee
rch :: http://search-lucene.com/ > > > > - Original Message > > From: Howard Lee > > To: solr-user@lucene.apache.org > > Sent: Mon, January 10, 2011 8:59:03 AM > > Subject: Multivalued fields and facet performance > > > > Hi, > > > >

Re: Multivalued fields and facet performance

2011-01-10 Thread Otis Gospodnetic
r-user@lucene.apache.org > Sent: Mon, January 10, 2011 8:59:03 AM > Subject: Multivalued fields and facet performance > > Hi, > > I'd appreciate some explanation on what may be going on in the following > scenario using multivalued fields and facets. > > Solr ve

Multivalued fields and facet performance

2011-01-10 Thread Howard Lee
Hi, I'd appreciate some explanation on what may be going on in the following scenario using multivalued fields and facets. Solr version: 1.5 Our index contains 35 million docs, and our search is using 2 multivalued fields as facets. There are approx 5 million different values in one field and 50

facet performance when number of values is large

2010-03-03 Thread Andy
I have a facet field whose values are created by users. So potentially there could be a very large number of values. is that going to be a problem performance-wise? A few more questions to help me understand how facet works: - after the filter cache warmed up, will the (if any) performance probl

Re: facet performance tips

2009-08-13 Thread Jason Rutherglen
gt; On Thu, Aug 13, 2009 at 9:55 AM, Fuad Efendi wrote: >> It seems BOBO-Browse is alternate faceting engine; would be interesting to >> compare performance with SOLR... Distributed? >> >> >> -Original Message- >> From: Jason Rutherglen [mailto:jason.rutherg.

RE: facet performance tips

2009-08-13 Thread Fuad Efendi
h SOLR... Distributed? > > > -Original Message- > From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] > Sent: August-12-09 6:12 PM > To: solr-user@lucene.apache.org > Subject: Re: facet performance tips > > For your fields with many terms you may want to

Re: facet performance tips

2009-08-13 Thread Jason Rutherglen
uld be interesting to > compare performance with SOLR... Distributed? > > > -Original Message- > From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] > Sent: August-12-09 6:12 PM > To: solr-user@lucene.apache.org > Subject: Re: facet performance tips > > For

RE: facet performance tips

2009-08-13 Thread Fuad Efendi
Interesting, it has "BoboRequestHandler implements SolrRequestHandler" - easy to try it; and shards support [Fuad Efendi] It seems BOBO-Browse is alternate faceting engine; would be interesting to compare performance with SOLR... Distributed? [Jason Rutherglen] For your fields with many terms

RE: facet performance tips

2009-08-13 Thread Fuad Efendi
It seems BOBO-Browse is alternate faceting engine; would be interesting to compare performance with SOLR... Distributed? -Original Message- From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] Sent: August-12-09 6:12 PM To: solr-user@lucene.apache.org Subject: Re: facet

RE: facet performance tips

2009-08-13 Thread Fuad Efendi
vé [mailto:jerome.et...@gmail.com] Sent: August-13-09 5:38 AM To: solr-user@lucene.apache.org Subject: Re: facet performance tips Thanks everyone for your advices. I increased my filterCache, and the faceting performances improved greatly. My faceted field can have at the moment ~4 different t

Re: facet performance tips

2009-08-13 Thread Jérôme Etévé
Thanks everyone for your advices. I increased my filterCache, and the faceting performances improved greatly. My faceted field can have at the moment ~4 different terms, so I did set a filterCache size of 5 and it works very well. However, I'm planning to increase the number of terms to

Re: facet performance tips

2009-08-12 Thread Stephen Duncan Jr
Note that depending on the profile of your field (full text and how many unique terms on average per document), the improvements from 1.4 may not apply, as you may exceed the limits of the new faceting technique in Solr 1.4. -Stephen On Wed, Aug 12, 2009 at 2:12 PM, Erik Hatcher wrote: > Yes, in

Re: facet performance tips

2009-08-12 Thread Jason Rutherglen
> > > > -Original Message- > From: Erik Hatcher [mailto:ehatc...@apache.org] > Sent: August-12-09 2:12 PM > To: solr-user@lucene.apache.org > Subject: Re: facet performance tips > > Yes, increasing the filterCache size will help with Solr 1.3 > performance.

RE: facet performance tips

2009-08-12 Thread Fuad Efendi
al Message- From: Erik Hatcher [mailto:ehatc...@apache.org] Sent: August-12-09 2:12 PM To: solr-user@lucene.apache.org Subject: Re: facet performance tips Yes, increasing the filterCache size will help with Solr 1.3 performance. Do note that trunk (soon Solr 1.4) has dramatically improved fac

Re: facet performance tips

2009-08-12 Thread Erik Hatcher
Yes, increasing the filterCache size will help with Solr 1.3 performance. Do note that trunk (soon Solr 1.4) has dramatically improved faceting performance. Erik On Aug 12, 2009, at 1:30 PM, Jérôme Etévé wrote: Hi everyone, I'm using some faceting on a solr index containing ~ 1

RE: facet performance tips

2009-08-12 Thread Manepalli, Kalyan
-Original Message- From: Jérôme Etévé [mailto:jerome.et...@gmail.com] Sent: Wednesday, August 12, 2009 12:31 PM To: solr-user@lucene.apache.org Subject: facet performance tips Hi everyone, I'm using some faceting on a solr index containing ~ 160K documents. I perform facets on multivalued s

facet performance tips

2009-08-12 Thread Jérôme Etévé
Hi everyone, I'm using some faceting on a solr index containing ~ 160K documents. I perform facets on multivalued string fields. The number of possible different values is quite large. Enabling facets degrades the performance by a factor 3. Because I'm using solr 1.3, I guess the facetting mak

Re: Facet Performance

2008-07-31 Thread Funtick
Hoss, This is still extremely interesting area for possible improvements; I simply don't want the topic to die http://www.nabble.com/Facet-Performance-td7746964.html http://issues.apache.org/jira/browse/SOLR-665 http://issues.apache.org/jira/browse/SOLR-667 http://issues.apache.org/jira/b

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Erik Hatcher wrote: On Dec 8, 2006, at 2:15 PM, Andrew Nagy wrote: My data is 492,000 records of book data. I am faceting on 4 fields: author, subject, language, format. Format and language are fairly simple as their are only a few unique terms. Author and subject however are much differe

Re: Facet Performance

2006-12-08 Thread Chris Hostetter
: Unfortunately which strategy will be chosen is currently undocumented : and control is a bit oblique: If the field is tokenized or multivalued : or Boolean, the FilterQuery method will be used; otherwise the : FieldCache method. I expect I or others will improve that shortly. Bear in mind, wh

Re: Facet Performance

2006-12-08 Thread Erik Hatcher
On Dec 8, 2006, at 2:15 PM, Andrew Nagy wrote: My data is 492,000 records of book data. I am faceting on 4 fields: author, subject, language, format. Format and language are fairly simple as their are only a few unique terms. Author and subject however are much different in that there are

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
J.J. Larrea wrote: Unfortunately which strategy will be chosen is currently undocumented and control is a bit oblique: If the field is tokenized or multivalued or Boolean, the FilterQuery method will be used; otherwise the FieldCache method. I expect I or others will improve that shortly.

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, J.J. Larrea <[EMAIL PROTECTED]> wrote: Unfortunately which strategy will be chosen is currently undocumented and control is a bit oblique: If the field is tokenized or multivalued or Boolean, the FilterQuery method will be used; otherwise the FieldCache method. If anyone had time

Re: Facet Performance

2006-12-08 Thread J.J. Larrea
Andrew Nagy, ditto on what Yonik said. Here is some further elaboration: I am doing much the same thing (faceting on Author etc.). When my Author field was defined as a solr.TextField, even using solr.KeywordTokenizerFactory so it wasn't actually tokenized, the faceting code chose the QueryFilt

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Yonik Seeley wrote: Are they multivalued, and do they need to be. Anything that is of type "string" and not multivalued will use the lucene FieldCache rather than the filterCache. The author field is multivalued. Will this be a strong performance issue? I could make multiple author fields as

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote: Chris Hostetter wrote: >: Could you suggest a better configuration based on this? > >If that's what your stats look like after a single request, then i would >guess you would need to make your cache size at least 1.6 million in order >for it to

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Chris Hostetter wrote: : Could you suggest a better configuration based on this? If that's what your stats look like after a single request, then i would guess you would need to make your cache size at least 1.6 million in order for it to be of any use in improving your facet speed. Would th

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : My data is 492,000 records of book data. I am faceting on 4 fields: : author, subject, language, format. : Format and language are fairly simple as their are only a few unique : terms. Author and subject however are much different in that

Re: Facet Performance

2006-12-08 Thread Chris Hostetter
: Here are the stats, Im still a newbie to SOLR, so Im not totally sure : what this all means: : lookups : 1530036 : hits : 2 : hitratio : 0.00 : inserts : 1530035 : evictions : 1504435 : size : 25600 those numbers are telling you that your cache is capable of holding 25,600 items. you have attem

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Yonik Seeley wrote: On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote: I changed the filterCache to the following: However a search that normally takes .04s is taking 74 seconds once I use the facets since I am faceting on 4 fields. The first time or subsequent times? Is your filterCa

Re: Facet Performance

2006-12-08 Thread Yonik Seeley
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote: I changed the filterCache to the following: However a search that normally takes .04s is taking 74 seconds once I use the facets since I am faceting on 4 fields. The first time or subsequent times? Is your filterCache big enough yet? Wha

Re: Facet Performance

2006-12-08 Thread Andrew Nagy
Yonik Seeley wrote: 1) facet on single-valued strings if you can 2) if you can't do (1) then enlarge the fieldcache so that the number of filters (one per possible term in the field you are filtering on) can fit. I changed the filterCache to the following: However a search that normally t

  1   2   >