queryResultCache doesn’t really help with faceting, even if it’s hit for the
main query.
That cache only stores a subset of the hits, and to facet properly you need
the entire result set….
> On Jun 17, 2020, at 12:47 PM, James Bodkin
> wrote:
>
> We've noticed that the filterCache uses a sig
We've noticed that the filterCache uses a significant amount of memory, as
we've assigned 8GB Heap per instance.
In total, we have 32 shards with 2 replicas, hence (8*32*2) 512G Heap space
alone, further memory is required to ensure the index is always memory mapped
for performance reasons.
Ide
To expand a bit on what Erick said regarding performance: my sense is
that the RefGuide assertion that "docValues=true" makes faceting
"faster" could use some qualification/clarification. My take, fwiw:
First, to reiterate/paraphrase what Erick said: the "faster" assertion
is not comparing to "fac
Uninvertible is a safety mechanism to make sure that you don’t _unknowingly_
use a docValues=false
field for faceting/grouping/sorting/function queries. The primary point of
docValues=true is twofold:
1> reduce Java heap requirements by using the OS memory to hold it
2> uninverting can be expen
The large majority of the relevant fields have fewer than 20 unique values. We
have two fields over that with 150 unique values and 5300 unique values
retrospectively.
At the moment, our filterCache is configured with a maximum size of 8192.
From the DocValues documentation
(https://lucene.apac
Ah, interesting! So if the number of possible values is low (like <= 10),
it is faster to *not *use docvalues on that (indexed) faceted field?
Does this hold true even when using faceting techniques like tag and
exclusion?
Thanks,
Anthony
On Wed, Jun 17, 2020 at 9:37 AM David Smiley
wrote:
> I
I strongly recommend setting indexed=true on a field you facet on for the
purposes of efficient refinement (fq=field:value). But it strictly isn't
required, as you have discovered.
~ David
On Wed, Jun 17, 2020 at 9:02 AM Michael Gibney
wrote:
> facet.method=enum works by executing a query (ag
facet.method=enum works by executing a query (against indexed values)
for each indexed value in a given field (which, for indexed=false, is
"no values"). So that explains why facet.method=enum no longer works.
I was going to suggest that you might not want to set indexed=false on
the docValues face
Thanks, I've implemented some queries that improve the first-hit execution for
faceting.
Since turning off indexed on those fields, we've noticed that facet.method=enum
no longer returns the facets when used.
Using facet.method=fc/fcs is significantly slower compared to facet.method=enum
for us
Ok, I see the disconnect... Necessary parts if the index are read from disk
lazily. So your newSearcher or firstSearcher query needs to do whatever
operation causes the relevant parts of the index to be read. In this case,
probably just facet on all the fields you care about. I'd add sorting too
if
I've been trying to build a query that I can use in newSearcher based off the
information in your previous e-mail. I thought you meant to build a *:* query
as per Query 1 in my previous e-mail but I'm still seeing the first-hit
execution.
Now I'm wondering if you meant to create a *:* query with
Did you try the autowarming like I mentioned in my previous e-mail?
> On Jun 16, 2020, at 10:18 AM, James Bodkin
> wrote:
>
> We've changed the schema to enable docValues for these fields and this led to
> an improvement in the response time. We found a further improvement by also
> switching
We've changed the schema to enable docValues for these fields and this led to
an improvement in the response time. We found a further improvement by also
switching off indexed as these fields are used for faceting and filtering only.
Since those changes, we've found that the first-execution for q
I question whether fiterCache has anything to do with it, I suspect what’s
really happening is that first time you’re reading the relevant bits from disk
into memory. And to double check you should have docVaues enabled for all these
fields. The “uninverting” process can be very expensive, and
We've run the performance test after changing the fields to be of the type
string. We're seeing improved performance, especially after the first time the
query has run. The first run is taking around 1-2 seconds rather than 6-8
seconds and when the filter cache is present, the response time is a
Could you explain why the performance is an issue for points-based fields? I've
looked through the referenced issue (which is fixed in the version we are
running) but I'm missing the link between the two. Is there an issue to improve
this for points-based fields?
We're going to change the field
There’s a lot of confusion about using points-based fields for faceting, see:
https://issues.apache.org/jira/browse/SOLR-13227 for instance.
Two options you might try:
1> copyField to a string field and facet on that (won’t work, of course, for
any kind of interval/range facet)
2> use the deprec
We’ve been running a load test against our index and have noticed that the
facet queries are significantly slower than we would like.
Currently these types of queries are taking several seconds to execute and are
wondering if it would be possible to speed these up.
Repeating the same query over a
On 2/20/2018 1:18 AM, LOPEZ-CORTES Mariano-ext wrote:
We return a facet list of values in "motifPresence" field (person status).
Status:
[ ] status1
[x] status2
[x] status3
The user then selects 1 or multiple status (It's this step that we called "facet
filtering
solution?
-Message d'origine-
De : Erick Erickson [mailto:erickerick...@gmail.com]
Envoyé : lundi 19 février 2018 18:18
À : solr-user
Objet : Re: Facet performance problem
I'm confused here. What do you mean by "facet filtering"? Your examples have no
facets at all, just
I'm confused here. What do you mean by "facet filtering"? Your
examples have no facets at all, just a _filter query_.
I'll assume you want to use filter query (fq), and faceting has
nothing to do with it. This is one of the tricky bits of docValues.
While it's _possible_ to search on a field that'
Hi
We have following environement :
3 nodes cluster
1 shard
Replication factor = 2
8GB per node
29 millions of documents
We've faceting over field "motifPresence" defined as follow:
Once the user selects motifPresence filter we executes search again with:
fq: (value1 OR value2 OR value3 OR
On Mon, Oct 23, 2017 at 3:06 PM, John Davis wrote:
> Hello,
>
> We are seeing really slow facet performance with new solr release. This is
> on an index of 2M documents. A few things we've tried:
What happens when you run this facet request again?
The first time a UIF faceting
have a look for more background information:
https://issues.apache.org/jira/browse/SOLR-8096
it's not only related to version 6.6. It's a question of design since 5.x
Günter
On 23.10.2017 21:06, John Davis wrote:
Hello,
We are seeing really slow facet performance with new so
John Davis wrote:
> We are seeing really slow facet performance with new solr release.
> This is on an index of 2M documents.
I am currently running some performance experiments on simple String faceting,
comparing Solr 4 & 6. There is definitely a performance difference, but it is
n
Hello,
We are seeing really slow facet performance with new solr release. This is
on an index of 2M documents. A few things we've tried:
1. method=uif however that didn't help much (the facet fields have
docValues=false since they are multi-valued). Debug info below.
2. changing query
gex:
is there a way to narrow it down to just one regex term, e.g. *thing*, so it
will find soothing,
somthing, everything each with their number of occurences for the document?
thanks,
~
From: Yonik Seeley
Sent: 24 May 2017
On Mon, May 8, 2017 at 11:27 AM, Yonik Seeley wrote:
> I opened https://issues.apache.org/jira/browse/SOLR-10634 to address
> this performance issue.
OK, this has been committed.
A quick test shows about a 30x speedup when faceting on a
string/numeric docvalues field with 100K unique values and d
ming at that case?
>
> Please advise.
>
> Thanks
> Mikhail
>
> -Original Message-
> From: Yonik Seeley [mailto:ysee...@gmail.com]
> Sent: Sunday, May 07, 2017 6:25 PM
> To: solr-user@lucene.apache.org
> Subject: Re: JSON facet performance for aggregations
&g
recommend streaming at that case?
Please advise.
Thanks
Mikhail
-Original Message-
From: Yonik Seeley [mailto:ysee...@gmail.com]
Sent: Sunday, May 07, 2017 6:25 PM
To: solr-user@lucene.apache.org
Subject: Re: JSON facet performance for aggregations
OK, so I think I know what's going on.
> Thanks
> Mikhail
>
> -Original Message-
> From: Yonik Seeley [mailto:ysee...@gmail.com]
> Sent: Sunday, April 30, 2017 1:04 PM
> To: solr-user@lucene.apache.org
> Subject: Re: JSON facet performance for aggregations
>
> It is odd there would be quite such a big perfor
Hi Yonik,
We are using Solr 6.5
Both studentId and grades are double:
We have 1.5 million records.
Thanks
Mikhail
-Original Message-
From: Yonik Seeley [mailto:ysee...@gmail.com]
Sent: Sunday, April 30, 2017 1:04 PM
To: solr-user@lucene.apache.org
Subject: Re: JSON facet performance
ms:
>
> json.facet={
>studentId:{
> type:terms,
> limit:-1,
> field:" studentId "
>
>}
> }
>
>
> Thanks
>
>
> -Original Message-
> From: Vijay Tiwary [mailto:vijaykr.tiw...@gmail.com]
> Sent: Sunday, April 30, 2017 10:44 AM
-
From: Vijay Tiwary [mailto:vijaykr.tiw...@gmail.com]
Sent: Sunday, April 30, 2017 10:44 AM
To: solr-user@lucene.apache.org
Subject: RE: JSON facet performance for aggregations
Please enable doc values and try.
There is a bug in the source code which causes json facet on string field to
run ver
ld.
> It is huge difference between json and flat here. Do you know the reason
> for this? Is there a way to improve it ?
>
> -Original Message-
> From: Vijay Tiwary [mailto:vijaykr.tiw...@gmail.com]
> Sent: Sunday, April 30, 2017 9:58 AM
> To: solr-user@lucene.apache.org
>
@lucene.apache.org
Subject: Re: JSON facet performance for aggregations
Json facet on string fields run lot slower than on numeric fields. Try and see
if you can represent studentid as a numeric field.
On Apr 30, 2017 1:19 PM, "Mikhail Ibraheem"
wrote:
> Hi,
>
> I am trying to do ag
ds :
> stats=true&facet=true&stats.field={!tag=piv1
> sum=true}grades&facet.pivot={!stats=piv1}studentId
>
>
>
> We are hoping to use one approach json or flat for all our services. JSON
> facet performance is better for many case.
>
>
>
> Please advise on why the performance for this is so bad and if we can
> improve it. Also what is the default algorithm used for json facet.
>
>
>
> Thanks
>
> Mikhail
>
;stats.field={!tag=piv1
sum=true}grades&facet.pivot={!stats=piv1}studentId
We are hoping to use one approach json or flat for all our services. JSON facet
performance is better for many case.
Please advise on why the performance for this is so bad and if we can improve
it. Also what i
t; -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331553.html
> Sent from the Solr - User mailing list archive at Nabble.com.
-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context:
http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331553.html
Sent from the Solr - User mailing list archive at Nabble.com.
I see. Once I specify a prefix the number of terms is MUCH smaller.
Thank you again for all your help.
Maria
On Fri, Apr 21, 2017 at 1:46 PM, Yonik Seeley wrote:
> On Fri, Apr 21, 2017 at 4:25 PM, Maria Muslea
> wrote:
> > The field is:
> >
> >
> >
> > and using unique() I found that it has
On Fri, Apr 21, 2017 at 4:25 PM, Maria Muslea wrote:
> The field is:
>
>
>
> and using unique() I found that it has 700K+ unique values.
>
> The query before (that takes ~10s):
>
> wt=json&indent=true&q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/
>
> the query after (that is almost
t, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/prefix-facet-performance-tp4330684p4331309.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context:
http://lucene.472066.n3.nabble.com/prefix-facet-performance-tp4330684p4331309.html
Sent from the Solr - User mailing list archive at Nabble.com.
e schema config for your
> field,
>
> Cheers
>
>
>
> -
> ---
> Alessandro Benedetti
> Search Consultant, R&D Software Engineer, Director
> Sease Ltd. - www.sease.io
> --
> View this message in context: http://lucene.472066.n3.
> nabble.co
.
Let's start from the algorithm you are using and the schema config for your
field,
Cheers
-
---
Alessandro Benedetti
Search Consultant, R&D Software Engineer, Director
Sease Ltd. - www.sease.io
--
View this message in context:
http://lucene.472066.n3.nabble.com/pre
Hmmm, not sure. Probably in the range of 100K-500K.
Before writing the email I was just looking at:
http://yonik.com/facet-performance/
Wow, using facet.method=enum makes a big difference. I will read on it to
understand what it does.
Thank you so much.
Maria
On Tue, Apr 18, 2017 at 5:21 PM
How many unique values in the index?
You could try facet.method=enum
-Yonik
On Tue, Apr 18, 2017 at 8:16 PM, Maria Muslea wrote:
> Hi,
>
> I have ~40K documents in SOLR (not many) and a multivalued facet field that
> contains at least 2K values per document.
>
> The values of the facet field lo
Hi,
I have ~40K documents in SOLR (not many) and a multivalued facet field that
contains at least 2K values per document.
The values of the facet field look like: A/B, A/C, A/D, C/E, M/F, etc, and
I use facet.prefix.
q=*:*&rows=0&facet=true&facet.field=concept&facet.prefix=A/
with "concept" de
Awesome, thanks for the feedback!
-Yonik
On Tue, Dec 22, 2015 at 5:36 PM, Aigner, Max wrote:
> I'm happy to report that we are seeing significant speed-ups in our queries
> with Json facets on 5.4 vs regular facets on 5.1. Our queries contain mostly
> terms facets, many of them with exclusion
I'm happy to report that we are seeing significant speed-ups in our queries
with Json facets on 5.4 vs regular facets on 5.1. Our queries contain mostly
terms facets, many of them with exclusion tags and prefix filtering.
Nice work!
t
> the performance, it is improved about 2.5 times compared with no parameter.
> So, It is improved a lot under your advice, thanks a lot.
> 2.
> Now I have another performance issue, It's the group performance.
> The number of data is as same as facet perform
s,
Alice Yang
+86-021-51530666*41493
Floor 19,KaiKai Plaza,888,Wanhandu Rd,Shanghai(200042)
-邮件原件-
发件人: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
发送时间: 2014年5月24日 15:17
收件人: solr-user@lucene.apache.org
主题: RE: (Issue) How improve solr facet performance
Alice.H.Yang (mis.cnsh0
Alice.H.Yang (mis.cnsh04.Newegg) 41493 [alice.h.y...@newegg.com] wrote:
> 1. I'm sorry, I have made a mistake, the total number of documents is 32
> Million, not 320 Million.
> 2. The system memory is large for solr index, OS total has 256G, I set the
> solr tomcat HEAPSIZE="-Xms25G -Xmx100G"
ay 23, 2014 8:08 PM
To: d...@lucene.apache.org
Subject: Re: (Issue) How improve solr facet performance
On Fri, 2014-05-23 at 11:45 +0200, Alice.H.Yang (mis.cnsh04.Newegg)
41493 wrote:
> We are blocked by solr facet performance when query hits many
> documents. (about 10,000,000)
[320M document
On Tue, October 22, 2013 5:23 PM Michael Lemke wrote:
>On Tue, October 22, 2013 9:23 AM Toke Eskildsen wrote:
>>On Mon, 2013-10-21 at 16:57 +0200, Lemke, Michael SZ/HZA-ZSW wrote:
>>> QTime fc:
>>>never returns, webserver restarts itself after 30 min with 100% CPU
>>> load
>>
>>It might be
On Tue, 2013-10-22 at 17:25 +0200, Lemke, Michael SZ/HZA-ZSW wrote:
> On Tue, October 22, 2013 11:54 AM Andre Bois-Crettez wrote:
> >> This is with Solr 1.4.
> >Really ?
> >This sound really outdated to me.
> >Have you tried a tried more recent version, 4.5 just went out ?
>
> Sorry, can't. Too m
On Tue, October 22, 2013 11:54 AM Andre Bois-Crettez wrote:
>
>> This is with Solr 1.4.
>Really ?
>This sound really outdated to me.
>Have you tried a tried more recent version, 4.5 just went out ?
Sorry, can't. Too much `grown' stuff.
Michael
On Tue, October 22, 2013 9:23 AM Toke Eskildsen wrote:
>On Mon, 2013-10-21 at 16:57 +0200, Lemke, Michael SZ/HZA-ZSW wrote:
>> QTime fc:
>>never returns, webserver restarts itself after 30 min with 100% CPU
>> load
>
>It might be because it dies due to garbage collection. But since more
>m
This is with Solr 1.4.
Really ?
This sound really outdated to me.
Have you tried a tried more recent version, 4.5 just went out ?
--
André Bois-Crettez
Software Architect
Search Developer
http://www.kelkoo.com/
Kelkoo SAS
Société par Actions Simplifiée
Au capital de € 4.168.964,30
Siège socia
On Mon, 2013-10-21 at 16:57 +0200, Lemke, Michael SZ/HZA-ZSW wrote:
> QTime enum:
> 1st call: 1200
> subsequent calls: 200
Those numbers seems fine.
> QTime fc:
>never returns, webserver restarts itself after 30 min with 100% CPU
> load
It might be because it dies due to garba
On Mon, October 21, 2013 10:04 AM, Toke Eskildsen wrote:
>On Fri, 2013-10-18 at 18:30 +0200, Lemke, Michael SZ/HZA-ZSW wrote:
>> Toke Eskildsen wrote:
>> > Unfortunately the enum-solution is normally quite slow when there
>> > are enough unique values to trigger the "too many > values"-exception.
>
On Fri, 2013-10-18 at 18:30 +0200, Lemke, Michael SZ/HZA-ZSW wrote:
> Toke Eskildsen [mailto:t...@statsbiblioteket.dk] wrote:
> > Unfortunately the enum-solution is normally quite slow when there
> > are enough unique values to trigger the "too many > values"-exception.
> > [...]
>
> [...] And yes
: >> 1.
q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
: >> 2.
q=word&facet.field=CONTENT&facet=true&facet.prefix=a&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
: >
: >> The only difference is am empty facet.prefix in the
DocValues is the new black
http://wiki.apache.org/solr/DocValues
Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
SOLR Performance Monitoring -- http://sematext.com/spm
On Fri, Oct 18, 2013 at 12:30 PM, Lemke, Michael SZ/HZA-ZSW
wrote:
> Toke Eskildsen [mailto:t...@statsbiblioteke
Toke Eskildsen [mailto:t...@statsbiblioteket.dk] wrote:
>Lemke, Michael SZ/HZA-ZSW [lemke...@schaeffler.com] wrote:
>> 1.
>> q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
>> 2.
>> q=word&facet.field=CONTENT&facet=true&facet.prefix=a&
Lemke, Michael SZ/HZA-ZSW [lemke...@schaeffler.com] wrote:
> 1.
> q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
> 2.
> q=word&facet.field=CONTENT&facet=true&facet.prefix=a&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
> T
I am working with Solr facet fields and come across a
performance problem I don't understand. Consider these
two queries:
1.
q=word&facet.field=CONTENT&facet=true&facet.prefix=&facet.limit=10&facet.mincount=1&facet.method=enum&rows=0
2.
q=word&facet.field=CONTENT&facet=true&facet.prefix=a&fac
rch :: http://search-lucene.com/
>
>
>
> - Original Message
> > From: Howard Lee
> > To: solr-user@lucene.apache.org
> > Sent: Mon, January 10, 2011 8:59:03 AM
> > Subject: Multivalued fields and facet performance
> >
> > Hi,
> >
> >
r-user@lucene.apache.org
> Sent: Mon, January 10, 2011 8:59:03 AM
> Subject: Multivalued fields and facet performance
>
> Hi,
>
> I'd appreciate some explanation on what may be going on in the following
> scenario using multivalued fields and facets.
>
> Solr ve
Hi,
I'd appreciate some explanation on what may be going on in the following
scenario using multivalued fields and facets.
Solr version: 1.5
Our index contains 35 million docs, and our search is using 2 multivalued
fields as facets. There are approx 5 million different values in one field
and 50
I have a facet field whose values are created by users. So potentially there
could be a very large number of values. is that going to be a problem
performance-wise?
A few more questions to help me understand how facet works:
- after the filter cache warmed up, will the (if any) performance probl
gt; On Thu, Aug 13, 2009 at 9:55 AM, Fuad Efendi wrote:
>> It seems BOBO-Browse is alternate faceting engine; would be interesting to
>> compare performance with SOLR... Distributed?
>>
>>
>> -Original Message-
>> From: Jason Rutherglen [mailto:jason.rutherg.
h SOLR... Distributed?
>
>
> -Original Message-
> From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com]
> Sent: August-12-09 6:12 PM
> To: solr-user@lucene.apache.org
> Subject: Re: facet performance tips
>
> For your fields with many terms you may want to
uld be interesting to
> compare performance with SOLR... Distributed?
>
>
> -Original Message-
> From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com]
> Sent: August-12-09 6:12 PM
> To: solr-user@lucene.apache.org
> Subject: Re: facet performance tips
>
> For
Interesting, it has "BoboRequestHandler implements SolrRequestHandler"
- easy to try it; and shards support
[Fuad Efendi] It seems BOBO-Browse is alternate faceting engine; would be
interesting to
compare performance with SOLR... Distributed?
[Jason Rutherglen] For your fields with many terms
It seems BOBO-Browse is alternate faceting engine; would be interesting to
compare performance with SOLR... Distributed?
-Original Message-
From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com]
Sent: August-12-09 6:12 PM
To: solr-user@lucene.apache.org
Subject: Re: facet
vé [mailto:jerome.et...@gmail.com]
Sent: August-13-09 5:38 AM
To: solr-user@lucene.apache.org
Subject: Re: facet performance tips
Thanks everyone for your advices.
I increased my filterCache, and the faceting performances improved greatly.
My faceted field can have at the moment ~4 different t
Thanks everyone for your advices.
I increased my filterCache, and the faceting performances improved greatly.
My faceted field can have at the moment ~4 different terms, so I
did set a filterCache size of 5 and it works very well.
However, I'm planning to increase the number of terms to
Note that depending on the profile of your field (full text and how many
unique terms on average per document), the improvements from 1.4 may not
apply, as you may exceed the limits of the new faceting technique in Solr
1.4.
-Stephen
On Wed, Aug 12, 2009 at 2:12 PM, Erik Hatcher wrote:
> Yes, in
>
>
>
> -Original Message-
> From: Erik Hatcher [mailto:ehatc...@apache.org]
> Sent: August-12-09 2:12 PM
> To: solr-user@lucene.apache.org
> Subject: Re: facet performance tips
>
> Yes, increasing the filterCache size will help with Solr 1.3
> performance.
al Message-
From: Erik Hatcher [mailto:ehatc...@apache.org]
Sent: August-12-09 2:12 PM
To: solr-user@lucene.apache.org
Subject: Re: facet performance tips
Yes, increasing the filterCache size will help with Solr 1.3
performance.
Do note that trunk (soon Solr 1.4) has dramatically improved fac
Yes, increasing the filterCache size will help with Solr 1.3
performance.
Do note that trunk (soon Solr 1.4) has dramatically improved faceting
performance.
Erik
On Aug 12, 2009, at 1:30 PM, Jérôme Etévé wrote:
Hi everyone,
I'm using some faceting on a solr index containing ~ 1
-Original Message-
From: Jérôme Etévé [mailto:jerome.et...@gmail.com]
Sent: Wednesday, August 12, 2009 12:31 PM
To: solr-user@lucene.apache.org
Subject: facet performance tips
Hi everyone,
I'm using some faceting on a solr index containing ~ 160K documents.
I perform facets on multivalued s
Hi everyone,
I'm using some faceting on a solr index containing ~ 160K documents.
I perform facets on multivalued string fields. The number of possible
different values is quite large.
Enabling facets degrades the performance by a factor 3.
Because I'm using solr 1.3, I guess the facetting mak
Hoss,
This is still extremely interesting area for possible improvements; I simply
don't want the topic to die
http://www.nabble.com/Facet-Performance-td7746964.html
http://issues.apache.org/jira/browse/SOLR-665
http://issues.apache.org/jira/browse/SOLR-667
http://issues.apache.org/jira/b
Erik Hatcher wrote:
On Dec 8, 2006, at 2:15 PM, Andrew Nagy wrote:
My data is 492,000 records of book data. I am faceting on 4 fields:
author, subject, language, format.
Format and language are fairly simple as their are only a few unique
terms. Author and subject however are much differe
: Unfortunately which strategy will be chosen is currently undocumented
: and control is a bit oblique: If the field is tokenized or multivalued
: or Boolean, the FilterQuery method will be used; otherwise the
: FieldCache method. I expect I or others will improve that shortly.
Bear in mind, wh
On Dec 8, 2006, at 2:15 PM, Andrew Nagy wrote:
My data is 492,000 records of book data. I am faceting on 4
fields: author, subject, language, format.
Format and language are fairly simple as their are only a few
unique terms. Author and subject however are much different in
that there are
J.J. Larrea wrote:
Unfortunately which strategy will be chosen is currently undocumented and
control is a bit oblique: If the field is tokenized or multivalued or Boolean,
the FilterQuery method will be used; otherwise the FieldCache method. I expect
I or others will improve that shortly.
On 12/8/06, J.J. Larrea <[EMAIL PROTECTED]> wrote:
Unfortunately which strategy will be chosen is currently undocumented and
control is a bit oblique: If the field is tokenized or multivalued or Boolean,
the FilterQuery method will be used; otherwise the FieldCache method.
If anyone had time
Andrew Nagy, ditto on what Yonik said. Here is some further elaboration:
I am doing much the same thing (faceting on Author etc.). When my Author field
was defined as a solr.TextField, even using solr.KeywordTokenizerFactory so it
wasn't actually tokenized, the faceting code chose the QueryFilt
Yonik Seeley wrote:
Are they multivalued, and do they need to be.
Anything that is of type "string" and not multivalued will use the
lucene FieldCache rather than the filterCache.
The author field is multivalued. Will this be a strong performance issue?
I could make multiple author fields as
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote:
Chris Hostetter wrote:
>: Could you suggest a better configuration based on this?
>
>If that's what your stats look like after a single request, then i would
>guess you would need to make your cache size at least 1.6 million in order
>for it to
Chris Hostetter wrote:
: Could you suggest a better configuration based on this?
If that's what your stats look like after a single request, then i would
guess you would need to make your cache size at least 1.6 million in order
for it to be of any use in improving your facet speed.
Would th
On 12/8/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: My data is 492,000 records of book data. I am faceting on 4 fields:
: author, subject, language, format.
: Format and language are fairly simple as their are only a few unique
: terms. Author and subject however are much different in that
: Here are the stats, Im still a newbie to SOLR, so Im not totally sure
: what this all means:
: lookups : 1530036
: hits : 2
: hitratio : 0.00
: inserts : 1530035
: evictions : 1504435
: size : 25600
those numbers are telling you that your cache is capable of holding 25,600
items. you have attem
Yonik Seeley wrote:
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote:
I changed the filterCache to the following:
However a search that normally takes .04s is taking 74 seconds once I
use the facets since I am faceting on 4 fields.
The first time or subsequent times?
Is your filterCa
On 12/8/06, Andrew Nagy <[EMAIL PROTECTED]> wrote:
I changed the filterCache to the following:
However a search that normally takes .04s is taking 74 seconds once I
use the facets since I am faceting on 4 fields.
The first time or subsequent times?
Is your filterCache big enough yet? Wha
Yonik Seeley wrote:
1) facet on single-valued strings if you can
2) if you can't do (1) then enlarge the fieldcache so that the number
of filters (one per possible term in the field you are filtering on)
can fit.
I changed the filterCache to the following:
However a search that normally t
1 - 100 of 127 matches
Mail list logo