Jason,

If I just use stock Solr 4.0 without modifying the source code, does that mean 
multi-value faceting will be very slow when I'm constantly inserting/updating 
documents? 

Which open source library are you referring to? Will Solr adopt this 
per-segment approach any time soon?

Thanks


________________________________
 From: Jason Rutherglen <jason.rutherg...@gmail.com>
To: solr-user@lucene.apache.org 
Sent: Saturday, July 7, 2012 2:05 PM
Subject: Re: Nrt and caching
 
Andy,

You'd need to hack on the Solr code, specifically the SimpleFacets class.
Solr uses UnInvertedField to build an in memory doc -> terms mapping, which
would need to be cached per-segment.  Then you'd need to aggregate the
resultant per-segment counts.

There is another open source library that has taken the same basic faceting
approach (it is per-segment), and could be colloquially faster, however it
is built for Lucene 3.x at the moment.

On Sat, Jul 7, 2012 at 12:21 PM, Andy <angelf...@yahoo.com> wrote:

> So If I want to use multi-value facet with NRT I'd need to convert the
> cache to per-segment? How do I do that?
>
> Thanks.
>
>
> ________________________________
>  From: Jason Rutherglen <jason.rutherg...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Saturday, July 7, 2012 11:32 AM
> Subject: Re: Nrt and caching
>
> The field caches are per-segment, which are used for sorting and basic
> [slower] facets.  The result set, document, filter, and multi-value facet
> caches are [in Solr] per-multi-segment.
>
> Of these, the document, filter, and multi-value facet caches could be
> converted to be [performant] per-segment, as with some other Apache
> licensed Lucene based search engines.
>
> On Sat, Jul 7, 2012 at 10:42 AM, Yonik Seeley <yo...@lucidimagination.com
> >wrote:
>
> > On Sat, Jul 7, 2012 at 9:59 AM, Jason Rutherglen
> > <jason.rutherg...@gmail.com> wrote:
> > > Currently the caches are stored per-multiple-segments, meaning after
> each
> > > 'soft' commit, the cache(s) will be purged.
> >
> > Depends which caches.  Some caches are per-segment, and some caches
> > are top level.
> > It's also a trade-off... for some things, per-segment data structures
> > would indeed turn around quicker on a reopen, but every query would be
> > slower for it.
> >
> > -Yonik
> > http://lucidimagination.com
> >
>

Reply via email to