On Thu, Sep 24, 2015 at 10:16 AM, Alessandro Benedetti
<benedetti.ale...@gmail.com> wrote:
> Yonik, I am really excited about the Json faceting module.
> I find it really interesting.
> Is there any pros/cons in using them, or it's definitely the "approach of
> the future" ?

Thanks!

The cons to the new stuff is that it doesn't yet have everything the
old stuff has.  But it does already have new stuff that the old stuff
doesn't have (like sorting by any statistic and rudimentary block-join
integration).

And yes, I do see it as "the future", a platform for integrating the
disparate features that have been developed for solr over time, but
don't always work that well together:
 - search
 - statistics
 - grouping
 - joins


> I saw your benchmarks and seems impressive.
>
> I have not read all the topic in details, just briefly, but is Json
> faceting using different faceting algorithms from the standard ones ? (
> Enum and fc)

I wouldn't say different fundamental algorithms yet... (compared to
4.10) but different code (to support some of the new features) and in
some places more optimized.

> I can not find the algorithm parameter to be passed in the Json facets.

There is an undocumented "method" parameter - I need to enable that to
allow switching between the docvalues approach and the UnInvertedField
approach.

-Yonik


> Are they using a complete different approach ?
> Is the algorithm used expressed anywhere ?
> This could give very good insights on when to use them.
>
> Cheers
>
> 2015-09-24 14:58 GMT+01:00 Yonik Seeley <ysee...@gmail.com>:
>
>> On Mon, Sep 21, 2015 at 8:09 AM, Uwe Reh <r...@hebis.uni-frankfurt.de>
>> wrote:
>> > our bibliographic index (~20M entries) runs fine with Solr 4.10.3
>> > With Solr 5.3 faceted searching is constantly incredibly slow (~ 20
>> seconds)
>> [...]
>> >
>> > The 'fieldValueCache' seems to be unused (no inserts nor lookups) in Solr
>> > 5.3. In Solr 4.10 the 'fieldValueCache' is in heavy use with a
>> > cumulative_hitratio of 1.
>>
>>
>> Indeed.  Use of the fieldValueCache (UnInvertedField) was secretly
>> removed as part of LUCENE-5666, causing these performance regressions.
>>
>> This code had been evolved over years to be very fast for specific use
>> cases.  No one facet algorithm is going to be optimal for everyone, so
>> it's important we have multiple.  But use of the UnInvertedField was
>> removed without any notification or discussion whatsoever (and
>> obviously no benchmarking), and was only discovered later by Solr devs
>> in SOLR-7190 that it was essentially dead code.
>>
>>
>> When I brought back my "JSON Facet API" work to Solr (which was based
>> on 4.10.x) it came with a heavily modified version of UnInvertedField
>> that is available via the JSON Facet API.  It might currently work
>> better for your usecase.
>>
>> On your normal (non-docValues) index, you can try something like the
>> following to see what the performance would be:
>>
>> $ curl http://yxz/solr/hebis/query -d 'q=darwin&
>> json.facet={
>>   authors : { type:terms, field:author_facet, limit:30 },
>>   material_access : { type:terms, field:material_access, limit:30 },
>>   material_brief : { type:terms, field:material_brief, limit:30 },
>>   rvk : { type:terms, field:rvk_facet, limit:30 },
>>   lang : { type:terms, field:language, limit:30 },
>>   dept : { type:terms, field:department_3, limit:30 }
>> }'
>>
>> There were other changes in LUCENE-5666 that will probably slow down
>> faceting on the single valued fields as well (so this may still be a
>> little slower than 4.10.x), but hopefully it would be more
>> competitive.
>>
>> -Yonik
>>
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card - http://about.me/alessandro_benedetti
> Blog - http://alexbenedetti.blogspot.co.uk
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England

Reply via email to