Thank you Yonik.

So I would probably advise then to "keep your indexed=true" and think
about _adding_ docValues when there is a memory pressure or when there
is clear performance issue for the ...specific... uses.

But if we are keeping the indexed=true, then docValues=true will STILL
use at least as much memory however efficient docValues are
themselves, right? Or will something that is normally loaded and use
memory will stay unloaded in this combination scenario?

Regards,
   Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 9 November 2015 at 11:57, Yonik Seeley <ysee...@gmail.com> wrote:
> On Mon, Nov 9, 2015 at 11:19 AM, Alexandre Rafalovitch
> <arafa...@gmail.com> wrote:
>> I thought docValues were per segment, so the price of un-inversion was
>> effectively paid on each commit for all the segments, as opposed to
>> just the updated one.
>
> Both the field cache (i.e. uninverting indexed values) and docValues
> are mostly per-segment (I say mostly because some uses still require
> building a global ord map).
>
> But even when things are mostly per-segment, you hit major segment
> merges and the cost of un-inversion (when you aren't using docValues)
> is non-trivial.
>
>> I admit I also find the story around docValues to be very confusing at
>> the moment. Especially on the interplay with "indexed=false".
>
> You still need "indexed=true" for efficient filters on the field.
> Hence if you're faceting on a field and want to use docValues, you
> probably want to keep the "indexed=true" on the field as well.
>
> -Yonik
>
>
>> It would
>> make a VERY good article to have this clarified somehow by people in
>> the know.
>>
>> Regards,
>>    Alex.
>> ----
>> Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
>> http://www.solr-start.com/
>>
>>
>> On 9 November 2015 at 11:04, Yonik Seeley <ysee...@gmail.com> wrote:
>>> On Mon, Nov 9, 2015 at 10:55 AM, Demian Katz <demian.k...@villanova.edu> 
>>> wrote:
>>>> I understand that by adding "docValues=true" to some of my fields, I can 
>>>> improve sorting/faceting performance.
>>>
>>> I don't think this is true in the general sense.
>>> docValues are built at index-time, so what you will save is initial
>>> un-inversion time (i.e. the first time a field is used after a new
>>> searcher is opened).
>>> After that point, docValues may be slightly slower.
>>>
>>> The other advantage of docValues is memory use... much/most of it is
>>> essentially "off-heap", being memory-mapped from disk.  This cuts down
>>> on memory issues and helps reduce longer GC pauses.
>>>
>>> docValues are good in general, and I think we should default to them
>>> more for Solr 6, but they are not better in all ways.
>>>
>>>> However, I have a couple of questions:
>>>>
>>>>
>>>> 1.)    Will Solr always take proper advantage of docValues when it is 
>>>> turned on
>>>
>>> Yes.
>>>
>>>> , or will I gain greater performance by turning of stored/indexed in 
>>>> situations where only docValues are necessary (e.g. a sort-only field)?
>>>>
>>>> 2.)    Will adding docValues to a field introduce significant performance 
>>>> penalties for non-docValues uses of that field, beyond the obvious fact 
>>>> that the additional data will consume more disk and memory?
>>>
>>> No, it's a separate part of the index.
>>>
>>> -Yonik
>>>
>>>
>>>> I'm asking this question because the existing schema has some 
>>>> multi-purpose fields, and I'm trying to determine whether I should just 
>>>> add "docValues=true" wherever it might help, or if I need to take a more 
>>>> thoughtful approach and potentially split some fields with copyFields, 
>>>> etc. This is particularly significant because my schema makes use of some 
>>>> dynamic field suffixes, and I'm not sure if I need to add new suffixes to 
>>>> differentiate docValues/non-docValues fields, or if it's okay to turn on 
>>>> docValues across the board "just in case."
>>>>
>>>> Apologies if these questions have already been answered - I couldn't find 
>>>> a totally clear answer in the places I searched.
>>>>
>>>> Thanks!
>>>>
>>>> - Demian

Reply via email to