I think where Ishan is going with his question is this:

1.      _version_ never needs to be searchable, thus, indexed=false makes sense.

2.      _version_ typically needs to be evaluated with performing an update 
and, possibly, delete, thus stored=true makes sense.

3.      _version_ would never be used for either sorting or faceting.

4.      Given the above, is using docValues=true for _version_ a good idea?

Looking at the documentation:
https://cwiki.apache.org/confluence/display/solr/DocValues

And a bit more background:
http://lucidworks.com/blog/fun-with-docvalues-in-solr-4-2/

My take is a simple “no”.   Since docValues is, in essence, column oriented 
storage (and can be seen, I think, as an alternate index format), what benefit 
is to be gained for the _version_ field.   The primary benefits of docValues 
are in the sorting and faceting operations (maybe grouping?).   These 
operations are never performed on the _version_ field, are they?

I guess my remaining question is does it make sense to set indexed=”false” on 
_version_?   The example schemas set indexed=true.   Does solr itself perform 
searches internally on _version_?   If so, then indexed=true is required.   But 
otherwise, it seems like useless overhead.

Note, I have been using optimistic concurrency control in one application and, 
so, am interested in this possible optimization.   Any changes in this space 
between 4.x and 5.x?

Thanks,
Charlie

From: Joel Bernstein [mailto:joels...@gmail.com]
Sent: Monday, June 22, 2015 11:55 AM
To: lucene dev
Subject: Re: Version field as DV

In general DocValues were built to support large scale random access use cases 
such as faceting and sorting. They have similar performance characteristics as 
the FieldCache. But unlike the FieldCache you can trade off memory and 
performance by selecting different DocValues formats.

Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, Jun 22, 2015 at 10:41 AM, Ishan Chattopadhyaya 
<ichattopadhy...@gmail.com<mailto:ichattopadhy...@gmail.com>> wrote:
Hi all,
I am looking to try out _version_ as a docvalue (SOLR-6337) as a precursor to 
SOLR-5944. Towards that, I want the _version_ field to be stored=indexed=false, 
docValues=true.
Does someone know about the performance implications of retrieving the 
_version_ as a docvalue, e.g. accessing docvalue vs. a stored field? Is there 
any known inefficiency when using a docvalue (as opposed to a stored field) due 
to random disk seeks, for example?
Regards,
Ishan


*************************************************************************
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA-CREF
*************************************************************************

Reply via email to