On Thu, 2019-02-07 at 11:24 +0900, Yasufumi Mizoguchi wrote:
> Actually, stored is compressed but I believed that docValues was
> compressed
> in some strategies depending on
> field's values/density as following java doc says.
> 
https://lucene.apache.org/core/7_6_0/core/org/apache/lucene/codecs/lucene70/Lucene70DocValuesFormat.html

In scenarios with low diversity in Strings (city names for example),
DocValues de-duplication can work very well. It is hard to generally
compare the size of stored vs. doc values as the strategies are very
different and the relative difference is highly dependent on content.

As for query performance, Shawn is technically correct that there will
be no impact on query performance (as long as you don't use
indexed=false, docvalues=true). But it does influence document
retrieval time. Under most circumstances the difference will be small,
but if you retrieve a large number of documents or your corpus is large
(measured in documents), it can be significant:


https://lucene.apache.org/solr/guide/7_6/docvalues.html#retrieving-docvalues-during-search

Specifically, the Solr 7 series has poor random access (used for
document retrieval) doc values performance for indexes with many
documents.

- Toke Eskildsen, royal Danish Library


Reply via email to