On 12/22/2017 12:45 PM, Tech Id wrote: > It seems that stored="false" docValues="true" is the default in Solr's > github and the recommended way to go.
Like most things in Solr, there's no simple answer. It depends. For the purposes of information retrieval (not facets, grouping, or sorting), whether you want stored or docValues will depend on a number of factors. Stored field data is compressed in the index. This means that it takes additional CPU processing to get the data from the index, but less data must be read from disk. DocValues is stored very differently. With docValues, the data is NOT compressed, and all of the values for one field for all documents across the entire index segment are written in one place, separately from any other field's docValues data. If you are returning all fields for a document and there are more than a few fields, then accessing stored data and decompressing it is probably going to be faster than accessing docValues data. For one thing, all the stored data for a single document is compressed and written together. With docValues, each field is in a different place, so multiple parts of the disk will need to be accessed to get results for multiple fields of a single document. If the index is small enough that it can easily be cached by the OS, then docValues will probably be faster, because accessing the data will be lightning fast and no decompression step is necessary. But if the index is too big to be fully cached, then only experimentation will allow you to know which is better. For facets, grouping, and/or sorting, using docValues instead of indexed data (indexed="true") will generally offer better performance, and WILL use less heap memory. Frequently, deciding which way performs better requires experimentation. Using indexed data and a larger heap could perform better in some situations. For information retrieval, stored is *usually* better than docValues, but not always. Thanks, Shawn