[ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154456#comment-15154456
 ] 

Ishan Chattopadhyaya edited comment on SOLR-8220 at 2/19/16 4:40 PM:
---------------------------------------------------------------------

Notes for the reference guide:

Page: https://cwiki.apache.org/confluence/display/solr/Defining+Fields
{code}
Property=useDocValuesAsStored
Description=If the field has docValues enabled, setting this to true would 
allow the field to be treated as regular stored fields (even if it has 
stored=false). This means that this field would be returned alongside regular 
stored fields that are returned using the fl parameter.
Values=true or false
Implicit default=false for schema versions <1.6, true for schema versions >=1.6
{code}


Page: https://cwiki.apache.org/confluence/display/solr/DocValues
{code}
<New section> Retrieving docValues during search:

Field values retrieved during search queries are typically returned from stored 
values. However, starting with schema version 1.6, all non-stored docValues 
fields will be also returned along with other stored fields when all fields (or 
pattern matching globs) are specified to be returned (e.g. fl=*) for search 
queries. This behavior can be turned on and off by setting useDocValuesAsStored 
parameter for a field or a field type to true (implicit default since schema 
version 1.6) or false (implicit default till schema version 1.5). See 
https://cwiki.apache.org/confluence/display/solr/Defining+Fields

Note that enabling this property has performance implications because DocValues 
are column-oriented and may therefore incur additional cost to retrieve for 
each returned document. Also note that while returning non-stored fields from 
docValues (default in schema versions 1.6+, unless useDocValuesAsStored is 
false), the values of a multi-valued field are returned in sorted order (and 
not insertion order). If you require the multi-valued fields to be returned in 
the original insertion order, then make your multi-valued field as stored (such 
a change requires re-indexing).
{code}

Page: 
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl%28FieldList%29Parameter
{code}
Note: Starting with schema version 1.6, if there are non-stored fields with 
docValues enabled in the index, then a pattern glob like * in the fl parameter 
will retrieve those fields. This is not the case if those fields have 
explicitly useDocValuesAsStored as false in their field definition (see 
https://cwiki.apache.org/confluence/display/solr/Defining+Fields) or the schema 
version is <1.6. However, something like fl=dvfield or fl=*,dvfield (say 
dvfield is a non-stored field with docValues enabled) would retrieve the 
dvfield irrespective of the useDocValuesAsStored value. (See SOLR-8220 for more 
details)
{code}

Could someone please review and update the ref guide with the above 
information? And please feel free to reorganize, modify, drop, or rephrase any 
of this.


was (Author: ichattopadhyaya):
Notes for the reference guide:

Page: https://cwiki.apache.org/confluence/display/solr/Defining+Fields
{code}
Property=useDocValuesAsStored
Description=If the field has docValues enabled, setting this to true would 
allow the field to be treated as regular stored fields (even if it has 
stored=false). This means that this field would be returned alongside regular 
stored fields that are returned using the fl parameter.
Values=true or false
Implicit default=false for schema versions <1.6, true for schema versions >=1.6
{code}


Page: https://cwiki.apache.org/confluence/display/solr/DocValues
{code}
<New section> Retrieving docValues during search:

Field values retrieved during search queries are typically returned from stored 
values if the field has stored=true. However, starting with schema version 1.6, 
all non-stored docValues fields will be also returned along with other stored 
fields when all fields (or pattern matching globs) are specified to be returned 
(e.g. fl=*) for search queries. This behavior can be turned on and off by 
setting useDocValuesAsStored parameter for a field or a field type to true 
(implicit default since schema version 1.6) or false (implicit default till 
schema version 1.5). See 
https://cwiki.apache.org/confluence/display/solr/Defining+Fields

Note that enabling this property has performance implications because DocValues 
are column-oriented and may therefore incur additional cost to retrieve for 
each returned document. Also note that while returning non-stored fields from 
docValues (default in schema versions 1.6+, unless useDocValuesAsStored is 
false), the values of a multi-valued field are returned in sorted order (and 
not insertion order). If you require the multi-valued fields to be returned in 
the original insertion order, then make your multi-valued field as stored (such 
a change requires re-indexing).
{code}

Page: 
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefl%28FieldList%29Parameter
{code}
Note: Starting with schema version 1.6, if there are non-stored fields with 
docValues enabled in the index, then a pattern glob like * in the fl parameter 
will retrieve those fields. This is not the case if those fields have 
explicitly useDocValuesAsStored as false in their field definition (see 
https://cwiki.apache.org/confluence/display/solr/Defining+Fields) or the schema 
version is <1.6. However, something like fl=dvfield or fl=*,dvfield (say 
dvfield is a non-stored field with docValues enabled) would retrieve the 
dvfield irrespective of the useDocValuesAsStored value. (See SOLR-8220 for more 
details)
{code}

Could someone please review and update the ref guide with the above 
information? And please feel free to reorganize, modify, drop, or rephrase any 
of this.

> Read field from docValues for non stored fields
> -----------------------------------------------
>
>                 Key: SOLR-8220
>                 URL: https://issues.apache.org/jira/browse/SOLR-8220
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Keith Laban
>            Assignee: Shalin Shekhar Mangar
>         Attachments: SOLR-8220-5x.patch, SOLR-8220-branch_5x.patch, 
> SOLR-8220-ishan.patch, SOLR-8220-ishan.patch, SOLR-8220-ishan.patch, 
> SOLR-8220-ishan.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, 
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, 
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, 
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, 
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, 
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, 
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, 
> SOLR-8220.patch
>
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>    -- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to