[jira] [Commented] (SOLR-8220) Read field from docValues for non stored fields

Ishan Chattopadhyaya (JIRA) Fri, 20 Nov 2015 13:14:32 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-8220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018805#comment-15018805
 ]


Ishan Chattopadhyaya commented on SOLR-8220:
--------------------------------------------

bq. I'm still a bit confused how SOLR-8276 works for you, i get a NPE when 
trying pull back the non-indexed/non-stored field in the current impl.
I added another document (id=4) to the test at SOLR-8276. I see no problems 
whatsoever with string dv fields (single valued), which internally uses 
SortedDocValues. The test passes fine. Also, the test at BasicFunctionalityTest 
works fine with the {{test_s_dvo}} field. Both SOLR-8276 and the latter test 
use the latest patch here. So, as per the tests, the createField seems to do 
its job. Am I missing something?

However, beyond this point, should we avoid using the 
schemaField.getType().createField() for fields in the StoredDocument (lucene) 
and instead do this decoration on the SolrDocument which is created from this 
StoredDocument? (See my comment before this one).

> Read field from docValues for non stored fields
> -----------------------------------------------
>
>                 Key: SOLR-8220
>                 URL: https://issues.apache.org/jira/browse/SOLR-8220
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Keith Laban
>         Attachments: SOLR-8220-ishan.patch, SOLR-8220-ishan.patch, 
> SOLR-8220-ishan.patch, SOLR-8220-ishan.patch, SOLR-8220.patch, 
> SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch, SOLR-8220.patch
>
>
> Many times a value will be both stored="true" and docValues="true" which 
> requires redundant data to be stored on disk. Since reading from docValues is 
> both efficient and a common practice (facets, analytics, streaming, etc), 
> reading values from docValues when a stored version of the field does not 
> exist would be a valuable disk usage optimization.
> The only caveat with this that I can see would be for multiValued fields as 
> they would always be returned sorted in the docValues approach. I believe 
> this is a fair compromise.
> I've done a rough implementation for this as a field transform, but I think 
> it should live closer to where stored fields are loaded in the 
> SolrIndexSearcher.
> Two open questions/observations:
> 1) There doesn't seem to be a standard way to read values for docValues, 
> facets, analytics, streaming, etc, all seem to be doing their own ways, 
> perhaps some of this logic should be centralized.
> 2) What will the API behavior be? (Below is my proposed implementation)
> Parameters for fl:
> - fl="docValueField"
>   -- return field from docValue if the field is not stored and in docValues, 
> if the field is stored return it from stored fields
> - fl="*"
>   -- return only stored fields
> - fl="+"
>    -- return stored fields and docValue fields
> 2a - would be easiest implementation and might be sufficient for a first 
> pass. 2b - is current behavior



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-8220) Read field from docValues for non stored fields

Reply via email to