[ 
https://issues.apache.org/jira/browse/SOLR-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16341929#comment-16341929
 ] 

Joel Bernstein edited comment on SOLR-8362 at 1/27/18 3:25 AM:
---------------------------------------------------------------

I wanted to comment on re-indexing text fields with streaming expressions. 
There is a straight forward approach that does not require doc values described 
here:

[http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html]

I'm not convinced that putting text fields into docValues is the way to go. The 
*significantTerms*, *features* and *train*  streaming expressions can do some 
really nice things with text fields in the inverted index. In the next release, 
the new *termVectors* expression allows you to create on-the-fly tf-idf *term 
vectors* which can be used all for all kinds of text analytics. 

 


was (Author: joel.bernstein):
I wanted to comment on re-indexing text fields with streaming expressions. 
There is a straight forward approach that does not require doc values described 
here:

[http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html]

 

I'm not convinced that putting text fields into docValues is the way to go. The 
*significantTerms*, *features* and *train*  streaming expressions can do some 
really nice things with text fields in the inverted index. In the next release, 
the new *termVectors* expression allows you to create on-the-fly tf-idf *term 
vectors* which can be used all for all kinds of text analytics. 

 

 

 

 

 

 

> Add docValues support for TextField
> -----------------------------------
>
>                 Key: SOLR-8362
>                 URL: https://issues.apache.org/jira/browse/SOLR-8362
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Hoss Man
>            Priority: Major
>
> At the last lucene/solr revolution, Toke asked a question about why TextField 
> doesn't support docValues.  The short answer is because no one ever added it, 
> but the longer answer was because we would have to think through carefully 
> the _intent_ of supporting docValues for  a "tokenized" field like TextField, 
> and how to support various conflicting usecases where they could be handy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to