[ https://issues.apache.org/jira/browse/SOLR-10351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Bernstein updated SOLR-10351: ---------------------------------- Labels: NLP Streaming (was: ) > Add analyze Stream Evaluator to support streaming NLP > ----------------------------------------------------- > > Key: SOLR-10351 > URL: https://issues.apache.org/jira/browse/SOLR-10351 > Project: Solr > Issue Type: New Feature > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Joel Bernstein > Assignee: Joel Bernstein > Labels: NLP, Streaming > Fix For: 6.6 > > > The *analyze* Stream Evaluator uses a Solr analyzer to return a collection of > tokens from a *text field*. The collection of tokens can then be streamed out > by the *cartesianProduct* Streaming Expression or attached to documents as > multi-valued fields by the *select* Streaming Expression. > This allows Streaming Expressions to leverage all the existing tokenizers and > filters and provides a place for future NLP analyzers to be added to > Streaming Expressions. > Sample syntax: > {code} > cartesianProduct(expr, analyze(analyzerField, textField) as outfield ) > {code} > {code} > select(expr, analyze(analyzerField, textField) as outfield ) > {code} > Combined with Solr's batch text processing capabilities this provides an > entire parallel NLP framework. Solr's batch processing capabilities are > described here: > *Batch jobs, Parallel ETL and Streaming Text Transformation* > http://joelsolr.blogspot.com/2016/10/solr-63-batch-jobs-parallel-etl-and.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org