[ https://issues.apache.org/jira/browse/SOLR-9240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joel Bernstein reassigned SOLR-9240: ------------------------------------ Assignee: Joel Bernstein > Add partitionKeys parameter to the topic() Streaming Expression > --------------------------------------------------------------- > > Key: SOLR-9240 > URL: https://issues.apache.org/jira/browse/SOLR-9240 > Project: Solr > Issue Type: Improvement > Reporter: Joel Bernstein > Assignee: Joel Bernstein > > Currently the topic() function doesn't accept a partitionKeys parameter like > the search() function does. This means the topic() function can't be wrapped > by the parallel() function to run across worker nodes. > It would be useful to support parallelizing the topic function because it > would provide a general purpose parallelized approach for processing batches > of data as they enter the index. > For example this would allow a classify() function to be wrapped around a > topic() function to classify documents in parallel across worker nodes. > Sample syntax: > {code} > parallel(daemon(update(classify(topic(..., partitionKeys="id"))))) > {code} > The example above would send a daemon to worker nodes that would classify all > new documents returned by the topic() function. The update function would > send the output of classify() to a SolrCloud collection for indexing. > The partitionKeys parameter would ensure that each worker would receive a > partition of the results returned by the topic() function. This allows the > classify() function to be run in parallel. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org