[ https://issues.apache.org/jira/browse/SOLR-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076415#comment-15076415 ]
Joel Bernstein commented on SOLR-7535: -------------------------------------- Yes, currently partitioning is only done as part of the search(). So any workflow that requires re-partitioning will have to be done in multiple steps. That's why this ticket is so important, the UpdateStream allows for write-backs. In the example above, the first join would need to be wrapped in an UpdateStream and sent to a temp index. The temp index would be used for the next steps. In the future we can look at faster ways to re-partition. One example would be have the workers repartition to local disk. Then the second step could read from the worker nodes rather the searching. This still involves multiple steps but it would be much faster. > Add UpdateStream to Streaming API and Streaming Expression > ---------------------------------------------------------- > > Key: SOLR-7535 > URL: https://issues.apache.org/jira/browse/SOLR-7535 > Project: Solr > Issue Type: New Feature > Components: clients - java, SolrJ > Reporter: Joel Bernstein > Priority: Minor > Attachments: SOLR-7535.patch, SOLR-7535.patch > > > The ticket adds an UpdateStream implementation to the Streaming API and > streaming expressions. The UpdateStream will wrap a TupleStream and send the > Tuples it reads to a SolrCloud collection to be indexed. > This will allow users to pull data from different Solr Cloud collections, > merge and transform the streams and send the transformed data to another Solr > Cloud collection. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org