[
https://issues.apache.org/jira/browse/SOLR-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076434#comment-15076434
]
Jason Gerlowski commented on SOLR-7535:
---------------------------------------
bq. The partitionKeys will get added to the search(...)
Ah, I see my mistake here. Reading the Streaming Expression wiki page
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions), I read
bq. The parallel function requires that the partitionKeys parameter be provided
to the underlying searches.
and interpreted it to mean that if I provided the parameter to {{parallel()}},
it would be passed through. But on a second glance, the example clearly shows
that the caller needs to put it on the underlying {{search()}} themselves.
I was seeing duplicate documents indexed because I wasn't providing a partition
on the searches. So that's clearly my fault.
That said, it'd be nice if there was a way to detect this misconfiguration from
within {{ParallelStream}}. It'd be easy to do some sort of dumb check, such as
ensuring the underlying expression contains the string 'partitionKeys'.
There's a lot of obvious issues with that, but it might be better than nothing,
and would let us spit out a helpful error message or warning. Or maybe this
isn't really important enough to worry about at this point.
> Add UpdateStream to Streaming API and Streaming Expression
> ----------------------------------------------------------
>
> Key: SOLR-7535
> URL: https://issues.apache.org/jira/browse/SOLR-7535
> Project: Solr
> Issue Type: New Feature
> Components: clients - java, SolrJ
> Reporter: Joel Bernstein
> Priority: Minor
> Attachments: SOLR-7535.patch, SOLR-7535.patch
>
>
> The ticket adds an UpdateStream implementation to the Streaming API and
> streaming expressions. The UpdateStream will wrap a TupleStream and send the
> Tuples it reads to a SolrCloud collection to be indexed.
> This will allow users to pull data from different Solr Cloud collections,
> merge and transform the streams and send the transformed data to another Solr
> Cloud collection.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]