[ 
https://issues.apache.org/jira/browse/SOLR-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076434#comment-15076434
 ] 

Jason Gerlowski commented on SOLR-7535:
---------------------------------------

bq. The partitionKeys will get added to the search(...)

Ah, I see my mistake here.  Reading the Streaming Expression wiki page 
(https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions), I read

bq. The parallel function requires that the partitionKeys parameter be provided 
to the underlying searches.

and interpreted it to mean that if I provided the parameter to {{parallel()}}, 
it would be passed through.  But on a second glance, the example clearly shows 
that the caller needs to put it on the underlying {{search()}} themselves.

I was seeing duplicate documents indexed because I wasn't providing a partition 
on the searches.  So that's clearly my fault.

That said, it'd be nice if there was a way to detect this misconfiguration from 
within {{ParallelStream}}.  It'd be easy to do some sort of dumb check, such as 
ensuring the underlying expression contains the string 'partitionKeys'.  
There's a lot of obvious issues with that, but it might be better than nothing, 
and would let us spit out a helpful error message or warning.  Or maybe this 
isn't really important enough to worry about at this point.



> Add UpdateStream to Streaming API and Streaming Expression
> ----------------------------------------------------------
>
>                 Key: SOLR-7535
>                 URL: https://issues.apache.org/jira/browse/SOLR-7535
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java, SolrJ
>            Reporter: Joel Bernstein
>            Priority: Minor
>         Attachments: SOLR-7535.patch, SOLR-7535.patch
>
>
> The ticket adds an UpdateStream implementation to the Streaming API and 
> streaming expressions. The UpdateStream will wrap a TupleStream and send the 
> Tuples it reads to a SolrCloud collection to be indexed.
> This will allow users to pull data from different Solr Cloud collections, 
> merge and transform the streams and send the transformed data to another Solr 
> Cloud collection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to