[ 
https://issues.apache.org/jira/browse/SOLR-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dennis Gove reassigned SOLR-8188:
---------------------------------

    Assignee: Dennis Gove

> Add hash style joins to the Streaming API and Streaming Expressions
> -------------------------------------------------------------------
>
>                 Key: SOLR-8188
>                 URL: https://issues.apache.org/jira/browse/SOLR-8188
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrJ
>            Reporter: Dennis Gove
>            Assignee: Dennis Gove
>            Priority: Minor
>         Attachments: SOLR-8188.patch, SOLR-8188.patch
>
>
> Add HashJoinStream and OuterHashJoinStream to the Streaming API to allow for 
> optimized joining between sub-streams.
> HashJoinStream is similar to an InnerJoinStream except that it does not 
> insist on any particular order and will read all values from the stream being 
> hashed (hashStream) when open() is called. During read() it will return the 
> next tuple from the stream not being hashed (fullStream) which has at least 
> one matching record in hashStream. It will return a tuple which is the merge 
> of both tuples. If the tuple from the fullStream matches with more than one 
> tuple from the hashStream then calling read() will return the merge with the 
> next matching tuple. The order of the resulting stream is the order of the 
> fullStream.
> OuterHashJoinStream is similar to a HashJoinStream and LeftOuterJoinStream in 
> that a tuple from fullStream will be returned even if it doesn't have a 
> matching record in hashStream. All other pieces are identical.
> In expression form
> {code}
> hashJoin(
>   search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...),
>   hashed=search(collection2, q=*:*, fl="fieldA, fieldB, fieldE", ...),
>   on="fieldA, fieldB"
> )
> {code}
> {code}
> outerHashJoin(
>   search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...),
>   hashed=search(collection2, q=*:*, fl="fieldA, fieldB, fieldE", ...),
>   on="fieldA, fieldB"
> )
> {code}
> As you can see the hashStream is named parameter which makes it very clear 
> which stream should be hashed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to