[ https://issues.apache.org/jira/browse/SOLR-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dennis Gove updated SOLR-8188: ------------------------------ Attachment: SOLR-8188.patch All tests pass. > Add hash style joins to the Streaming API and Streaming Expressions > ------------------------------------------------------------------- > > Key: SOLR-8188 > URL: https://issues.apache.org/jira/browse/SOLR-8188 > Project: Solr > Issue Type: Improvement > Components: SolrJ > Reporter: Dennis Gove > Priority: Minor > Attachments: SOLR-8188.patch > > > Add HashJoinStream and OuterHashJoinStream to the Streaming API to allow for > optimized joining between sub-streams. > HashJoinStream is similar to an InnerJoinStream except that it does not > insist on any particular order and will read all values from the stream being > hashed (hashStream) when open() is called. During read() it will return the > next tuple from the stream not being hashed (fullStream) which has at least > one matching record in hashStream. It will return a tuple which is the merge > of both tuples. If the tuple from the fullStream matches with more than one > tuple from the hashStream then calling read() will return the merge with the > next matching tuple. The order of the resulting stream is the order of the > fullStream. > OuterHashJoinStream is similar to a HashJoinStream and LeftOuterJoinStream in > that a tuple from fullStream will be returned even if it doesn't have a > matching record in hashStream. All other pieces are identical. > In expression form > {code} > hashJoin( > search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...), > hashed=search(collection2, q=*:*, fl="fieldA, fieldB, fieldE", ...), > on="fieldA, fieldB" > ) > {code} > {code} > outerHashJoin( > search(collection1, q=*:*, fl="fieldA, fieldB, fieldC", ...), > hashed=search(collection2, q=*:*, fl="fieldA, fieldB, fieldE", ...), > on="fieldA, fieldB" > ) > {code} > As you can see the hashStream is named parameter which makes it very clear > which stream should be hashed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org