[ https://issues.apache.org/jira/browse/SOLR-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dennis Gove updated SOLR-7525: ------------------------------ Attachment: SOLR-7525.patch Rebases off of trunk and adds a DistinctOperation for use in the ReducerStream. The DistinctOperation ensures that for any given group only a single tuple will be returned. Currently it is implemented to return the first tuple in a group but a possible enhancement down the road could be to support a parameter asking for some other tuple in the group (such as the first in a sub-sorted list). Also, while implementing this I realized that the UniqueStream can be refactored to be just a type of ReducerStream with DistinctOperation. That change is not included in this patch but will be done under a separate ticket. Also of note, I'm not sure if the getChildren() function declared in TupleStream is necessary any longer. If I recall correctly that function was used by the StreamHandler when passing streams to workers but since all that has been changed to pass the result of toExpression(....) I think we can get rid of the getChildren() function. I will explore that possibility. > Add ComplementStream to the Streaming API and Streaming Expressions > ------------------------------------------------------------------- > > Key: SOLR-7525 > URL: https://issues.apache.org/jira/browse/SOLR-7525 > Project: Solr > Issue Type: New Feature > Components: SolrJ > Reporter: Joel Bernstein > Priority: Minor > Attachments: SOLR-7525.patch, SOLR-7525.patch > > > This ticket adds a ComplementStream to the Streaming API and Streaming > Expression language. > The ComplementStream will wrap two TupleStreams (StreamA, StreamB) and emit > Tuples from StreamA that are not in StreamB. > Streaming API Syntax: > {code} > ComplementStream cstream = new ComplementStream(streamA, streamB, comp); > {code} > Streaming Expression syntax: > {code} > complement(search(...), search(...), on(...)) > {code} > Internal implementation will rely on the ReducerStream. The ComplementStream > can be parallelized using the ParallelStream. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org