[ 
https://issues.apache.org/jira/browse/SOLR-8962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15234128#comment-15234128
 ] 

Joel Bernstein edited comment on SOLR-8962 at 4/10/16 2:01 PM:
---------------------------------------------------------------

Looks good.

On thing we can consider in future implementations is the merge sort fork/join. 
The gatherNodes function is going to return a stream of Tuples that contains 
long runs of pre-sorted Tuples. This is because the /export handler is going to 
be returning the nodes already sorted. But because the traversal is done in 
batches, the stream will have a pattern of runs of sorted Tuples. I suspect 
this will work nicely with the merge sort fork join, plus we get the threading. 
In my testing sorting is an operation that scales really nicely in parallel 
because the memory locality of sorts is very tight.


was (Author: joel.bernstein):
Looks good.

On thing we can consider in future implementations is the merge sort fork/join. 
The gatherNodes function is going to return a stream of Tuples that contains 
long runs of pre-sorted Tuples. This is because the /export handler is going to 
be returning the nodes already sorted. But because the traversal is done in 
batches, the stream will have a pattern of runs of sorted Tuples. I suspect 
this will work nicely with the merge sort fork join, plus we get the threading. 

> Add sort Streaming Expression
> -----------------------------
>
>                 Key: SOLR-8962
>                 URL: https://issues.apache.org/jira/browse/SOLR-8962
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Joel Bernstein
>            Priority: Critical
>             Fix For: 6.1
>
>         Attachments: SOLR-8962.patch
>
>
> The sort Streaming Expression does an in memory sort of the Tuples returned 
> by it's underlying stream. This is intended to be used for sorting sets 
> gathered during local graph traversals. This will make it easy to gather sets 
> during a traversal and use all of the sort based set operations (merge, 
> innerJoin, outerJoin, reduce, complement, intersect). 
> This will be particularly useful with the gatherNodes expression (SOLR-8925). 
> Sample syntax:
> {code}
> intersect(
>        sort(gatherNodes(...), "fieldA asc"),
>        sort(gatherNodes(...), "fieldA asc"),
>        on)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to