[ 
https://issues.apache.org/jira/browse/SOLR-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210206#comment-15210206
 ] 

Joel Bernstein commented on SOLR-8888:
--------------------------------------

I've been digging into the implementation and it looks like Streaming provides 
some real advantages.

The biggest advantage comes from the ability to sort entire results by the Node 
id and do this in parallel across the cluster. This means that once the Nodes 
arrive at the worker they can simply be written to memory mapped files for the 
book keeping. The book keeping files need to be sorted by Node Id and most 
likely need offset information to support binary searching and skipping during 
intersections. I looked at using MapDB for the book keeping and if the data 
wasn't already coming in sorted then this would have been the approach to use. 
But even as fast as MapDB is there is still overhead that we don't need in 
managing the BTree's.

So, in order to get the maximum speed in reading and writing the book keeping 
files I'm planning on just using memory mapped files with offsets. This is 
going to take more time to develop but will pay off when there are large 
traversals.



> Add shortestPath Streaming Expression
> -------------------------------------
>
>                 Key: SOLR-8888
>                 URL: https://issues.apache.org/jira/browse/SOLR-8888
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Joel Bernstein
>
> This ticket is to implement a distributed shortest path graph traversal as a 
> Streaming Expression.
> possible expression syntax:
> {code}
> shortestPath(collection, 
>                      from="colA:node1", 
>                      to="colB:node2", 
>                      fq="limiting query", 
>                      maxDepth="10")
> {code}
> This would start from colA:node1 and traverse from colA to colB iteratively 
> until it finds colB:node2. The shortestPath function would emit Tuples 
> representing the shortest path.
> The optional fq could be used to apply a filter on the traversal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to