[ 
https://issues.apache.org/jira/browse/SOLR-8888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15214407#comment-15214407
 ] 

Joel Bernstein edited comment on SOLR-8888 at 3/28/16 5:14 PM:
---------------------------------------------------------------

First patch which implements a breadth first search using a threaded nested 
loop join. Each join in the traversal is split up into batches and is executed 
in threads within the worker node. This approach spreads the join across all 
replicas. The bottleneck in this scenario will be the network as potentially 
dozens of search nodes will be returning nodes in parallel to the same worker 
to satisfy the join. This bottleneck can be greatly reduced by compression 
because the edges are returned sorted by the toField, which will cause large 
amount of repeated data to be streamed in the same compression block. SOLR-8910 
has been opened to add Lz4 compression to the /export handler. 

In my last comment I mentioned using sorted memory mapped files for the book 
keeping. In this patch all book keeping is done in memory using HashMaps. 


was (Author: joel.bernstein):
First patch which implements a breadth first search using a threaded nested 
loop join. Each join in the traversal is split up into batches and is executed 
in threads within the worker node. This approach spreads the join across all 
replicas. The bottleneck in this scenario will be the network as potentially 
dozens of search nodes will be returning nodes in parallel to the same worker 
to satisfy the join. This bottleneck can be greatly reduced by compression 
because the edges are returned sorted by the toField, which will cause large 
amount of repeated data to be streamed in the same compression block. SOLR-8910 
has been opened to add Lz4 compression to the /export handler. 

> Add shortestPath Streaming Expression
> -------------------------------------
>
>                 Key: SOLR-8888
>                 URL: https://issues.apache.org/jira/browse/SOLR-8888
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Joel Bernstein
>         Attachments: SOLR-8888.patch
>
>
> This ticket is to implement a distributed shortest path graph traversal as a 
> Streaming Expression.
> possible expression syntax:
> {code}
> shortestPath(collection, 
>                      from="colA:node1", 
>                      to="colB:node2", 
>                      fq="limiting query", 
>                      maxDepth="10")
> {code}
> This would start from colA:node1 and traverse from colA to colB iteratively 
> until it finds colB:node2. The shortestPath function would emit Tuples 
> representing the shortest path.
> The optional fq could be used to apply a filter on the traversal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to