[
https://issues.apache.org/jira/browse/SOLR-8176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208763#comment-15208763
]
Jack Krupansky commented on SOLR-8176:
--------------------------------------
To what extent can the graph traversal be parallelized for the data on a single
node? The eternal question with Solr is how much data you can put on a node
before you need to shard, or how big each shard can be. I'm curious how graph
traversal affects that calculation. Also, how merge policy and segment size
should be configured so that segments can be traversed in parallel. If there
was some more idea way to organize the nodes in segments, maybe people could
pack a lot more data on fat nodes to reduce the inter-node delays.
Alternatively, maybe have more nodes mean more of the operations can be done in
parallel without conflicting on local machine resources. Interesting tradeoffs.
> Model distributed graph traversals with Streaming Expressions
> -------------------------------------------------------------
>
> Key: SOLR-8176
> URL: https://issues.apache.org/jira/browse/SOLR-8176
> Project: Solr
> Issue Type: New Feature
> Components: clients - java, SolrCloud, SolrJ
> Affects Versions: master
> Reporter: Joel Bernstein
> Labels: Graph
> Fix For: master
>
>
> I think it would be useful to model a few *distributed graph traversal* use
> cases with Solr's *Streaming Expression* language. This ticket will explore
> different approaches with a goal of implementing two or three common graph
> traversal use cases.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]