Re: [TinkerPop] GraphActors as a new distributed computing framework in TinkerPop

Stephen Mallette Mon, 19 Dec 2016 05:09:32 -0800

Gremlin Server should probably start the akka cluster (of course, i still
don't have a solid understanding of the clustering capabilities of akka
just yet). .I also wonder if there is any value to Gremlin Server embedding
akka so it could piggyback on some of the features akka has (e.g. perhaps
gremlin server instances could be aware of each other which might yield
some interesting features).


> Thus, I don’t think GremlinServer really needs to come into play.

Unless i'm missing something, I'm not sure we should say it quite that way
- that's a bit more jvm-centric sounding. So as not to be confused, non-jvm
GLVs would still require Gremlin Server, right?

On Thu, Dec 15, 2016 at 11:47 AM, Marko Rodriguez <[email protected]>
wrote:

> Hi,
>
> How will this get deployed? Each database instance (alternatively
> gremlin-server) shipping a version of akka-actor and akka-cluster?
>
>
> This is a good question. As I’m seeing it lately, I think we treat it just
> like spark-gremlin/. That is, lets assume a multi-machine graph database:
>
> 1. User has a graph database across 3 nodes in a cluster.
> 2. User has Akka Cluster setup on those 3 nodes. (like they would have
> SparkServer or Hadoop).
> 3. akka-gremlin/ “jobs” have a configuration with information about the
> Akka cluster and the graph database partitions.
>
> Thus, I don’t think GremlinServer really needs to come into play. However,
> I sort of think that down the line, GremlinServer should support the
> spawning of “services.” For instance, it would be great if GremlinServer,
> when deployed, it could spawn a SparkServer cluster or an Akka Cluster…
> This removes the headache for users having to install and configure stuff.
> It would be great if GremlinServer was like a Docker or something.
>
> bin/gremlin-server.sh —i akka.gremlin.plugin —c akka.properties
>
> Dunno. Stephen would have more to say.
>
> What does it mean for performance? Here's my understanding... thoughts?
>
> 1. *A sharded graph database*: as long as the data is local it'll scale
> linearly, then it needs some synchronisation (i.e. hand off the traversal
> to the instance where the data is local again). I.e. there'll be a sweet
> spot of replication vs. shards for each use case.
> 2. *A replicated graph database*: should scale linearly for most
> traversals
> 3. *A single machine graph database*: should scale linearly for most
> traversals
>
>
> So there will be traverser migration when a traverser no longer references
> data in its current partition. That is a message pass. You don’t want just
> full replication because then you aren’t load balancing your traversals
> across machines. Even if you have a replicated graph database, you will
> want to create logical partitions so that traversers will be forced to move
> between machines. When its worth doing that or when you should just use
> standard iterator Gremlin execution is a fine line… how much data will your
> traversal touch?
>
> Marko.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Gremlin-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/gremlin-users/D2CDD477-4671-4100-ACBB-D0196E9BEB41%40gmail.com
> <https://groups.google.com/d/msgid/gremlin-users/D2CDD477-4671-4100-ACBB-D0196E9BEB41%40gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

Re: [TinkerPop] GraphActors as a new distributed computing framework in TinkerPop

Reply via email to