This would be great for me. In Unopop we want to enable running heavy queries in a distributed manner. We figured we could implement some kind of UnipopSparkComputer that utilizes the current Spark implementation, but from a quick check we didn't find an obvious way to do that.
Might DefaultInputRDD be a good solution for us? Cheers, Ran On Wed, 2 Dec 2015 at 22:23 Marko Rodriguez <[email protected]> wrote: > Hello, > > It is possible for us to provide a DefaultInputRDD and DefaultInputFormat > to allow any OLTP graph system to easily load the data into > Giraph/Spark/etc. > > https://issues.apache.org/jira/browse/TINKERPOP3-1015 > > This is a "quick and dirty" as its single threaded -- no splits. It uses > Graph.vertices() to stream in the vertices one at a time. > > Would people be interested in this feature? It would allow you to, for > example, use Spark with Neo4j. Also, another thing we could do to make this > efficient is: > > List<Iterator<Vertex>> Graph.vertexSplits(int numberOfSplits) > > Then each graph provider can specify how to do parallel reads. The default > implementation would be: > > List<Iterator<Vertex>> splits = new ArrayList<>(numberOfSplits); > list.add(this.vertices()); > return splits; > > Anywho…. random idea as I was doing some Spark InputRDD test suite stuff. > > Take care, > Marko. > > http://markorodriguez.com > >
