Re: [DISCUSS] DefaultInputRDD and DefaultInputFormat

Ran Magen Thu, 03 Dec 2015 13:12:53 -0800

This would be great for me.
In Unopop we want to enable running heavy queries in a distributed manner.
We figured we could implement some kind of UnipopSparkComputer that
utilizes the current Spark implementation, but from a quick check we didn't
find an obvious way to do that.


Might DefaultInputRDD be a good solution for us?

Cheers,
Ran

On Wed, 2 Dec 2015 at 22:23 Marko Rodriguez <[email protected]> wrote:

> Hello,
>
> It is possible for us to provide a DefaultInputRDD and DefaultInputFormat
> to allow any OLTP graph system to easily load the data into
> Giraph/Spark/etc.
>
>         https://issues.apache.org/jira/browse/TINKERPOP3-1015
>
> This is a "quick and dirty" as its single threaded -- no splits. It uses
> Graph.vertices() to stream in the vertices one at a time.
>
> Would people be interested in this feature? It would allow you to, for
> example, use Spark with Neo4j. Also, another thing we could do to make this
> efficient is:
>
>         List<Iterator<Vertex>> Graph.vertexSplits(int numberOfSplits)
>
> Then each graph provider can specify how to do parallel reads. The default
> implementation would be:
>
>         List<Iterator<Vertex>> splits = new ArrayList<>(numberOfSplits);
>         list.add(this.vertices());
>         return splits;
>
> Anywho…. random idea as I was doing some Spark InputRDD test suite stuff.
>
> Take care,
> Marko.
>
> http://markorodriguez.com
>
>

Re: [DISCUSS] DefaultInputRDD and DefaultInputFormat

Reply via email to