Hi Vasia, I had started tinkering on it in my spare time in a separate
repo.  There really isn't much to collaborate on at this point.  I was
mostly trying to understand the parallels between Flink and Spark so that I
could understand how a FlinkGraphComputer could be implemented given what
I'd seen of the Spark implementation Marko did.  I had expected to
contribute the work to Flink (rather than keep it here on the TinkerPop
side).  Anyway, not much else to offer - Marko can probably get you running
much faster than I can, as that area is where he holds the most expertise.
You should probably keep an eye out for his comments.



On Wed, Nov 25, 2015 at 11:38 AM, Vasiliki Kalavri <[email protected]> wrote:

> Hi James and TinkerPop community,
>
> thanks a lot for starting this discussion!
> I am Vasia, Apache Flink PMC and core Gelly developer. Nice to meet you ;)
>
> I'm only starting to get familiar with the TinkerPop project, but it seems
> that it can play well with Flink.
> As you already noticed, a FlinkGraphComputer should be straight-forward to
> implement. Gelly has a vertex-centric API that is similar to the
> scatter-gather model [1] and a gather-sum-apply API [2] that is closer to
> the Powergraph model. These are built on top of Flink's delta iteration
> operators, which are more generic and could also be used directly for the
> FlinkGraphComputer, if the existing Gelly abstractions won't work.
>
> Regarding the difference between stream and batch in Flink. Flink is a
> streaming dataflow engine, on top of which you can run both streaming and
> batch jobs. A batch job is simply seen by Flink as a job operating on a
> finite stream. Respectively, Flink has a stream and a batch API. Gelly is
> currently built on top of the batch API, i.e. the DataSet API.
>
> James mentioned in the Flink mailing list that someone has already started
> working on a FlinkGraphComputer. Is there a JIRA for this? Let me know if
> you have questions or you think I can help in some way!
>
> Cheers,
> -Vasia.
>
> [1]:
>
> https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#vertex-centric-iterations
> [2]:
>
> https://ci.apache.org/projects/flink/flink-docs-master/libs/gelly_guide.html#gather-sum-apply-iterations
> [3]:
>
> https://ci.apache.org/projects/flink/flink-docs-master/apis/iterations.html#delta-iterate-operator
>
> On 25 November 2015 at 17:05, James Thornton <[email protected]>
> wrote:
>
> > Hi Vasia -
> >
> > Welcome to TinkerPop (linking you into the Flink thread as requested)...
> >
> > - James
> >
> > On Mon, Nov 23, 2015 at 10:01 AM, Marko Rodriguez <[email protected]>
> > wrote:
> >
> > > Hi James,
> > >
> > > Thank you for always having a ear to the tech pulse. If it wasn't for
> > you,
> > > I would still be excited about XMPP and would be programming in Tcl/Tk.
> > >
> > > Given my 20 minute review of their docs …… It would be cool if like the
> > > "Table API," they also had a "Graph API" that was just TinkerPop
> > > Graph/Vertex/Edge. That could be super intrusive, so as a simple step
> --
> > > they already have a "vertex-centric" API and thus, having a
> > > FlinkGraphComputer implementation seems "easy." Then from there,
> Gremlin
> > > should just work. I don't really understand the difference between
> steam
> > > and batch unless they are talking the difference between "Storm" and
> > > "MapReduce." ? Would be cool to see how TinkerPop fits into the
> > > stream-scene.
> > >
> > > Next, their fluent API is similar to Spark's and I would argue that
> > > Gremlin's API is much nicer than just low-level primitives like map(),
> > > flatMap(), etc. Thus, they could really benefit from having a full
> graph
> > > query language already available for their users. (As a side note, its
> > > really nice to see more and more systems use functional/fluent APIs as
> > this
> > > really trains the next generation to think like this which is important
> > as
> > > Gremlin is purely this! Hopefully the SQL model of querying starts to
> > look
> > > odd to people in comparison.)
> > >
> > > I just sent out this tweet:
> > >         https://twitter.com/apachetinkerpop/status/668820458599530497
> > >
> > > If they seem positive, I can detail in JIRA what would be required for
> > > them to have TinkerPop-support.
> > >
> > > Thanks again James,
> > > Marko.
> > >
> > > http://markorodriguez.com
> > >
> > > On Nov 19, 2015, at 12:19 PM, James Thornton <[email protected]>
> > > wrote:
> > >
> > > > Hi -
> > > >
> > > > Apache Flink has a graph API named Gelly...
> > > >
> > > >
> https://flink.apache.org/news/2015/08/24/introducing-flink-gelly.html
> > > >
> > > > ...and Flink's "dedicated support for iterative operations" should
> pair
> > > > well with Gremlin:
> > > >
> > > >  https://flink.apache.org/features.html
> > > >
> > > > Has anyone dug into this yet?
> > > >
> > > > - James
> > > >
> > > >
> > > > --
> > > > James Thornton, *http://electricspeed.com <http://electricspeed.com
> >*
> > >
> > >
> >
> >
> > --
> > James Thornton, *http://electricspeed.com <http://electricspeed.com>*
> >
>

Reply via email to