Got it.

On Mon, Jun 13, 2016 at 10:05 PM, Saikat Kanjilal <[email protected]>
wrote:

> That's a responsibility of the graph db not flume, flume is responsible
> for delivering the events and has no understanding of connectivity of the
> data.  The goal in using flume is to connect incoming data that is
> heterogeneous and transform that data before dumping it into the graph db.
>
> Sent from my iPhone
>
> > On Jun 13, 2016, at 11:09 AM, Lior Zeno <[email protected]> wrote:
> >
> > I got this part. How events are linked together? Do you expect an
> adjacency
> > list incorporated in the header?
> >
> > On Mon, Jun 13, 2016 at 8:59 PM, Saikat Kanjilal <[email protected]>
> > wrote:
> >
> >> The use case is a flume developer wanting to connect data coming into
> and
> >> out of flume sinks/sources to a graph database
> >>
> >> Sent from my iPhone
> >>
> >>> On Jun 13, 2016, at 10:55 AM, Lior Zeno <[email protected]> wrote:
> >>>
> >>> I'm not sure that I follow here. Can you please give a detailed
> use-case?
> >>>
> >>>> On Mon, Jun 13, 2016 at 7:20 AM, Lior Zeno <[email protected]>
> wrote:
> >>>>
> >>>> Thanks. I'll review this and share my comments later on today.
> >>>>> On Jun 13, 2016 2:30 AM, "Saikat Kanjilal" <[email protected]>
> >> wrote:
> >>>>>
> >>>>> Motivation/Design: The graph/sink source plugin will be used to
> >>>>> custom transformations to connected data and dynamically apply these
> >>>>> transformations to send data to any sync, an example of a set of
> >>>>> destination sinks include elasticsearch/relational databases/spark
> rdd
> >>>>> etc.   Note that this plugin will serve as a source and a sink
> >> depending
> >>>>> on the configurations.  For v1 I am targeting that we plug into neo4j
> >>>>> database using the neo4j-jdbc interface (
> >>>>> https://github.com/larusba/neo4j-jdbc)
> >>>>> to build http payloads to talk to neo4j.  Once our neo4j interface
> will
> >>>>> allow us to build generic interfaces and plug in any graph store in
> the
> >>>>> future.
> >>>>> The
> >>>>> design will consist of a hybrid piece of infrastructure serving both
> as
> >>>>> a source and a sink connected to the current flume infrastructure
> >>>>> (since all the current sinks and sources are living in their own
> >>>>> directories I would suggest this live somewhere else in the flume
> >>>>> directory structure.  Listed below is some classes I have partially
> >>>>> configured to kick off this
> >>>>> discussion
> >>>>> NeoRestClient
> >>>>> Roles and Responsibilities: Interface to neo4j, unpack and pack data
> >>>>> structures to perform CRUD operation on a local or remote noe4j
> >> instance
> >>>>> APIS:
> >>>>> //inputs flume event
> >>>>> //outputs flume data structure identifying success metrics around the
> >>>>> operation
> >>>>> //description: transform the flume event into a graph node
> >>>>> insertNode(NeoNode nodeToInsert)
> >>>>> searchNode(NeoNode nodeToSearch,Algorithm useAStarOrDijkstra)
> >>>>> deleteNode(NeoNode nodeToDelete)
> >>>>>
> >>>>>
> >>>>> Note that I would also like to offer up the chance to present cipher
> >>>>> queries (http://neo4j.com/developer/cypher-query-language/) to the
> >>>>> source/sink infrastructure
> >>>>>
> >>>>> Neo4jDynamicSerializer
> >>>>> Roles and responsibilities: serialize flume headers and body and use
> >> the
> >>>>> Neo4jRestClient to perform crud on neo4j
> >>>>>
> >>>>>
> >>>>> Both the source and the sink infrastructure will use the same
> >>>>> infrastructure above.
> >>>>>
> >>>>>
> >>>>> That should be enough of a first cut for design/motivation and JIRA
> >>>>> details, would love to kick off the discussion at this point.
> >>>>> Thanks in advance
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>> From: [email protected]
> >>>>>> To: [email protected]
> >>>>>> Subject: [Discuss graph source/sink design proposal]
> >>>>>> Date: Sun, 12 Jun 2016 15:01:14 -0700
> >>>>>>
> >>>>>> Jira with details here:
> >>>>> https://issues.apache.org/jira/browse/FLUME-2035
> >>>>>>
> >>>>>> Please respond with your questions.
> >>
>

Reply via email to