Motivation/Design: The graph/sink source plugin will be used to custom transformations to connected data and dynamically apply these transformations to send data to any sync, an example of a set of destination sinks include elasticsearch/relational databases/spark rdd etc. Note that this plugin will serve as a source and a sink depending on the configurations. For v1 I am targeting that we plug into neo4j database using the neo4j-jdbc interface (https://github.com/larusba/neo4j-jdbc) to build http payloads to talk to neo4j. Once our neo4j interface will allow us to build generic interfaces and plug in any graph store in the future. The design will consist of a hybrid piece of infrastructure serving both as a source and a sink connected to the current flume infrastructure (since all the current sinks and sources are living in their own directories I would suggest this live somewhere else in the flume directory structure. Listed below is some classes I have partially configured to kick off this discussion NeoRestClient Roles and Responsibilities: Interface to neo4j, unpack and pack data structures to perform CRUD operation on a local or remote noe4j instance APIS: //inputs flume event //outputs flume data structure identifying success metrics around the operation //description: transform the flume event into a graph node insertNode(NeoNode nodeToInsert) searchNode(NeoNode nodeToSearch,Algorithm useAStarOrDijkstra) deleteNode(NeoNode nodeToDelete)
Note that I would also like to offer up the chance to present cipher queries (http://neo4j.com/developer/cypher-query-language/) to the source/sink infrastructure Neo4jDynamicSerializer Roles and responsibilities: serialize flume headers and body and use the Neo4jRestClient to perform crud on neo4j Both the source and the sink infrastructure will use the same infrastructure above. That should be enough of a first cut for design/motivation and JIRA details, would love to kick off the discussion at this point. Thanks in advance > From: [email protected] > To: [email protected] > Subject: [Discuss graph source/sink design proposal] > Date: Sun, 12 Jun 2016 15:01:14 -0700 > > Jira with details here: https://issues.apache.org/jira/browse/FLUME-2035 > > Please respond with your questions.
