I would be happy to help with this (although that's probably obvious from the above email), but I realize we'll probably want to chat a bit about this. It's certainly not a small change =)
-- Jimmy On Sat, Nov 14, 2015 at 2:28 AM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi Folks, > > Mike Joyce and myself have been working on a Tinkerpop implementation of > Node and NodeDB (generated through WebGraph) which builds a Vertex input, > used by Tinkerpop, subsequently Gremlin and persisted into a graph database > such as TitanDB. > We have analyzed the problem quite a bit and came across the following I/O > formats > > http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#script-io-format > I've implemented a PropertyWebGraphVertex writable in Nutch which builds > off of NodeDB (and others) to enable us to write out to > the ScriptOutputFormat. Essentially we address the issues of parent child > Vs child parent e.g. Outlinks Vs Inlinks respectively. > The work from there then consists of an external process (to Nutch) > invoking a Groovy script from within Gremlin to ingest data into TitanDB. > During the course of this work we have realized that mapred and mapreduce > API's are NOT ok within trunk if we want to move Nutch to accommodate the > above described architecture. > > Breath of fresh air and a deep breath... > > What do you guys think about branching trunk into a 3.X branch with every > mapred --> mapreduce package addressed. > Mike, Sujen and myself talked today. We want to touch base with everyone > within dev@ as it lends itself very much to the work undertaken by > https://issues.apache.org/jira/browse/NUTCH-2097 > > It does not however totally rearrange the codebase. It will however > generate a genuine graph output based upon > > http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#script-io-format > We can have a gremlin script as part of $NUTCH_HOME/conf which merely > ingests data (along with a config file) to a GraphDB such as Titan. > > What does everyone think? > Thanks > Lewis > > -- > *Lewis* >