I would be happy to help with this (although that's probably obvious from
the above email), but I realize we'll probably want to chat a bit about
this. It's certainly not a small change =)


-- Jimmy

On Sat, Nov 14, 2015 at 2:28 AM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:

> Hi Folks,
>
> Mike Joyce and myself have been working on a Tinkerpop implementation of
> Node and NodeDB (generated through WebGraph) which builds a Vertex input,
> used by Tinkerpop, subsequently Gremlin and persisted into a graph database
> such as TitanDB.
> We have analyzed the problem quite a bit and came across the following I/O
> formats
>
> http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#script-io-format
> I've implemented a PropertyWebGraphVertex writable in Nutch which builds
> off of NodeDB (and others) to enable us to write out to
> the ScriptOutputFormat. Essentially we address the issues of parent child
> Vs child parent e.g. Outlinks Vs Inlinks respectively.
> The work from there then consists of an external process (to Nutch)
> invoking a Groovy script from within Gremlin to ingest data into TitanDB.
> During the course of this work we have realized that mapred and mapreduce
> API's are NOT ok within trunk if we want to move Nutch to accommodate the
> above described architecture.
>
> Breath of fresh air and a deep breath...
>
> What do you guys think about branching trunk into a 3.X branch with every
> mapred --> mapreduce package addressed.
> Mike, Sujen and myself talked today. We want to touch base with everyone
> within dev@ as it lends itself very much to the work undertaken by
> https://issues.apache.org/jira/browse/NUTCH-2097
>
> It does not however totally rearrange the codebase. It will however
> generate a genuine graph output based upon
>
> http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#script-io-format
> We can have a gremlin script as part of $NUTCH_HOME/conf which merely
> ingests data (along with a config file) to a GraphDB such as Titan.
>
> What does everyone think?
> Thanks
> Lewis
>
> --
> *Lewis*
>

Reply via email to