Re: Nifi to Titan... How?

Pat Trainor Fri, 20 May 2016 21:41:41 -0700

Matt,

Great response, and thanks for writing. I would imagine querying would be
left to Java, Gephi, etc., and a Cassandra query (QueryCassandra) would
seem to do well, or even the REST API in Titan 1.0.0? But then again, I see
where Titan queries would be valuable, with Traversals not being the least
important.


I don't know what Nifi uses on the backend, but the Java API to the Gremin
server has been very simple to incorporate, as in:

    public String doTest() {
        Graph graph = new TinkerGraph();
        Vertex a = graph.addVertex(null);
        Vertex b = graph.addVertex(null);
        a.setProperty("name", "Fred");
        b.setProperty("name", "Frank");
        Edge e = graph.addEdge(null, a, b, "knows");
        String r = e.getVertex(Direction.OUT).getProperty("name")
                + "--" + e.getLabel()
                + "-->" + e.getVertex(Direction.IN).getProperty("name");
        return r;
    }

*results in:*
Fred--knows-->Frank

I'm grabbing NEs (Named Entities in Stamford-speak) from scraped html pages
now, and it makes sense for early stuff to make the name of the NE the key.
But that is too far down the rabbit hole for this discussion on this list.

What I will probably end up doing with Nifi to send & receive from parsers
is take advantage of it's ability to send & receive from endpoints (TCP,
UDP, even HTTP) to send Nifi content to a daemon I'll write that is sitting
on that endpoint. It may even be simpler to start another Nifi flow with an
endpoint that is waiting for my daemon to send it the results... Need to
experiment...

I'm just really trying not to load code from disk every time there is text
to be worked with... (sent from Nifi)...

Your idea for a GraphProcessor sounds extremely interesting! I write in
lots of languages, so I'm sitting here wondering if Nifi is doing compiles
on the fly, or...

Thanks to the NSA! But it makes me wonder what they're using NOW... :)

Thanks again...

pat
:)




Thanks!

pat <http://about.me/PatTrainor>
( ͡° ͜ʖ ͡°)

LinkedIn <http://LinkedIn.com/in/PatTrainor>
Hobby/Fun Blog <http://blog.atcp.us>
Sales Engineering University <http://seu.atcp.us>



"A wise man can learn more from a foolish *question *than a fool can learn
from a wise *answer*". ~ Bruce Lee.

On Fri, May 20, 2016 at 11:20 PM, Matt Burgess <mattyb...@gmail.com> wrote:

> Pat,
>
> No worries, this discussion is relevant to the devs group :) I
> appreciate and share your interest in getting connected data into
> graph databases in order to get more insight out of the data.
>
> It may be possible to put the data directly into the backing store
> (Cassandra, e.g.) via NiFi, but in my experience that may be a fragile
> and possibly non-future-proof solution. IMHO I think you're on the
> right track with respect to taking graph-ready data and putting it
> directly into a graph DB like Titan.
>
> To that end, and there are some email thread(s) in this (and/or the
> users list) that mention it, I think we need a PutToGraphDatabase
> processor. In my mind, this processor takes GraphSON (or some other
> supported format) and writes it to a Tinkerpop graph DB (to include
> Titan, Neo4J, etc.). Conversion of input data should be possible with
> existing processors in NiFi, and such a Put processor would allow the
> user to pick the destination DB (Titan, Neo4J, Sail, OrientDB, e.g.)
>
> Querying existing graphs is a different animal; in fact, it's so
> complex that a DSL like Gremlin is likely the best play (as mentioned
> in the thread), but certainly a processor that hides the scaffolding
> would be helpful. Maybe we can get a graph bundle with
> PutToGraphDatabase and QueryGraphDatabase processors, if you agree
> (partially or wholly) please feel free to log Jira(s) to add such
> things :)
>
> As it turns out, I may have some free time this weekend (thanks
> Grandma for watching the kids!), I've wanted the PutToGraphDB
> processor for a while, as well as a Tinkerpop-enabled
> Site-to-Site-client to ingest NiFi flow files (containing graph-ready
> data) that writes to graph databases. Stay tuned to this list, if I
> get something useful I will be sure to share it. Also I would be very
> appreciative for any guidance, suggestions, and review you'd like to
> share :)
>
> Cheers,
> Matt
>
>
> On Fri, May 20, 2016 at 10:02 PM, Pat Trainor <pat.trai...@gmail.com>
> wrote:
> > Joe Witt & Andre,
> >
> > You guys are very nice, but I thought this was the only mailing list for
> > Nifi... I now realize that there is a user's list, where this kind of
> > question would be more appropriate.
> >
> > Thanks for not tazing me... :)
> >
> > As for the great replies:
> >
> > Joe,
> >
> > I want the output to be used in Titan for analysis (which is why I
> figured
> > a Nifi->Titan connection was needed), but I didn't think of going right
> to
> > the DB store of Titan's. I'm wondering now if doing so will allow Titan
> to
> > do it's thing if graphs are created/modified in this fashion.
> >
> > I'm only into Titan/Cassandra for a month or so now, starting off on
> > Hadoop. I will go on the user list and see if anyone has taken this
> > approach, and I'll try to figure out which connector (processor) would
> > update/insert into Cassandra.
> >
> > Very insightful!
> >
> > Andre,
> >
> > This is promising, but the running of scripts is a way I really don't
> want
> > to go in. I like the speed/performance of the code in memory (like a
> > daemon), instead of loading off the HD every time it's needed. It would
> > seem to me to be a limiting factor for scaling.
> >
> > Also, I'm really trying to not add anything more to the zoo I have now.
> If
> > it is inevitable, so be it. I can write code in Java that talks to the
> > tinkerpop3 stack and I can see the data in cassandra, but again, I have
> to
> > load up the java program each time it is run.
> >
> > If a Processor in Nifi could just convert to the needed format/syntax and
> > act against either Titan, Cassandra, or an existing, running component
> > directly, it would make the flow of data very fast In my mind).
> >
> > I need to figure out what other users are doing, and I will do that on
> the
> > user's list...
> > Maybe like Joe said, I can go from Nifi processor directly to
> Cassandra...
> > Sounds very interesting...
> >
> > Thanks again for putting up with a non-dev question, folks!
> >
> > pat
> > :)
> >
> >
> >
> >
> >
> > Thanks!
> >
> > pat <http://about.me/PatTrainor>
> > ( ͡° ͜ʖ ͡°)
> >
> > LinkedIn <http://LinkedIn.com/in/PatTrainor>
> > Hobby/Fun Blog <http://blog.atcp.us>
> > Sales Engineering University <http://seu.atcp.us>
> >
> >
> >
> > "A wise man can learn more from a foolish *question *than a fool can
> learn
> > from a wise *answer*". ~ Bruce Lee.
> >
> > On Fri, May 20, 2016 at 12:22 AM, Joe Witt <joe.w...@gmail.com> wrote:
> >
> >> Pat
> >>
> >> It looks like Titan can be backed by Apache Cassandra or Apache HBase.
> >> NiFi can deliver to both of those.  Would that take care of what
> >> you're looking to do?
> >>
> >> Thanks
> >> Joe
> >>
> >> On Thu, May 19, 2016 at 11:16 PM, Pat Trainor <pat.trai...@gmail.com>
> >> wrote:
> >> > Thanks to everyone making this now open sourced tool awesome.
> >> >
> >> > I can't find anything that directly links the 2.
> >> >
> >> > I want to use Nifi to coordinate page scraping, parsing, and finally
> >> throw
> >> > Titan data for graphs.
> >> >
> >> > Do I have to use Kafka or Spark for this?
> >> >
> >> > I'm looking at the output mechanisms (Processors), and I'm not a JSON
> >> > expert, but I can write anything needed in Java...
> >> >
> >> > I kind of like the elegance of Titan on Cassandra, and am reticent to
> add
> >> > more animals to my little ark!
> >> >
> >> > Just looking for pointers to the tech that would fit neatly...
> >> >
> >> > Thanks
> >> > in advance for your insights
> >> > !
> >> >
> >> > pat <http://about.me/PatTrainor>
> >> > ( ͡° ͜ʖ ͡°)
> >>
>

Re: Nifi to Titan... How?

Reply via email to