Adam, I was able import the full template, thanks. A couple of things...
The ExtractText processor works by adding user-defined properties (the + icon in the top-right of the properties window) where the property name is a destination attribute and the value is a regular expression. Right now there weren't any regular expressions defined so that processor will always route the file to 'unmatched'. Generally you would probably want to route the matched files to the next processor, and then auto-terminate the unmatched relationship (assuming you want to filter out non-matches). Do you know if MongoDB supports inserting a CSV file through their Java client? do you have similar code that already does this in Storm? I am honestly not that familiar with MongoDB, but in the PutMongo processor it takes the incoming data and calls: Document doc = Document.parse(new String(content, charset)); Looking at that Document.parse() method, it looks like it expects a JSON document, so I just want to make sure that we expect CSV insertions to work here. In researching this, it looks Mongo has some kind of bulkimport utility that handles CSV [1], but this is a command line utility. -Bryan [1] http://docs.mongodb.org/manual/reference/program/mongoimport/ On Mon, Sep 21, 2015 at 3:19 PM, Adam Williams <[email protected]> wrote: > Sorry about that, this should work. Attached the template and the below > error: > > 2015-09-21 14:36:02,821 ERROR [Timer-Driven Process Thread-10] > o.a.nifi.processors.mongodb.PutMongo > PutMongo[id=480877a4-f349-4ef7-9538-8e3e3e108e06] Failed to insert > StandardFlowFileRecord[uuid=bbd7048f-d5a1-4db4-b938-da64b67e810e,claim=org.apache.nifi.controller.repository.claim.StandardContentClaim@8893ae38,offset=0,name=GDELT.MASTERREDUCEDV2.TXT,size=6581409407] > into MongoDB due to java.lang.NegativeArraySizeException: > java.lang.NegativeArraySizeException > > ------------------------------ > Date: Mon, 21 Sep 2015 15:12:43 -0400 > Subject: Re: CSV to Mongo > From: [email protected] > To: [email protected] > > > Adam, > > I imported the template and it looks like it only captured the PutMongo > processor. Can you try deselecting everything on the graph and creating the > template again so we can take a look at the rest of the flow? or if you > have other stuff on your graph, select all of the processors you described > so they all get captured. > > Also, can you provide any of the stacktrace for the exception you are > seeing? The log is in NIFI_HOME/logs/nifi-app.log > > Thanks, > > Bryan > > > On Mon, Sep 21, 2015 at 3:03 PM, Bryan Bende <[email protected]> wrote: > > Adam, > > Thanks for attaching the template, we will take a look and see what is > going on. > > Thanks, > > Bryan > > > On Mon, Sep 21, 2015 at 2:50 PM, Adam Williams <[email protected] > > wrote: > > Hey Joe, > > Sure thing. I attached the template, I'm just taking the GDELT data set > for the getFile Processor which works. The error i get is a negative array. > > > > > Date: Mon, 21 Sep 2015 14:24:50 -0400 > > Subject: Re: CSV to Mongo > > From: [email protected] > > To: [email protected] > > > > > Adam, > > > > Regarding moving from Storm to NiFi i'd say they make better teammates > > than competitors. The use case outlines above should be quite easy > > for NiFi but there are analytic/processing functions Storm is probably > > a better answer for. We're happy to help explore that with you as you > > progress. > > > > If you ever run into an ArrayIndexBoundsException.. then it will > > always be 100% a coding error. Would you mind sending your > > flow.xml.gz over or making a template of the flow (assuming it > > contains nothing sensitive)? If at all possible sample data which > > exposes the issue would be ideal. As an alternative can you go ahead > > and send us the resulting stack trace/error that comes out? > > > > We'll get this addressed. > > > > Thanks > > Joe > > > > On Mon, Sep 21, 2015 at 2:17 PM, Adam Williams > > <[email protected]> wrote: > > > Hello, > > > > > > I'm moving from storm to NiFi and trying to do a simple test with > getting a > > > large CSV file dumped into MongoDB. The CSV file has a header with > column > > > names and it is structured, my only problem is dumping it into > MongoDB. At > > > a high level, do the following processor steps look correct? All i > want is > > > to just pull the whole CSV file over the MongoDB without a regex or > anything > > > fancy (yet). I eventually always seem to hit trouble with array index > > > problems with the putmongo processor: > > > > > > GetFile --> ExtractText --> RoutOnAttribute(not a null line) --> > PutMongo. > > > > > > Does that seem to be the right way to do this in NiFi? > > > > > > Thank you, > > > Adam > > > >
