Re: ListenUDP source IP for single datagram

2016-01-01 Thread Joe Witt
Doug,

Did a quick scan through and it does look like we're not exposing
this.  I agree with you that this would be cool/useful.  It looks like
this line is the one that is throwing away this useful context today:

https://github.com/apache/nifi/blob/master/nifi-commons/nifi-socket-utils/src/main/java/org/apache/nifi/io/nio/DatagramChannelReader.java#L52

The return type of the receive call is a SocketAddress of the sender
and that is precisely what it looks like you'd want.  That would make
for a great flow file attribute to add.

Thanks
Joe

On Wed, Dec 30, 2015 at 10:16 PM, Doug Royal  wrote:
> In the ListenUDP processor, when using the "FlowFile per Datagram" option,
> I'd like to be able to track the message source IP.
>
> I've dug through the source code, and there isn't a way to do this now.
>
> I was wondering if anyone had an interest in, or concerns about this
> feature?
>
> Thanks,
> Doug
> --
> :wq


Re: How to iterate through complex JSON objects.

2016-01-01 Thread Joe Witt
Igor,

The term ETL has a lot of baggage associated with it.  What NiFi was
built to do is dataflow management.  There are already a lot of tools
out there that address the typical relational database ETL space and
NiFi doesn't need to replicate all of those functions.  So probably
best to just focus on use cases/problems and see if NiFi handles them
nicely now, doesn't handle them nicely now but should be made to do
so, or doesn't handle them nicely and it should always be left to some
other system.

Thanks
Joe

On Wed, Dec 16, 2015 at 7:10 PM, Igor Kravzov  wrote:
> The question is "Is NiFi supposed to be a full ETL tool"?
> On Dec 16, 2015 11:27 AM, "Angry Duck Studio" 
> wrote:
>
>> Shweta,
>>
>> I think your issue demonstrates one of my minor complaints with NiFi --
>> that you always have to think in terms of several little, built-in pieces
>> to get a simple job done. Sometimes it's fun, like a puzzle, but other
>> times, I don't feel like dealing with it. That's why I wrote this:
>> https://github.com/mring33621/scripting-for-nifi. A short, custom JS or
>> Groovy script could have handled your JSON data munging in a single stroke.
>>
>> -Matt
>>
>> On Tue, Dec 15, 2015 at 8:40 PM, shweta  wrote:
>>
>> > Thanks Bryan!! Infact I followed the exact approach that you told. Just
>> > that
>> > I was clueless about using Mergecontent processor. So I wrote my custom
>> > script to combine the different outputs and executed it using Execute
>> > Stream
>> > command.
>> > Will try the same with Mergecontent.
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> >
>> http://apache-nifi-developer-list.39713.n7.nabble.com/How-to-iterate-through-complex-JSON-objects-tp5776p5806.html
>> > Sent from the Apache NiFi Developer List mailing list archive at
>> > Nabble.com.
>> >
>>


Re: Common data exchange formats and tabular data

2016-01-01 Thread Joe Witt
Toivo - this thread seems important and does not appear to have come
to a resolution.  Do you want to pick this back up or are you
comfortable with where it is as for now?

On Wed, Dec 2, 2015 at 12:39 PM, dcave  wrote:
> Adding multiple input and output format support would complicate the
> usability and ongoing maintenance of the SQL/NoSQL processors.
> Additionally, as you suggested it is impossible to select a "correct" format
> or set of formats that can handle all potential needs.
>
> A simpler and more streamlined solution is to put the emphasis on having
> Convert processors available that can handle specific cases as they come up
> as your last comment suggested.  This also keeps processor focus on one
> specific task rather than having Get/Put/Convert hybrids that can lead to
> unneeded complexity and code bloat.
>
> This is in line with Benjamin's line of work.
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Common-data-exchange-formats-and-tabular-data-tp3508p5551.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


Re: Should we have an easier way to implement ListenBlah processors?

2016-01-01 Thread Joe Witt
Hello Andre,

Wanted to follow up with you and see how things are developing here
and offer any assistance.

Thanks
Joe

On Mon, Nov 30, 2015 at 11:09 AM, Andre  wrote:
> Hi Joe,
>
> Damn! I should have suspected it... :-(
>
> Anyhow, the code is starting to take shape here :
>
> https://github.com/trixpan/nifi-lumberjack-bundle
> (I will send a PR when I get closer to complete)
>
> I believe I've cleaned most of the netty dependencies, except for the
> issues I faced trying to get TLS to work (netty is so much easier for
> this sort of things! ), so meanwhile I am simply using SOCAT to play
> the TLS termination point until I can get my head around.
>
> The code heavily follows the ListenSyslog pattern (don't pay attention
> to the mess, I will clean it later) but still doesn't inject
> flowfiles. So far I have been able to receive the initial window size
> packets, however, no traffic is processed yet (I suspect it is a
> problem with the framedecoder switch statement but will have to
> confirm tomorrow morning.
>
> Cheers
>
>
> On Mon, Nov 30, 2015 at 1:45 PM, Joe Witt  wrote:
>> Hello
>>
>> I think it is fair to say that building listeners, where NiFi is
>> acting as a recipient of data being pushed to it, is among the harder
>> extension points to build.  I also think that the outline of the
>> general pattern is fair at a high level but that a good API to make
>> that more repeatable generically is pretty hard for the reasons Tony
>> mentions.
>>
>> That said there are things we can do and should do to just keep making
>> such things a little bit easier over time.  We're happy to help you
>> walk through NIFI-856 and if we find things to chip away at
>> generically then let's identify them and work them off.
>>
>> Thanks
>> Joe
>>
>> On Sun, Nov 29, 2015 at 9:37 PM, Tony Kurc  wrote:
>>> Many of these use blockingqueues to pass between the "listener" and the
>>> "workers". I do think some library support for this would be a good idea,
>>> to help provide thread safe publication and make exception handling a
>>> little easier. I think providing any of the networking support may be
>>> tricky because of the diversity of client apis and libraries. But a nice
>>> interprocessor pub/sub api seems like a good move. I noticed Joe Witt
>>> mentioned adding a little delay on poll in his listensyslog, I mentioned
>>> how coverity and findbugs complain about ignoring the return code of
>>> offer... Can you put a ticket in for this? I wouldn't mind working it.
>>> On Nov 29, 2015 9:01 PM, "Andre"  wrote:
>>>
 Hi,

 I am trying to give it a go on NIFI-856 (I'm not a coder either but
 decided to take the challenge).

 As I try to get my head around it I've noticed that coding "Listeners"
 for NiFi is sort of a nightmare. (please take no offense, it is just a
 sincere unskilled code statement).

 I've looked at ListenHTTP, ListenUDP and ListenSyslog for inspiration,
 I got the impression they all seem to have a need to tackle the
 transmission of messages between the Listener "daemon" and the
 onTrigger method. (e.g. syslogEvents within ListenSyslog).

 Is this understanding correct?

 If so,  Shouldn't the framework contain something that makes it easier
 to write this sort of processor instead of having to recreate them?

 I say that because nearly every Listen processors will:

 1. Bind to a socket;
 2. Receive messages;
 3. Read messages;
 4. Add them flowfile;
 5. Acknowledge receipt;

 Where 4 and 5 are usually applicable to some protocols (like Logstash,
 RELP, RLTP, etc).

 Keen to hear your thoughts

 Andre



Re: Merging csv files based on criterias

2016-01-01 Thread Joe Witt
Yaismel,

My best guess is that to best accomplish this it would require custom
coding to handling the merging logic.  As described thus far I believe
I understand the use case but I still have a lot of questions about
frequency of arrival for each dataset, how to handle misses (where one
side doesn't have a reference for a row in the other side), what to do
with out of order arrival, etc..

Thanks
joe

On Fri, Nov 13, 2015 at 6:18 PM, Yaismel Miranda Pons
 wrote:
> Hi Joe, thanks for taking the time to answer.
>
> This is the scenario I'm trying to accomplish with nifi: I want to create a
> simple dataflow for automating the process of ingesting CSV data found in
> some datasets. The dataset could come from either an Http endpoint or just
> be csv files in a directory and it has to be ingested every month.
> I was able to implement this scenario with nifi when the data is just a
> single CSV file, but I have some cases where can be 2 or more CSV files
> related. I would like to know if there is an effective way in nifi to
> combine these CSV files into a single one, based on specific criterias. Each
> file contains around 15 million records.
>
> Thanks
> Yaismel
>
>
>
> --
> View this message in context: 
> http://apache-nifi-developer-list.39713.n7.nabble.com/Merging-csv-files-based-on-criterias-tp4711p4873.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.


[GitHub] nifi pull request: added embedded Kafka server and tests

2016-01-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/134


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1338 - Clarifying InvokeHTTPDocumentation

2016-01-01 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/nifi/pull/153


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-1212 Handling text/csv files as plain text...

2016-01-01 Thread apiri
GitHub user apiri opened a pull request:

https://github.com/apache/nifi/pull/154

NIFI-1212 Handling text/csv files as plain text within the content viewer

Perhaps not as full featured as desired with something like a grid viewer, 
but this creates handling of text/csv files as plain text files so they can be 
viewed within the StandardContentViewerController.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apiri/incubator-nifi nifi-1212

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/154.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #154


commit bff63b0aeeca20e9a5798f9cf2f0389865c833f3
Author: Aldrin Piri 
Date:   2016-01-01T19:10:39Z

NIFI-1212 Treating text/csv files as plain text so they may also be 
displayed.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---