Re: Should we have an easier way to implement ListenBlah processors?

2015-11-29 Thread Joe Witt
Hello I think it is fair to say that building listeners, where NiFi is acting as a recipient of data being pushed to it, is among the harder extension points to build. I also think that the outline of the general pattern is fair at a high level but that a good API to make that more repeatable gen

Re: Should we have an easier way to implement ListenBlah processors?

2015-11-29 Thread Tony Kurc
Many of these use blockingqueues to pass between the "listener" and the "workers". I do think some library support for this would be a good idea, to help provide thread safe publication and make exception handling a little easier. I think providing any of the networking support may be tricky becaus

Should we have an easier way to implement ListenBlah processors?

2015-11-29 Thread Andre
Hi, I am trying to give it a go on NIFI-856 (I'm not a coder either but decided to take the challenge). As I try to get my head around it I've noticed that coding "Listeners" for NiFi is sort of a nightmare. (please take no offense, it is just a sincere unskilled code statement). I've looked at

Re: Can anyone explain the WHY of this PutHDFS logic?

2015-11-29 Thread Mark Petronic
Thanks Tony, that makes sense. On Sun, Nov 29, 2015 at 11:11 AM, Tony Kurc wrote: > Mark, > I didn't write this code, but I believe that this code will trigger when > the namenode is out to lunch (e.g. garbage collecting or a big queue of > operations and *maybe* safe mode. Hadoop code in genera

Re: Can anyone explain the WHY of this PutHDFS logic?

2015-11-29 Thread Tony Kurc
Mark, I didn't write this code, but I believe that this code will trigger when the namenode is out to lunch (e.g. garbage collecting or a big queue of operations and *maybe* safe mode. Hadoop code in general really doesn't have the concept of flow control (i.e. RPC feedback that it is too busy, and

Can anyone explain the WHY of this PutHDFS logic?

2015-11-29 Thread Mark Petronic
This is the sort of "mystery" code that really should have some explicit code comments. :) What are the underlying reasons for this retry logic? This could definitely lead to a bottleneck if this loop has to run and sleep numerous times. Just wondering what, in HDFS, results in the need to do this?