Re: Processors on the fly for many sensor devices

2016-09-28 Thread Andrew Psaltis
Davy,
The processor I have been working on may meet your needs. You are correct,
at this time I have not pushed the source for it, still working through
some hurdles. The one thing to work out is how you would dynamically add
the processors -- suppose you may be able to use the REST API for NiFi. I
would imagine there are quite a number of these devices that you would need
to have processors for. In the use case I have been working on, there may
be 600 or so endpoints that I need to connect to and I'm trying to figure
out does it make sense to do it this way.

I'll hopefully be in a place soon that I can push the code I have for the
GetTCP processor.




On Thu, Sep 29, 2016 at 5:22 AM, Bryan Bende  wrote:

> Just wanted to clarify something about ListenTCP... it does support
> multiple incoming connections, however if you using the batch output
> capability, one flow file will contain data across all the connections.
>
> I do agree with Andrew that based on the description it sounds like NiFi
> is expected to be the client that initiates a connection and keeps reading
> for some amount of time/threshold, like a GetTCP processor.
>
> Currently we have ListenTCP which is waiting for incoming connections, and
> PutTCP which makes a connection, but only writes data.
>
> On Wed, Sep 28, 2016 at 5:02 PM, Andrew Psaltis 
> wrote:
>
>> Davy,
>> It sounds like you need a GetTCP type of processor that connects from
>> NiFi to the TCP endpoint on the sensor, is that correct?
>>
>> Thanks,
>> Andrew
>>
>> On Thu, Sep 29, 2016 at 4:46 AM, Davy De Waele 
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks for the response ... it's an existing network of sensors. The
>>> sensors spit out data over a serial interface that is exposed over a tcp
>>> connection. (rs232 -> ethernet converter in the sensor).
>>> The current sensor architecture involves clients making direct
>>> connections to the individual sensors. (establishing a tcp connection to
>>> the specific ip of the sensor).
>>>
>>> If I understand correctly, ListenTCP would not work in this case for
>>> multiple sensors
>>>
>>> Are you talking about a setup where the sensors would be in a "client"
>>> mode where each sensor would each establish a tcp connections to a single
>>> ListTCP processor  ?
>>>
>>> Thx
>>>
>>>
>>>
>>> On Wed, Sep 28, 2016 at 10:03 PM, Joe Witt  wrote:
>>>
 Hello

 Can you talk a bit about why you'd want ListenTCP processors tied to a
 given sensor?  You should be able to have many sensors to a single
 ListenTCP.  Each stream will be between a source/sensor and nifi so
 data won't be getting intermingled there.  If we're not providing
 enough session/stream metadata on the flow files to make demux of the
 streams easy using something like RouteOnAttribute or whatnot we
 definitely should.

 Now, that said, you could certainly programmatically deploy (via the
 REST API) instances of these processors along the lines of what your
 endpoint registry tells you.  It just seems on the surface like doing
 so would be avoidable at least for the listening of data.  Typically
 such a registry would be useful to do additional tagging/enrichment of
 the data and would occur once it is in the flow.

 Thanks
 Joe

 On Wed, Sep 28, 2016 at 3:39 PM, Davy De Waele 
 wrote:
 > We have a large number of sensors that send out data via TCP. The
 idea is to
 > use a ListenTCP processor in Nifi to capture the data, do some
 filtering /
 > basic transformation before sending it upstream into our stack.
 >
 > We can configure individual ListenTCP processors for each sensor, and
 that
 > works fine when the number of sensors is small, but once you hit a
 larger
 > number if becomes cumbersome and difficult to manage.
 >
 > We have an inventory of those sensors (exposed via a REST service
 endpoint),
 > containing  the sensor tcp information like ip and port)
 >
 > Is there an easy way to create these ListenTCP processors on the fly
 based
 > on a REST endpoint or some other external configuration ? How would
 that
 > work ?
 >
 > Thx.

>>>
>>>
>>
>>
>> --
>> Thanks,
>> Andrew
>>
>> Subscribe to my book: Streaming Data 
>> 
>> twiiter: @itmdata 
>>
>
>


-- 
Thanks,
Andrew

Subscribe to my book: Streaming Data 

twiiter: @itmdata 


Re: Processors on the fly for many sensor devices

2016-09-28 Thread Bryan Bende
Just wanted to clarify something about ListenTCP... it does support
multiple incoming connections, however if you using the batch output
capability, one flow file will contain data across all the connections.

I do agree with Andrew that based on the description it sounds like NiFi is
expected to be the client that initiates a connection and keeps reading for
some amount of time/threshold, like a GetTCP processor.

Currently we have ListenTCP which is waiting for incoming connections, and
PutTCP which makes a connection, but only writes data.

On Wed, Sep 28, 2016 at 5:02 PM, Andrew Psaltis 
wrote:

> Davy,
> It sounds like you need a GetTCP type of processor that connects from NiFi
> to the TCP endpoint on the sensor, is that correct?
>
> Thanks,
> Andrew
>
> On Thu, Sep 29, 2016 at 4:46 AM, Davy De Waele  wrote:
>
>> Hi,
>>
>> Thanks for the response ... it's an existing network of sensors. The
>> sensors spit out data over a serial interface that is exposed over a tcp
>> connection. (rs232 -> ethernet converter in the sensor).
>> The current sensor architecture involves clients making direct
>> connections to the individual sensors. (establishing a tcp connection to
>> the specific ip of the sensor).
>>
>> If I understand correctly, ListenTCP would not work in this case for
>> multiple sensors
>>
>> Are you talking about a setup where the sensors would be in a "client"
>> mode where each sensor would each establish a tcp connections to a single
>> ListTCP processor  ?
>>
>> Thx
>>
>>
>>
>> On Wed, Sep 28, 2016 at 10:03 PM, Joe Witt  wrote:
>>
>>> Hello
>>>
>>> Can you talk a bit about why you'd want ListenTCP processors tied to a
>>> given sensor?  You should be able to have many sensors to a single
>>> ListenTCP.  Each stream will be between a source/sensor and nifi so
>>> data won't be getting intermingled there.  If we're not providing
>>> enough session/stream metadata on the flow files to make demux of the
>>> streams easy using something like RouteOnAttribute or whatnot we
>>> definitely should.
>>>
>>> Now, that said, you could certainly programmatically deploy (via the
>>> REST API) instances of these processors along the lines of what your
>>> endpoint registry tells you.  It just seems on the surface like doing
>>> so would be avoidable at least for the listening of data.  Typically
>>> such a registry would be useful to do additional tagging/enrichment of
>>> the data and would occur once it is in the flow.
>>>
>>> Thanks
>>> Joe
>>>
>>> On Wed, Sep 28, 2016 at 3:39 PM, Davy De Waele 
>>> wrote:
>>> > We have a large number of sensors that send out data via TCP. The idea
>>> is to
>>> > use a ListenTCP processor in Nifi to capture the data, do some
>>> filtering /
>>> > basic transformation before sending it upstream into our stack.
>>> >
>>> > We can configure individual ListenTCP processors for each sensor, and
>>> that
>>> > works fine when the number of sensors is small, but once you hit a
>>> larger
>>> > number if becomes cumbersome and difficult to manage.
>>> >
>>> > We have an inventory of those sensors (exposed via a REST service
>>> endpoint),
>>> > containing  the sensor tcp information like ip and port)
>>> >
>>> > Is there an easy way to create these ListenTCP processors on the fly
>>> based
>>> > on a REST endpoint or some other external configuration ? How would
>>> that
>>> > work ?
>>> >
>>> > Thx.
>>>
>>
>>
>
>
> --
> Thanks,
> Andrew
>
> Subscribe to my book: Streaming Data 
> 
> twiiter: @itmdata 
>


Re: Processors on the fly for many sensor devices

2016-09-28 Thread Davy De Waele
Correct... But I don't think that one is in the official nifi distribution.
I did stumble upon your repo / jira issue. (The repo didn't contain any
sources, only binaries I think).

But  I guess there we would also need some way of dynamically adding these
processors (as it would require 1 processor per sensor).

Thx

On Wednesday, 28 September 2016, Andrew Psaltis 
wrote:

> Davy,
> It sounds like you need a GetTCP type of processor that connects from NiFi
> to the TCP endpoint on the sensor, is that correct?
>
> Thanks,
> Andrew
>
> On Thu, Sep 29, 2016 at 4:46 AM, Davy De Waele  > wrote:
>
>> Hi,
>>
>> Thanks for the response ... it's an existing network of sensors. The
>> sensors spit out data over a serial interface that is exposed over a tcp
>> connection. (rs232 -> ethernet converter in the sensor).
>> The current sensor architecture involves clients making direct
>> connections to the individual sensors. (establishing a tcp connection to
>> the specific ip of the sensor).
>>
>> If I understand correctly, ListenTCP would not work in this case for
>> multiple sensors
>>
>> Are you talking about a setup where the sensors would be in a "client"
>> mode where each sensor would each establish a tcp connections to a single
>> ListTCP processor  ?
>>
>> Thx
>>
>>
>>
>> On Wed, Sep 28, 2016 at 10:03 PM, Joe Witt > > wrote:
>>
>>> Hello
>>>
>>> Can you talk a bit about why you'd want ListenTCP processors tied to a
>>> given sensor?  You should be able to have many sensors to a single
>>> ListenTCP.  Each stream will be between a source/sensor and nifi so
>>> data won't be getting intermingled there.  If we're not providing
>>> enough session/stream metadata on the flow files to make demux of the
>>> streams easy using something like RouteOnAttribute or whatnot we
>>> definitely should.
>>>
>>> Now, that said, you could certainly programmatically deploy (via the
>>> REST API) instances of these processors along the lines of what your
>>> endpoint registry tells you.  It just seems on the surface like doing
>>> so would be avoidable at least for the listening of data.  Typically
>>> such a registry would be useful to do additional tagging/enrichment of
>>> the data and would occur once it is in the flow.
>>>
>>> Thanks
>>> Joe
>>>
>>> On Wed, Sep 28, 2016 at 3:39 PM, Davy De Waele >> > wrote:
>>> > We have a large number of sensors that send out data via TCP. The idea
>>> is to
>>> > use a ListenTCP processor in Nifi to capture the data, do some
>>> filtering /
>>> > basic transformation before sending it upstream into our stack.
>>> >
>>> > We can configure individual ListenTCP processors for each sensor, and
>>> that
>>> > works fine when the number of sensors is small, but once you hit a
>>> larger
>>> > number if becomes cumbersome and difficult to manage.
>>> >
>>> > We have an inventory of those sensors (exposed via a REST service
>>> endpoint),
>>> > containing  the sensor tcp information like ip and port)
>>> >
>>> > Is there an easy way to create these ListenTCP processors on the fly
>>> based
>>> > on a REST endpoint or some other external configuration ? How would
>>> that
>>> > work ?
>>> >
>>> > Thx.
>>>
>>
>>
>
>
> --
> Thanks,
> Andrew
>
> Subscribe to my book: Streaming Data 
> 
> twiiter: @itmdata 
>


Re: Processors on the fly for many sensor devices

2016-09-28 Thread Andrew Psaltis
Davy,
It sounds like you need a GetTCP type of processor that connects from NiFi
to the TCP endpoint on the sensor, is that correct?

Thanks,
Andrew

On Thu, Sep 29, 2016 at 4:46 AM, Davy De Waele  wrote:

> Hi,
>
> Thanks for the response ... it's an existing network of sensors. The
> sensors spit out data over a serial interface that is exposed over a tcp
> connection. (rs232 -> ethernet converter in the sensor).
> The current sensor architecture involves clients making direct connections
> to the individual sensors. (establishing a tcp connection to the specific
> ip of the sensor).
>
> If I understand correctly, ListenTCP would not work in this case for
> multiple sensors
>
> Are you talking about a setup where the sensors would be in a "client"
> mode where each sensor would each establish a tcp connections to a single
> ListTCP processor  ?
>
> Thx
>
>
>
> On Wed, Sep 28, 2016 at 10:03 PM, Joe Witt  wrote:
>
>> Hello
>>
>> Can you talk a bit about why you'd want ListenTCP processors tied to a
>> given sensor?  You should be able to have many sensors to a single
>> ListenTCP.  Each stream will be between a source/sensor and nifi so
>> data won't be getting intermingled there.  If we're not providing
>> enough session/stream metadata on the flow files to make demux of the
>> streams easy using something like RouteOnAttribute or whatnot we
>> definitely should.
>>
>> Now, that said, you could certainly programmatically deploy (via the
>> REST API) instances of these processors along the lines of what your
>> endpoint registry tells you.  It just seems on the surface like doing
>> so would be avoidable at least for the listening of data.  Typically
>> such a registry would be useful to do additional tagging/enrichment of
>> the data and would occur once it is in the flow.
>>
>> Thanks
>> Joe
>>
>> On Wed, Sep 28, 2016 at 3:39 PM, Davy De Waele 
>> wrote:
>> > We have a large number of sensors that send out data via TCP. The idea
>> is to
>> > use a ListenTCP processor in Nifi to capture the data, do some
>> filtering /
>> > basic transformation before sending it upstream into our stack.
>> >
>> > We can configure individual ListenTCP processors for each sensor, and
>> that
>> > works fine when the number of sensors is small, but once you hit a
>> larger
>> > number if becomes cumbersome and difficult to manage.
>> >
>> > We have an inventory of those sensors (exposed via a REST service
>> endpoint),
>> > containing  the sensor tcp information like ip and port)
>> >
>> > Is there an easy way to create these ListenTCP processors on the fly
>> based
>> > on a REST endpoint or some other external configuration ? How would that
>> > work ?
>> >
>> > Thx.
>>
>
>


-- 
Thanks,
Andrew

Subscribe to my book: Streaming Data 

twiiter: @itmdata 


Re: Processors on the fly for many sensor devices

2016-09-28 Thread Davy De Waele
Hi,

Thanks for the response ... it's an existing network of sensors. The
sensors spit out data over a serial interface that is exposed over a tcp
connection. (rs232 -> ethernet converter in the sensor).
The current sensor architecture involves clients making direct connections
to the individual sensors. (establishing a tcp connection to the specific
ip of the sensor).

If I understand correctly, ListenTCP would not work in this case for
multiple sensors

Are you talking about a setup where the sensors would be in a "client" mode
where each sensor would each establish a tcp connections to a single
ListTCP processor  ?

Thx



On Wed, Sep 28, 2016 at 10:03 PM, Joe Witt  wrote:

> Hello
>
> Can you talk a bit about why you'd want ListenTCP processors tied to a
> given sensor?  You should be able to have many sensors to a single
> ListenTCP.  Each stream will be between a source/sensor and nifi so
> data won't be getting intermingled there.  If we're not providing
> enough session/stream metadata on the flow files to make demux of the
> streams easy using something like RouteOnAttribute or whatnot we
> definitely should.
>
> Now, that said, you could certainly programmatically deploy (via the
> REST API) instances of these processors along the lines of what your
> endpoint registry tells you.  It just seems on the surface like doing
> so would be avoidable at least for the listening of data.  Typically
> such a registry would be useful to do additional tagging/enrichment of
> the data and would occur once it is in the flow.
>
> Thanks
> Joe
>
> On Wed, Sep 28, 2016 at 3:39 PM, Davy De Waele  wrote:
> > We have a large number of sensors that send out data via TCP. The idea
> is to
> > use a ListenTCP processor in Nifi to capture the data, do some filtering
> /
> > basic transformation before sending it upstream into our stack.
> >
> > We can configure individual ListenTCP processors for each sensor, and
> that
> > works fine when the number of sensors is small, but once you hit a larger
> > number if becomes cumbersome and difficult to manage.
> >
> > We have an inventory of those sensors (exposed via a REST service
> endpoint),
> > containing  the sensor tcp information like ip and port)
> >
> > Is there an easy way to create these ListenTCP processors on the fly
> based
> > on a REST endpoint or some other external configuration ? How would that
> > work ?
> >
> > Thx.
>


Re: Processors on the fly for many sensor devices

2016-09-28 Thread Joe Witt
Hello

Can you talk a bit about why you'd want ListenTCP processors tied to a
given sensor?  You should be able to have many sensors to a single
ListenTCP.  Each stream will be between a source/sensor and nifi so
data won't be getting intermingled there.  If we're not providing
enough session/stream metadata on the flow files to make demux of the
streams easy using something like RouteOnAttribute or whatnot we
definitely should.

Now, that said, you could certainly programmatically deploy (via the
REST API) instances of these processors along the lines of what your
endpoint registry tells you.  It just seems on the surface like doing
so would be avoidable at least for the listening of data.  Typically
such a registry would be useful to do additional tagging/enrichment of
the data and would occur once it is in the flow.

Thanks
Joe

On Wed, Sep 28, 2016 at 3:39 PM, Davy De Waele  wrote:
> We have a large number of sensors that send out data via TCP. The idea is to
> use a ListenTCP processor in Nifi to capture the data, do some filtering /
> basic transformation before sending it upstream into our stack.
>
> We can configure individual ListenTCP processors for each sensor, and that
> works fine when the number of sensors is small, but once you hit a larger
> number if becomes cumbersome and difficult to manage.
>
> We have an inventory of those sensors (exposed via a REST service endpoint),
> containing  the sensor tcp information like ip and port)
>
> Is there an easy way to create these ListenTCP processors on the fly based
> on a REST endpoint or some other external configuration ? How would that
> work ?
>
> Thx.


Processors on the fly for many sensor devices

2016-09-28 Thread Davy De Waele
We have a large number of sensors that send out data via TCP. The idea is
to use a ListenTCP processor in Nifi to capture the data, do some filtering
/ basic transformation before sending it upstream into our stack.

We can configure individual ListenTCP processors for each sensor, and that
works fine when the number of sensors is small, but once you hit a larger
number if becomes cumbersome and difficult to manage.

We have an inventory of those sensors (exposed via a REST service
endpoint), containing  the sensor tcp information like ip and port)

Is there an easy way to create these ListenTCP processors on the fly based
on a REST endpoint or some other external configuration ? How would that
work ?

Thx.


Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-28 Thread Andy LoPresto
I think Rob’s layer idea is also very useful. I agree with Scott elaborating on 
it. Different user roles are interested in different metrics (a “designer” 
laying out components and solving a transformation problem vs. a “monitor” 
watching realtime data flow and keeping operations stable). As the available 
features grow, making them usable, understandable, and performant is very 
important. New roles are adopting NiFi as well, so there will be a balance 
between conservative adherence to previous user experience and growing the 
reach and audience of the project.

Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Sep 28, 2016, at 11:09 AM, Andrew Grande  wrote:
> 
> I think we are over designing. I like the big ideas, but really would love 
> simple functionality that was there before, based on user reactions I 
> observed first-hand.
> 
> Andrew
> 
> 
> On Wed, Sep 28, 2016, 2:03 PM Scott Aslan  > wrote:
> Rob - I really like your idea of layers! We could have a 'traffic' layer that 
> could highlight areas of back pressure while the data is flowing. We could 
> even allow the user to customize the threshold for warnings or alert values 
> as well as the colors used for the different states of data flowing (E.g. 
> green means data flowing normally, yellow is between the two user defined 
> thresholds, and red would be over)...maybe this layer is just the color of 
> the drop shadow for each element on the canvas and this view can be toggled 
> on or off.
> 
> Andrew - I can def see the value in coloring different phases of a flow (E.g. 
> flow terminator colored in red). I wonder if we could create a list of these 
> common phases and either let the user assign the processor/element to a phase 
> while they are configuring it or maybe we can automatically detect certain 
> well defined phases. Would also be cool to allow the user to input custom 
> colors for each phase and also to be able to toggle the view on/off.
> 
> Andrew - Also on the topic of coloring elements on the canvasI was 
> thinking about zooming out on the canvas and how quickly the current UX of 
> colored icons becomes unhelpful...meanwhile the Birdseye view does color the 
> processors in a very useful way when zoomed out...would it make sense to 
> switch out the canvas for the Birdseye view once we have sufficiently zoomed 
> out? I think this would satisfy most of the cases for needing/wanting color. 
> Also, nifi could allow users to toggle the Birdseye view as one of the 
> 'layers' even when they are zoomed in...
> 
> -Scott
> 
> On Wed, Sep 28, 2016 at 9:41 AM, Russell Bateman 
>  > wrote:
> After thinking on it a bit, I agree that Manish' suggestion could be a good 
> idea as an option (the way additionalDetails.html is an option). It would be 
> easier if they were .png files rather than formal icon files only with a 
> "width x length" limit.
> 
> My two cents,
> 
> Russ
> 
> On 09/28/2016 12:57 AM, Manish Gupta 8 wrote:
>> I think one of the things that will really help in complex data flow from UI 
>> perspective is “colored icons” on each processor. Not sure if this already 
>> part of 1.0, but from my experience, icons definitely help a lot in quickly 
>> understanding complex flows. Those icons can be fixed (embedded within the 
>> nar) or may be dynamic (user defined icon file for different processors) – 
>> just a suggestion.
>> 
>> 
>> 
>> Regards,
>> 
>> Manish
>> 
>> 
>> 
>> From: Andrew Grande [mailto:apere...@gmail.com ]
>> Sent: Tuesday, September 20, 2016 10:40 PM
>> To: users@nifi.apache.org 
>> Subject: Re: UI: feedback on the processor 'color' in NiFi 1.0
>> 
>> 
>> 
>> No need to go wild, changing processor colors should be enough, IMO. PG and 
>> RPG are possible candidates, but they are different enough already, I guess.
>> 
>> What I heard quite often was to differentiate between regular processors, 
>> incoming sources of data and out only (data producers?). Maybe even with a 
>> shape?
>> 
>> Andrew
>> 
>> 
>> 
>> On Tue, Sep 20, 2016, 12:35 PM Rob Moran > > wrote:
>> 
>> Good points. I was thinking a label would be tied to the group of components 
>> to which it was applied, but that could also introduce problems as things 
>> move and are added to a flow.
>> 
>> 
>> 
>> So would you all expect to be able to change the color of every component 
>> type, or just processors?
>> 
>> 
>> 
>> Andrew - your comment about coloring terminators red is interesting as well. 
>> What are some other parts of a flow you might use color to identify? Along 
>> with backpressure, we could explore other ways to call these things out so 
>> users do not come up with their own methods. Perhaps there are layer 
>> options, like on a map (e.g., "show terrain" or "show traffic"

Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-28 Thread Scott Aslan
Does the Birdseye view as a layer not solve most of the 'simple
functionality' being requested by users?

On Wed, Sep 28, 2016 at 2:21 PM, Andy LoPresto  wrote:

> I think Rob’s layer idea is also very useful. I agree with Scott
> elaborating on it. Different user roles are interested in different metrics
> (a “designer” laying out components and solving a transformation problem
> vs. a “monitor” watching realtime data flow and keeping operations stable).
> As the available features grow, making them usable, understandable, and
> performant is very important. New roles are adopting NiFi as well, so there
> will be a balance between conservative adherence to previous user
> experience and growing the reach and audience of the project.
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Sep 28, 2016, at 11:09 AM, Andrew Grande  wrote:
>
> I think we are over designing. I like the big ideas, but really would love
> simple functionality that was there before, based on user reactions I
> observed first-hand.
>
> Andrew
>
> On Wed, Sep 28, 2016, 2:03 PM Scott Aslan  wrote:
>
>> Rob - I really like your idea of layers! We could have a 'traffic' layer
>> that could highlight areas of back pressure while the data is flowing. We
>> could even allow the user to customize the threshold for warnings or alert
>> values as well as the colors used for the different states of data flowing
>> (E.g. green means data flowing normally, yellow is between the two user
>> defined thresholds, and red would be over)...maybe this layer is just the
>> color of the drop shadow for each element on the canvas and this view can
>> be toggled on or off.
>>
>> Andrew - I can def see the value in coloring different phases of a flow (E.g.
>> flow terminator colored in red). I wonder if we could create a list of
>> these common phases and either let the user assign the processor/element to
>> a phase while they are configuring it or maybe we can automatically detect
>> certain well defined phases. Would also be cool to allow the user to input
>> custom colors for each phase and also to be able to toggle the view on/off.
>>
>> Andrew - Also on the topic of coloring elements on the canvasI was
>> thinking about zooming out on the canvas and how quickly the current UX of
>> colored icons becomes unhelpful...meanwhile the Birdseye view does color
>> the processors in a very useful way when zoomed out...would it make sense
>> to switch out the canvas for the Birdseye view once we have sufficiently
>> zoomed out? I think this would satisfy most of the cases for
>> needing/wanting color. Also, nifi could allow users to toggle
>> the Birdseye view as one of the 'layers' even when they are zoomed in...
>>
>> -Scott
>>
>> On Wed, Sep 28, 2016 at 9:41 AM, Russell Bateman > perfectsearchcorp.com> wrote:
>>
>>> After thinking on it a bit, I agree that Manish' suggestion could be a
>>> good idea as an option (the way *additionalDetails.html* is an option).
>>> It would be easier if they were *.png* files rather than formal icon
>>> files only with a "width x length" limit.
>>>
>>> My two cents,
>>>
>>> Russ
>>>
>>> On 09/28/2016 12:57 AM, Manish Gupta 8 wrote:
>>>
>>> I think one of the things that will really help in complex data flow
>>> from UI perspective is “colored icons” on each processor. Not sure if this
>>> already part of 1.0, but from my experience, icons definitely help a lot in
>>> quickly understanding complex flows. Those icons can be fixed (embedded
>>> within the nar) or may be dynamic (user defined icon file for different
>>> processors) – just a suggestion.
>>>
>>>
>>>
>>> Regards,
>>>
>>> Manish
>>>
>>>
>>>
>>> *From:* Andrew Grande [mailto:apere...@gmail.com ]
>>> *Sent:* Tuesday, September 20, 2016 10:40 PM
>>> *To:* users@nifi.apache.org
>>> *Subject:* Re: UI: feedback on the processor 'color' in NiFi 1.0
>>>
>>>
>>>
>>> No need to go wild, changing processor colors should be enough, IMO. PG
>>> and RPG are possible candidates, but they are different enough already, I
>>> guess.
>>>
>>> What I heard quite often was to differentiate between regular
>>> processors, incoming sources of data and out only (data producers?). Maybe
>>> even with a shape?
>>>
>>> Andrew
>>>
>>>
>>>
>>> On Tue, Sep 20, 2016, 12:35 PM Rob Moran  wrote:
>>>
>>> Good points. I was thinking a label would be tied to the group of
>>> components to which it was applied, but that could also introduce problems
>>> as things move and are added to a flow.
>>>
>>>
>>>
>>> So would you all expect to be able to change the color of every
>>> component type, or just processors?
>>>
>>>
>>>
>>> Andrew - your comment about coloring terminators red is interesting as
>>> well. What are some other parts of a flow you might use color to identify?
>>> Along with backpressure, we could explore other ways to call these things
>>> out so users do not come up with their own methods. P

Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-28 Thread Andrew Grande
I think we are over designing. I like the big ideas, but really would love
simple functionality that was there before, based on user reactions I
observed first-hand.

Andrew

On Wed, Sep 28, 2016, 2:03 PM Scott Aslan  wrote:

> Rob - I really like your idea of layers! We could have a 'traffic' layer
> that could highlight areas of back pressure while the data is flowing. We
> could even allow the user to customize the threshold for warnings or alert
> values as well as the colors used for the different states of data flowing
> (E.g. green means data flowing normally, yellow is between the two user
> defined thresholds, and red would be over)...maybe this layer is just the
> color of the drop shadow for each element on the canvas and this view can
> be toggled on or off.
>
> Andrew - I can def see the value in coloring different phases of a flow (E.g.
> flow terminator colored in red). I wonder if we could create a list of
> these common phases and either let the user assign the processor/element to
> a phase while they are configuring it or maybe we can automatically detect
> certain well defined phases. Would also be cool to allow the user to input
> custom colors for each phase and also to be able to toggle the view on/off.
>
> Andrew - Also on the topic of coloring elements on the canvasI was
> thinking about zooming out on the canvas and how quickly the current UX of
> colored icons becomes unhelpful...meanwhile the Birdseye view does color
> the processors in a very useful way when zoomed out...would it make sense
> to switch out the canvas for the Birdseye view once we have sufficiently
> zoomed out? I think this would satisfy most of the cases for
> needing/wanting color. Also, nifi could allow users to toggle
> the Birdseye view as one of the 'layers' even when they are zoomed in...
>
> -Scott
>
> On Wed, Sep 28, 2016 at 9:41 AM, Russell Bateman <
> russell.bate...@perfectsearchcorp.com> wrote:
>
>> After thinking on it a bit, I agree that Manish' suggestion could be a
>> good idea as an option (the way *additionalDetails.html* is an option).
>> It would be easier if they were *.png* files rather than formal icon
>> files only with a "width x length" limit.
>>
>> My two cents,
>>
>> Russ
>>
>> On 09/28/2016 12:57 AM, Manish Gupta 8 wrote:
>>
>> I think one of the things that will really help in complex data flow from
>> UI perspective is “colored icons” on each processor. Not sure if this
>> already part of 1.0, but from my experience, icons definitely help a lot in
>> quickly understanding complex flows. Those icons can be fixed (embedded
>> within the nar) or may be dynamic (user defined icon file for different
>> processors) – just a suggestion.
>>
>>
>>
>> Regards,
>>
>> Manish
>>
>>
>>
>> *From:* Andrew Grande [mailto:apere...@gmail.com ]
>> *Sent:* Tuesday, September 20, 2016 10:40 PM
>> *To:* users@nifi.apache.org
>> *Subject:* Re: UI: feedback on the processor 'color' in NiFi 1.0
>>
>>
>>
>> No need to go wild, changing processor colors should be enough, IMO. PG
>> and RPG are possible candidates, but they are different enough already, I
>> guess.
>>
>> What I heard quite often was to differentiate between regular processors,
>> incoming sources of data and out only (data producers?). Maybe even with a
>> shape?
>>
>> Andrew
>>
>>
>>
>> On Tue, Sep 20, 2016, 12:35 PM Rob Moran  wrote:
>>
>> Good points. I was thinking a label would be tied to the group of
>> components to which it was applied, but that could also introduce problems
>> as things move and are added to a flow.
>>
>>
>>
>> So would you all expect to be able to change the color of every component
>> type, or just processors?
>>
>>
>>
>> Andrew - your comment about coloring terminators red is interesting as
>> well. What are some other parts of a flow you might use color to identify?
>> Along with backpressure, we could explore other ways to call these things
>> out so users do not come up with their own methods. Perhaps there are layer
>> options, like on a map (e.g., "show terrain" or "show traffic").
>>
>>
>> Rob
>>
>>
>>
>> On Tue, Sep 20, 2016 at 11:23 AM, Andrew Grande 
>> wrote:
>>
>> I agree. Labels are great for grouping, beyond PGs. Processor colors
>> individually add value. E.g. flow terminator colored in red was a very
>> common pattern I used. Besides, labels are not grouped with components, so
>> moving things and re-arranging is a pain.
>>
>> Andrew
>>
>>
>>
>> On Tue, Sep 20, 2016, 11:21 AM Joe Skora < 
>> jsk...@gmail.com> wrote:
>>
>> Rob,
>>
>> The labelling functionality you described sounds very useful in general.
>> But, I miss the processor color too.
>>
>> I think labels are really useful for identifying groups of components and
>> areas in the flow, but I worry that needing to use them in volume for
>> processor coloring will increase the API and browser canvas load for
>> elements that don't actually affect the flow.
>>
>>
>>
>> On Tue, Sep 20, 2016 at 10:40 AM, Rob Moran < 
>> rmo...@gmail.com> wro

Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-28 Thread Scott Aslan
Rob - I really like your idea of layers! We could have a 'traffic' layer
that could highlight areas of back pressure while the data is flowing. We
could even allow the user to customize the threshold for warnings or alert
values as well as the colors used for the different states of data flowing
(E.g. green means data flowing normally, yellow is between the two user
defined thresholds, and red would be over)...maybe this layer is just the
color of the drop shadow for each element on the canvas and this view can
be toggled on or off.

Andrew - I can def see the value in coloring different phases of a flow (E.g.
flow terminator colored in red). I wonder if we could create a list of
these common phases and either let the user assign the processor/element to
a phase while they are configuring it or maybe we can automatically detect
certain well defined phases. Would also be cool to allow the user to input
custom colors for each phase and also to be able to toggle the view on/off.

Andrew - Also on the topic of coloring elements on the canvasI was
thinking about zooming out on the canvas and how quickly the current UX of
colored icons becomes unhelpful...meanwhile the Birdseye view does color
the processors in a very useful way when zoomed out...would it make sense
to switch out the canvas for the Birdseye view once we have sufficiently
zoomed out? I think this would satisfy most of the cases for
needing/wanting color. Also, nifi could allow users to toggle
the Birdseye view as one of the 'layers' even when they are zoomed in...

-Scott

On Wed, Sep 28, 2016 at 9:41 AM, Russell Bateman <
russell.bate...@perfectsearchcorp.com> wrote:

> After thinking on it a bit, I agree that Manish' suggestion could be a
> good idea as an option (the way *additionalDetails.html* is an option).
> It would be easier if they were *.png* files rather than formal icon
> files only with a "width x length" limit.
>
> My two cents,
>
> Russ
>
> On 09/28/2016 12:57 AM, Manish Gupta 8 wrote:
>
> I think one of the things that will really help in complex data flow from
> UI perspective is “colored icons” on each processor. Not sure if this
> already part of 1.0, but from my experience, icons definitely help a lot in
> quickly understanding complex flows. Those icons can be fixed (embedded
> within the nar) or may be dynamic (user defined icon file for different
> processors) – just a suggestion.
>
>
>
> Regards,
>
> Manish
>
>
>
> *From:* Andrew Grande [mailto:apere...@gmail.com ]
> *Sent:* Tuesday, September 20, 2016 10:40 PM
> *To:* users@nifi.apache.org
> *Subject:* Re: UI: feedback on the processor 'color' in NiFi 1.0
>
>
>
> No need to go wild, changing processor colors should be enough, IMO. PG
> and RPG are possible candidates, but they are different enough already, I
> guess.
>
> What I heard quite often was to differentiate between regular processors,
> incoming sources of data and out only (data producers?). Maybe even with a
> shape?
>
> Andrew
>
>
>
> On Tue, Sep 20, 2016, 12:35 PM Rob Moran  wrote:
>
> Good points. I was thinking a label would be tied to the group of
> components to which it was applied, but that could also introduce problems
> as things move and are added to a flow.
>
>
>
> So would you all expect to be able to change the color of every component
> type, or just processors?
>
>
>
> Andrew - your comment about coloring terminators red is interesting as
> well. What are some other parts of a flow you might use color to identify?
> Along with backpressure, we could explore other ways to call these things
> out so users do not come up with their own methods. Perhaps there are layer
> options, like on a map (e.g., "show terrain" or "show traffic").
>
>
> Rob
>
>
>
> On Tue, Sep 20, 2016 at 11:23 AM, Andrew Grande 
> wrote:
>
> I agree. Labels are great for grouping, beyond PGs. Processor colors
> individually add value. E.g. flow terminator colored in red was a very
> common pattern I used. Besides, labels are not grouped with components, so
> moving things and re-arranging is a pain.
>
> Andrew
>
>
>
> On Tue, Sep 20, 2016, 11:21 AM Joe Skora < 
> jsk...@gmail.com> wrote:
>
> Rob,
>
> The labelling functionality you described sounds very useful in general.
> But, I miss the processor color too.
>
> I think labels are really useful for identifying groups of components and
> areas in the flow, but I worry that needing to use them in volume for
> processor coloring will increase the API and browser canvas load for
> elements that don't actually affect the flow.
>
>
>
> On Tue, Sep 20, 2016 at 10:40 AM, Rob Moran < 
> rmo...@gmail.com> wrote:
>
> What if we promote the use of Labels as a way to highlight things. We
> could add functionality to expand their usefulness as a way to highlight
> things on the canvas. I believe that is their intended use.
>
>
>
> Today you can create a label and change its color to highlight single or
> multiple components. Even better you can do it for any component (not just
> proc

Re: Remove top N lines from a text file

2016-09-28 Thread Mark Payne
Peter,

Another option that may be a lot easier for you is to use the RouteText 
processor.

If you set Matching Strategy to "Satisfies Expression", you can use the 
Expression Language
to inspect FlowFile attributes, etc. But the RouteText processor also exposes 
two additional
variables: _line_ and _lineNo_ that you can use to route individual lines. So 
you set the Routing
Strategy to "Route to 'matched' if line matches all conditions" and add a
property with a value of: ${lineNo:gt(5)}

That would route the first 5 lines to to the 'unmatched' relationship (lineNo 
is 1 for the first line),
and all the other lines would be routed to the 'matched' relationship. Then you 
can just auto-terminate
'unmatched'.

This should help to avoid having to write any custom scripts.

Thanks
-Mark


> On Sep 28, 2016, at 7:03 AM, Pierre Villard  
> wrote:
> 
> Hi Peter,
> 
> I would recommend you the following blog by Matt:
> http://funnifi.blogspot.fr/2016/02/executescript-processor-replacing-flow.html
>  
> 
> 
> Pierre
> 
> 2016-09-28 13:01 GMT+02:00 Andrew Grande  >:
> Groovy script or a simple sed command invoked via ExecuteStreamingCommand 
> should do the job.
> 
> Andrew
> 
> 
> On Wed, Sep 28, 2016, 12:18 AM Peter Wicks (pwicks)  > wrote:
> I have a CSV file where the first few lines are a summary of the report 
> parameters that were used to generate it. I want to strip these off in NiFi.
> 
> I’ve considered using a RegEx to match the {N} top lines, but am wondering of 
> a Groovy script might be a better option?  I want to keep the file intact, so 
> splitting it by line ending and routing all of the lines through a 
> RouteByAttribute seems excessive.
> 
>  
> 
> I’ve never built a Groovy script, any examples on how I might go about this?
> 
>  
> 
> Thanks,
> 
>   Peter
> 
> 



Re: nifi Rest API to get full details of the flow.

2016-09-28 Thread Sandeep Khurana
I am not able to locate rest api which can give summary of a flow. I am
checking at https://nifi.apache.org/docs/nifi-docs/rest-api/. Maybe I am
missing something?

On Wed, Sep 28, 2016 at 4:30 PM, Andrew Grande  wrote:

> This isn't an ideal approach, IMO. There is a standard API to get a
> summary of the flow and,status of every processor, check what the Summary
> tab is invoking for a URL. You can then drill into any specific component
> by ID.
>
> Andrew
>
> On Wed, Sep 28, 2016, 6:42 AM Sandeep Khurana 
> wrote:
>
>> Just now looked at flow.xml.gz file. Ir serves  the purpose. Thx
>>
>> On Wed, Sep 28, 2016 at 4:02 PM, Sandeep Khurana 
>> wrote:
>>
>>> Hello
>>>
>>> Is there a way to get the full details of the flow which I created from
>>> Nifi UI ?
>>>
>>> If want to the ID of processors programatically (without looking from
>>> Nifi UI) and then based upon some conditions I want to see status of 1 or
>>> more processors.
>>>
>>>  Is there any way ?
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Thanks and regards
>> Sandeep Khurana
>>
>


-- 
Thanks and regards
Sandeep Khurana


Re: State management not captured in cluster mode

2016-09-28 Thread Pierre Villard
At the moment the sate management description (if any) of a processor is
available by right clicking on it and going into "View state".

Pierre

2016-09-28 16:15 GMT+02:00 Bryan Bende :

> What I was referring to is in the code of each processor, it is annotated
> with something describing the state.
>
> For example:
> https://github.com/apache/nifi/blob/master/nifi-nar-
> bundles/nifi-standard-bundle/nifi-standard-processors/src/
> main/java/org/apache/nifi/processors/standard/ListFile.java#L102-L105
>
> Currently this is not visible anywhere in the NiFi UI, but what Joe
> referenced was an improvement that Pierre submitted so that we can display
> this information in the documentation for the processor when clicking
> "usage".
>
>
> On Wed, Sep 28, 2016 at 10:09 AM, Selvam Raman  wrote:
>
>> Hi Bryan,
>>
>> Thanks for the information. Can you please share a picture where i can
>> able to see the state (local or cluster). I could not see anywhere.
>>
>> On Wed, Sep 28, 2016 at 1:22 PM, Bryan Bende  wrote:
>>
>>> Hi Selvam,
>>>
>>> It depends what processor you are using. For example, ListFile using a
>>> local file path will always store state locally even when clustered because
>>> no other node can take over that state since the directory to list only
>>> exists on that node. Each processor has an annotation at the top of it
>>> which specifies what type of state it stores, local or clustered. We should
>>> consider adding this to the docs if it's not included already.
>>>
>>> -Bryan
>>>
>>>
>>> On Wednesday, September 28, 2016, Selvam Raman  wrote:
>>>
 Hi,

 This is my state-management.xml attribute

 
 local-provider
 org.apache.nifi.controller.state.providers.local.Writ
 eAheadLocalStateProvider
 ./state/local
 


 
 zk-provider
 org.apache.nifi.controller.state.providers.zookeeper.
 ZooKeeperStateProvider
 hostname:2181,hostname
 :2181
 /opt/nifiroot
 10 seconds
 Open
 


 This is my nifi.poperties file attributes

 
 # State Management #
 
 nifi.state.management.configuration.file=./conf/state-management.xml
 # The ID of the local state provider
 nifi.state.management.provider.local=local-provider
 # The ID of the cluster-wide state provider. This will be ignored if
 NiFi is not clustered but must be populated if running in a cluster.
 nifi.state.management.provider.cluster=zk-provider
 # Specifies whether or not this instance of NiFi should run an embedded
 ZooKeeper server
 nifi.state.management.embedded.zookeeper.start=false
 # Properties file that provides the ZooKeeper properties to use if
  is set to true
 nifi.state.management.embedded.zookeeper.properties=./conf/z
 ookeeper.properties

 # zookeeper properties, used for cluster management #
 nifi.zookeeper.connect.string=hostname:2181,hostname:2181
 nifi.zookeeper.connect.timeout=3 secs
 nifi.zookeeper.session.timeout=3 secs
 nifi.zookeeper.root.node=/opt/nifiroot

 ​the question here is, i am running nifi in cluster mode and i am
 expecting state should be stored in zk-provider. But the state stored in
 local-p​rovider.

 local-state provider:
 /home/nifi/nifi-1.0.0/state/local/partition-*

 zk-provider:(empty directory)
 /opt/nifiroot

 any help on this.



 --
 Selvam Raman
 "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"

>>>
>>>
>>> --
>>> Sent from Gmail Mobile
>>>
>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>


Re: InvokeHTTP request fails for huge size

2016-09-28 Thread Selvam Raman
Hi LoPresto/Villard,

yes i am fetching data from an (OAI)open source server to my s3 bucket.
open source server contains 14k Http requst and this request will download
files. File sizes vary from kb to MB.
i have created nifi workflow (listofurl-invokehttp - putS3Object) get the
data and put it into s3. some invokehttp fails because of server side heap
space problem.

is there anyway we can sort out this issue.

Thanks,
selvam R

On Tue, Sep 27, 2016 at 7:28 PM, Andy LoPresto  wrote:

> Selvam,
>
> Without more information, it does appear the error is on the "server
> side", as NiFi does not run on Tomcat. Is this a remote connection between
> two different servers or are both applications running on the same machine?
>
> I don’t know what the heap size is on the Tomcat server you appear to be
> connecting to, and I am not sure why you would have a screenshot of that
> HTML response.
>
> Is the Tomcat server hosting a 200+ MB file that you want to ingest into
> NiFi?
>
> Andy LoPresto
> alopre...@apache.org
> *alopresto.apa...@gmail.com *
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>
> On Sep 27, 2016, at 9:38 AM, Pierre Villard 
> wrote:
>
> Hi,
>
> Could you give more details about what you are trying to achieve ?
>
> Pierre
>
> 2016-09-27 13:29 GMT+02:00 Selvam Raman :
>
>> HTTP Status 500 - Servlet execution threw an exception
>> --
>>
>> *type* Exception report
>>
>> *message* *Servlet execution threw an exception*
>>
>> *description* *The server encountered an internal error that prevented
>> it from fulfilling this request.*
>>
>> *exception*
>>
>> javax.servlet.ServletException: Servlet execution threw an exception
>>
>>
>> *root cause*
>>
>> java.lang.OutOfMemoryError: Java heap space
>>
>>
>> *note* *The full stack trace of the root cause is available in the
>> Apache Tomcat/6.0.45 logs.*
>> --
>> Apache Tomcat/6.0.45
>>
>> ​i am getting the above error when i am trying to get 200+MB of data. is
>> there any way to get the data. Hopefully this error on the server side not
>> nifi side.​
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>
>


-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: State management not captured in cluster mode

2016-09-28 Thread Bryan Bende
What I was referring to is in the code of each processor, it is annotated
with something describing the state.

For example:
https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ListFile.java#L102-L105

Currently this is not visible anywhere in the NiFi UI, but what Joe
referenced was an improvement that Pierre submitted so that we can display
this information in the documentation for the processor when clicking
"usage".


On Wed, Sep 28, 2016 at 10:09 AM, Selvam Raman  wrote:

> Hi Bryan,
>
> Thanks for the information. Can you please share a picture where i can
> able to see the state (local or cluster). I could not see anywhere.
>
> On Wed, Sep 28, 2016 at 1:22 PM, Bryan Bende  wrote:
>
>> Hi Selvam,
>>
>> It depends what processor you are using. For example, ListFile using a
>> local file path will always store state locally even when clustered because
>> no other node can take over that state since the directory to list only
>> exists on that node. Each processor has an annotation at the top of it
>> which specifies what type of state it stores, local or clustered. We should
>> consider adding this to the docs if it's not included already.
>>
>> -Bryan
>>
>>
>> On Wednesday, September 28, 2016, Selvam Raman  wrote:
>>
>>> Hi,
>>>
>>> This is my state-management.xml attribute
>>>
>>> 
>>> local-provider
>>> org.apache.nifi.controller.state.providers.local.Writ
>>> eAheadLocalStateProvider
>>> ./state/local
>>> 
>>>
>>>
>>> 
>>> zk-provider
>>> org.apache.nifi.controller.state.providers.zookeeper.
>>> ZooKeeperStateProvider
>>> hostname:2181,hostname
>>> :2181
>>> /opt/nifiroot
>>> 10 seconds
>>> Open
>>> 
>>>
>>>
>>> This is my nifi.poperties file attributes
>>>
>>> 
>>> # State Management #
>>> 
>>> nifi.state.management.configuration.file=./conf/state-management.xml
>>> # The ID of the local state provider
>>> nifi.state.management.provider.local=local-provider
>>> # The ID of the cluster-wide state provider. This will be ignored if
>>> NiFi is not clustered but must be populated if running in a cluster.
>>> nifi.state.management.provider.cluster=zk-provider
>>> # Specifies whether or not this instance of NiFi should run an embedded
>>> ZooKeeper server
>>> nifi.state.management.embedded.zookeeper.start=false
>>> # Properties file that provides the ZooKeeper properties to use if
>>>  is set to true
>>> nifi.state.management.embedded.zookeeper.properties=./conf/z
>>> ookeeper.properties
>>>
>>> # zookeeper properties, used for cluster management #
>>> nifi.zookeeper.connect.string=hostname:2181,hostname:2181
>>> nifi.zookeeper.connect.timeout=3 secs
>>> nifi.zookeeper.session.timeout=3 secs
>>> nifi.zookeeper.root.node=/opt/nifiroot
>>>
>>> ​the question here is, i am running nifi in cluster mode and i am
>>> expecting state should be stored in zk-provider. But the state stored in
>>> local-p​rovider.
>>>
>>> local-state provider:
>>> /home/nifi/nifi-1.0.0/state/local/partition-*
>>>
>>> zk-provider:(empty directory)
>>> /opt/nifiroot
>>>
>>> any help on this.
>>>
>>>
>>>
>>> --
>>> Selvam Raman
>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>>
>>
>>
>> --
>> Sent from Gmail Mobile
>>
>
>
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>


Re: InvokeHTTP request fails for huge size

2016-09-28 Thread Joe Witt
Is "Use Chunked Encoding" set to false (as is the default) or has that
been changed to true?  You'll probably need that set to true.

Thanks
Joe

On Wed, Sep 28, 2016 at 10:26 AM, Selvam Raman  wrote:
> Hi LoPresto/Villard,
>
> yes i am fetching data from an (OAI)open source server to my s3 bucket.
> open source server contains 14k Http requst and this request will download
> files. File sizes vary from kb to MB.
> i have created nifi workflow (listofurl-invokehttp - putS3Object) get the
> data and put it into s3. some invokehttp fails because of server side heap
> space problem.
>
> is there anyway we can sort out this issue.
>
> Thanks,
> selvam R
>
> On Tue, Sep 27, 2016 at 7:28 PM, Andy LoPresto  wrote:
>>
>> Selvam,
>>
>> Without more information, it does appear the error is on the "server
>> side", as NiFi does not run on Tomcat. Is this a remote connection between
>> two different servers or are both applications running on the same machine?
>>
>> I don’t know what the heap size is on the Tomcat server you appear to be
>> connecting to, and I am not sure why you would have a screenshot of that
>> HTML response.
>>
>> Is the Tomcat server hosting a 200+ MB file that you want to ingest into
>> NiFi?
>>
>> Andy LoPresto
>> alopre...@apache.org
>> alopresto.apa...@gmail.com
>> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
>>
>> On Sep 27, 2016, at 9:38 AM, Pierre Villard 
>> wrote:
>>
>> Hi,
>>
>> Could you give more details about what you are trying to achieve ?
>>
>> Pierre
>>
>> 2016-09-27 13:29 GMT+02:00 Selvam Raman :
>>>
>>> HTTP Status 500 - Servlet execution threw an exception
>>>
>>> 
>>>
>>> type Exception report
>>>
>>> message Servlet execution threw an exception
>>>
>>> description The server encountered an internal error that prevented it
>>> from fulfilling this request.
>>>
>>> exception
>>>
>>> javax.servlet.ServletException: Servlet execution threw an exception
>>>
>>>
>>> root cause
>>>
>>> java.lang.OutOfMemoryError: Java heap space
>>>
>>>
>>> note The full stack trace of the root cause is available in the Apache
>>> Tomcat/6.0.45 logs.
>>>
>>> 
>>>
>>> Apache Tomcat/6.0.45
>>>
>>>
>>> i am getting the above error when i am trying to get 200+MB of data. is
>>> there any way to get the data. Hopefully this error on the server side not
>>> nifi side.
>>>
>>>
>>> --
>>> Selvam Raman
>>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>>
>>
>
>
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: State management not captured in cluster mode

2016-09-28 Thread Selvam Raman
Hi Bryan,

Thanks for the information. Can you please share a picture where i can able
to see the state (local or cluster). I could not see anywhere.

On Wed, Sep 28, 2016 at 1:22 PM, Bryan Bende  wrote:

> Hi Selvam,
>
> It depends what processor you are using. For example, ListFile using a
> local file path will always store state locally even when clustered because
> no other node can take over that state since the directory to list only
> exists on that node. Each processor has an annotation at the top of it
> which specifies what type of state it stores, local or clustered. We should
> consider adding this to the docs if it's not included already.
>
> -Bryan
>
>
> On Wednesday, September 28, 2016, Selvam Raman  wrote:
>
>> Hi,
>>
>> This is my state-management.xml attribute
>>
>> 
>> local-provider
>> org.apache.nifi.controller.state.providers.local.Writ
>> eAheadLocalStateProvider
>> ./state/local
>> 
>>
>>
>> 
>> zk-provider
>> org.apache.nifi.controller.state.providers.zookeeper.
>> ZooKeeperStateProvider
>> hostname:2181,hostname
>> :2181
>> /opt/nifiroot
>> 10 seconds
>> Open
>> 
>>
>>
>> This is my nifi.poperties file attributes
>>
>> 
>> # State Management #
>> 
>> nifi.state.management.configuration.file=./conf/state-management.xml
>> # The ID of the local state provider
>> nifi.state.management.provider.local=local-provider
>> # The ID of the cluster-wide state provider. This will be ignored if NiFi
>> is not clustered but must be populated if running in a cluster.
>> nifi.state.management.provider.cluster=zk-provider
>> # Specifies whether or not this instance of NiFi should run an embedded
>> ZooKeeper server
>> nifi.state.management.embedded.zookeeper.start=false
>> # Properties file that provides the ZooKeeper properties to use if
>>  is set to true
>> nifi.state.management.embedded.zookeeper.properties=./conf/
>> zookeeper.properties
>>
>> # zookeeper properties, used for cluster management #
>> nifi.zookeeper.connect.string=hostname:2181,hostname:2181
>> nifi.zookeeper.connect.timeout=3 secs
>> nifi.zookeeper.session.timeout=3 secs
>> nifi.zookeeper.root.node=/opt/nifiroot
>>
>> ​the question here is, i am running nifi in cluster mode and i am
>> expecting state should be stored in zk-provider. But the state stored in
>> local-p​rovider.
>>
>> local-state provider:
>> /home/nifi/nifi-1.0.0/state/local/partition-*
>>
>> zk-provider:(empty directory)
>> /opt/nifiroot
>>
>> any help on this.
>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>>
>
>
> --
> Sent from Gmail Mobile
>



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: UI: feedback on the processor 'color' in NiFi 1.0

2016-09-28 Thread Russell Bateman
After thinking on it a bit, I agree that Manish' suggestion could be a 
good idea as an option (the way /additionalDetails.html/ is an option). 
It would be easier if they were /.png/ files rather than formal icon 
files only with a "width x length" limit.


My two cents,

Russ

On 09/28/2016 12:57 AM, Manish Gupta 8 wrote:


I think one of the things that will really help in complex data flow 
from UI perspective is “colored icons” on each processor. Not sure if 
this already part of 1.0, but from my experience, icons definitely 
help a lot in quickly understanding complex flows. Those icons can be 
fixed (embedded within the nar) or may be dynamic (user defined icon 
file for different processors) – just a suggestion.


Regards,

Manish

*From:*Andrew Grande [mailto:apere...@gmail.com]
*Sent:* Tuesday, September 20, 2016 10:40 PM
*To:* users@nifi.apache.org
*Subject:* Re: UI: feedback on the processor 'color' in NiFi 1.0

No need to go wild, changing processor colors should be enough, IMO. 
PG and RPG are possible candidates, but they are different enough 
already, I guess.


What I heard quite often was to differentiate between regular 
processors, incoming sources of data and out only (data producers?). 
Maybe even with a shape?


Andrew

On Tue, Sep 20, 2016, 12:35 PM Rob Moran > wrote:


Good points. I was thinking a label would be tied to the group of
components to which it was applied, but that could also introduce
problems as things move and are added to a flow.

So would you all expect to be able to change the color of every
component type, or just processors?

Andrew - your comment about coloring terminators red is
interesting as well. What are some other parts of a flow you might
use color to identify? Along with backpressure, we could explore
other ways to call these things out so users do not come up with
their own methods. Perhaps there are layer options, like on a map
(e.g., "show terrain" or "show traffic").


Rob

On Tue, Sep 20, 2016 at 11:23 AM, Andrew Grande
mailto:apere...@gmail.com>> wrote:

I agree. Labels are great for grouping, beyond PGs. Processor
colors individually add value. E.g. flow terminator colored in
red was a very common pattern I used. Besides, labels are not
grouped with components, so moving things and re-arranging is
a pain.

Andrew

On Tue, Sep 20, 2016, 11:21 AM Joe Skora mailto:jsk...@gmail.com>> wrote:

Rob,

The labelling functionality you described sounds very
useful in general.  But, I miss the processor color too.

I think labels are really useful for identifying groups of
components and areas in the flow, but I worry that needing
to use them in volume for processor coloring will increase
the API and browser canvas load for elements that don't
actually affect the flow.

On Tue, Sep 20, 2016 at 10:40 AM, Rob Moran
mailto:rmo...@gmail.com>> wrote:

What if we promote the use of Labels as a way to
highlight things. We could add functionality to expand
their usefulness as a way to highlight things on the
canvas. I believe that is their intended use.

Today you can create a label and change its color to
highlight single or multiple components. Even better
you can do it for any component (not just processors).

To expand on functionality, I'm imagining a context
menu and palette action to "Label" a selected
component or components. This would prompt a user to
pick a background and add text which would place a
label around everything once it's applied.


Rob

On Mon, Sep 19, 2016 at 6:42 PM, Jeff
mailto:jtsw...@gmail.com>> wrote:

I was thinking, in addition to changing the color
of the icon on the processor, that the color of
the drop shadow could be changed as well.  That
would provide more contrast, but preserve
readability, in my opinion.

On Mon, Sep 19, 2016 at 6:39 PM Andrew Grande
mailto:apere...@gmail.com>>
wrote:

Hi All,

Rolling with UI feedback threads. This time
I'd like to discuss how NiFi 'lost' its
ability to change processor boxes color. I.e.
as you can see from a screenshot attached, it
does change color for the processor in the
flow overview panel, but the processor itself
only changes the icon in the top-left o

Re: State management not captured in cluster mode

2016-09-28 Thread Joe Witt
Great timing.  Pierre put in a JIRA/PR for this yesterday.

  https://issues.apache.org/jira/browse/NIFI-2832

Thanks
Joe

On Wed, Sep 28, 2016 at 8:22 AM, Bryan Bende  wrote:
> Hi Selvam,
>
> It depends what processor you are using. For example, ListFile using a local
> file path will always store state locally even when clustered because no
> other node can take over that state since the directory to list only exists
> on that node. Each processor has an annotation at the top of it which
> specifies what type of state it stores, local or clustered. We should
> consider adding this to the docs if it's not included already.
>
> -Bryan
>
>
> On Wednesday, September 28, 2016, Selvam Raman  wrote:
>>
>> Hi,
>>
>> This is my state-management.xml attribute
>>
>> 
>> local-provider
>>
>> org.apache.nifi.controller.state.providers.local.WriteAheadLocalStateProvider
>> ./state/local
>> 
>>
>>
>> 
>> zk-provider
>>
>> org.apache.nifi.controller.state.providers.zookeeper.ZooKeeperStateProvider
>> hostname:2181,hostname:2181
>> /opt/nifiroot
>> 10 seconds
>> Open
>> 
>>
>>
>> This is my nifi.poperties file attributes
>>
>> 
>> # State Management #
>> 
>> nifi.state.management.configuration.file=./conf/state-management.xml
>> # The ID of the local state provider
>> nifi.state.management.provider.local=local-provider
>> # The ID of the cluster-wide state provider. This will be ignored if NiFi
>> is not clustered but must be populated if running in a cluster.
>> nifi.state.management.provider.cluster=zk-provider
>> # Specifies whether or not this instance of NiFi should run an embedded
>> ZooKeeper server
>> nifi.state.management.embedded.zookeeper.start=false
>> # Properties file that provides the ZooKeeper properties to use if
>>  is set to true
>>
>> nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties
>>
>> # zookeeper properties, used for cluster management #
>> nifi.zookeeper.connect.string=hostname:2181,hostname:2181
>> nifi.zookeeper.connect.timeout=3 secs
>> nifi.zookeeper.session.timeout=3 secs
>> nifi.zookeeper.root.node=/opt/nifiroot
>>
>> the question here is, i am running nifi in cluster mode and i am expecting
>> state should be stored in zk-provider. But the state stored in
>> local-provider.
>>
>> local-state provider:
>> /home/nifi/nifi-1.0.0/state/local/partition-*
>>
>> zk-provider:(empty directory)
>> /opt/nifiroot
>>
>> any help on this.
>>
>>
>>
>> --
>> Selvam Raman
>> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>
>
>
> --
> Sent from Gmail Mobile


Re: Create NiFi Templates

2016-09-28 Thread Matt Burgess
Ashish,

I don't have the 0.7 docs in front of me so I'm not sure if/how it is
possible there, but it is definitely possible via the 1.0 REST API
[1]. Procedure is as follows:

If the process group already exists in the flow (and you know its ID),
here are some REST API calls that should create a template from it:

1) Determine the parent process group ID. If the process group is at
the top level, I believe you can use "root". In any case you could get
the parentGroupId by doing a GET
nifi-api/process-groups/, the parentGroupId is
available in the response, as is "clientId" and "version" which you
can use below.

2) Create a snippet by POST'ing something similar to the following to
nifi-api/snippet:

{"snippet":{"parentGroupId":"48b93e09-0157-1000-9e3a-f95fffd1a69e","processors":{},"funnels":{},"inputPorts":{},"outputPorts":{},"remoteProcessGroups":{},"processGroups":{"70c6ac33-0157-1000-ecd5-a2680dd6e8ad":{"clientId":"70c56ee6-0157-1000-b298-89976e64b8a5","version":2}},"connections":{},"labels":{}}}

here "parentGroupId" is the id of the parent process group, and
"processGroups" contains only the ID of the process group you want to
create a template from. Also "version" should be the latest version of
the process group, which I believe should be one greater than the
version you retrieved in Step 1 (but equal might work too, can't
remember).

The POST returns a JSON payload with (among other things) a snippet ID.

3) Use the snippet ID in a POST
nifi-api/process-groups//templates:

{"name":"","description":"","snippetId":""}

This returns a JSON payload with (among other things) a template ID.

4) Use the template ID in a GET
nifi-api/templates//download, the payload will be the
contents of the template.

Maybe someone with the 0.7 REST API docs handy can check for similar
API calls that would create a snippet, create a template, then export
the template.

Regards,
Matt

[1] https://nifi.apache.org/docs/nifi-docs/rest-api/index.html

On Wed, Sep 28, 2016 at 8:02 AM, Oleg Zhurakousky
 wrote:
> Ashish
>
> As Joe pointed out, while “possible”, at the moment we don’t expose a direct 
> public API to accomplish that. When I say “possible” I am of course referring 
> to the internal API that is used by NIFi (NIFI UI that is) to wire up flows, 
> but at the moment it is neither public  nor it is the preferred approach.
> That said, I am still wondering as to “what" is the issue you are 
> experiencing? I mean you are clearly describing what you want to do, but I am 
> missing the “why” part.
> Please don’t take it the wrong way. . .  You may very well have legitimate 
> reasons to do what you need to do. All we are trying to do is to understand 
> those reasons so we in the NiFi community can determine if it is indeed a 
> missing yet valuable feature we can/should put on the road map.
>
> Cheers
> Oleg
>
>
>> On Sep 28, 2016, at 7:49 AM, Ashish Agarwal 10  
>> wrote:
>>
>> Hello,
>>
>> I am using Nifi 0.7.0.
>> I want to create the template of the processor groups.
>> Instead of creating it from UI. I want to design a flow that creates a 
>> processor group template based on its id, download it and store it somewhere 
>> locally.
>> Is it possible ? If yes, Is there any method/api available for the same?
>>
>> Thanks You
>> Ashish Agarwal
>>
>> -Original Message-
>> From: Joe Witt [mailto:joe.w...@gmail.com]
>> Sent: Tuesday, September 27, 2016 6:06 PM
>> To: users@nifi.apache.org
>> Subject: Re: Create NiFi Templates
>>
>> Ashish,
>>
>> If the question is 'at runtime what are the ways I could trigger the
>> creation of a NiFI template?'
>>
>>  You could call the REST API endpoint using some mechanism other than
>> the NiFi UI or you could use the NiFi UI.
>>
>> If the question is 'can I programmatically create a template during
>> development time?'
>>
>>  We don't have any direct public API's to accomplish this that I am
>> aware of though it is an interesting idea.
>>
>> Can you help direct us to where you're more interested right now so we
>> can help most effectively.
>>
>> Thanks
>> Joe
>>
>> On Tue, Sep 27, 2016 at 8:31 AM, Oleg Zhurakousky
>>  wrote:
>>> Ashish
>>>
>>> I am not sure I fully understand the question. . .
>>> Templates represented as an XML file and therefore if you “type everything
>>> correctly” you’ll get a working template, but it would be simpler to use
>>> NIFI UI as a design tool to do the same.
>>> Could you please clarify more as to what exactly are you trying to
>>> accomplish?
>>>
>>> Cheers
>>> Oleg
>>>
>>> On Sep 27, 2016, at 3:06 AM, Ashish Agarwal 10 
>>> wrote:
>>>
>>> Hello,
>>>
>>> Is there a way to create a template of a process group other than the button
>>> present on UI ?
>>>
>>> Regards,
>>> Ashish Agarwal
>>>
>>>
>


Re: State management not captured in cluster mode

2016-09-28 Thread Bryan Bende
Hi Selvam,

It depends what processor you are using. For example, ListFile using a
local file path will always store state locally even when clustered because
no other node can take over that state since the directory to list only
exists on that node. Each processor has an annotation at the top of it
which specifies what type of state it stores, local or clustered. We should
consider adding this to the docs if it's not included already.

-Bryan

On Wednesday, September 28, 2016, Selvam Raman  wrote:

> Hi,
>
> This is my state-management.xml attribute
>
> 
> local-provider
> org.apache.nifi.controller.state.providers.local.
> WriteAheadLocalStateProvider
> ./state/local
> 
>
>
> 
> zk-provider
> org.apache.nifi.controller.state.providers.zookeeper.
> ZooKeeperStateProvider
> hostname:2181,
> hostname:2181
> /opt/nifiroot
> 10 seconds
> Open
> 
>
>
> This is my nifi.poperties file attributes
>
> 
> # State Management #
> 
> nifi.state.management.configuration.file=./conf/state-management.xml
> # The ID of the local state provider
> nifi.state.management.provider.local=local-provider
> # The ID of the cluster-wide state provider. This will be ignored if NiFi
> is not clustered but must be populated if running in a cluster.
> nifi.state.management.provider.cluster=zk-provider
> # Specifies whether or not this instance of NiFi should run an embedded
> ZooKeeper server
> nifi.state.management.embedded.zookeeper.start=false
> # Properties file that provides the ZooKeeper properties to use if
>  is set to true
> nifi.state.management.embedded.zookeeper.properties=
> ./conf/zookeeper.properties
>
> # zookeeper properties, used for cluster management #
> nifi.zookeeper.connect.string=hostname:2181,hostname:2181
> nifi.zookeeper.connect.timeout=3 secs
> nifi.zookeeper.session.timeout=3 secs
> nifi.zookeeper.root.node=/opt/nifiroot
>
> ​the question here is, i am running nifi in cluster mode and i am
> expecting state should be stored in zk-provider. But the state stored in
> local-p​rovider.
>
> local-state provider:
> /home/nifi/nifi-1.0.0/state/local/partition-*
>
> zk-provider:(empty directory)
> /opt/nifiroot
>
> any help on this.
>
>
>
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>


-- 
Sent from Gmail Mobile


Re: Create NiFi Templates

2016-09-28 Thread Oleg Zhurakousky
Ashish

As Joe pointed out, while “possible”, at the moment we don’t expose a direct 
public API to accomplish that. When I say “possible” I am of course referring 
to the internal API that is used by NIFi (NIFI UI that is) to wire up flows, 
but at the moment it is neither public  nor it is the preferred approach.
That said, I am still wondering as to “what" is the issue you are experiencing? 
I mean you are clearly describing what you want to do, but I am missing the 
“why” part. 
Please don’t take it the wrong way. . .  You may very well have legitimate 
reasons to do what you need to do. All we are trying to do is to understand 
those reasons so we in the NiFi community can determine if it is indeed a 
missing yet valuable feature we can/should put on the road map.

Cheers
Oleg


> On Sep 28, 2016, at 7:49 AM, Ashish Agarwal 10  wrote:
> 
> Hello,
> 
> I am using Nifi 0.7.0. 
> I want to create the template of the processor groups.
> Instead of creating it from UI. I want to design a flow that creates a 
> processor group template based on its id, download it and store it somewhere 
> locally. 
> Is it possible ? If yes, Is there any method/api available for the same?
> 
> Thanks You
> Ashish Agarwal
> 
> -Original Message-
> From: Joe Witt [mailto:joe.w...@gmail.com] 
> Sent: Tuesday, September 27, 2016 6:06 PM
> To: users@nifi.apache.org
> Subject: Re: Create NiFi Templates
> 
> Ashish,
> 
> If the question is 'at runtime what are the ways I could trigger the
> creation of a NiFI template?'
> 
>  You could call the REST API endpoint using some mechanism other than
> the NiFi UI or you could use the NiFi UI.
> 
> If the question is 'can I programmatically create a template during
> development time?'
> 
>  We don't have any direct public API's to accomplish this that I am
> aware of though it is an interesting idea.
> 
> Can you help direct us to where you're more interested right now so we
> can help most effectively.
> 
> Thanks
> Joe
> 
> On Tue, Sep 27, 2016 at 8:31 AM, Oleg Zhurakousky
>  wrote:
>> Ashish
>> 
>> I am not sure I fully understand the question. . .
>> Templates represented as an XML file and therefore if you “type everything
>> correctly” you’ll get a working template, but it would be simpler to use
>> NIFI UI as a design tool to do the same.
>> Could you please clarify more as to what exactly are you trying to
>> accomplish?
>> 
>> Cheers
>> Oleg
>> 
>> On Sep 27, 2016, at 3:06 AM, Ashish Agarwal 10 
>> wrote:
>> 
>> Hello,
>> 
>> Is there a way to create a template of a process group other than the button
>> present on UI ?
>> 
>> Regards,
>> Ashish Agarwal
>> 
>> 



RE: Create NiFi Templates

2016-09-28 Thread Ashish Agarwal 10
Hello,

I am using Nifi 0.7.0. 
I want to create the template of the processor groups.
Instead of creating it from UI. I want to design a flow that creates a 
processor group template based on its id, download it and store it somewhere 
locally. 
Is it possible ? If yes, Is there any method/api available for the same?
 
Thanks You
Ashish Agarwal

-Original Message-
From: Joe Witt [mailto:joe.w...@gmail.com] 
Sent: Tuesday, September 27, 2016 6:06 PM
To: users@nifi.apache.org
Subject: Re: Create NiFi Templates

Ashish,

If the question is 'at runtime what are the ways I could trigger the
creation of a NiFI template?'

  You could call the REST API endpoint using some mechanism other than
the NiFi UI or you could use the NiFi UI.

If the question is 'can I programmatically create a template during
development time?'

  We don't have any direct public API's to accomplish this that I am
aware of though it is an interesting idea.

Can you help direct us to where you're more interested right now so we
can help most effectively.

Thanks
Joe

On Tue, Sep 27, 2016 at 8:31 AM, Oleg Zhurakousky
 wrote:
> Ashish
>
> I am not sure I fully understand the question. . .
> Templates represented as an XML file and therefore if you “type everything
> correctly” you’ll get a working template, but it would be simpler to use
> NIFI UI as a design tool to do the same.
> Could you please clarify more as to what exactly are you trying to
> accomplish?
>
> Cheers
> Oleg
>
> On Sep 27, 2016, at 3:06 AM, Ashish Agarwal 10 
> wrote:
>
> Hello,
>
> Is there a way to create a template of a process group other than the button
> present on UI ?
>
> Regards,
> Ashish Agarwal
>
>


RE: Remove top N lines from a text file

2016-09-28 Thread Carlos Manuel Fernandes (DSI)
Hi Peter , the simplest way I see,  need to transverse the all file, using a 
processor property. Ex: N_lines (number of Lines to Strip on file)

import java.nio.charset.StandardCharsets

//Read the flowFile
def flowFile = session.get()
if (!flowFile) return

//Read Processor property
def nLinesToStrip=N_lines.value.toInteger()// Number of Lines to Strip

//Read flowFile and Write a new one without the first nLinesToStrip
def counter=0
try {
  flowFile = session.write(flowFile, {inputStream, outputStream ->
inputStream.eachLine { line ->
  counter++
  if (counter > nLinesToStrip) {
outputStream.write("${line}\n".getBytes(StandardCharsets.UTF_8))
   }
   }
  } as StreamCallback)

  session.transfer(flowFile, REL_SUCCESS)
}
catch(Exception e) {
  log.error(e)
  session.transfer(flowFile, REL_FAILURE)
}

For better understanding how to build ExecuteScripts, in Groovy  I recommend 
the blog : funnifi.blogspot.com from Matt 
Burgess. Thanks Matt.

Carlos



From: Pierre Villard [mailto:pierre.villard...@gmail.com]
Sent: quarta-feira, 28 de Setembro de 2016 12:04
To: users@nifi.apache.org
Subject: Re: Remove top N lines from a text file

Hi Peter,
I would recommend you the following blog by Matt:
http://funnifi.blogspot.fr/2016/02/executescript-processor-replacing-flow.html
Pierre

2016-09-28 13:01 GMT+02:00 Andrew Grande 
mailto:apere...@gmail.com>>:

Groovy script or a simple sed command invoked via ExecuteStreamingCommand 
should do the job.

Andrew

On Wed, Sep 28, 2016, 12:18 AM Peter Wicks (pwicks) 
mailto:pwi...@micron.com>> wrote:
I have a CSV file where the first few lines are a summary of the report 
parameters that were used to generate it. I want to strip these off in NiFi.
I’ve considered using a RegEx to match the {N} top lines, but am wondering of a 
Groovy script might be a better option?  I want to keep the file intact, so 
splitting it by line ending and routing all of the lines through a 
RouteByAttribute seems excessive.

I’ve never built a Groovy script, any examples on how I might go about this?

Thanks,
  Peter



State management not captured in cluster mode

2016-09-28 Thread Selvam Raman
Hi,

This is my state-management.xml attribute


local-provider

org.apache.nifi.controller.state.providers.local.WriteAheadLocalStateProvider
./state/local




zk-provider

org.apache.nifi.controller.state.providers.zookeeper.ZooKeeperStateProvider
hostname:2181,hostname:2181
/opt/nifiroot
10 seconds
Open



This is my nifi.poperties file attributes


# State Management #

nifi.state.management.configuration.file=./conf/state-management.xml
# The ID of the local state provider
nifi.state.management.provider.local=local-provider
# The ID of the cluster-wide state provider. This will be ignored if NiFi
is not clustered but must be populated if running in a cluster.
nifi.state.management.provider.cluster=zk-provider
# Specifies whether or not this instance of NiFi should run an embedded
ZooKeeper server
nifi.state.management.embedded.zookeeper.start=false
# Properties file that provides the ZooKeeper properties to use if
 is set to true
nifi.state.management.embedded.zookeeper.properties=./conf/zookeeper.properties

# zookeeper properties, used for cluster management #
nifi.zookeeper.connect.string=hostname:2181,hostname:2181
nifi.zookeeper.connect.timeout=3 secs
nifi.zookeeper.session.timeout=3 secs
nifi.zookeeper.root.node=/opt/nifiroot

​the question here is, i am running nifi in cluster mode and i am expecting
state should be stored in zk-provider. But the state stored in
local-p​rovider.

local-state provider:
/home/nifi/nifi-1.0.0/state/local/partition-*

zk-provider:(empty directory)
/opt/nifiroot

any help on this.



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: Remove top N lines from a text file

2016-09-28 Thread Pierre Villard
Hi Peter,

I would recommend you the following blog by Matt:
http://funnifi.blogspot.fr/2016/02/executescript-processor-replacing-flow.html

Pierre

2016-09-28 13:01 GMT+02:00 Andrew Grande :

> Groovy script or a simple sed command invoked via ExecuteStreamingCommand
> should do the job.
>
> Andrew
>
> On Wed, Sep 28, 2016, 12:18 AM Peter Wicks (pwicks) 
> wrote:
>
>> I have a CSV file where the first few lines are a summary of the report
>> parameters that were used to generate it. I want to strip these off in NiFi.
>>
>> I’ve considered using a RegEx to match the {N} top lines, but am
>> wondering of a Groovy script might be a better option?  I want to keep the
>> file intact, so splitting it by line ending and routing all of the lines
>> through a RouteByAttribute seems excessive.
>>
>>
>>
>> I’ve never built a Groovy script, any examples on how I might go about
>> this?
>>
>>
>>
>> Thanks,
>>
>>   Peter
>>
>


Re: Remove top N lines from a text file

2016-09-28 Thread Andrew Grande
Groovy script or a simple sed command invoked via ExecuteStreamingCommand
should do the job.

Andrew

On Wed, Sep 28, 2016, 12:18 AM Peter Wicks (pwicks) 
wrote:

> I have a CSV file where the first few lines are a summary of the report
> parameters that were used to generate it. I want to strip these off in NiFi.
>
> I’ve considered using a RegEx to match the {N} top lines, but am wondering
> of a Groovy script might be a better option?  I want to keep the file
> intact, so splitting it by line ending and routing all of the lines through
> a RouteByAttribute seems excessive.
>
>
>
> I’ve never built a Groovy script, any examples on how I might go about
> this?
>
>
>
> Thanks,
>
>   Peter
>


Re: nifi Rest API to get full details of the flow.

2016-09-28 Thread Andrew Grande
This isn't an ideal approach, IMO. There is a standard API to get a summary
of the flow and,status of every processor, check what the Summary tab is
invoking for a URL. You can then drill into any specific component by ID.

Andrew

On Wed, Sep 28, 2016, 6:42 AM Sandeep Khurana  wrote:

> Just now looked at flow.xml.gz file. Ir serves  the purpose. Thx
>
> On Wed, Sep 28, 2016 at 4:02 PM, Sandeep Khurana 
> wrote:
>
>> Hello
>>
>> Is there a way to get the full details of the flow which I created from
>> Nifi UI ?
>>
>> If want to the ID of processors programatically (without looking from
>> Nifi UI) and then based upon some conditions I want to see status of 1 or
>> more processors.
>>
>>  Is there any way ?
>>
>>
>>
>>
>
>
> --
> Thanks and regards
> Sandeep Khurana
>


Re: logging all transformed flowfiles

2016-09-28 Thread Andrew Grande
I wonder if you saw the Provenance Querying API. Though, it wasn't designed
for bulk dump of data, more for an interactive poking around.

If you want to proactively store everything in an external system, the S2S
provenance reporting task is the way to go, but it's up to you to filter
and make sense of all events as lineage then. Maybe peek into how NiFi
visualizes the graph for ideas?

Andrew

On Wed, Sep 28, 2016, 2:23 AM  wrote:

> Hello Manish
> Thx for the very helpful  answer , but I was thinking that this functional
> perimeter ( ie logging, storing transformations of data, data lineage ) was
> built in Nifi and available  through REST API  ...
> Or internal calls ...
> The point is that I am not ready to  hook devoted logging processors on
> every processor of my DF or on DF developed by others
> -  firstly , it is intrusive in the DF
> - secondly , it cannot be easily hooked with a template approach ..
> because it is very dependent of the chosen processors in the DF
>
>  Ideally (in  a very simple /naïve requirement)  I would like to run my DF
> taking again my example :
>  (File1 (in) --> Processor1 --> flow1 --> Processor2 --> flow2 --> File2
> (out))
> And then store all the stuff in a Database and says :
>
> getTrace  (Processor1, beforeProcessing)  -> returning ( Attributes ,
> flowfile)
> getTrace ( Processor2, afterProcessing)  
>
> phil
> best regards
>
>
> -Original Message-
> From: Manish Gupta 8 [mailto:mgupt...@sapient.com]
> Sent: mardi 27 septembre 2016 16:46
> To: users@nifi.apache.org
> Subject: RE: logging all transformed flowfiles
>
> Hi Phil,
>
> We are also doing a similar thing but not keeping all the content after
> each transformation externally. What we do is, only send the flow file
> attributes to an external storage (like file / Event Hub / Database/NoSQL)
> using AttributesToJSON processor and then send it for logging after every
> logical step where we want to log (after adding couple of additional
> details like - step name, #of rows in file, hascode etc.).
>
> For your scenario, I think you can simply clone the output relationship
> from each of your processors and send it to a single/multiple logging/sink
> processors. For keeping the lineage, you have couple of options:
> 1. Use different sink/folder/table for each step (with corresponding name)
> 2. Keep file name consistent to track the lineage 3. Modify the Flow file
> content to make sure you can track the lineage from the metadata content.
>
>
> Regards,
> Manish
>
> -Original Message-
> From: philippe.gib...@orange.com [mailto:philippe.gib...@orange.com]
> Sent: Tuesday, September 27, 2016 7:33 PM
> To: users@nifi.apache.org
> Subject: logging all transformed flowfiles
>
> Hello,
> My SW context : standalone  NiFi  1.0.0
>
> My Problem  : I would like to log all the different transformations
> applied to an initial file ( input) up to exiting the  DF ( output) :
> If imagine this simple DF :
> File1 (in) --> Processor1 --> flow1 --> Processor2 --> flow2 --> File2
> (out)
> I would like to store outside of Nifi  ( in my own  external DB) ->
>  File1, flow1, flow2, File2
> Are  there some simple  REST API to help to accomplish this ( I looked at
> Data provenance and SiteToSiteProvenanceReportingTask but not clearly found
> the right way to implement this) Any idea ?
>
> Phil
> Best regards
>
>
>
>
> _
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
> recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and
> delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been
> modified, changed or falsified.
> Thank you.
>
>


Re: nifi Rest API to get full details of the flow.

2016-09-28 Thread Sandeep Khurana
Just now looked at flow.xml.gz file. Ir serves  the purpose. Thx

On Wed, Sep 28, 2016 at 4:02 PM, Sandeep Khurana 
wrote:

> Hello
>
> Is there a way to get the full details of the flow which I created from
> Nifi UI ?
>
> If want to the ID of processors programatically (without looking from Nifi
> UI) and then based upon some conditions I want to see status of 1 or more
> processors.
>
>  Is there any way ?
>
>
>
>


-- 
Thanks and regards
Sandeep Khurana


Fwd: nifi Rest API to get full details of the flow.

2016-09-28 Thread Sandeep Khurana
Hello

Is there a way to get the full details of the flow which I created from
Nifi UI ?

If want to the ID of processors programatically (without looking from Nifi
UI) and then based upon some conditions I want to see status of 1 or more
processors.

 Is there any way ?


Re: ExecuteSQL & BigInt fieds

2016-09-28 Thread Yari Marchetti
Just recompiled with pull/1053, testing with two different tables, with
both signed and unsigned BIGINT and it works!

Thanks, Pierre & Matt
Yari

On 27 September 2016 at 16:52, Matt Burgess  wrote:

> All,
>
> I just reviewed and merged this fix. If you need a workaround in the
> meantime, if you can change your table such that the 'code' column is
> an unsigned bigint, then I think it works. That's what I tested for a
> related issue NIFI-2531, but forgot the signed bigint case :(
>
> Regards,
> Matt
>
> On Tue, Sep 27, 2016 at 4:54 AM, Pierre Villard
>  wrote:
> > Hi Yari,
> >
> > I think there is a JIRA on this problem at the moment:
> >
> > https://issues.apache.org/jira/browse/NIFI-2811
> > https://github.com/apache/nifi/pull/1053
> >
> > If you have the chance you can try the PR and give us your feedback.
> >
> > Pierre
> >
> >
> > 2016-09-27 10:23 GMT+02:00 Yari Marchetti  >:
> >>
> >> Hello,
> >> I'm trying to use a ExecuteSQL processor performing a query on a table
> >> with a BigInt field and it's failing with a:
> >>
> >> org.apache.avro.file.DataFileWriter$AppendWriteException:
> >> org.apache.avro.UnresolvedUnionException: Not in union
> ["null","string"]: 10
> >> at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:296)
> >> ~[na:na]
> >> at
> >> org.apache.nifi.processors.standard.util.JdbcCommon.convertT
> oAvroStream(JdbcCommon.java:160)
> >> ~[na:na]
> >> at
> >> org.apache.nifi.processors.standard.util.JdbcCommon.convertT
> oAvroStream(JdbcCommon.java:83)
> >> ~[na:na]
> >> at
> >> org.apache.nifi.processors.standard.util.JdbcCommon.convertT
> oAvroStream(JdbcCommon.java:74)
> >> ~[na:na]
> >> at
> >> org.apache.nifi.processors.standard.ExecuteSQL$2.process(Exe
> cuteSQL.java:193)
> >> ~[na:na]
> >> at
> >> org.apache.nifi.controller.repository.StandardProcessSession
> .write(StandardProcessSession.java:2123)
> >> ~[nifi-framework-core-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> >> at
> >> org.apache.nifi.processors.standard.ExecuteSQL.onTrigger(Exe
> cuteSQL.java:187)
> >> ~[na:na]
> >> at
> >> org.apache.nifi.processor.AbstractProcessor.onTrigger(Abstra
> ctProcessor.java:27)
> >> ~[nifi-api-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> >> at
> >> org.apache.nifi.controller.StandardProcessorNode.onTrigger(S
> tandardProcessorNode.java:1064)
> >> ~[nifi-framework-core-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> >> at
> >> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
> .call(ContinuallyRunProcessorTask.java:136)
> >> [nifi-framework-core-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> >> at
> >> org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask
> .call(ContinuallyRunProcessorTask.java:47)
> >> [nifi-framework-core-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> >> at
> >> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingA
> gent$1.run(TimerDrivenSchedulingAgent.java:132)
> >> [nifi-framework-core-1.1.0-SNAPSHOT.jar:1.1.0-SNAPSHOT]
> >> at java.util.concurrent.Executors$RunnableAdapter.call(
> Executors.java:511)
> >> [na:1.8.0_101]
> >>
> >>
> >> I've found this JIRA (https://issues.apache.org/jira/browse/NIFI-2531)
> and
> >> it looks like the very same issue but it's marked as resolved in 1.0.0
> >> (which it's the version I'm using). Do you have any idea?
> >>
> >> The table is on a MySQL 5.7.15 and the schema is like this:
> >>
> >> CREATE TABLE `test_table` (
> >>   `code` bigint(20) DEFAULT NULL,
> >>   `name` varchar(20) DEFAULT NULL
> >> ) ENGINE=InnoDB DEFAULT CHARSET=latin1
> >>
> >> Thanks,
> >> Yari
> >
> >
>