Re: NIFI RabbitMQ Processor

2015-10-02 Thread Mark Payne
Dave,

In general, I would recommend not trying to mock the AMQP broker, but rather 
mocking the client.
This allows you to not have to make any sort of network connection (even back 
to localhost) and
also makes it much easier to mock Exceptions being thrown, etc.

Generally, when I write a processor that is going to reach out to some external 
service, I will create a method
in my processor:

protected SomeTypeOfClient getClient() {
return client;
}

This allows me to easily override this method in a subclass that I use for unit 
tests. I can then just mock out the
client however I need to do.

Thanks
-Mark


> On Oct 2, 2015, at 1:39 AM, DAVID SMITH  wrote:
> 
> Hi Chris
> 
> I have produced a set of processors which work with Java and C++ brokers (but 
> not RabbitMQ), but like you I haven't raised a ticket yet because
> I am unsure how to mock an AMQP broker.
> I would certainly be interested in seeing how you would do this.
> 
> Many thanks
> Dave
> 
> Sent from Yahoo! Mail on Android
> 



Re: Source code for Version 0.3.0

2015-10-02 Thread Adam Taft
Just bumping this conversation.  Did we end up addressing this?  Are we
going for a signed release tag?  If so, does it make sense for the 0.3.0
tag to be signed by the releasor (I believe Matt Gilman)?  Or maybe just an
unsigned tag?

Thanks,

Adam


On Mon, Sep 21, 2015 at 2:28 PM, Joe Witt  wrote:

> Looks fairly straightforward to sign a release [1].
>
> What is the workflow you'd suggest?  Can we keep our current process
> and once the vote is done just add a step to make a new identical (but
> signed) tag with a name that doesn't include '-RC#'?
>
> I'm good with that.  I understand why the RC# throws folks off so
> happy to sort this out.
>
> [1] http://gitready.com/advanced/2014/11/02/gpg-sign-releases.html
>
> On Mon, Sep 21, 2015 at 12:42 PM, Ryan Blue  wrote:
> > +1 for a nifi-0.3.0 release tag. Signed is even better, but I don't think
> > I'd mind if it weren't signed.
> >
> > rb
> >
> >
> > On 09/21/2015 06:35 AM, Sean Busbey wrote:
> >>
> >> The pattern I've liked the most on other projects is to create a
> >> proper release tag, signed by the RM on passage of the release vote. I
> >> don't recall off-hand what the phrasing was in the VOTE thread (if
> >> any).
> >>
> >> On Mon, Sep 21, 2015 at 8:13 AM, Adam Taft  wrote:
> >>>
> >>> What's the thoughts on creating a proper 0.3.0 tag, as would be
> >>> traditional
> >>> for a final release?  It is arguably a little confusing to only have
> the
> >>> RC
> >>> tags, when looking for the final release.  I found this head scratching
> >>> for
> >>> 0.2.0 as well.
> >>>
> >>> Adam
> >
> >
> >
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Cloudera, Inc.
>


Re: Source code for Version 0.3.0

2015-10-02 Thread Sean Busbey
If we're going with tags, I'd love one for each previous release.

On Fri, Oct 2, 2015 at 7:48 AM, Adam Taft  wrote:
> Just bumping this conversation.  Did we end up addressing this?  Are we
> going for a signed release tag?  If so, does it make sense for the 0.3.0
> tag to be signed by the releasor (I believe Matt Gilman)?  Or maybe just an
> unsigned tag?
>
> Thanks,
>
> Adam
>
>
> On Mon, Sep 21, 2015 at 2:28 PM, Joe Witt  wrote:
>
>> Looks fairly straightforward to sign a release [1].
>>
>> What is the workflow you'd suggest?  Can we keep our current process
>> and once the vote is done just add a step to make a new identical (but
>> signed) tag with a name that doesn't include '-RC#'?
>>
>> I'm good with that.  I understand why the RC# throws folks off so
>> happy to sort this out.
>>
>> [1] http://gitready.com/advanced/2014/11/02/gpg-sign-releases.html
>>
>> On Mon, Sep 21, 2015 at 12:42 PM, Ryan Blue  wrote:
>> > +1 for a nifi-0.3.0 release tag. Signed is even better, but I don't think
>> > I'd mind if it weren't signed.
>> >
>> > rb
>> >
>> >
>> > On 09/21/2015 06:35 AM, Sean Busbey wrote:
>> >>
>> >> The pattern I've liked the most on other projects is to create a
>> >> proper release tag, signed by the RM on passage of the release vote. I
>> >> don't recall off-hand what the phrasing was in the VOTE thread (if
>> >> any).
>> >>
>> >> On Mon, Sep 21, 2015 at 8:13 AM, Adam Taft  wrote:
>> >>>
>> >>> What's the thoughts on creating a proper 0.3.0 tag, as would be
>> >>> traditional
>> >>> for a final release?  It is arguably a little confusing to only have
>> the
>> >>> RC
>> >>> tags, when looking for the final release.  I found this head scratching
>> >>> for
>> >>> 0.2.0 as well.
>> >>>
>> >>> Adam
>> >
>> >
>> >
>> >
>> > --
>> > Ryan Blue
>> > Software Engineer
>> > Cloudera, Inc.
>>



-- 
Sean


Re: Source code for Version 0.3.0

2015-10-02 Thread Dan Bress
I think a tag for each release signed by the person who originally released it 
would make the most sense to anyone looking at our codebase.

Dan Bress
Software Engineer
ONYX Consulting Services


From: Sean Busbey 
Sent: Friday, October 2, 2015 11:35 AM
To: dev@nifi.apache.org
Subject: Re: Source code for Version 0.3.0

If we're going with tags, I'd love one for each previous release.

On Fri, Oct 2, 2015 at 7:48 AM, Adam Taft  wrote:
> Just bumping this conversation.  Did we end up addressing this?  Are we
> going for a signed release tag?  If so, does it make sense for the 0.3.0
> tag to be signed by the releasor (I believe Matt Gilman)?  Or maybe just an
> unsigned tag?
>
> Thanks,
>
> Adam
>
>
> On Mon, Sep 21, 2015 at 2:28 PM, Joe Witt  wrote:
>
>> Looks fairly straightforward to sign a release [1].
>>
>> What is the workflow you'd suggest?  Can we keep our current process
>> and once the vote is done just add a step to make a new identical (but
>> signed) tag with a name that doesn't include '-RC#'?
>>
>> I'm good with that.  I understand why the RC# throws folks off so
>> happy to sort this out.
>>
>> [1] http://gitready.com/advanced/2014/11/02/gpg-sign-releases.html
>>
>> On Mon, Sep 21, 2015 at 12:42 PM, Ryan Blue  wrote:
>> > +1 for a nifi-0.3.0 release tag. Signed is even better, but I don't think
>> > I'd mind if it weren't signed.
>> >
>> > rb
>> >
>> >
>> > On 09/21/2015 06:35 AM, Sean Busbey wrote:
>> >>
>> >> The pattern I've liked the most on other projects is to create a
>> >> proper release tag, signed by the RM on passage of the release vote. I
>> >> don't recall off-hand what the phrasing was in the VOTE thread (if
>> >> any).
>> >>
>> >> On Mon, Sep 21, 2015 at 8:13 AM, Adam Taft  wrote:
>> >>>
>> >>> What's the thoughts on creating a proper 0.3.0 tag, as would be
>> >>> traditional
>> >>> for a final release?  It is arguably a little confusing to only have
>> the
>> >>> RC
>> >>> tags, when looking for the final release.  I found this head scratching
>> >>> for
>> >>> 0.2.0 as well.
>> >>>
>> >>> Adam
>> >
>> >
>> >
>> >
>> > --
>> > Ryan Blue
>> > Software Engineer
>> > Cloudera, Inc.
>>



--
Sean


flow.tar.stale

2015-10-02 Thread Corey Flowers
I have a cluster that is running production and the nodes within the
cluster keep falling out. The flow.tar is stuck in flow.tar.stale but I
can't tell which server is causing the timing issue. In ApacheNIFI, which
conf property actually increases the setting for the node to respond? I see:

nifi.cluster.manager.flow.retrieval.delay
description: the delay before the cluster manager retrieves the latest flow
configuration.
But I thought this pulled the flow.xml out of memory to save to disk.

What I need is to increase the time before a node drops out of the cluster
because the flow.tar is stale.

Also, it would be great if in the logs it said which node had not responded
and times of each successful response. This would greatly help to identify
systems that are the slow pokes or the ones that are potentially too busy
to respond fast enough.

Thanks!

-- 
Corey Flowers
Vice President, Onyx Point, Inc
(410) 541-6699
cflow...@onyxpoint.com

-- This account not approved for unencrypted proprietary information --


Re: Interactive Queue management

2015-10-02 Thread Matt Gilman
Joe,

Yes, as Mark mentioned it is definitely awesome that your interested in
digging in here. Most of the discussions regarding this feature have been
really high level at this point. So we're happy to work through some of the
details as Mark has begun. A couple points that come to mind right now.

- I don't think we want to support manually prioritization. The connections
can be configured with prioritizers and we'd like to use those to manage
the ordering of the enqueued FlowFiles. However, the listing of FlowFiles
will be rendered by their priority by default though will likely support
sorting by any of the fields.

- The current thought process is that we'll want to require source and
destination components to be stopped. This is inline with the existing
functionality throughout the application.

- The number of enqueued FlowFiles is technically unbounded. Because of
this the endpoint may be need to support some sort of pagination since we'd
may not want/be able to return the entire queue in a single response. There
is some concern about Java heap since many of the flowfiles may be swapped
out to disk. Additionally there are some concerns about HTTP response size
and the amount of data we store client side.

- What we do with the FlowFiles that are swapped out to disk is still
undecided. Not sure whether we want to load them from disk in order to
include them in the response or if we just show that X number of FlowFiles
are currently swapped out.

Some of these items will need to be hashed out but we're happy to work
through them with you. We should keep the Feature Proposal up to date as
well [1].

Thanks!

Matt

[1]
https://cwiki.apache.org/confluence/display/NIFI/Interactive+Queue+Management

On Thu, Oct 1, 2015 at 12:34 PM, Mark Payne  wrote:

> Joe,
>
> First of all, it is awesome that you're interested in jumping on this! And
> I think you're off to a great
> start and have a really good understanding of exactly where we all want to
> go with this.
>
> I'm sure there will be a lot of questions that will come up in working
> through a lot of the
> stuff here. Just from reading through the email here i have a couple of
> comments/thoughts that
> may help to shape the way forward. This is a bit of a stream of
> consciousness, so I hope all
> makes sense :)
>
> The connectionQueueItem model that you lay out here, I think is really
> just a FlowFile.
> I think it will make sense to just use the name flowFile.
>
> When you bring up the contents of a queue in the UI, I would imagine that
> it would be shown
> as something similar to the Data Provenance table. From there I'd want to
> click on the FlowFile
> in the table to see more details. So I'm envisioning two separate data
> models really. The first
> would be maybe a FlowFileSummary. It would look very similar to what
> you've laid out below,
> but perhaps contain information about how long the FlowFile has been
> queued up, perhaps
> how many times it has been re-queued on this particular queue (for
> example, if a FlowFile keeps
> failing to process, we could use this information to remove that
> particular FlowFile from the
> queue, etc.)
>
> When we get more info for the FlowFile, I would expect it to contain all
> FlowFile Attributes. This
> I think is a different data model because if we pull back all attributes
> for every FlowFile when
> we render the table, the amount of data brought back could be huge.
>
> Another consideration here, is that when a connection has a lot of
> FlowFiles on it, the framework
> may swap those FlowFiles out to disk in order to remove them from the Java
> heap. We will
> want to ensure that we include info about how much is in the queue (# of
> FlowFiles and size of those
> FlowFiles), how much is swapped out (# of FlowFiles + size), and how much
> is currently being
> processed by Processors (in the FlowFileQueue this is referenced as
> Unacknowledged FlowFiles).
>
> In the UI table, we should also make sure that by default we are showing
> the FlowFiles in the order
> in which they exist in the queue right now.
>
> From a RESTful perspective, we may want to also consider that in order to
> purge a queue, we are not
> really deleting the queue itself, but rather its contents. So perhaps we
> should use a URI like
>
> http://your-host/nifi-api/controller/process-groups/{process-group-id}/connections/{connection-id}/queue/contents
> <
> http://your-host/nifi-api/controller/process-groups/%7Bprocess-group-id%7D/connections/%7Bconnection-id%7D/queue/contents
> >
> but I'll be the first to admit that REST is not really my forte. So if
> that doesn't make sense then ignore that.
>
> Very excited to see you jumping in here!
>
> Thanks
> -Mark
>
>
>
> > On Oct 1, 2015, at 12:13 PM, József Mészáros 
> wrote:
> >
> > Hey NiFi experts :-)
> >
> > I have started to work on the backend part of interactive queue
> management,
> > which has several related issues: NIFI-99 (Review in flight flow file
> > details) 

Re: flow.tar.stale

2015-10-02 Thread Mark Payne
Corey,

I think the properties you're looking for are:

nifi.cluster.manager.node.api.read.timeout - the amount of time to wait between 
each successful transfer of data before considering it an error. I.e., if we go 
this amount of time (30 secs by default) without receiving any data from the 
node, it will timeout.

nifi.cluster.manager.node.api.connection.timeout - the amount of time to wait 
for a connection to be established before timing out.

If you are seeing timeouts without any indication of which node is the problem, 
I certainly agree that is a problem. Can you provide the actual error message 
that you are seeing, so that it's easier to understand where in the code the 
timeout is actually occurring?

In the meantime, you should see timing info if you add the following line to 
your conf/logback.xml file:


That will provide some pretty verbose logging, though, as it logs timing info 
for each request to each node, as well as min, max, average.

Thanks
-Mark



> On Oct 2, 2015, at 2:23 PM, Corey Flowers  wrote:
> 
> I have a cluster that is running production and the nodes within the
> cluster keep falling out. The flow.tar is stuck in flow.tar.stale but I
> can't tell which server is causing the timing issue. In ApacheNIFI, which
> conf property actually increases the setting for the node to respond? I see:
> 
> nifi.cluster.manager.flow.retrieval.delay
> description: the delay before the cluster manager retrieves the latest flow
> configuration.
> But I thought this pulled the flow.xml out of memory to save to disk.
> 
> What I need is to increase the time before a node drops out of the cluster
> because the flow.tar is stale.
> 
> Also, it would be great if in the logs it said which node had not responded
> and times of each successful response. This would greatly help to identify
> systems that are the slow pokes or the ones that are potentially too busy
> to respond fast enough.
> 
> Thanks!
> 
> -- 
> Corey Flowers
> Vice President, Onyx Point, Inc
> (410) 541-6699
> cflow...@onyxpoint.com
> 
> -- This account not approved for unencrypted proprietary information --



Re: NIFI RabbitMQ Processor

2015-10-02 Thread DAVID SMITH
Mark
Thanks very much for the advice, that sounds a bit easier do you have an 
example where you have done this sort of thing that I could have a look at?
Many thanksDave 


 On Friday, 2 October 2015, 6:39, DAVID SMITH  
wrote:
   

 Hi Chris

I have produced a set of processors which work with Java and C++ brokers (but 
not RabbitMQ), but like you I haven't raised a ticket yet because
I am unsure how to mock an AMQP broker.
I would certainly be interested in seeing how you would do this.

Many thanks
Dave

Sent from Yahoo! Mail on Android



   

Re: NIFI RabbitMQ Processor

2015-10-02 Thread Mark Payne
Dave,

Absolutely.

The PutKafka processor (and associated TestPutKafka unit test class) which are 
found in the nar-bundles/nifi-kafka-bundle/nifi-kafka-processors module follow 
this pattern. 

The YandexTranslate (in the nifi-language-translation-bundle) also follows this 
pattern.

Thanks
-Mark


> On Oct 2, 2015, at 3:41 PM, DAVID SMITH  wrote:
> 
> Mark
> Thanks very much for the advice, that sounds a bit easier do you have an 
> example where you have done this sort of thing that I could have a look at?
> Many thanksDave 
> 
> 
> On Friday, 2 October 2015, 6:39, DAVID SMITH  
> wrote:
> 
> 
> Hi Chris
> 
> I have produced a set of processors which work with Java and C++ brokers (but 
> not RabbitMQ), but like you I haven't raised a ticket yet because
> I am unsure how to mock an AMQP broker.
> I would certainly be interested in seeing how you would do this.
> 
> Many thanks
> Dave
> 
> Sent from Yahoo! Mail on Android
> 
> 
> 



[GitHub] nifi pull request: NIFI-1020

2015-10-02 Thread randerzander
GitHub user randerzander opened a pull request:

https://github.com/apache/nifi/pull/99

NIFI-1020

Bumping Kafka API version to prevent PutKafka from writing to only one 
partition.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/randerzander/nifi master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nifi/pull/99.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #99


commit da907d3d94b594e59859f1906a967fac5b0eba0c
Author: Randy Gelhausen 
Date:   2015-10-02T22:09:02Z

Bumped nifi-kafka-processors Kafka version




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-991: Add "upsert" verb support for Convert...

2015-10-02 Thread randerzander
Github user randerzander commented on the pull request:

https://github.com/apache/nifi/pull/93#issuecomment-145168859
  
closing after suggestion to use existing processor in input mode with a 
downstream ReplaceText Processor(replace "INSERT" with "UPSERT".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] nifi pull request: NIFI-991: Add "upsert" verb support for Convert...

2015-10-02 Thread randerzander
Github user randerzander closed the pull request at:

https://github.com/apache/nifi/pull/93


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---