Executestream command - stream closed

2016-12-20 Thread Selvam Raman
Hi,

My flow like below

listdatabasetables->executesql->streamcommand

streamcommand i am using "ne.sh" script to write the count and tablename.

ne.sh"
#!/bin/sh
echo $1" : "$2 >> /home/nifi/neurocount.txt
exit 0


the data is written to txt file but always i am getting error as stream
closed.

Exact exception:
write flow file stdin due to -> java.io.exception broken pipe
java.lang.unixprocess ->java.io.exception stream closed

can you please help to solve this.



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"


Re: Making FlowFiles environment independent

2016-12-20 Thread ddewaele
Coming back to PutTCP, is there a reason why the hostname property doesn't
support EL ?

In both cases you mentioned you would like the option to externalise it or
make it dynamic.

Are there other ways of injecting a hostname in the PutTCP processor ?



--
View this message in context: 
http://apache-nifi-users-list.2361937.n4.nabble.com/Making-FlowFiles-environment-independent-tp409p498.html
Sent from the Apache NiFi Users List mailing list archive at Nabble.com.


Re: Making FlowFiles environment independent

2016-12-20 Thread Bryan Bende
There is no reason it can't support EL without a flow file, I created this
JIRA: https://issues.apache.org/jira/browse/NIFI-3231

EL per flow file, meaning the flow file specifies the host and port, would
require a fairly significant change to how the processor works internally,
but it doesn't sound like that is what we are talking about here.

On Tue, Dec 20, 2016 at 7:05 AM, ddewaele  wrote:

> Coming back to PutTCP, is there a reason why the hostname property doesn't
> support EL ?
>
> In both cases you mentioned you would like the option to externalise it or
> make it dynamic.
>
> Are there other ways of injecting a hostname in the PutTCP processor ?
>
>
>
> --
> View this message in context: http://apache-nifi-users-list.
> 2361937.n4.nabble.com/Making-FlowFiles-environment-
> independent-tp409p498.html
> Sent from the Apache NiFi Users List mailing list archive at Nabble.com.
>


Re: Load-balancing web api in cluster

2016-12-20 Thread Jeff
Hello Greg,

You can use the REST API on any of the nodes in the cluster.  Could you
provide more details on what you're trying to accomplish?  If, for
instance, you are posting data to a ListenHTTP processor and you want to
balance POSTs across the instances of ListenHTTP on your cluster, then
haproxy would probably be a good idea.  If you're trying to distribute the
processing load once the data is received, you can use a Remote Process
Group to distribute the data across the cluster.  Pierre Villard has
written a nice blog about setting up a cluster and configuring a flow using
a Remote Process Group to distribute the processing load [1].  It details
creating a Remote Process Group to send data back to an Input Port in the
same NiFi cluster, and allows NiFi to distribute the processing load across
all the nodes in your cluster.

You can use a combination of haproxy and Remote Process Group to load
balance connections to the REST API on each NiFi node and to balance the
processing load across the cluster.

[1] https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/

- Jeff

On Mon, Dec 19, 2016 at 9:25 PM Hart, Greg 
wrote:

> Hi all,
>
> What¹s the recommended way for communicating with the NiFi REST API in a
> cluster? I see that NiFi uses ZooKeeper so is it possible to get the
> Cluster Coordinator hostname and API port from ZooKeeper, or should I use
> something like haproxy?
>
> Thanks!
> -Greg
>
>


Re: Load-balancing web api in cluster

2016-12-20 Thread Hart, Greg
Hi Jeff,

My application communicates with the NiFi REST API to import templates, 
instantiate flows from templates, edit processor properties, and a few other 
things. I’m currently using Jersey to send calls to one NiFi node but if that 
node goes down then my application has to be manually reconfigured with the 
hostname and port of another NiFi node. HAProxy would handle failover but it 
still must be manually reconfigured when a NiFi node is added or removed from 
the cluster.

I was hoping that NiFi would use ZooKeeper similarly to other applications 
(Hive or HBase) where a client can easily get the hostname and port of the 
cluster coordinator (or active master). Unfortunately, the information in 
ZooKeeper does not include the value of nifi.rest.http.host and 
nifi.rest.http.port of any NiFi nodes.

It sounds like HAProxy might be the better solution for now. Luckily, adding or 
removing nodes from a cluster shouldn’t be a daily occurrence. If you have any 
other ideas please let me know.

Thanks!
-Greg

From: Jeff mailto:jtsw...@gmail.com>>
Reply-To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Date: Tuesday, December 20, 2016 at 8:56 AM
To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Subject: Re: Load-balancing web api in cluster

Hello Greg,

You can use the REST API on any of the nodes in the cluster.  Could you provide 
more details on what you're trying to accomplish?  If, for instance, you are 
posting data to a ListenHTTP processor and you want to balance POSTs across the 
instances of ListenHTTP on your cluster, then haproxy would probably be a good 
idea.  If you're trying to distribute the processing load once the data is 
received, you can use a Remote Process Group to distribute the data across the 
cluster.  Pierre Villard has written a nice blog about setting up a cluster and 
configuring a flow using a Remote Process Group to distribute the processing 
load [1].  It details creating a Remote Process Group to send data back to an 
Input Port in the same NiFi cluster, and allows NiFi to distribute the 
processing load across all the nodes in your cluster.

You can use a combination of haproxy and Remote Process Group to load balance 
connections to the REST API on each NiFi node and to balance the processing 
load across the cluster.

[1] https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/

- Jeff

On Mon, Dec 19, 2016 at 9:25 PM Hart, Greg 
mailto:greg.h...@thinkbiganalytics.com>> wrote:
Hi all,

What¹s the recommended way for communicating with the NiFi REST API in a
cluster? I see that NiFi uses ZooKeeper so is it possible to get the
Cluster Coordinator hostname and API port from ZooKeeper, or should I use
something like haproxy?

Thanks!
-Greg



Re: Load-balancing web api in cluster

2016-12-20 Thread Jeff
Greg,

NiFi does store which nodes are the primary and coordinator.  Relevant
nodes in ZK are (for instance, in a cluster I'm running locally):
/nifi/leaders/Primary
Node/_c_c94f1eb8-e5ac-443c-9643-2668b6f685b2-lock-000553,
/nifi/leaders/Primary
Node/_c_7cd14bd5-85f5-4ea9-b849-121496269ef4-lock-000554,
/nifi/leaders/Primary
Node/_c_99b79311-495f-4619-b316-9e842d445a8d-lock-000552,
/nifi/leaders/Cluster
Coordinator/_c_dc449a75-1a14-42d6-98ab-2cef3e74d616-lock-005967,
/nifi/leaders/Cluster
Coordinator/_c_2fbc68df-c9cd-4ecd-99d2-234b7b801110-lock-005966,
/nifi/leaders/Cluster
Coordinator/_c_a2b9c2be-c0fd-4bf7-a479-e011a7792fc3-lock-005968

The data on each of these nodes should have the host:port.  These are the
candidate nodes for being elected the Primary or Cluster Coordinator.  I
don't think that the current active Primary and Cluster Coordinator is
stored in ZK, just the nodes that are candidates to fulfill those roles.
I'll have to get back to you on that for sure, though.

- Jeff

On Tue, Dec 20, 2016 at 1:45 PM Hart, Greg 
wrote:

> Hi Jeff,
>
> My application communicates with the NiFi REST API to import templates,
> instantiate flows from templates, edit processor properties, and a few
> other things. I’m currently using Jersey to send calls to one NiFi node but
> if that node goes down then my application has to be manually reconfigured
> with the hostname and port of another NiFi node. HAProxy would handle
> failover but it still must be manually reconfigured when a NiFi node is
> added or removed from the cluster.
>
> I was hoping that NiFi would use ZooKeeper similarly to other applications
> (Hive or HBase) where a client can easily get the hostname and port of the
> cluster coordinator (or active master). Unfortunately, the information in
> ZooKeeper does not include the value of nifi.rest.http.host and
> nifi.rest.http.port of any NiFi nodes.
>
> It sounds like HAProxy might be the better solution for now. Luckily,
> adding or removing nodes from a cluster shouldn’t be a daily occurrence. If
> you have any other ideas please let me know.
>
> Thanks!
> -Greg
>
> From: Jeff 
> Reply-To: "users@nifi.apache.org" 
> Date: Tuesday, December 20, 2016 at 8:56 AM
> To: "users@nifi.apache.org" 
> Subject: Re: Load-balancing web api in cluster
>
> Hello Greg,
>
> You can use the REST API on any of the nodes in the cluster.  Could you
> provide more details on what you're trying to accomplish?  If, for
> instance, you are posting data to a ListenHTTP processor and you want to
> balance POSTs across the instances of ListenHTTP on your cluster, then
> haproxy would probably be a good idea.  If you're trying to distribute the
> processing load once the data is received, you can use a Remote Process
> Group to distribute the data across the cluster.  Pierre Villard has
> written a nice blog about setting up a cluster and configuring a flow using
> a Remote Process Group to distribute the processing load [1].  It details
> creating a Remote Process Group to send data back to an Input Port in the
> same NiFi cluster, and allows NiFi to distribute the processing load across
> all the nodes in your cluster.
>
> You can use a combination of haproxy and Remote Process Group to load
> balance connections to the REST API on each NiFi node and to balance the
> processing load across the cluster.
>
> [1] https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/
>
> - Jeff
>
> On Mon, Dec 19, 2016 at 9:25 PM Hart, Greg <
> greg.h...@thinkbiganalytics.com> wrote:
>
> Hi all,
>
> What¹s the recommended way for communicating with the NiFi REST API in a
> cluster? I see that NiFi uses ZooKeeper so is it possible to get the
> Cluster Coordinator hostname and API port from ZooKeeper, or should I use
> something like haproxy?
>
> Thanks!
> -Greg
>
>


Re: Load-balancing web api in cluster

2016-12-20 Thread Jeff
Greg,

That first statement in my previous email should read "which nodes can be
the primary or cluster coordinator".  I apologize for any confusion!

- Jeff

On Tue, Dec 20, 2016 at 2:04 PM Jeff  wrote:

> Greg,
>
> NiFi does store which nodes are the primary and coordinator.  Relevant
> nodes in ZK are (for instance, in a cluster I'm running locally):
> /nifi/leaders/Primary
> Node/_c_c94f1eb8-e5ac-443c-9643-2668b6f685b2-lock-000553,
> /nifi/leaders/Primary
> Node/_c_7cd14bd5-85f5-4ea9-b849-121496269ef4-lock-000554,
> /nifi/leaders/Primary
> Node/_c_99b79311-495f-4619-b316-9e842d445a8d-lock-000552,
> /nifi/leaders/Cluster
> Coordinator/_c_dc449a75-1a14-42d6-98ab-2cef3e74d616-lock-005967,
> /nifi/leaders/Cluster
> Coordinator/_c_2fbc68df-c9cd-4ecd-99d2-234b7b801110-lock-005966,
> /nifi/leaders/Cluster
> Coordinator/_c_a2b9c2be-c0fd-4bf7-a479-e011a7792fc3-lock-005968
>
> The data on each of these nodes should have the host:port.  These are the
> candidate nodes for being elected the Primary or Cluster Coordinator.  I
> don't think that the current active Primary and Cluster Coordinator is
> stored in ZK, just the nodes that are candidates to fulfill those roles.
> I'll have to get back to you on that for sure, though.
>
> - Jeff
>
> On Tue, Dec 20, 2016 at 1:45 PM Hart, Greg <
> greg.h...@thinkbiganalytics.com> wrote:
>
> Hi Jeff,
>
> My application communicates with the NiFi REST API to import templates,
> instantiate flows from templates, edit processor properties, and a few
> other things. I’m currently using Jersey to send calls to one NiFi node but
> if that node goes down then my application has to be manually reconfigured
> with the hostname and port of another NiFi node. HAProxy would handle
> failover but it still must be manually reconfigured when a NiFi node is
> added or removed from the cluster.
>
> I was hoping that NiFi would use ZooKeeper similarly to other applications
> (Hive or HBase) where a client can easily get the hostname and port of the
> cluster coordinator (or active master). Unfortunately, the information in
> ZooKeeper does not include the value of nifi.rest.http.host and
> nifi.rest.http.port of any NiFi nodes.
>
> It sounds like HAProxy might be the better solution for now. Luckily,
> adding or removing nodes from a cluster shouldn’t be a daily occurrence. If
> you have any other ideas please let me know.
>
> Thanks!
> -Greg
>
> From: Jeff 
> Reply-To: "users@nifi.apache.org" 
> Date: Tuesday, December 20, 2016 at 8:56 AM
> To: "users@nifi.apache.org" 
> Subject: Re: Load-balancing web api in cluster
>
> Hello Greg,
>
> You can use the REST API on any of the nodes in the cluster.  Could you
> provide more details on what you're trying to accomplish?  If, for
> instance, you are posting data to a ListenHTTP processor and you want to
> balance POSTs across the instances of ListenHTTP on your cluster, then
> haproxy would probably be a good idea.  If you're trying to distribute the
> processing load once the data is received, you can use a Remote Process
> Group to distribute the data across the cluster.  Pierre Villard has
> written a nice blog about setting up a cluster and configuring a flow using
> a Remote Process Group to distribute the processing load [1].  It details
> creating a Remote Process Group to send data back to an Input Port in the
> same NiFi cluster, and allows NiFi to distribute the processing load across
> all the nodes in your cluster.
>
> You can use a combination of haproxy and Remote Process Group to load
> balance connections to the REST API on each NiFi node and to balance the
> processing load across the cluster.
>
> [1] https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/
>
> - Jeff
>
> On Mon, Dec 19, 2016 at 9:25 PM Hart, Greg <
> greg.h...@thinkbiganalytics.com> wrote:
>
> Hi all,
>
> What¹s the recommended way for communicating with the NiFi REST API in a
> cluster? I see that NiFi uses ZooKeeper so is it possible to get the
> Cluster Coordinator hostname and API port from ZooKeeper, or should I use
> something like haproxy?
>
> Thanks!
> -Greg
>
>


Re: Load-balancing web api in cluster

2016-12-20 Thread Hart, Greg
Hi Jeff,

I saw this and looked into it. The data in those nodes are the 
nifi.cluster.node.address and nifi.cluster.node.protocol.port values. In order 
to get the nifi.web.http.host and nifi.web.http.port values, it seems I would 
have to connect first using the cluster node protocol and pretend to be a NiFi 
node so that I can query the cluster coordinator for the list of NodeIdentifier 
objects. Is this cluster node protocol stable enough to use in a production 
application? It doesn’t seem to be documented anywhere so I was assuming it may 
change in a minor release without much notice.

Thanks!
-Greg

From: Jeff mailto:jtsw...@gmail.com>>
Reply-To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Date: Tuesday, December 20, 2016 at 11:10 AM
To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Subject: Re: Load-balancing web api in cluster

Greg,

That first statement in my previous email should read "which nodes can be the 
primary or cluster coordinator".  I apologize for any confusion!

- Jeff

On Tue, Dec 20, 2016 at 2:04 PM Jeff 
mailto:jtsw...@gmail.com>> wrote:
Greg,

NiFi does store which nodes are the primary and coordinator.  Relevant nodes in 
ZK are (for instance, in a cluster I'm running locally):
/nifi/leaders/Primary 
Node/_c_c94f1eb8-e5ac-443c-9643-2668b6f685b2-lock-000553,
/nifi/leaders/Primary 
Node/_c_7cd14bd5-85f5-4ea9-b849-121496269ef4-lock-000554,
/nifi/leaders/Primary 
Node/_c_99b79311-495f-4619-b316-9e842d445a8d-lock-000552,
/nifi/leaders/Cluster 
Coordinator/_c_dc449a75-1a14-42d6-98ab-2cef3e74d616-lock-005967,
/nifi/leaders/Cluster 
Coordinator/_c_2fbc68df-c9cd-4ecd-99d2-234b7b801110-lock-005966,
/nifi/leaders/Cluster 
Coordinator/_c_a2b9c2be-c0fd-4bf7-a479-e011a7792fc3-lock-005968

The data on each of these nodes should have the host:port.  These are the 
candidate nodes for being elected the Primary or Cluster Coordinator.  I don't 
think that the current active Primary and Cluster Coordinator is stored in ZK, 
just the nodes that are candidates to fulfill those roles.  I'll have to get 
back to you on that for sure, though.

- Jeff

On Tue, Dec 20, 2016 at 1:45 PM Hart, Greg 
mailto:greg.h...@thinkbiganalytics.com>> wrote:
Hi Jeff,

My application communicates with the NiFi REST API to import templates, 
instantiate flows from templates, edit processor properties, and a few other 
things. I’m currently using Jersey to send calls to one NiFi node but if that 
node goes down then my application has to be manually reconfigured with the 
hostname and port of another NiFi node. HAProxy would handle failover but it 
still must be manually reconfigured when a NiFi node is added or removed from 
the cluster.

I was hoping that NiFi would use ZooKeeper similarly to other applications 
(Hive or HBase) where a client can easily get the hostname and port of the 
cluster coordinator (or active master). Unfortunately, the information in 
ZooKeeper does not include the value of nifi.rest.http.host and 
nifi.rest.http.port of any NiFi nodes.

It sounds like HAProxy might be the better solution for now. Luckily, adding or 
removing nodes from a cluster shouldn’t be a daily occurrence. If you have any 
other ideas please let me know.

Thanks!
-Greg

From: Jeff mailto:jtsw...@gmail.com>>
Reply-To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Date: Tuesday, December 20, 2016 at 8:56 AM
To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Subject: Re: Load-balancing web api in cluster

Hello Greg,

You can use the REST API on any of the nodes in the cluster.  Could you provide 
more details on what you're trying to accomplish?  If, for instance, you are 
posting data to a ListenHTTP processor and you want to balance POSTs across the 
instances of ListenHTTP on your cluster, then haproxy would probably be a good 
idea.  If you're trying to distribute the processing load once the data is 
received, you can use a Remote Process Group to distribute the data across the 
cluster.  Pierre Villard has written a nice blog about setting up a cluster and 
configuring a flow using a Remote Process Group to distribute the processing 
load [1].  It details creating a Remote Process Group to send data back to an 
Input Port in the same NiFi cluster, and allows NiFi to distribute the 
processing load across all the nodes in your cluster.

You can use a combination of haproxy and Remote Process Group to load balance 
connections to the REST API on each NiFi node and to balance the processing 
load across the cluster.

[1] https://pierrevillard.com/2016/08/13/apache-nifi-1-0-0-cluster-setup/

- Jeff

On Mon, Dec 19, 2016 at 9:25 PM Hart, Greg 
mailto:greg.h...@thinkbiganalytics.com>> wrote:
Hi all,

What¹s the recommended way for communicating with the NiFi REST API in a
cluster? I see that NiFi uses ZooKeeper so is it possible to get the

Failing to Start NiFi 1.1.0, OverlappingFileLockException

2016-12-20 Thread Peter Wicks (pwicks)
I've successfully upgraded my DEV and TEST environments from NiFi 1.0.0 to NiFi 
1.1.0. So I felt comfortable upgrading PROD until...
I completed all the work and went to start the server, but am receiving the 
below stack trace.  I dug through the code a bit and found that it's locking a 
file named wali.lock, so I tried deleting that file and starting NiFi up again 
but got the same stack dump.  The lock file did get recreated on the next run.

Our version of NiFi 1.1.0 is a couple of commits newer than official due to a 
merged branch (rather than a cherry pick). I don't have the exact commit we are 
running, but we haven't had this issue in our other environments using the same 
code.

---BEGIN STACK TRACE---
org.springframework.beans.factory.BeanCreationException: Error creating bean 
with name 'flowService': FactoryBean threw exception on object creation; nested 
exception is org.springframework.beans.factory.BeanCreationException: Error 
creating bean with name 'flowController': FactoryBean threw exception on object 
creation; nested exception is java.lang.RuntimeException: 
java.nio.channels.OverlappingFileLockException
at 
org.springframework.beans.factory.support.FactoryBeanRegistrySupport.doGetObjectFromFactoryBean(FactoryBeanRegistrySupport.java:175)
 ~[na:na]
at 
org.springframework.beans.factory.support.FactoryBeanRegistrySupport.getObjectFromFactoryBean(FactoryBeanRegistrySupport.java:103)
 ~[na:na]
at 
org.springframework.beans.factory.support.AbstractBeanFactory.getObjectForBeanInstance(AbstractBeanFactory.java:1585)
 ~[na:na]
at 
org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:254)
 ~[na:na]
at 
org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:202)
 ~[na:na]
at 
org.springframework.context.support.AbstractApplicationContext.getBean(AbstractApplicationContext.java:1060)
 ~[na:na]
at 
org.apache.nifi.web.contextlistener.ApplicationStartupContextListener.contextDestroyed(ApplicationStartupContextListener.java:103)
 ~[na:na]
at 
org.eclipse.jetty.server.handler.ContextHandler.callContextDestroyed(ContextHandler.java:845)
 ~[na:na]
at 
org.eclipse.jetty.servlet.ServletContextHandler.callContextDestroyed(ServletContextHandler.java:546)
 ~[na:na]
at 
org.eclipse.jetty.server.handler.ContextHandler.stopContext(ContextHandler.java:826)
 ~[na:na]
at 
org.eclipse.jetty.servlet.ServletContextHandler.stopContext(ServletContextHandler.java:356)
 ~[na:na]
at 
org.eclipse.jetty.webapp.WebAppContext.stopWebapp(WebAppContext.java:1410) 
~[na:na]
at 
org.eclipse.jetty.webapp.WebAppContext.stopContext(WebAppContext.java:1374) 
~[na:na]
at 
org.eclipse.jetty.server.handler.ContextHandler.doStop(ContextHandler.java:874) 
~[na:na]
at 
org.eclipse.jetty.servlet.ServletContextHandler.doStop(ServletContextHandler.java:272)
 ~[na:na]
at 
org.eclipse.jetty.webapp.WebAppContext.doStop(WebAppContext.java:544) ~[na:na]
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
 ~[na:na]
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:143)
 ~[na:na]
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:161)
 ~[na:na]
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:73)
 ~[na:na]
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
 ~[na:na]
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:143)
 ~[na:na]
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:161)
 ~[na:na]
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:73)
 ~[na:na]
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
 ~[na:na]
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.stop(ContainerLifeCycle.java:143)
 ~[na:na]
at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStop(ContainerLifeCycle.java:161)
 ~[na:na]
at 
org.eclipse.jetty.server.handler.AbstractHandler.doStop(AbstractHandler.java:73)
 ~[na:na]
at org.eclipse.jetty.server.Server.doStop(Server.java:482) ~[na:na]
at 
org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89)
 ~[na:na]
at org.apache.nifi.web.server.JettyServer.stop(JettyServer.java:854) 
~[na:na]
at org.apache.nifi.NiFi.shutdownHook(NiFi.java:187) 
[nifi-runtime-1.1.0.jar:1.1.0]
at org.apache.nifi.NiFi$2.run(NiFi.java:88) 
[nifi-runtime-1.1.0.jar:1.1.0]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
Caused by: org.springframework.beans.factory.BeanCreationException: Error 
creating bean with name 'flowController': FactoryBean thr

Re: Load-balancing web api in cluster

2016-12-20 Thread Jeff
Greg,

Again, I have to apologize.  You're right, the host:port in ZK are for the
cluster, not the NiFi UI.  Also, I was told those nodes in ZK are being
created by Curator, NiFi isn't explicitly creating them, so I'd be hesitant
to rely on that information.  The cluster node protocol is
production-stable, in my opinion, but it's not part of the public API.

I created a feature request JIRA to add the hostnames and UI ports of the
nodes in a NiFi cluster [1].

[1] https://issues.apache.org/jira/browse/NIFI-3237

On Tue, Dec 20, 2016 at 2:18 PM Hart, Greg 
wrote:

> Hi Jeff,
>
> I saw this and looked into it. The data in those nodes are
> the nifi.cluster.node.address and nifi.cluster.node.protocol.port values.
> In order to get the nifi.web.http.host and nifi.web.http.port values, it
> seems I would have to connect first using the cluster node protocol and
> pretend to be a NiFi node so that I can query the cluster coordinator for
> the list of NodeIdentifier objects. Is this cluster node protocol stable
> enough to use in a production application? It doesn’t seem to be documented
> anywhere so I was assuming it may change in a minor release without much
> notice.
>
> Thanks!
> -Greg
>
> From: Jeff 
> Reply-To: "users@nifi.apache.org" 
> Date: Tuesday, December 20, 2016 at 11:10 AM
>
> To: "users@nifi.apache.org" 
> Subject: Re: Load-balancing web api in cluster
>
> Greg,
>
> That first statement in my previous email should read "which nodes can be
> the primary or cluster coordinator".  I apologize for any confusion!
>
> - Jeff
>
> On Tue, Dec 20, 2016 at 2:04 PM Jeff  wrote:
>
> Greg,
>
> NiFi does store which nodes are the primary and coordinator.  Relevant
> nodes in ZK are (for instance, in a cluster I'm running locally):
> /nifi/leaders/Primary
> Node/_c_c94f1eb8-e5ac-443c-9643-2668b6f685b2-lock-000553,
> /nifi/leaders/Primary
> Node/_c_7cd14bd5-85f5-4ea9-b849-121496269ef4-lock-000554,
> /nifi/leaders/Primary
> Node/_c_99b79311-495f-4619-b316-9e842d445a8d-lock-000552,
> /nifi/leaders/Cluster
> Coordinator/_c_dc449a75-1a14-42d6-98ab-2cef3e74d616-lock-005967,
> /nifi/leaders/Cluster
> Coordinator/_c_2fbc68df-c9cd-4ecd-99d2-234b7b801110-lock-005966,
> /nifi/leaders/Cluster
> Coordinator/_c_a2b9c2be-c0fd-4bf7-a479-e011a7792fc3-lock-005968
>
> The data on each of these nodes should have the host:port.  These are the
> candidate nodes for being elected the Primary or Cluster Coordinator.  I
> don't think that the current active Primary and Cluster Coordinator is
> stored in ZK, just the nodes that are candidates to fulfill those roles.
> I'll have to get back to you on that for sure, though.
>
> - Jeff
>
> On Tue, Dec 20, 2016 at 1:45 PM Hart, Greg <
> greg.h...@thinkbiganalytics.com> wrote:
>
> Hi Jeff,
>
> My application communicates with the NiFi REST API to import templates,
> instantiate flows from templates, edit processor properties, and a few
> other things. I’m currently using Jersey to send calls to one NiFi node but
> if that node goes down then my application has to be manually reconfigured
> with the hostname and port of another NiFi node. HAProxy would handle
> failover but it still must be manually reconfigured when a NiFi node is
> added or removed from the cluster.
>
> I was hoping that NiFi would use ZooKeeper similarly to other applications
> (Hive or HBase) where a client can easily get the hostname and port of the
> cluster coordinator (or active master). Unfortunately, the information in
> ZooKeeper does not include the value of nifi.rest.http.host and
> nifi.rest.http.port of any NiFi nodes.
>
> It sounds like HAProxy might be the better solution for now. Luckily,
> adding or removing nodes from a cluster shouldn’t be a daily occurrence. If
> you have any other ideas please let me know.
>
> Thanks!
> -Greg
>
> From: Jeff 
> Reply-To: "users@nifi.apache.org" 
> Date: Tuesday, December 20, 2016 at 8:56 AM
> To: "users@nifi.apache.org" 
> Subject: Re: Load-balancing web api in cluster
>
> Hello Greg,
>
> You can use the REST API on any of the nodes in the cluster.  Could you
> provide more details on what you're trying to accomplish?  If, for
> instance, you are posting data to a ListenHTTP processor and you want to
> balance POSTs across the instances of ListenHTTP on your cluster, then
> haproxy would probably be a good idea.  If you're trying to distribute the
> processing load once the data is received, you can use a Remote Process
> Group to distribute the data across the cluster.  Pierre Villard has
> written a nice blog about setting up a cluster and configuring a flow using
> a Remote Process Group to distribute the processing load [1].  It details
> creating a Remote Process Group to send data back to an Input Port in the
> same NiFi cluster, and allows NiFi to distribute the processing load across
> all the nodes in your cluster.
>
> You can use a combination of haproxy and Remote Process Group to load
> balance connections to the REST API on each NiFi 

Re: Load-balancing web api in cluster

2016-12-20 Thread Hart, Greg
Hi Jeff,

I appreciate the help!

Thanks!
-Greg

From: Jeff mailto:jtsw...@gmail.com>>
Reply-To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Date: Tuesday, December 20, 2016 at 3:58 PM
To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Subject: Re: Load-balancing web api in cluster

Greg,

Again, I have to apologize.  You're right, the host:port in ZK are for the 
cluster, not the NiFi UI.  Also, I was told those nodes in ZK are being created 
by Curator, NiFi isn't explicitly creating them, so I'd be hesitant to rely on 
that information.  The cluster node protocol is production-stable, in my 
opinion, but it's not part of the public API.

I created a feature request JIRA to add the hostnames and UI ports of the nodes 
in a NiFi cluster [1].

[1] https://issues.apache.org/jira/browse/NIFI-3237

On Tue, Dec 20, 2016 at 2:18 PM Hart, Greg 
mailto:greg.h...@thinkbiganalytics.com>> wrote:
Hi Jeff,

I saw this and looked into it. The data in those nodes are the 
nifi.cluster.node.address and nifi.cluster.node.protocol.port values. In order 
to get the nifi.web.http.host and nifi.web.http.port values, it seems I would 
have to connect first using the cluster node protocol and pretend to be a NiFi 
node so that I can query the cluster coordinator for the list of NodeIdentifier 
objects. Is this cluster node protocol stable enough to use in a production 
application? It doesn’t seem to be documented anywhere so I was assuming it may 
change in a minor release without much notice.

Thanks!
-Greg

From: Jeff mailto:jtsw...@gmail.com>>
Reply-To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Date: Tuesday, December 20, 2016 at 11:10 AM

To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Subject: Re: Load-balancing web api in cluster

Greg,

That first statement in my previous email should read "which nodes can be the 
primary or cluster coordinator".  I apologize for any confusion!

- Jeff

On Tue, Dec 20, 2016 at 2:04 PM Jeff 
mailto:jtsw...@gmail.com>> wrote:
Greg,

NiFi does store which nodes are the primary and coordinator.  Relevant nodes in 
ZK are (for instance, in a cluster I'm running locally):
/nifi/leaders/Primary 
Node/_c_c94f1eb8-e5ac-443c-9643-2668b6f685b2-lock-000553,
/nifi/leaders/Primary 
Node/_c_7cd14bd5-85f5-4ea9-b849-121496269ef4-lock-000554,
/nifi/leaders/Primary 
Node/_c_99b79311-495f-4619-b316-9e842d445a8d-lock-000552,
/nifi/leaders/Cluster 
Coordinator/_c_dc449a75-1a14-42d6-98ab-2cef3e74d616-lock-005967,
/nifi/leaders/Cluster 
Coordinator/_c_2fbc68df-c9cd-4ecd-99d2-234b7b801110-lock-005966,
/nifi/leaders/Cluster 
Coordinator/_c_a2b9c2be-c0fd-4bf7-a479-e011a7792fc3-lock-005968

The data on each of these nodes should have the host:port.  These are the 
candidate nodes for being elected the Primary or Cluster Coordinator.  I don't 
think that the current active Primary and Cluster Coordinator is stored in ZK, 
just the nodes that are candidates to fulfill those roles.  I'll have to get 
back to you on that for sure, though.

- Jeff

On Tue, Dec 20, 2016 at 1:45 PM Hart, Greg 
mailto:greg.h...@thinkbiganalytics.com>> wrote:
Hi Jeff,

My application communicates with the NiFi REST API to import templates, 
instantiate flows from templates, edit processor properties, and a few other 
things. I’m currently using Jersey to send calls to one NiFi node but if that 
node goes down then my application has to be manually reconfigured with the 
hostname and port of another NiFi node. HAProxy would handle failover but it 
still must be manually reconfigured when a NiFi node is added or removed from 
the cluster.

I was hoping that NiFi would use ZooKeeper similarly to other applications 
(Hive or HBase) where a client can easily get the hostname and port of the 
cluster coordinator (or active master). Unfortunately, the information in 
ZooKeeper does not include the value of nifi.rest.http.host and 
nifi.rest.http.port of any NiFi nodes.

It sounds like HAProxy might be the better solution for now. Luckily, adding or 
removing nodes from a cluster shouldn’t be a daily occurrence. If you have any 
other ideas please let me know.

Thanks!
-Greg

From: Jeff mailto:jtsw...@gmail.com>>
Reply-To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Date: Tuesday, December 20, 2016 at 8:56 AM
To: "users@nifi.apache.org" 
mailto:users@nifi.apache.org>>
Subject: Re: Load-balancing web api in cluster

Hello Greg,

You can use the REST API on any of the nodes in the cluster.  Could you provide 
more details on what you're trying to accomplish?  If, for instance, you are 
posting data to a ListenHTTP processor and you want to balance POSTs across the 
instances of ListenHTTP on your cluster, then haproxy would probably be a good 
idea.  If you're trying to distribute the proces