Chakri,

Glad you got site-to-site working.

Regarding the data distribution, I'm not sure why it is behaving that way.
I just did a similar test running ncm, node1, and node2 all on my local
machine, with GenerateFlowFile running every 10 seconds, and Input Port
going to a LogAttribute, and I see it alternating between node1 and node2
logs every 10 seconds.

Is there anything in your primary node logs
(primary_node/logs/nifi-app.log) when you see the data on the other node?

-Bryan


On Sun, Jan 10, 2016 at 3:44 PM, Joe Witt <joe.w...@gmail.com> wrote:

> Chakri,
>
> Would love to hear what you've learned and how that differed from the
> docs themselves.  Site-to-site has proven difficult to setup so we're
> clearly not there yet in having the right operator/admin experience.
>
> Thanks
> Joe
>
> On Sun, Jan 10, 2016 at 3:41 PM, Chakrader Dewaragatla
> <chakrader.dewaraga...@lifelock.com> wrote:
> > I was able to get site-to-site work.
> > I tried to follow your instructions to send data distribute across the
> > nodes.
> >
> > GenerateFlowFile (On Primary) —> RPG
> > RPG —> Input Port   —> Putfile (Time driven scheduling)
> >
> > However, data is only written to one slave (Secondary slave). Primary
> slave
> > has not data.
> >
> > Image screenshot :
> > http://tinyurl.com/jjvjtmq
> >
> > From: Chakrader Dewaragatla <chakrader.dewaraga...@lifelock.com>
> > Date: Sunday, January 10, 2016 at 11:26 AM
> >
> > To: "users@nifi.apache.org" <users@nifi.apache.org>
> > Subject: Re: Nifi cluster features - Questions
> >
> > Bryan – Thanks – I am trying to setup site-to-site.
> > I have two slaves and one NCM.
> >
> > My properties as follows :
> >
> > On both Slaves:
> >
> > nifi.remote.input.socket.port=10880
> > nifi.remote.input.secure=false
> >
> > On NCM:
> > nifi.remote.input.socket.port=10880
> > nifi.remote.input.secure=false
> >
> > When I try drop remote process group (with http://<NCM IP>:8080/nifi),
> I see
> > error as follows for two nodes.
> >
> > [<Slave1 ip>:8080] - Remote instance is not allowed for Site to Site
> > communication
> > [<Slave2 ip>:8080] - Remote instance is not allowed for Site to Site
> > communication
> >
> > Do you have insight why its trying to connecting 8080 on slaves ? When do
> > 10880 port come into the picture ? I remember try setting site to site
> few
> > months back and succeeded.
> >
> > Thanks,
> > -Chakri
> >
> >
> >
> > From: Bryan Bende <bbe...@gmail.com>
> > Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> > Date: Saturday, January 9, 2016 at 11:22 AM
> > To: "users@nifi.apache.org" <users@nifi.apache.org>
> > Subject: Re: Nifi cluster features - Questions
> >
> > The sending node (where the remote process group is) will distribute the
> > data evenly across the two nodes, so an individual file will only be
> sent to
> > one of the nodes. You could think of it as if a separate NiFi instance
> was
> > sending directly to a two node cluster, it would be evenly distributing
> the
> > data across the two nodes. In this case it just so happens to all be
> with in
> > the same cluster.
> >
> > The most common use case for this scenario is the List and Fetch
> processors
> > like HDFS. You can perform the listing on primary node, and then
> distribute
> > the results so the fetching takes place on all nodes.
> >
> > On Saturday, January 9, 2016, Chakrader Dewaragatla
> > <chakrader.dewaraga...@lifelock.com> wrote:
> >>
> >> Bryan – Thanks, how do the nodes distribute the load for a input port.
> As
> >> port is open and listening on two nodes,  does it copy same files on
> both
> >> the nodes?
> >> I need to try this setup to see the results, appreciate your help.
> >>
> >> Thanks,
> >> -Chakri
> >>
> >> From: Bryan Bende <bbe...@gmail.com>
> >> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> >> Date: Friday, January 8, 2016 at 3:44 PM
> >> To: "users@nifi.apache.org" <users@nifi.apache.org>
> >> Subject: Re: Nifi cluster features - Questions
> >>
> >> Hi Chakri,
> >>
> >> I believe the DistributeLoad processor is more for load balancing when
> >> sending to downstream systems. For example, if you had two HTTP
> endpoints,
> >> you could have the first relationship from DistributeLoad going to a
> >> PostHTTP that posts to endpoint #1, and the second relationship going
> to a
> >> second PostHTTP that goes to endpoint #2.
> >>
> >> If you want to distribute the data with in the cluster, then you need to
> >> use site-to-site. The way you do this is the following...
> >>
> >> - Add an Input Port connected to your PutFile.
> >> - Add GenerateFlowFile scheduled on primary node only, connected to a
> >> Remote Process Group. The Remote Process Group should be connected to
> the
> >> Input Port from the previous step.
> >>
> >> So both nodes have an input port listening for data, but only the
> primary
> >> node produces a FlowFile and sends it to the RPG which then
> re-distributes
> >> it back to one of the Input Ports.
> >>
> >> In order for this to work you need to set nifi.remote.input.socket.port
> in
> >> nifi.properties to some available port, and you probably want
> >> nifi.remote.input.secure=false for testing.
> >>
> >> -Bryan
> >>
> >>
> >> On Fri, Jan 8, 2016 at 6:27 PM, Chakrader Dewaragatla
> >> <chakrader.dewaraga...@lifelock.com> wrote:
> >>>
> >>> Mark – I have setup a two node cluster and tried the following .
> >>>  GenrateFlowfile processor (Run only on primary node) —>
> DistributionLoad
> >>> processor (RoundRobin)   —> PutFile
> >>>
> >>> >> The GetFile/PutFile will run on all nodes (unless you schedule it to
> >>> >> run on primary node only).
> >>> From your above comment, It should put file on two nodes. It put files
> on
> >>> primary node only. Any thoughts ?
> >>>
> >>> Thanks,
> >>> -Chakri
> >>>
> >>> From: Mark Payne <marka...@hotmail.com>
> >>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> >>> Date: Wednesday, October 7, 2015 at 11:28 AM
> >>>
> >>> To: "users@nifi.apache.org" <users@nifi.apache.org>
> >>> Subject: Re: Nifi cluster features - Questions
> >>>
> >>> Chakri,
> >>>
> >>> Correct - when NiFi instances are clustered, they do not transfer data
> >>> between the nodes. This is very different
> >>> than you might expect from something like Storm or Spark, as the key
> >>> goals and design are quite different.
> >>> We have discussed providing the ability to allow the user to indicate
> >>> that they want to have the framework
> >>> do load balancing for specific connections in the background, but it's
> >>> still in more of a discussion phase.
> >>>
> >>> Site-to-Site is simply the capability that we have developed to
> transfer
> >>> data between one instance of
> >>> NiFi and another instance of NiFi. So currently, if we want to do load
> >>> balancing across the cluster, we would
> >>> create a site-to-site connection (by dragging a Remote Process Group
> onto
> >>> the graph) and give that
> >>> site-to-site connection the URL of our cluster. That way, you can push
> >>> data to your own cluster, effectively
> >>> providing a load balancing capability.
> >>>
> >>> If you were to just run ListenHTTP without setting it to Primary Node,
> >>> then every node in the cluster will be listening
> >>> for incoming HTTP connections. So you could then use a simple load
> >>> balancer in front of NiFi to distribute the load
> >>> across your cluster.
> >>>
> >>> Does this help? If you have any more questions we're happy to help!
> >>>
> >>> Thanks
> >>> -Mark
> >>>
> >>>
> >>> On Oct 7, 2015, at 2:32 PM, Chakrader Dewaragatla
> >>> <chakrader.dewaraga...@lifelock.com> wrote:
> >>>
> >>> Mark - Thanks for the notes.
> >>>
> >>> >> The other option would be to have a ListenHTTP processor run on
> >>> >> Primary Node only and then use Site-to-Site to distribute the data
> to other
> >>> >> nodes.
> >>> Lets say I have 5 node cluster and ListenHTTP processor on Primary
> node,
> >>> collected data on primary node is not transfered to other nodes by
> default
> >>> for processing despite all nodes are part of one cluster?
> >>> If ListenHTTP processor is running  as a dafult (with out explicit
> >>> setting to run on primary node), how does the data transferred to rest
> of
> >>> the nodes? Does site-to-site come in play when I make one processor to
> run
> >>> on primary node ?
> >>>
> >>> Thanks,
> >>> -Chakri
> >>>
> >>> From: Mark Payne <marka...@hotmail.com>
> >>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
> >>> Date: Wednesday, October 7, 2015 at 7:00 AM
> >>> To: "users@nifi.apache.org" <users@nifi.apache.org>
> >>> Subject: Re: Nifi cluster features - Questions
> >>>
> >>> Hello Chakro,
> >>>
> >>> When you create a cluster of NiFi instances, each node in the cluster
> is
> >>> acting independently and in exactly
> >>> the same way. I.e., if you have 5 nodes, all 5 nodes will run exactly
> the
> >>> same flow. However, they will be
> >>> pulling in different data and therefore operating on different data.
> >>>
> >>> So if you pull in 10 1-gig files from S3, each of those files will be
> >>> processed on the node that pulled the data
> >>> in. NiFi does not currently shuffle data around between nodes in the
> >>> cluster (you can use site-to-site to do
> >>> this if you want to, but it won't happen automatically). If you set the
> >>> number of Concurrent Tasks to 5, then
> >>> you will have up to 5 threads running for that processor on each node.
> >>>
> >>> The only exception to this is the Primary Node. You can schedule a
> >>> Processor to run only on the Primary Node
> >>> by right-clicking on the Processor, and going to the Configure menu. In
> >>> the Scheduling tab, you can change
> >>> the Scheduling Strategy to Primary Node Only. In this case, that
> >>> Processor will only be triggered to run on
> >>> whichever node is elected the Primary Node (this can be changed in the
> >>> Cluster management screen by clicking
> >>> the appropriate icon in the top-right corner of the UI).
> >>>
> >>> The GetFile/PutFile will run on all nodes (unless you schedule it to
> run
> >>> on primary node only).
> >>>
> >>> If you are attempting to have a single input running HTTP and then push
> >>> that out across the entire cluster to
> >>> process the data, you would have a few options. First, you could just
> use
> >>> an HTTP Load Balancer in front of NiFi.
> >>> The other option would be to have a ListenHTTP processor run on Primary
> >>> Node only and then use Site-to-Site
> >>> to distribute the data to other nodes.
> >>>
> >>> For more info on site-to-site, you can see the Site-to-Site section of
> >>> the User Guide at
> >>>
> http://nifi.apache.org/docs/nifi-docs/html/user-guide.html#site-to-site
> >>>
> >>> If you have any more questions, let us know!
> >>>
> >>> Thanks
> >>> -Mark
> >>>
> >>> On Oct 7, 2015, at 2:33 AM, Chakrader Dewaragatla
> >>> <chakrader.dewaraga...@lifelock.com> wrote:
> >>>
> >>> Nifi Team – I would like to understand the advantages of Nifi
> clustering
> >>> setup.
> >>>
> >>> Questions :
> >>>
> >>>  - How does workflow work on multiple nodes ? Does it share the
> resources
> >>> intra nodes ?
> >>> Lets say I need to pull data 10 1Gig files from S3, how does work load
> >>> distribute  ? Setting concurrent tasks as 5. Does it spew 5 tasks per
> node ?
> >>>
> >>>  - How to “isolate” the processor to the master node (or one node)?
> >>>
> >>> - Getfile/Putfile processors on cluster setup, does it get/put on
> primary
> >>> node ? How do I force processor to look in one of the slave node?
> >>>
> >>> - How can we have a workflow where the input side we want to receive
> >>> requests (http) and then the rest of the pipeline need to run in
> parallel on
> >>> all the nodes ?
> >>>
> >>> Thanks,
> >>> -Chakro
> >>>
> >>> ________________________________
> >>> The information contained in this transmission may contain privileged
> and
> >>> confidential information. It is intended only for the use of the
> person(s)
> >>> named above. If you are not the intended recipient, you are hereby
> notified
> >>> that any review, dissemination, distribution or duplication of this
> >>> communication is strictly prohibited. If you are not the intended
> recipient,
> >>> please contact the sender by reply email and destroy all copies of the
> >>> original message.
> >>> ________________________________
> >>>
> >>>
> >>> ________________________________
> >>> The information contained in this transmission may contain privileged
> and
> >>> confidential information. It is intended only for the use of the
> person(s)
> >>> named above. If you are not the intended recipient, you are hereby
> notified
> >>> that any review, dissemination, distribution or duplication of this
> >>> communication is strictly prohibited. If you are not the intended
> recipient,
> >>> please contact the sender by reply email and destroy all copies of the
> >>> original message.
> >>> ________________________________
> >>>
> >>>
> >>> ________________________________
> >>> The information contained in this transmission may contain privileged
> and
> >>> confidential information. It is intended only for the use of the
> person(s)
> >>> named above. If you are not the intended recipient, you are hereby
> notified
> >>> that any review, dissemination, distribution or duplication of this
> >>> communication is strictly prohibited. If you are not the intended
> recipient,
> >>> please contact the sender by reply email and destroy all copies of the
> >>> original message.
> >>> ________________________________
> >>
> >>
> >> ________________________________
> >> The information contained in this transmission may contain privileged
> and
> >> confidential information. It is intended only for the use of the
> person(s)
> >> named above. If you are not the intended recipient, you are hereby
> notified
> >> that any review, dissemination, distribution or duplication of this
> >> communication is strictly prohibited. If you are not the intended
> recipient,
> >> please contact the sender by reply email and destroy all copies of the
> >> original message.
> >> ________________________________
> >
> >
> >
> > --
> > Sent from Gmail Mobile
> > ________________________________
> > The information contained in this transmission may contain privileged and
> > confidential information. It is intended only for the use of the
> person(s)
> > named above. If you are not the intended recipient, you are hereby
> notified
> > that any review, dissemination, distribution or duplication of this
> > communication is strictly prohibited. If you are not the intended
> recipient,
> > please contact the sender by reply email and destroy all copies of the
> > original message.
> > ________________________________
>

Reply via email to