Hi Mark,
Thanks for your valuable suggestion. It worked a lot. Now I can understand, 
there is no point in load balancing between FetchSFTP and CompressContent.

After making all the changes it worked but some of the  flow files are stuck 
between CompressContent and putHDFS  https://i.imgur.com/oSYkYuA.png 

And 2nd thing is that 10 FlowFiles between ListSFTP and FetchSFTP is there for 
long time
https://i.imgur.com/Q44VDW6.png

Please suggested where I can start debugging these two issues.

Meanwhile we are migrating to 1.10.0. This time we are doing through HDF and it 
has NIFI 1.9.0 as latest version. We are planing to  replace the library and 
content of 1.9.0 with 1.10.0. Can we go ahead with this approach or is there 
are other way.

Currently 1.9.2 is an independent cluster. 



On 2019/12/03 14:30:43, Mark Payne <marka...@hotmail.com> wrote: 
> Nayan,
> 
> Looking at the screenshot, I can see two different connections there that are 
> load balanced. One of them holds the nearly 100 GB of data.
> 
> There are a handful of bugs related to load-balanced connections in 1.9.2 
> that were addressed in 1.10.0. If you're relying on load-balanced connections 
> to spread data across the cluster (and this particular flow clearly is), then 
> I would strongly encourage you to upgrade to 1.10.0 because at least one of 
> these bugs does cause the flow to appear to stop flowing.
> 
> That being said, there are two other things that you may want to consider:
> 
> 1. You're trying to load balance 100 GB of data spread across 6 files. So 
> each file is nearly 20 GB of data. It may take a little while to push that 
> from Node A to Node B. If the data is queued up, waiting to go to another 
> node, or is on the way to another node, it will not be shown in the FlowFile 
> listing. That will only show FlowFiles that are queued up to be processed on 
> the node that it currently lives on.
> 
> 2. You should not be using a load balanced connection between FetchSFTP and 
> CompressContent. The way that these processors are designed, the listing 
> should be performed, and then the connection between ListSFTP and FetchSFTP 
> should be load balanced. Once that has happened, the listing has been 
> federated across the cluster, so whichever node receives the listing for File 
> A should be responsible for fetching and processing it. Since the listing has 
> already been spread across the cluster, there is no benefit to fetching the 
> data, and then re-spreading it across the cluster. This will be very 
> expensive with little to no benefit. Similarly, you don't want to load 
> balance between CompressContent and PutHDFS. Simply load balance the listing 
> itself (which is very cheap because the FlowFiles have no content) and the 
> data will automatically be balanced across the cluster.
> 
> Thanks
> -Mark
> 
> 
> > On Dec 3, 2019, at 9:18 AM, nayan sharma <nayansharm...@gmail.com> wrote:
> > 
> > Hi,
> > Thanks for your reply.
> > Please find the attachment. Flow files has been for last 7 days. And while 
> > listing flow files it says The queue has no Flow Files.
> > Let me know your thoughts.
> > 
> > Thanks & Regards,
> > Nayan Sharma
> >  +91-8095382952
> > 
> >  <https://www.linkedin.com/in/nayan-sharma> 
> > <http://stackoverflow.com/users/3687426/nayan-sharma?tab=profile>
> > 
> > On Tue, Dec 3, 2019 at 7:34 PM Bryan Bende <bbe...@gmail.com 
> > <mailto:bbe...@gmail.com>> wrote:
> > Hello,
> > 
> > It would be helpful if you could upload a screenshot of your flow
> > somewhere and send a link.
> > 
> > Thanks,
> > 
> > Bryan
> > 
> > On Tue, Dec 3, 2019 at 6:06 AM nayan sharma <nayansharm...@gmail.com 
> > <mailto:nayansharm...@gmail.com>> wrote:
> > >
> > > Hi,
> > > I am using 2 nodes cluster.
> > > nodes config Heap(max) 48gb & 64 core machine
> > > Processor flow
> > > ListSFTP--->FetchSFTP(all nodes with 10 threads)--->CompressContent(all 
> > > nodes,10 threads)-->PutHDFS
> > >
> > > Queues shows it has 96gb in queue but when I do listing it shows no flow 
> > > files.
> > >
> > > Everything seems stuck, nothing is moving.
> > >
> > > I was wondering and curious also  even if with such heavy machines, What 
> > > I am doing wrong or with which config parameter.
> > >
> > > I couldn't find out solution for by myself so I reached here. Any help or 
> > > suggestion will be much highly appreciated.
> > >
> > > Thanks,
> > > Nayan
> > <Screenshot 2019-12-03 at 7.44.25 PM.png>
> 
> 

Reply via email to