We have been using Nifi for over a year and we just turned up a new cluster.  
We move around 6TB a day of small to large files.   We are having an issue
of the ListSFTP missing files.   I know this can happen if a file with an
older date is moved into the directory because the lister is maintaining
state.   However it also seems to hang when there are 10k plus files.   I am
running Nifi 1.6 on Ubuntu 18.  The cluster has plenty of memory, CPU, and
disk space.   I am also using the distributed cache because we haven't
migrated to 1.8 yet.   

We have 20 different data flows all with their own logic.  We connect the
Lister to a remote port that is connected to a remote process group and then
distributed across the cluster to a FetchSFTP that deletes the files after
they are loaded.  

We move files into the input directory so we have permission to delete them
from the Nifi Fetch.  We are doing a find which orders the files to make
sure that we don't grab old files.  This could still be an issue and cause
us to miss a few files but it still doesn't explain why when the lister is
running and there are files to pull nothing gets pulled. 

Any suggestion for idea would be appreciated.  

Dave



--
Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Reply via email to