Hi Lee, The List+Fetch model in a cluster is one of the trickier configurations to set up.
This article has a good description with a diagram under the "pulling section" that shows ListHDFS+FetchHDFS, but should be the same for ListFile+FetchFile: https://community.hortonworks.com/articles/16120/how-do-i-distribute-data-across-a-nifi-cluster.html The short answer is you would connect ListFile to a Remote Process Group that points back to the same cluster, and then an Input Port goes to Fetch File, and it is the Remote Process Group that distributes the data across the cluster. Hopefully this helps. -Bryan On Thu, Mar 24, 2016 at 4:55 PM, Lee Laim <[email protected]> wrote: > I'm using the ListFile/FetchFile combination in cluster mode. > > When *ListFile is set to run on primary node* and *Fetch File is set > to default*, The generated flow files only run on the primary node, > other nodes sit out. > > When *ListFile and FetchFile is set to run on default* (timer driven), > They generate flow files which are then consumed by all downstream nodes. > > Is this expected behavior? Or is something off with my deployment? > > What I am seeing appears to be contrary to the usage description; ListFile > (primary) generates one list of flow files to organize and distribute work > to the rest of the cluster. > > I'm running 0.5.1 on 3 nodes. > > Thanks, > Lee >
