Thank you both very much for responding, Denes and Joe. I looked at your code and see that you build a custom ListFile processor that leverages the state info to identify and save new directories that appear since the prior run (please do correct me if I don't have that quite right after initial review). I'm going to try this too. And I think this would be a very useful feature.
Most of the things I get tasked with here are of the "get it done yesterday" variety, and so on Friday evening I rolled my own to get something up and running. You asked about other solutions, and so I'll tell you what I did very briefly. 1- I use a GenerateFlowFile processor to generate a small 1KB trigger file. That processor is configured as CRON driven. I can adjust that easily to whatever periodicity the customer requires. 2- That trigger flowFile then causes an ExecuteScript processor to run. It executes a very simple python script. 3- The script does two things. It reads into a List the subdirectories we've already seen, which I persist in a small configuration file. 4- The code then uses os commands to generate a list of current subdirectories in the snapshot. 5- It compares the two lists using python list functions, and the difference represents the directories for which my flow I issue alerts. 6- The trigger file payload is replaced by the product of the list difference, and I create a few attributes too. 7- As a final step I append to the config file the new subdirs identified this cycle. I am happy to share the script here if anyone wants to see it. In truth it is pretty underwhelming, with just a few interesting list operations. Jim On Mon, Mar 11, 2019 at 12:51 PM Denes Arvay <[email protected]> wrote: > Hi Jim, > > I suppose you want to monitor the newly created but still empty > subdirectories, right? ListFile doesn't list those and I'm not aware of any > processor for this purpose. > I created a quick patch for ListFile, feel free to use it as is or as a > starting point: > https://github.com/apache/nifi/compare/master...adenes:listfile-list-dirs?expand=1 > I'm curious if the community knows any other solution, if not and this > seems to be useful feature I'm happy to file a Jira and open a pull request. > > Best, > Denes >
