[ https://issues.apache.org/jira/browse/NIFI-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sivaprasanna Sethuraman resolved NIFI-2705. ------------------------------------------- Resolution: Fixed Fix Version/s: 1.1.0 Fixed in 1.1.0 release. See the related issue NIFI-2831 > ListHDFS Cannot Be Re-run > ------------------------- > > Key: NIFI-2705 > URL: https://issues.apache.org/jira/browse/NIFI-2705 > Project: Apache NiFi > Issue Type: Bug > Components: Core Framework, Documentation & Website > Affects Versions: 1.0.0 > Reporter: Alan Jackoway > Priority: Major > Fix For: 1.1.0 > > > I have a use case where every day I want to go through a directory in HDFS > and do something to the files more than a month old. > I was trying to do this with a flow like ListHDFS -> RouteOnAttribute > (hdfs.lastModified) -> FetchHDFS -> Processing. > However, after I ran it once, old files were not pulled any more. I turned on > debug logging and got this: > {noformat} > 2016-08-30 06:15:17,473 DEBUG [Timer-Driven Process Thread-9] > o.apache.nifi.processors.hadoop.ListHDFS > ListHDFS[id=d80a1ceb-0156-1000-595d-978dcf53ecb6] Found a total of 3 files in > HDFS > 2016-08-30 06:15:17,473 DEBUG [Timer-Driven Process Thread-9] > o.apache.nifi.processors.hadoop.ListHDFS > ListHDFS[id=d80a1ceb-0156-1000-595d-978dcf53ecb6] Of the 3 files found in > HDFS, 0 are listable > 2016-08-30 06:15:17,473 DEBUG [Timer-Driven Process Thread-9] > o.apache.nifi.processors.hadoop.ListHDFS > ListHDFS[id=d80a1ceb-0156-1000-595d-978dcf53ecb6] There is no data to list. > Yielding. > {noformat} > It turns out that ListHDFS maintains state called {{latestTimestampListed}} > that prevents it from re-listing files unless you change the directory being > listed. At a minimum, that should be mentioned in the docs on ListHDFS. > Better would be to make it configurable more like GetHDFS. > In my case I think I can change to using GetHDFS without causing trouble, but > the behavior of ListHDFS was surprising to me, and as far as I can tell is > not documented anywhere. -- This message was sent by Atlassian JIRA (v7.6.3#76005)