Neil, I'm not aware of this problem for ListS3. I do not suggest there are no issues, rather that many users might not notice or have come to accept some variance in the accuracy of ListS3. If you can persuade ListS3 to do it again, that would be great :).
We did recently hear a report of similar behavior in the similarly-implemented ListGCSBucket processor that does the same list operation for Google Cloud Storage. In my brief experience troubleshooting ListGCSBucket, the issue appears to be that GCS would report different last modified timestamps in different list API responses, despite what I believed to be a single write. I rationalized that as a product of eventual consistency when write and list operations were taking place within a few seconds. That explanation would not make sense with a 10-week old file. One outcome of the ListGCSBucket episode was that using a DetectDuplicates processor after the list processor to check for unique keys can be an effective workaround. Thanks, James On Wed, Dec 6, 2017 at 11:24 AM, Neil Derraugh < neil.derra...@intellifylearning.com> wrote: > I have a slowly changing S3 bucket. It has about 10 files in it. > > Prior to today the bucket's most recently modified file was modified > on September 15, 2017 2:54:40 PM. > > One of the files just got updated today (December 6, 2017 4:58:22 PM) and > ListS3 emitted it properly. But It also (re-)emitted that file last > modified on September 15, 2017 2:54:40 PM. I checked the etags from > September and today on the spurious file and they match. Confusing > behavior. > > Anybody seen anything like this before, or know why it happened? > > Thanks, > Neil >