[ 
https://issues.apache.org/jira/browse/NIFI-12594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17805142#comment-17805142
 ] 

Peter Kimberley commented on NIFI-12594:
----------------------------------------

The listing of objects without regard to time filter is due to this line of 
code: 
[https://github.com/apache/nifi/blob/e07bb19233de81513a5218ed782c782b26591bb3/nifi-nar-bundles/nifi-aws-bundle/nifi-aws-processors/src/main/java/org/apache/nifi/processors/aws/s3/ListS3.java#L591]

The filtering logic fails to account for the configured minimum and maximum 
object age, unlike other processors that use 
{{ListedEntityTracker::trackEntities}}, such as ListFile.

As I found in my original analysis, ListS3::verify() is inconsistent with the 
behaviour of the processor when triggered. As such, I've extracted the object 
timestamp comparison logic into a common function, so both verify() and 
triggered behaviour are consistent.

I also found a related problem, which is an incorrect property dependency of 
{{Entity Tracking Time Window}} on the {{Entity Tracking Initial Listing 
Target}} being equal to {{Tracking Time Window}}. As the entity tracking time 
window is used to prune the entity cache list (regardless of listing target), 
it should only be hidden if the {{Listing Strategy}} is set to {{Tracking 
Timestamps}}.

> ListS3 minimum object age filter not observed when entity state tracking is 
> used
> --------------------------------------------------------------------------------
>
>                 Key: NIFI-12594
>                 URL: https://issues.apache.org/jira/browse/NIFI-12594
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.24.0
>         Environment: Docker, on-prem S3
>            Reporter: Peter Kimberley
>            Priority: Major
>
> When ListS3 is configured to use the {{Tracking Entities}} listing strategy, 
> the following is observed:
>  # Configure ListS3 with a Minimum Object Age of {{1 hour.}} Ensure processor 
> is stopped.
>  # Create a new FlowFile with GenerateFlowFile and run once
>  # Put the FlowFile to an S3 bucket with PutS3
>  # Open ListS3 configuration
>  # Click Verify. UI reports: ??Successfully listed contents of bucket <bucket 
> name>, finding 0 objects matching the filter.??
>  # Run ListS3 once. Flowfile is retrieved, even though the 1 hour interval 
> has not yet elapsed.
> The issue is the ListS3 {{Minimum Object Age}} property is not being observed 
> when using the Tracking Entities listing strategy. When using {{{}Tracking 
> Timestamps{}}}, the processor behaves as expected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to