[ 
https://issues.apache.org/jira/browse/NIFI-12841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820514#comment-17820514
 ] 

Mark Payne commented on NIFI-12841:
-----------------------------------

Hey [~EndzeitBegins] thanks for reaching out about this.

In general, the naming convention used in NiFi for such a thing is DeleteXYZ, 
rather than RemoveXYZ. Several of these components exist. For example: 
DeleteMongo, DeleteHDFS, DeleteDynamoDB, DeleteSQS, DeleteS3Object, 
DeleteGCSObject.

It is important to note that these Processors should not share a base class. 
There is no significant code reuse that would be gained by sharing a base 
class, but doing so would constraint the extensibility of the Processors. For 
example, DeleteS3Object is likely to extend from an AbstractS3Processor, etc.

Typically, the Processor will have both a "success" and a "failure" 
relationship. As for a "does not exist" relationship, it depends on the 
Processor. Some Processors may provide such a relationship while others do not. 
It should be documented how each Processor behaves in such a condition - 
whether it's a specific relationship, or the FlowFile goes to failure (because 
it failed to delete the file), or the FlowFile goes to success (because the 
file no longer exists), etc. It would be good to ensure that we are consistent, 
but given that several Delete* Processors already exist, it may not make sense 
to start changing the behavior. It would take some investigation there.

It is also important to note that detecting whether or not a given file exists 
may also even have significant performance considerations. For something like 
an SQS Processor it may be expensive to make the request for every single 
message to detect whether or not it exists.

> Introduce RemoveXYZ type of processors
> --------------------------------------
>
>                 Key: NIFI-12841
>                 URL: https://issues.apache.org/jira/browse/NIFI-12841
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: endzeit
>            Priority: Minor
>
> There is the notion of "families" or "types" of processors in the standard 
> distribution of NiFi. 
> Among others, these are {{ListXYZ}}, {{GetXYZ}}, {{FetchXYZ}}, {{UpdateXYZ}}, 
> and {{PutXYZ}}. 
> The following examples will be based on files on the local filesystem. 
> However, the same principle applies to other types of resources, e.g. files 
> on a SFTP server.
> The existing {{GetFile}} and {{FetchFile}} processors support the removal of 
> the resource from the source after successful transfer into the content of a 
> FlowFile. 
> However, in some scenarios it might be undesired to remove the resource until 
> it has been processed successfully and the transformation result be stored, 
> e.g. to a secure network storage.
> This cannot be achieved with a {{GetXYZ}} or {{FetchXYZ}} processor on its 
> own. 
> As of now, one of the scripting processors or even a full-fledged custom 
> processor can be used to achieve this. 
> However, these might get relatively involved due to session handling or other 
> concerns.
> This issue proposes the introduction of an additional such processor "type", 
> namely {{RemoveXYZ}} which removes a resource.
> The base processor should have two properties, namely {{path}} and 
> {{filename}}, by default retrieving their values from the respective core 
> FlowFile attributes. Implementations may add protocol specific properties, 
> e.g. for authentication. 
> There should be three outgoing relationships at least:
> - "success" for FlowFiles, where the resource was removed from the source,
> - "not exists" for FlowFiles, where the resource did (no longer) exist on the 
> source,
> - "failure" for FlowFiles, where the resource couldn't be removed from the 
> source, e.g. due to network errors or missing permissions.
> An initial implementation should provide {{RemoveXYZ}} for one of the 
> existing resources types, e.g. File, FTP, SFTP...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to