Hi everyone, I would like to address a particular scenario that has recently come to my attention regarding the use of the PutAzureBlobStorage processor with the FileResourceService.
When the PutAzureBlobStorage processor is used with the FileResourceService, it currently uploads a file from the user's local filesystem to Azure, but it does not create a FlowFile. Instead, it utilizes the incoming FlowFile solely to send a provenance event. In this case the size of the provenance event is the incoming FlowFile's size instead of the uploaded one. There are potential solutions to address this issue and ensure that the provenance events are handled effectively. Two main options have been proposed: * Create a Mock FlowFile: A mock FlowFile with a size matching that of the local file being uploaded could be generated. This mock FlowFile would serve as the basis for the provenance event, even though its size might not reflect the actual content. * Modify the ProvenanceReporter Interface: Alternatively, we could introduce a new method in the ProvenanceReporter interface that doesn't require a FlowFile but instead accepts a "size" parameter as an argument. This would eliminate the need for a mock FlowFile. The lack of a FlowFile operation in this situation creates a distinct challenge because provenance events are typically tied to FlowFiles. Still, it's important to indicate data transmission for monitoring and tracking. While the idea of a "size" parameter for the provenance event seems preferable, we need to carefully consider its feasibility, potential complexities, and community acceptance. The FileResourceService already deviates from NiFi's concept of using FlowFiles to hold payload data, and we must avoid further complicating the framework unless absolutely necessary. If you have any insights or suggestions, please feel free to reply to this email or join the discussion. Best Regards, Lehel