[ 
https://issues.apache.org/jira/browse/NIFI-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896179#comment-16896179
 ] 

Malthe Borch commented on NIFI-6496:
------------------------------------

Adding compression-support to {{FlowFile}} itself seems ideal, but ideally 
exposed only through interfaces such as `CompressContent`.


In this model, {{CompressContent}} would have an optional streaming (or "lazy") 
mode such that the unpacked file contents would not have to be written to disk. 
The effect of running the processor would effectively be to set an internal 
flag that enables transparent decompression in a subsequent step. The 
{{fileSize}} should not need to be updated because effectively, the size has 
not changed (this should be mostly of interest in the context of provenance).



If in some cases content is not streamed (but loaded entirely into memory) then 
I would think that it an issue that can fixed separately?

> Add compression support to record reader processor
> --------------------------------------------------
>
>                 Key: NIFI-6496
>                 URL: https://issues.apache.org/jira/browse/NIFI-6496
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Malthe Borch
>            Priority: Minor
>              Labels: easyfix, usability
>
> Text-based record formats such as CSV, JSON and XML compress well and will 
> often be transmitted in a compressed format. If compression support is added 
> to the relevant processors, users will not need to explicitly unpack files 
> before processing (which may not be feasible or practical due to space 
> requirements).
> There are at least two ways of implementing this, using either a generic 
> approach where a {{CompressedRecordReaderFactory}} is the basis for a new 
> controller service that wraps the underlying record reader controller service 
> (e.g. {{CSVReader}}); or adding the functionality at the relevant record 
> reader implementations.
> The latter option may provide a better UX because no additional 
> {{ControllerService}} has to be configured.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to