[ 
https://issues.apache.org/jira/browse/NIFI-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896082#comment-16896082
 ] 

Edward Armes edited comment on NIFI-6496 at 7/30/19 4:00 PM:
-------------------------------------------------------------

This change doesn't sit right me. This is for mainly 2 reasons;
 # Nifi Processes in general currently leverage the Unix Philosophy (Do one 
thing and do it well), this gives every processor 2 advantages, the first being 
each processor is very efficient and the second makes it very clear for a user 
to understand what processor needs to be used and where to allow them to 
achieve there goal.
 # A-s I understand it currently Java non-blocking IO is not used everywhere 
and in some cases flow-file content is not streamed and it is loaded entirely 
in memory. I think that should be fixed first instead of bolting on a specific 
bit of functionality, to a specific processor.- *Edit/Update: This is not quite 
true.* The content for a FlowFile is exposed as an InputStream. However there 
are existing processors that do load the entire contents of a FlowFiles content 
into memory (for valid reasons) and don't always work well with large amounts 
of content. 

Given that in general the existing processors suite already allow for the same 
functionality to perform in various methods (allowing users to choose the 
method that is the most suitable for there data i.e. altering sections of XML 
via the XML Processors or the Replace text processor).

I think instead we need to look at adding support for compression on the 
FlowFile store itself as an alternative if storage and not memory is the 
constraint 


was (Author: bickerx2):
This change doesn't sit right me. This is for mainly 2 reasons;
 # Nifi Processes in general currently leverage the Unix Philosophy (Do one 
thing and do it well), this gives every processor 2 advantages, the first being 
each processor is very efficient and the second makes it very clear for a user 
to understand what processor needs to be used and where to allow them to 
achieve there goal.
 # As I understand it currently Java non-blocking IO is not used everywhere and 
in some cases flow-file content is not streamed and it is loaded entirely in 
memory. I think that should be fixed first instead of bolting on a specific bit 
of functionality, to a specific processor.

Given that in general the existing processors suite already allow for the same 
functionality to perform in various methods (allowing users to choose the 
method that is the most suitable for there data i.e. altering sections of XML 
via the XML Processors or the Replace text processor).

I think instead we need to look at adding support for compression on the 
FlowFile store itself as an alternative if storage and not memory is the 
constraint 

> Add compression support to record reader processor
> --------------------------------------------------
>
>                 Key: NIFI-6496
>                 URL: https://issues.apache.org/jira/browse/NIFI-6496
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Extensions
>            Reporter: Malthe Borch
>            Priority: Minor
>              Labels: easyfix, usability
>
> Text-based record formats such as CSV, JSON and XML compress well and will 
> often be transmitted in a compressed format. If compression support is added 
> to the relevant processors, users will not need to explicitly unpack files 
> before processing (which may not be feasible or practical due to space 
> requirements).
> There are at least two ways of implementing this, using either a generic 
> approach where a {{CompressedRecordReaderFactory}} is the basis for a new 
> controller service that wraps the underlying record reader controller service 
> (e.g. {{CSVReader}}); or adding the functionality at the relevant record 
> reader implementations.
> The latter option may provide a better UX because no additional 
> {{ControllerService}} has to be configured.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to