[ 
https://issues.apache.org/jira/browse/AVRO-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16010427#comment-16010427
 ] 

Niels Basjes commented on AVRO-1862:
------------------------------------

If I create a tar archive and the compress it with gzip I get a name like 
{{example.tar.gz}}. 
If I gunzip that file I actually get a {{example.tar}} which is a tar archive.
Avro files are Avro files.
You cannot 'unavro' a {{example.gz.avro}} file and get a {{example.gz}} file.

The other way around is also wrong using a name like {{example.avro.gz}} would 
lead to the expectation that it is a gzipped file and if you ungzip it you get 
a {{example.avro}}.

Based on the explanation I interpret the reason behind this change as a 
workaround for a need in scripting to avoid certain situations.
Alternative solution for the described use case: Run Camus in a script that 
after running the task simply renames the output file {{example.avro}} to 
{{example.camus.avro}}

I see this as a problem that does not belong to the avro code base.
So based on what I see here I think this should not be committed.



> AvroOutputFormat saves compressed avrò files without respecting codec's 
> default extension
> -----------------------------------------------------------------------------------------
>
>                 Key: AVRO-1862
>                 URL: https://issues.apache.org/jira/browse/AVRO-1862
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.8.1
>            Reporter: Piotr Wikieł
>            Priority: Minor
>              Labels: patch
>             Fix For: 1.8.3
>
>         Attachments: AVRO-1862-1.patch, AVRO-1862.patch
>
>
> Common pattern in naming compressed files is giving them extension derived 
> from compression codec, for example: {{.gz}}, {{.zip}}, {{.bz2}}. 
> {{AvroOutputFormat}} currently does not respect this convention. 
> I've adapted some code from Hadoop's {{TextOutputFormat}} in 
> backward-compatible manner adding following {{JobConf}} property:
> {{avro.mapred.output.extension.from-codec}} ({{boolean}}, default: {{false}}) 
> - when set to {{true}}, extension will be changed according to above rule.
> EDIT: Please take a look at first comment for an update. {{.gz.avro}}, 
> {{.snappy.avro}} will be an extension of the file when above property will be 
> set to true.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to