[ 
https://issues.apache.org/jira/browse/SQOOP-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13196822#comment-13196822
 ] 

[email protected] commented on SQOOP-428:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3600/
-----------------------------------------------------------

(Updated 2012-01-31 09:50:52.475912)


Review request for Sqoop.


Changes
-------

Adds the option of providing a Codec class name as well.


Summary
-------

This basically only ports all the code from Avro's (1.5.4) AvroOutputFormat to 
the new MR API.

I've changed the test to extract the common functionality into a helper method 
because they are the same apart from the two command line arguments.

I could have deleted AvroJob completely but as I was told last time that binary 
compatibility needs to be maintained I left it in. It's not needed anymore as 
all necessary functionality can be gotten from Avro's own version of that file 
as far as I can tell. So if it's okay to delete that redundant file (two 
actually, cloudera and apache package) let me know and I'll provide a new patch.


This addresses bug SQOOP-428.
    https://issues.apache.org/jira/browse/SQOOP-428


Diffs (updated)
-----

  src/java/com/cloudera/sqoop/io/CodecMap.java ffe949b 
  src/java/org/apache/sqoop/io/CodecMap.java 5b67206 
  src/java/org/apache/sqoop/mapreduce/AvroJob.java a57aaf1 
  src/java/org/apache/sqoop/mapreduce/AvroOutputFormat.java 96befd7 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java ed6954a 
  src/test/com/cloudera/sqoop/TestAvroImport.java 1b8b046 
  src/test/com/cloudera/sqoop/io/TestCodecMap.java f2f4039 

Diff: https://reviews.apache.org/r/3600/diff


Testing
-------

All tests pass for hadoopversion=20 but TestColumnTypes fails for me on 23. I 
can't see how that's related though.


Thanks,

Lars


                
> AvroOutputFormat doesn't support compression even though documentation claims 
> it does
> -------------------------------------------------------------------------------------
>
>                 Key: SQOOP-428
>                 URL: https://issues.apache.org/jira/browse/SQOOP-428
>             Project: Sqoop
>          Issue Type: Bug
>          Components: docs
>    Affects Versions: 1.4.0-incubating
>            Reporter: Lars Francke
>            Assignee: Lars Francke
>            Priority: Minor
>              Labels: avro, document
>             Fix For: 1.4.1-incubating
>
>         Attachments: SQOOP-428.1.patch, SQOOP-428.2.patch
>
>
> The documentation claims that Avro files can be compressed as well:
> {quote}
> By default, data is not compressed. You can compress your data by using the 
> deflate (gzip) algorithm with the -z or --compress argument, or specify any 
> Hadoop compression codec using the --compression-codec argument. This applies 
> to SequenceFile, text, and Avro files.
> {quote}
> This is not true as the AvroOutputFormat currently doesn't support 
> compression.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to