[ 
https://issues.apache.org/jira/browse/PIG-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13498579#comment-13498579
 ] 

Cheolsoo Park commented on PIG-3015:
------------------------------------

Hi Joseph,

First of all, thank you so much!

Secondly, considering the size of the patch, would you mind uploading it to the 
RB? This will encourage more people to review it.
https://reviews.apache.org/

You can choose pig-git to upload a diff file from the github repository.

Thirdly, I haven't fully read the patch yet and will do once it's uploaded on 
the RB. But I have a few minor comments as below:
- Can you please add the Apache license header to every new file?
- Can you please remove @author tags?
- Can you please replace {{System.err.println()}} with {{common.logging.log}}?
- Our indentation convention is 4 spaces and no tabs. You used 2 spaces, and I 
see 2 tabs in {{directory_test.pig}}.

Lastly, your bash script probably should be replaced by a python script (or 
another cross-platform script) because there is an on-going effort of porting 
Pig to Windows (PIG-2793). In particular, TestAvroStorage is added to the unit 
test suites, this will be an issue. Please feel free to open a sub-task for 
converting it to Python if you'd like to get help.
                
> Rewrite of AvroStorage
> ----------------------
>
>                 Key: PIG-3015
>                 URL: https://issues.apache.org/jira/browse/PIG-3015
>             Project: Pig
>          Issue Type: Improvement
>          Components: piggybank
>            Reporter: Joseph Adler
>            Assignee: Joseph Adler
>         Attachments: PIG-3015.patch
>
>
> The current AvroStorage implementation has a lot of issues: it requires old 
> versions of Avro, it copies data much more than needed, and it's verbose and 
> complicated. (One pet peeve of mine is that old versions of Avro don't 
> support Snappy compression.)
> I rewrote AvroStorage from scratch to fix these issues. In early tests, the 
> new implementation is significantly faster, and the code is a lot simpler. 
> Rewriting AvroStorage also enabled me to implement support for Trevni.
> I'm opening this ticket to facilitate discussion while I figure out the best 
> way to contribute the changes back to Apache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to