[jira] [Commented] (HIVE-2604) Add UberCompressor Serde/Codec to contrib which allows per-column compression strategies

Phabricator (Commented) (JIRA) Fri, 13 Jan 2012 11:01:06 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185756#comment-13185756
 ]


Phabricator commented on HIVE-2604:
-----------------------------------

heyongqiang has commented on the revision "HIVE-2604 [jira] Add UberCompressor 
Serde/Codec to contrib which allows per-column compression strategies".

INLINE COMMENTS
  contrib/src/test/queries/clientpositive/ubercompressor.q:4 setting a bunch of 
compression config here is fine for single insert. But how about multi-insert 
queries?

  Can u put these configs to table/partition object? And that will make things 
easy to debug. (if u want to do in a followup, please open a follow up jira.)

  
contrib/src/java/org/apache/hadoop/hive/contrib/ubercompressor/dsalg/Tuple.java:1
 what is the package name "dsalg"
  
contrib/src/java/org/apache/hadoop/hive/contrib/ubercompressor/UberCompressorUtils.java:38
 just curious, can WritableUtils be used here?
  
contrib/src/java/org/apache/hadoop/hive/contrib/ubercompressor/UberCompressionCodec.java:33
 How is this class used? Can it be defined as an interface? DummyCompressor 
inside it is not doing anything.
  
contrib/src/java/org/apache/hadoop/hive/contrib/ubercompressor/UberCompressionInputStream.java:70
 can u add more comments here? If i understand correctly, it is doing read and 
decompression here. But there is readFromCompressor. Should it be 
readFromDecompressor()?

  And there is some bytes transfer and copied involved here. Can that be 
avoided?
  
contrib/src/java/org/apache/hadoop/hive/contrib/ubercompressor/UberCompressionInputStream.java:101
 why is the serde involved here? It is deserializing and serializing again 
here...

REVISION DETAIL
  https://reviews.facebook.net/D1011

                
> Add UberCompressor Serde/Codec to contrib which allows per-column compression 
> strategies
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-2604
>                 URL: https://issues.apache.org/jira/browse/HIVE-2604
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Contrib
>            Reporter: Krishna Kumar
>            Assignee: Krishna Kumar
>         Attachments: HIVE-2604.D1011.1.patch, HIVE-2604.v0.patch, 
> HIVE-2604.v1.patch, HIVE-2604.v2.patch
>
>
> The strategies supported are
> 1. using a specified codec on the column
> 2. using a specific codec on the column which is serialized via a specific 
> serde
> 3. using a specific "TypeSpecificCompressor" instance

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2604) Add UberCompressor Serde/Codec to contrib which allows per-column compression strategies

Reply via email to