[ 
https://issues.apache.org/jira/browse/PIG-1426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12870899#action_12870899
 ] 

Alan Gates commented on PIG-1426:
---------------------------------

This looks cool.  Eventually we could extend it to length of strings, bags, 
databyte arrays, etc.  

One question.  Zebra and BinStorage use the code in DataReaderWriter to read 
this data off disk.  Is WritableUtils.readVInt compatible with a regular 
integer as well?  If not, it seems we're introducing a data incompatibility for 
data stored using these formats.

> Change the size of Tuple from Int to VInt when Serialize Tuple
> --------------------------------------------------------------
>
>                 Key: PIG-1426
>                 URL: https://issues.apache.org/jira/browse/PIG-1426
>             Project: Pig
>          Issue Type: Improvement
>          Components: data
>    Affects Versions: 0.8.0
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>             Fix For: 0.8.0
>
>         Attachments: PIG_1426.patch
>
>
> Most of  time,  the size of tuple is not very large, one byte is enough for 
> store the size of tuple. So I suggest to use VInt instead of Int for the size 
> of tuple when doing Serialization. Because the key type of map output is 
> Tuple, so this can reduce the amount of data transferred from mapper to 
> reducer. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to