[ 
https://issues.apache.org/jira/browse/PIG-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Raghu Angadi updated PIG-2187:
------------------------------

    Attachment: PIG-2187.patch

The patch add utility method in StorageUtils to convert ta Tuple to Text. Now 
PigTextOutputFormat is an alias for TextOutputFormat.

This looks pretty but only drawback is couple of extra copies to make Text from 
Tuple. These copies didn't exist before. But this is same as PigTextInputFormat 
which makes the same compromise, though it handles order of magnitude more data.

> PigStorage should handle converting Tuple to text
> -------------------------------------------------
>
>                 Key: PIG-2187
>                 URL: https://issues.apache.org/jira/browse/PIG-2187
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>         Attachments: PIG-2187.patch
>
>
> Currently it is simple for users to use a different TextInputFormat with 
> PigStorage since PigStorage loader just expects one line at a time, takes 
> care of parsing the text into a tuple.
> This is not the case with storage side. PigTextOutputFormat handles the 
> conversion to Text (it actually write UTF8 of each field from tuple directly 
> to output). This implies a different TextOutputFormat can not be used with 
> PigStorage.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to