[jira] Updated: (PIG-760) Serialize schemas for PigStorage() and other storage types.

Dmitriy V. Ryaboy (JIRA) Wed, 25 Nov 2009 22:02:10 -0800

     [ 
https://issues.apache.org/jira/browse/PIG-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dmitriy V. Ryaboy updated PIG-760:
----------------------------------

    Fix Version/s:     (was: 0.6.0)
                   0.7.0
           Status: Patch Available  (was: Open)

The updated patch moves PigStorageSchema to the piggybank (I feel it needs to 
include proper handling of complex structures before it can be considered a 
builtin).  Also updated the various interfaces from the Load/Store redesign to 
match latest spec.



> Serialize schemas for PigStorage() and other storage types.
> -----------------------------------------------------------
>
>                 Key: PIG-760
>                 URL: https://issues.apache.org/jira/browse/PIG-760
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: David Ciemiewicz
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.7.0
>
>         Attachments: pigstorageschema-2.patch, pigstorageschema.patch, 
> pigstorageschema_3.patch
>
>
> I'm finding PigStorage() really convenient for storage and data interchange 
> because it compresses well and imports into Excel and other analysis 
> environments well.
> However, it is a pain when it comes to maintenance because the columns are in 
> fixed locations and I'd like to add columns in some cases.
> It would be great if load PigStorage() could read a default schema from a 
> .schema file stored with the data and if store PigStorage() could store a 
> .schema file with the data.
> I have tested this out and both Hadoop HDFS and Pig in -exectype local mode 
> will ignore a file called .schema in a directory of part files.
> So, for example, if I have a chain of Pig scripts I execute such as:
> A = load 'data-1' using PigStorage() as ( a: int , b: int );
> store A into 'data-2' using PigStorage();
> B = load 'data-2' using PigStorage();
> describe B;
> describe B should output something like { a: int, b: int }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (PIG-760) Serialize schemas for PigStorage() and other storage types.

Reply via email to