[ 
https://issues.apache.org/jira/browse/PIG-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Costin Leau updated PIG-3646:
-----------------------------

    Description: 
Described on the mailing list here: 
http://www.mail-archive.com/user%40pig.apache.org/msg09009.html

A Pig {{LoadFunc}} cannot get a hold of its associated schema. For example, in 
the following script:
{code}
A = LOAD 'pig/tupleartists' USING MyStorage() AS (name: chararray, links 
(url:chararray, picture:chararray));
B = FOREACH A GENERATE name, links.url;
DUMP B;
{code}

{{MyStorage}} cannot get a hold of {{(name:chararray, links ...}} even when 
{{LoadPushDown#pushProjection()}} is implemented (which is called only when a 
transformation occurs - PlanOptimizer/ColumnMapKeyPrune).

One can look into a {{POStore}} but even then the information obtain is 
incomplete - meaning the schema is incomplete and the fields mentioned in 
{{FOREACH}} are dereferenced {{links.url}} is returned as {{url}}.

The purpose of this issue is to allow a {{LoadFunc}} implementation to get 
access to its schema declaration as specified in the script.

Thanks!

  was:
Described on the mailing list here: 
http://www.mail-archive.com/user%40pig.apache.org/msg09009.html

A Pig LoadFunc cannot get a hold of its associated schema. For example, in the 
following script:



> LoadFunc cannot get a hold of the associated user defined schema
> ----------------------------------------------------------------
>
>                 Key: PIG-3646
>                 URL: https://issues.apache.org/jira/browse/PIG-3646
>             Project: Pig
>          Issue Type: Bug
>          Components: data
>    Affects Versions: 0.12.0
>            Reporter: Costin Leau
>
> Described on the mailing list here: 
> http://www.mail-archive.com/user%40pig.apache.org/msg09009.html
> A Pig {{LoadFunc}} cannot get a hold of its associated schema. For example, 
> in the following script:
> {code}
> A = LOAD 'pig/tupleartists' USING MyStorage() AS (name: chararray, links 
> (url:chararray, picture:chararray));
> B = FOREACH A GENERATE name, links.url;
> DUMP B;
> {code}
> {{MyStorage}} cannot get a hold of {{(name:chararray, links ...}} even when 
> {{LoadPushDown#pushProjection()}} is implemented (which is called only when a 
> transformation occurs - PlanOptimizer/ColumnMapKeyPrune).
> One can look into a {{POStore}} but even then the information obtain is 
> incomplete - meaning the schema is incomplete and the fields mentioned in 
> {{FOREACH}} are dereferenced {{links.url}} is returned as {{url}}.
> The purpose of this issue is to allow a {{LoadFunc}} implementation to get 
> access to its schema declaration as specified in the script.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to