[ 
https://issues.apache.org/jira/browse/PIG-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Richard Ding updated PIG-2069:
------------------------------

    Attachment: PIG-2069.patch

This happens when the original MapReduce DAG (before optimization) contains a 
diamond node.

User can workaround this by explicitly registering the LoadFunc jar in the 
script.

The attached patch provides a fix. It's verified with manual test.

> LoadFunc jar does not ship to backend in MultiQuery case
> --------------------------------------------------------
>
>                 Key: PIG-2069
>                 URL: https://issues.apache.org/jira/browse/PIG-2069
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.8.1, 0.9.0
>            Reporter: Daniel Dai
>            Assignee: Richard Ding
>             Fix For: 0.9.0
>
>         Attachments: PIG-2069.patch
>
>
> Pig is able to automatically figure out the jar containing the LoadFunc and 
> ship them to backend. However, the following script didn't:
> {code}
> A = load '1.txt' using SomeLoadFunc();
> B = filter A by $0==0;
> C = filter A by $1==1;
> D = join B by $0, C by $0;
> dump D;
> {code}
> The reason is this query is a multiquery (A is reused and thus create an 
> implicit split). When we merge multiquery into one job, we didn't merge udfs 
> list properly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to