[ https://issues.apache.org/jira/browse/PIG-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Richard Ding updated PIG-2069: ------------------------------ Attachment: PIG-2069.patch This happens when the original MapReduce DAG (before optimization) contains a diamond node. User can workaround this by explicitly registering the LoadFunc jar in the script. The attached patch provides a fix. It's verified with manual test. > LoadFunc jar does not ship to backend in MultiQuery case > -------------------------------------------------------- > > Key: PIG-2069 > URL: https://issues.apache.org/jira/browse/PIG-2069 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.1, 0.9.0 > Reporter: Daniel Dai > Assignee: Richard Ding > Fix For: 0.9.0 > > Attachments: PIG-2069.patch > > > Pig is able to automatically figure out the jar containing the LoadFunc and > ship them to backend. However, the following script didn't: > {code} > A = load '1.txt' using SomeLoadFunc(); > B = filter A by $0==0; > C = filter A by $1==1; > D = join B by $0, C by $0; > dump D; > {code} > The reason is this query is a multiquery (A is reused and thus create an > implicit split). When we merge multiquery into one job, we didn't merge udfs > list properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira