[ 
https://issues.apache.org/jira/browse/PIG-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772095#action_12772095
 ] 

Pradeep Kamath commented on PIG-1035:
-------------------------------------

The unit test does not seem to check the results of the outer join - would be 
good to add check of the actual results. 
In fact, there are already outer join tests in TestJoin.java - you can just 
update those to also test skewed join since those tests
already check output correctness.

In LogToPhyTranslationVisitor.java, in the following code, the return value of 
op.getSchema() should be checked for null in
which case the same Exception should be thrown:
{code}
 849                 try {
   850                     skj.addSchema(op.getSchema());
   851                 } catch (FrontendException e) {
   852                     int errCode = 2015;
   853                     String msg = "Couldn't set the schema for outer 
join" ;
   854                     throw new LogicalToPhysicalTranslatorException(msg, 
errCode, PigException.BUG, e);
   855                 }
{code}
With the above code, schema is required for both inputs to the join. Strictly, 
for left and right outer joins, only the
schema of the side where nulls need to be projected is needed. Only in full 
outer join both inputs should have schemas - if possible
for left and right outer joins the restriction should be to require a schema 
only on the relevant input - for reference - left and right outer
joins  in regular join do this.

> support for skewed outer join
> -----------------------------
>
>                 Key: PIG-1035
>                 URL: https://issues.apache.org/jira/browse/PIG-1035
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Olga Natkovich
>            Assignee: Sriranjan Manjunath
>         Attachments: 1035.patch
>
>
> Similarly to skewed inner join, skewed outer join will help to scale in the 
> presense of join keys that don't fit into memory

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to