[ https://issues.apache.org/jira/browse/PIG-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772095#action_12772095 ]
Pradeep Kamath commented on PIG-1035: ------------------------------------- The unit test does not seem to check the results of the outer join - would be good to add check of the actual results. In fact, there are already outer join tests in TestJoin.java - you can just update those to also test skewed join since those tests already check output correctness. In LogToPhyTranslationVisitor.java, in the following code, the return value of op.getSchema() should be checked for null in which case the same Exception should be thrown: {code} 849 try { 850 skj.addSchema(op.getSchema()); 851 } catch (FrontendException e) { 852 int errCode = 2015; 853 String msg = "Couldn't set the schema for outer join" ; 854 throw new LogicalToPhysicalTranslatorException(msg, errCode, PigException.BUG, e); 855 } {code} With the above code, schema is required for both inputs to the join. Strictly, for left and right outer joins, only the schema of the side where nulls need to be projected is needed. Only in full outer join both inputs should have schemas - if possible for left and right outer joins the restriction should be to require a schema only on the relevant input - for reference - left and right outer joins in regular join do this. > support for skewed outer join > ----------------------------- > > Key: PIG-1035 > URL: https://issues.apache.org/jira/browse/PIG-1035 > Project: Pig > Issue Type: New Feature > Reporter: Olga Natkovich > Assignee: Sriranjan Manjunath > Attachments: 1035.patch > > > Similarly to skewed inner join, skewed outer join will help to scale in the > presense of join keys that don't fit into memory -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.