William Watson created PIG-4458:
-----------------------------------

             Summary: Support UDFs in a FOREACH Before a Merge Join
                 Key: PIG-4458
                 URL: https://issues.apache.org/jira/browse/PIG-4458
             Project: Pig
          Issue Type: New Feature
            Reporter: William Watson


Right now, the MapSideMergeValidator outright rejects any foreach that has a 
UDF in it:

{code}
private boolean isAcceptableForEachOp(Operator lo) throws 
LogicalToPhysicalTranslatorException {
        if (lo instanceof LOForEach) {
            OperatorPlan innerPlan = ((LOForEach) lo).getInnerPlan();
            validateMapSideMerge(innerPlan.getSinks(), innerPlan);
            return !containsUDFs((LOForEach) lo);
        } else {
            return false;
        }
    }
{code}


There is a TODO for this later on in that same class (inside containsUDFs):

{code}
// TODO (dvryaboy): in the future we could relax this rule by tracing what 
fields
// are being passed into the UDF, and only refusing if the UDF is working on the
// join key. Transforms of other fields should be ok.
{code}

We should do the TODO and relax this requirement or just remove it altogether



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to