William Watson created PIG-4458: ----------------------------------- Summary: Support UDFs in a FOREACH Before a Merge Join Key: PIG-4458 URL: https://issues.apache.org/jira/browse/PIG-4458 Project: Pig Issue Type: New Feature Reporter: William Watson
Right now, the MapSideMergeValidator outright rejects any foreach that has a UDF in it: {code} private boolean isAcceptableForEachOp(Operator lo) throws LogicalToPhysicalTranslatorException { if (lo instanceof LOForEach) { OperatorPlan innerPlan = ((LOForEach) lo).getInnerPlan(); validateMapSideMerge(innerPlan.getSinks(), innerPlan); return !containsUDFs((LOForEach) lo); } else { return false; } } {code} There is a TODO for this later on in that same class (inside containsUDFs): {code} // TODO (dvryaboy): in the future we could relax this rule by tracing what fields // are being passed into the UDF, and only refusing if the UDF is working on the // join key. Transforms of other fields should be ok. {code} We should do the TODO and relax this requirement or just remove it altogether -- This message was sent by Atlassian JIRA (v6.3.4#6332)