[ 
https://issues.apache.org/jira/browse/PIG-4421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531168#comment-14531168
 ] 

Mohit Sabharwal commented on PIG-4421:
--------------------------------------

[~kellyzly], we should only disable this test for Spark and add a TODO note in 
the comment to enable this test again when Spark engine implements Skew Join 
algorithm.

In {{test/org/apache/pig/test/Util.java}}, add:
{code}
public static boolean isSparkExecType(ExecType execType) {
        if (execType.name().toLowerCase().startsWith("spark")) {
            return true;
        }

        return false;
    }
{code}

And then in, {{TestSkewedJoin#testSkewedJoinKeyPartition}}
{code}
        // This test relies on how the keys are distributed in Skew Join 
implementation.
        // Spark engine currently implements skew join as regular join, and 
hence does
        // not control key distribution.
        // TODO: Enable this test when Spark engine implements Skew Join 
algorithm.
        if (Util.isSparkExecType(cluster.getExecType()))
             return; 
{code}


> implement visitSkewedJoin in SparkCompiler
> ------------------------------------------
>
>                 Key: PIG-4421
>                 URL: https://issues.apache.org/jira/browse/PIG-4421
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4421.patch, PIG-4421_2.patch, PIG-4421_3.patch, 
> PIG-4421_4.patch, PIG-4421_5.patch, PIG-4421_6.patch
>
>
> If visitSkewedJoin is not implemented, following unittests will fail.
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinWithGroup
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinMapKey
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinManyReducers
> org.apache.pig.test.TestSkewedJoin.testNonExistingInputPathInSkewJoin
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinOneValue
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinWithNoProperties
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinEmptyInput
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinNullKeys
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinOuter
> org.apache.pig.test.TestSkewedJoin.testRecursiveFileListing
> org.apache.pig.test.TestSkewedJoin.testSkewedJoinReducers
> org.apache.pig.test.TestJoinSmoke.testSkewedJoinWithGroup
> org.apache.pig.test.TestJoinSmoke.testSkewedJoinOuter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to