[ 
https://issues.apache.org/jira/browse/PIG-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803582#comment-15803582
 ] 

liyunzhang_intel edited comment on PIG-4858 at 1/6/17 5:07 AM:
---------------------------------------------------------------

[~nkollar]: i guess what you mean is following where I marked "Here apply the 
patch from PIG-3417” ? I have updated patch in PIG-5044 and you can view the 
whole code in the review board of that patch.
SparkCompiler#getSamplingJob
{code}
  private SparkOperator getSamplingJob(POSort sort, SparkOperator 
sampleOperator, List<PhysicalPlan>
            transformPlans,
                                         int rp,
                                         String udfClassName, String[] udfArgs) 
throws PlanException,
            VisitorException, ExecException {
        addSampleOperatorForSkewedJoin(sampleOperator);
        List<Boolean> flat1 = new ArrayList<Boolean>();
        List<PhysicalPlan> eps1 = new ArrayList<PhysicalPlan>();

        // if transform plans are not specified, project the columns of sorting 
keys
        if (transformPlans == null) {
      ......
        } else {
            for (int i = 0; i < transformPlans.size(); i++) {
                eps1.add(transformPlans.get(i));
                flat1.add(i == transformPlans.size() - 1 ? true : false);  
#Here apply the patch from PIG-3417
            }
        }
{code}


was (Author: kellyzly):
[~nkollar]: i guess what you mean is following where I marked "Here apply the 
patch from PIG-3417” ?
SparkCompiler#getSamplingJob
{code}
  private SparkOperator getSamplingJob(POSort sort, SparkOperator 
sampleOperator, List<PhysicalPlan>
            transformPlans,
                                         int rp,
                                         String udfClassName, String[] udfArgs) 
throws PlanException,
            VisitorException, ExecException {
        addSampleOperatorForSkewedJoin(sampleOperator);
        List<Boolean> flat1 = new ArrayList<Boolean>();
        List<PhysicalPlan> eps1 = new ArrayList<PhysicalPlan>();

        // if transform plans are not specified, project the columns of sorting 
keys
        if (transformPlans == null) {
      ......
        } else {
            for (int i = 0; i < transformPlans.size(); i++) {
                eps1.add(transformPlans.get(i));
                flat1.add(i == transformPlans.size() - 1 ? true : false);  
#Here apply the patch from PIG-3417
            }
        }
{code}

> Implement Skewed join for spark engine
> --------------------------------------
>
>                 Key: PIG-4858
>                 URL: https://issues.apache.org/jira/browse/PIG-4858
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: Xianda Ke
>             Fix For: spark-branch
>
>         Attachments: PIG-4858.patch, PIG-4858_2.patch, PIG-4858_3.patch, 
> SkewedJoinInSparkMode.pdf
>
>
> Now we use regular join to replace skewed join. Need implement it 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to