[ 
https://issues.apache.org/jira/browse/PIG-5163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15946683#comment-15946683
 ] 

liyunzhang_intel commented on PIG-5163:
---------------------------------------

[~nkollar]: it is a bug of POJoinGroupSpark#setPredecessors. in my cluster
before multiquery optimization
{code}
before multiquery optimization:
scope-74->scope-77 scope-82 
scope-77
scope-82
#--------------------------------------------------
# Spark Plan                                  
#--------------------------------------------------

Spark node scope-74
Store(hdfs://bdpe42:8020/tmp/temp1378261290/tmp-519681347:org.apache.pig.impl.io.InterStorage)
 - scope-75
|
|---B: POStream[perl -ne 'print $_;' 
(stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)]
 - scope-52
    |
    |---A: New For Each(false,false,false)[bag] - scope-51
        |   |
        |   Project[bytearray][0] - scope-45
        |   |
        |   Project[bytearray][1] - scope-47
        |   |
        |   Project[bytearray][2] - scope-49
        |
        |---A: 
Load(/user/pig/tests/data/singlefile/studenttab10k:org.apache.pig.builtin.PigStorage)
 - scope-44--------

Spark node scope-77
B: 
Store(hdfs://bdpe42:8020/user/root/ms_1.out.1:org.apache.pig.builtin.PigStorage)
 - scope-56
|
|---Load(hdfs://bdpe42:8020/tmp/temp1378261290/tmp-519681347:org.apache.pig.impl.io.InterStorage)
 - scope-76--------

Spark node scope-82
D: 
Store(hdfs://bdpe42:8020/user/root/ms_1.out.2:org.apache.pig.builtin.PigStorage)
 - scope-73
|
|---D: New For Each(true,true)[tuple] - scope-72
    |   |
    |   Project[bag][1] - scope-70
    |   |
    |   Project[bag][2] - scope-71
    |
    |---D: Package(Packager)[tuple]{bytearray} - scope-65
        |
        |---D: Global Rearrange[tuple] - scope-64
            |
            |---D: Local Rearrange[tuple]{bytearray}(false) - scope-66
            |   |   |
            |   |   Project[bytearray][0] - scope-67
            |   |
            |   
|---Load(hdfs://bdpe42:8020/tmp/temp1378261290/tmp-519681347:org.apache.pig.impl.io.InterStorage)
 - scope-78
            |
            |---D: Local Rearrange[tuple]{bytearray}(false) - scope-68
                |   |
                |   Project[bytearray][0] - scope-69
                |
                |---C: POStream[perl -ne 'print $_;' 
(stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)]
 - scope-61
                    |
                    
|---Load(hdfs://bdpe42:8020/tmp/temp1378261290/tmp-519681347:org.apache.pig.impl.io.InterStorage)
 - scope-80--------

{code}

after multiquery optimization
{code}
after multiquery optimization:
scope-74
#--------------------------------------------------
# Spark Plan                                  
#--------------------------------------------------

Spark node scope-74
Split - scope-86
|   |
|   B: 
Store(hdfs://bdpe42:8020/user/root/ms_1.out.1:org.apache.pig.builtin.PigStorage)
 - scope-56
|   |
|   D: 
Store(hdfs://bdpe42:8020/user/root/ms_1.out.2:org.apache.pig.builtin.PigStorage)
 - scope-73
|   |
|   |---D: New For Each(true,true)[tuple] - scope-72
|       |   |
|       |   Project[bag][1] - scope-70
|       |   |
|       |   Project[bag][2] - scope-71
|       |
|       |---POJoinGroupSpark[tuple] - scope-64
|           |
|           |---C: POStream[perl -ne 'print $_;' 
(stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)]
 - scope-61
|
|---B: POStream[perl -ne 'print $_;' 
(stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)]
 - scope-52
    |
    |---A: New For Each(false,false,false)[bag] - scope-51
        |   |
        |   Project[bytearray][0] - scope-45
        |   |
        |   Project[bytearray][1] - scope-47
        |   |
        |   Project[bytearray][2] - scope-49
        |
        |---A: 
Load(/user/pig/tests/data/singlefile/studenttab10k:org.apache.pig.builtin.PigStorage)
 - scope-44--------
scope-74
#--------------------------------------------------
# Spark Plan                                  
#--------------------------------------------------

Spark node scope-74
Split - scope-86
|   |
|   B: 
Store(hdfs://bdpe42:8020/user/root/ms_1.out.1:org.apache.pig.builtin.PigStorage)
 - scope-56
|   |
|   D: 
Store(hdfs://bdpe42:8020/user/root/ms_1.out.2:org.apache.pig.builtin.PigStorage)
 - scope-73
|   |
|   |---D: New For Each(true,true)[tuple] - scope-72
|       |   |
|       |   Project[bag][1] - scope-70
|       |   |
|       |   Project[bag][2] - scope-71
|       |
|       |---POJoinGroupSpark[tuple] - scope-64
|           |
|           |---C: POStream[perl -ne 'print $_;' 
(stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)]
 - scope-61
|
|---B: POStream[perl -ne 'print $_;' 
(stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)]
 - scope-52
    |
    |---A: New For Each(false,false,false)[bag] - scope-51
        |   |
        |   Project[bytearray][0] - scope-45
        |   |
        |   Project[bytearray][1] - scope-47
        |   |
        |   Project[bytearray][2] - scope-49
        |
        |---A: 
Load(/user/pig/tests/data/singlefile/studenttab10k:org.apache.pig.builtin.PigStorage)
 - scope-44--------
{code}

the predecessor of scope-64 is scope-52 and scope-61 while in current code the 
predecessor of scope-64 is only scope-61

> MultiQuery_Streaming_1 is failing with spark exec type
> ------------------------------------------------------
>
>                 Key: PIG-5163
>                 URL: https://issues.apache.org/jira/browse/PIG-5163
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Nandor Kollar
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>
> 2nd output was empty, looks like pig on spark didn't generate any data.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to