[ 
https://issues.apache.org/jira/browse/SPARK-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruhui Wang updated SPARK-20295:
-------------------------------
    Description: 
when spark.sql.exchange.reuse is opened, then run a query with self join(such 
as tpcds-q95), the physical plan will become below randomly:

WholeStageCodegen
:  +- Project [id#0L]
:     +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None
:        :- Project [id#0L]
:        :  +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None
:        :     :- Range 0, 1, 4, 1024, [id#0L]
:        :     +- INPUT
:        +- INPUT
:- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
:  +- WholeStageCodegen
:     :  +- Range 0, 1, 4, 1024, [id#1L]
+- ReusedExchange [id#2L], BroadcastExchange 
HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))

If spark.sql.adaptive.enabled = true,  the code stack is : 
ShuffleExchange#doExecute --> postShuffleRDD function --> 
doEstimationIfNecessary . In this function, 
assert(exchanges.length == numExchanges) will be error, as left side has only 
one element, but right is equal to 2.

If this is a bug of spark.sql.adaptive.enabled and exchange resue?

  was:

when spark.sql.exchange.reuse is opened, then run a query with self join(such 
as tpcds-q95), the physical plan will become below randomly:

WholeStageCodegen
:  +- Project [id#0L]
:     +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None
:        :- Project [id#0L]
:        :  +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None
:        :     :- Range 0, 1, 4, 1024, [id#0L]
:        :     +- INPUT
:        +- INPUT
:- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
:  +- WholeStageCodegen
:     :  +- Range 0, 1, 4, 1024, [id#1L]
+- ReusedExchange [id#2L], BroadcastExchange 
HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))

If spark.sql.adaptive.enabled = true,  the code stack is : 
ShuffleExchange#doExecute --> postShuffleRDD function --> 
doEstimationIfNecessary . In this function, 
assert(exchanges.length == numExchanges) will be error, as left side has only 
one element, but right is equal to 2.

If this is a bug of spark.sql.adaptive.enabled and exchange resue


> when  spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-20295
>                 URL: https://issues.apache.org/jira/browse/SPARK-20295
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, SQL
>    Affects Versions: 2.1.0
>            Reporter: Ruhui Wang
>
> when spark.sql.exchange.reuse is opened, then run a query with self join(such 
> as tpcds-q95), the physical plan will become below randomly:
> WholeStageCodegen
> :  +- Project [id#0L]
> :     +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None
> :        :- Project [id#0L]
> :        :  +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None
> :        :     :- Range 0, 1, 4, 1024, [id#0L]
> :        :     +- INPUT
> :        +- INPUT
> :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
> :  +- WholeStageCodegen
> :     :  +- Range 0, 1, 4, 1024, [id#1L]
> +- ReusedExchange [id#2L], BroadcastExchange 
> HashedRelationBroadcastMode(true,List(id#1L),List(id#1L))
> If spark.sql.adaptive.enabled = true,  the code stack is : 
> ShuffleExchange#doExecute --> postShuffleRDD function --> 
> doEstimationIfNecessary . In this function, 
> assert(exchanges.length == numExchanges) will be error, as left side has only 
> one element, but right is equal to 2.
> If this is a bug of spark.sql.adaptive.enabled and exchange resue?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to