Ruhui Wang created SPARK-20295: ---------------------------------- Summary: when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue Key: SPARK-20295 URL: https://issues.apache.org/jira/browse/SPARK-20295 Project: Spark Issue Type: Bug Components: Shuffle, SQL Affects Versions: 2.1.0 Reporter: Ruhui Wang
when spark.sql.exchange.reuse is opened, then run a query with self join(such as tpcds-q95), the physical plan will become below randomly: WholeStageCodegen : +- Project [id#0L] : +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None : :- Project [id#0L] : : +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None : : :- Range 0, 1, 4, 1024, [id#0L] : : +- INPUT : +- INPUT :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) : +- WholeStageCodegen : : +- Range 0, 1, 4, 1024, [id#1L] +- ReusedExchange [id#2L], BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) If spark.sql.adaptive.enabled = true, the code stack is : ShuffleExchange#doExecute --> postShuffleRDD function --> doEstimationIfNecessary . In this function, assert(exchanges.length == numExchanges) will be error, as left side has only one element, but right is equal to 2. If this is a bug of spark.sql.adaptive.enabled and exchange resue -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org