[jira] [Updated] (SPARK-20295) when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue
[ https://issues.apache.org/jira/browse/SPARK-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruhui Wang updated SPARK-20295: --- Description: when run tpcds-q95, and set spark.sql.adaptive.enabled = true the physical plan firstly: Sort : +- Exchange(coordinator id: 1) : +- Project*** ::-Sort ** :: +- Exchange(coordinator id: 2) :: :- Project *** :+- Sort :: +- Exchange(coordinator id: 3) spark.sql.exchange.reuse is opened, then physical plan will become below: Sort : +- Exchange(coordinator id: 1) : +- Project*** ::-Sort ** :: +- Exchange(coordinator id: 2) :: :- Project *** :+- Sort :: +- ReusedExchange Exchange(coordinator id: 2) If spark.sql.adaptive.enabled = true, the code stack is : ShuffleExchange#doExecute --> postShuffleRDD function --> doEstimationIfNecessary . In this function, assert(exchanges.length == numExchanges) will be error, as left side has only one element, but right is equal to 2. If this is a bug of spark.sql.adaptive.enabled and exchange resue? was: when spark.sql.exchange.reuse is opened, then run a query with self join(such as tpcds-q95), the physical plan will become below randomly: WholeStageCodegen : +- Project [id#0L] : +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None ::- Project [id#0L] :: +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None :: :- Range 0, 1, 4, 1024, [id#0L] :: +- INPUT :+- INPUT :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) : +- WholeStageCodegen : : +- Range 0, 1, 4, 1024, [id#1L] +- ReusedExchange [id#2L], BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) If spark.sql.adaptive.enabled = true, the code stack is : ShuffleExchange#doExecute --> postShuffleRDD function --> doEstimationIfNecessary . In this function, assert(exchanges.length == numExchanges) will be error, as left side has only one element, but right is equal to 2. If this is a bug of spark.sql.adaptive.enabled and exchange resue? > when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue > -- > > Key: SPARK-20295 > URL: https://issues.apache.org/jira/browse/SPARK-20295 > Project: Spark > Issue Type: Bug > Components: Shuffle, SQL >Affects Versions: 2.1.0 >Reporter: Ruhui Wang > > when run tpcds-q95, and set spark.sql.adaptive.enabled = true the physical > plan firstly: > Sort > : +- Exchange(coordinator id: 1) > : +- Project*** > ::-Sort ** > :: +- Exchange(coordinator id: 2) > :: :- Project *** > :+- Sort > :: +- Exchange(coordinator id: 3) > spark.sql.exchange.reuse is opened, then physical plan will become below: > Sort > : +- Exchange(coordinator id: 1) > : +- Project*** > ::-Sort ** > :: +- Exchange(coordinator id: 2) > :: :- Project *** > :+- Sort > :: +- ReusedExchange Exchange(coordinator id: 2) > If spark.sql.adaptive.enabled = true, the code stack is : > ShuffleExchange#doExecute --> postShuffleRDD function --> > doEstimationIfNecessary . In this function, > assert(exchanges.length == numExchanges) will be error, as left side has only > one element, but right is equal to 2. > If this is a bug of spark.sql.adaptive.enabled and exchange resue? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-20295) when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue
[ https://issues.apache.org/jira/browse/SPARK-20295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruhui Wang updated SPARK-20295: --- Description: when spark.sql.exchange.reuse is opened, then run a query with self join(such as tpcds-q95), the physical plan will become below randomly: WholeStageCodegen : +- Project [id#0L] : +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None ::- Project [id#0L] :: +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None :: :- Range 0, 1, 4, 1024, [id#0L] :: +- INPUT :+- INPUT :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) : +- WholeStageCodegen : : +- Range 0, 1, 4, 1024, [id#1L] +- ReusedExchange [id#2L], BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) If spark.sql.adaptive.enabled = true, the code stack is : ShuffleExchange#doExecute --> postShuffleRDD function --> doEstimationIfNecessary . In this function, assert(exchanges.length == numExchanges) will be error, as left side has only one element, but right is equal to 2. If this is a bug of spark.sql.adaptive.enabled and exchange resue? was: when spark.sql.exchange.reuse is opened, then run a query with self join(such as tpcds-q95), the physical plan will become below randomly: WholeStageCodegen : +- Project [id#0L] : +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None ::- Project [id#0L] :: +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None :: :- Range 0, 1, 4, 1024, [id#0L] :: +- INPUT :+- INPUT :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) : +- WholeStageCodegen : : +- Range 0, 1, 4, 1024, [id#1L] +- ReusedExchange [id#2L], BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) If spark.sql.adaptive.enabled = true, the code stack is : ShuffleExchange#doExecute --> postShuffleRDD function --> doEstimationIfNecessary . In this function, assert(exchanges.length == numExchanges) will be error, as left side has only one element, but right is equal to 2. If this is a bug of spark.sql.adaptive.enabled and exchange resue > when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue > -- > > Key: SPARK-20295 > URL: https://issues.apache.org/jira/browse/SPARK-20295 > Project: Spark > Issue Type: Bug > Components: Shuffle, SQL >Affects Versions: 2.1.0 >Reporter: Ruhui Wang > > when spark.sql.exchange.reuse is opened, then run a query with self join(such > as tpcds-q95), the physical plan will become below randomly: > WholeStageCodegen > : +- Project [id#0L] > : +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None > ::- Project [id#0L] > :: +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None > :: :- Range 0, 1, 4, 1024, [id#0L] > :: +- INPUT > :+- INPUT > :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) > : +- WholeStageCodegen > : : +- Range 0, 1, 4, 1024, [id#1L] > +- ReusedExchange [id#2L], BroadcastExchange > HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) > If spark.sql.adaptive.enabled = true, the code stack is : > ShuffleExchange#doExecute --> postShuffleRDD function --> > doEstimationIfNecessary . In this function, > assert(exchanges.length == numExchanges) will be error, as left side has only > one element, but right is equal to 2. > If this is a bug of spark.sql.adaptive.enabled and exchange resue? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20295) when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue
Ruhui Wang created SPARK-20295: -- Summary: when spark.sql.adaptive.enabled is enabled, have conflict with Exchange Resue Key: SPARK-20295 URL: https://issues.apache.org/jira/browse/SPARK-20295 Project: Spark Issue Type: Bug Components: Shuffle, SQL Affects Versions: 2.1.0 Reporter: Ruhui Wang when spark.sql.exchange.reuse is opened, then run a query with self join(such as tpcds-q95), the physical plan will become below randomly: WholeStageCodegen : +- Project [id#0L] : +- BroadcastHashJoin [id#0L], [id#2L], Inner, BuildRight, None ::- Project [id#0L] :: +- BroadcastHashJoin [id#0L], [id#1L], Inner, BuildRight, None :: :- Range 0, 1, 4, 1024, [id#0L] :: +- INPUT :+- INPUT :- BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) : +- WholeStageCodegen : : +- Range 0, 1, 4, 1024, [id#1L] +- ReusedExchange [id#2L], BroadcastExchange HashedRelationBroadcastMode(true,List(id#1L),List(id#1L)) If spark.sql.adaptive.enabled = true, the code stack is : ShuffleExchange#doExecute --> postShuffleRDD function --> doEstimationIfNecessary . In this function, assert(exchanges.length == numExchanges) will be error, as left side has only one element, but right is equal to 2. If this is a bug of spark.sql.adaptive.enabled and exchange resue -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-20137) In Spark1.5 I can 'cache table as select' for many times. In Spark2.1 it will show error TempTableAlreadyExistsException
Ruhui Wang created SPARK-20137: -- Summary: In Spark1.5 I can 'cache table as select' for many times. In Spark2.1 it will show error TempTableAlreadyExistsException Key: SPARK-20137 URL: https://issues.apache.org/jira/browse/SPARK-20137 Project: Spark Issue Type: Question Components: SQL Affects Versions: 2.1.0 Reporter: Ruhui Wang About 'cache table as select'. In Spark1.5, I can run this sql for many times: cache table t as select * from t1; cache table t as select * from t2; In Spark2.1 when I run the second sql will show "Error in query: Temporary table 't1' already exists;" Why Spark2.1 don't support this sql?? -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org