[jira] [Updated] (SPARK-37099) Impl a rank-based filter to optimize top-k computation
[ https://issues.apache.org/jira/browse/SPARK-37099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-37099: - Description: in JD, we found that more than 90% usage of window function follows this pattern: {code:java} select (... (row_number|rank|dense_rank) () over( [partition by ...] order by ... ) as rn) where rn (==|<|<=) k and other conditions{code} However, existing physical plan is not optimum: 1, we should select local top-k records within each partitions, and then compute the global top-k. this can help reduce the shuffle amount; For these three rank functions (row_number|rank|dense_rank), the rank of a key computed on partitial dataset is always <= its final rank computed on the whole dataset. so we can safely discard rows with partitial rank > k, anywhere. 2, skewed-window: some partition is skewed and take a long time to finish computation. A real-world skewed-window case in our system is attached. was: in JD, we found that more than 90% usage of window function follows this pattern: {code:java} select (... (row_number|rank|dense_rank) () over( [partition by ...] order by ... ) as rn) where rn (==|<|<=) k and other conditions{code} However, existing physical plan is not optimum: 1, we should select local top-k records within each partitions, and then compute the global top-k. this can help reduce the shuffle amount; For these three rank functions (row_number|rank|dense_rank), the rank of a key computed on partitial dataset is always <= its final rank computed on the whole dataset. so we can safely discard rows with partitial rank > rn, anywhere. 2, skewed-window: some partition is skewed and take a long time to finish computation. A real-world skewed-window case in our system is attached. > Impl a rank-based filter to optimize top-k computation > -- > > Key: SPARK-37099 > URL: https://issues.apache.org/jira/browse/SPARK-37099 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: zhengruifeng >Priority: Major > Attachments: q67.png, q67_optimized.png, skewed_window.png > > > in JD, we found that more than 90% usage of window function follows this > pattern: > {code:java} > select (... (row_number|rank|dense_rank) () over( [partition by ...] order > by ... ) as rn) > where rn (==|<|<=) k and other conditions{code} > > However, existing physical plan is not optimum: > > 1, we should select local top-k records within each partitions, and then > compute the global top-k. this can help reduce the shuffle amount; > > For these three rank functions (row_number|rank|dense_rank), the rank of a > key computed on partitial dataset is always <= its final rank computed on > the whole dataset. so we can safely discard rows with partitial rank > k, > anywhere. > > > 2, skewed-window: some partition is skewed and take a long time to finish > computation. > > A real-world skewed-window case in our system is attached. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37099) Introduce a rank-based filter to optimize top-k computation
[ https://issues.apache.org/jira/browse/SPARK-37099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-37099: - Summary: Introduce a rank-based filter to optimize top-k computation (was: Impl a rank-based filter to optimize top-k computation) > Introduce a rank-based filter to optimize top-k computation > --- > > Key: SPARK-37099 > URL: https://issues.apache.org/jira/browse/SPARK-37099 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: zhengruifeng >Priority: Major > Attachments: q67.png, q67_optimized.png, skewed_window.png > > > in JD, we found that more than 90% usage of window function follows this > pattern: > {code:java} > select (... (row_number|rank|dense_rank) () over( [partition by ...] order > by ... ) as rn) > where rn (==|<|<=) k and other conditions{code} > > However, existing physical plan is not optimum: > > 1, we should select local top-k records within each partitions, and then > compute the global top-k. this can help reduce the shuffle amount; > > For these three rank functions (row_number|rank|dense_rank), the rank of a > key computed on partitial dataset is always <= its final rank computed on > the whole dataset. so we can safely discard rows with partitial rank > k, > anywhere. > > > 2, skewed-window: some partition is skewed and take a long time to finish > computation. > > A real-world skewed-window case in our system is attached. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37099) Introduce a rank-based filter to optimize top-k computation
[ https://issues.apache.org/jira/browse/SPARK-37099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-37099: - Affects Version/s: 3.4.0 (was: 3.3.0) > Introduce a rank-based filter to optimize top-k computation > --- > > Key: SPARK-37099 > URL: https://issues.apache.org/jira/browse/SPARK-37099 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: zhengruifeng >Priority: Major > Attachments: q67.png, q67_optimized.png, skewed_window.png > > > in JD, we found that more than 90% usage of window function follows this > pattern: > {code:java} > select (... (row_number|rank|dense_rank) () over( [partition by ...] order > by ... ) as rn) > where rn (==|<|<=) k and other conditions{code} > > However, existing physical plan is not optimum: > > 1, we should select local top-k records within each partitions, and then > compute the global top-k. this can help reduce the shuffle amount; > > For these three rank functions (row_number|rank|dense_rank), the rank of a > key computed on partitial dataset is always <= its final rank computed on > the whole dataset. so we can safely discard rows with partitial rank > k, > anywhere. > > > 2, skewed-window: some partition is skewed and take a long time to finish > computation. > > A real-world skewed-window case in our system is attached. > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36638) Generalize OptimizeSkewedJoin
[ https://issues.apache.org/jira/browse/SPARK-36638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengruifeng updated SPARK-36638: - Affects Version/s: 3.4.0 (was: 3.3.0) > Generalize OptimizeSkewedJoin > - > > Key: SPARK-36638 > URL: https://issues.apache.org/jira/browse/SPARK-36638 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: zhengruifeng >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38710) use SparkArithmeticException in Arithmetic overflow runtime errors
Gengliang Wang created SPARK-38710: -- Summary: use SparkArithmeticException in Arithmetic overflow runtime errors Key: SPARK-38710 URL: https://issues.apache.org/jira/browse/SPARK-38710 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang use SparkArithmeticException in Arithmetic overflow runtime errors, instead of java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38710) use SparkArithmeticException for Arithmetic overflow runtime errors
[ https://issues.apache.org/jira/browse/SPARK-38710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-38710: --- Summary: use SparkArithmeticException for Arithmetic overflow runtime errors (was: use SparkArithmeticException in Arithmetic overflow runtime errors) > use SparkArithmeticException for Arithmetic overflow runtime errors > --- > > Key: SPARK-38710 > URL: https://issues.apache.org/jira/browse/SPARK-38710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > > use SparkArithmeticException in Arithmetic overflow runtime errors, instead > of > java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38710) use SparkArithmeticException for Arithmetic overflow runtime errors
[ https://issues.apache.org/jira/browse/SPARK-38710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515142#comment-17515142 ] Apache Spark commented on SPARK-38710: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/36022 > use SparkArithmeticException for Arithmetic overflow runtime errors > --- > > Key: SPARK-38710 > URL: https://issues.apache.org/jira/browse/SPARK-38710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > > use SparkArithmeticException in Arithmetic overflow runtime errors, instead > of > java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38710) use SparkArithmeticException for Arithmetic overflow runtime errors
[ https://issues.apache.org/jira/browse/SPARK-38710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38710: Assignee: Apache Spark (was: Gengliang Wang) > use SparkArithmeticException for Arithmetic overflow runtime errors > --- > > Key: SPARK-38710 > URL: https://issues.apache.org/jira/browse/SPARK-38710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Minor > > use SparkArithmeticException in Arithmetic overflow runtime errors, instead > of > java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38710) use SparkArithmeticException for Arithmetic overflow runtime errors
[ https://issues.apache.org/jira/browse/SPARK-38710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38710: Assignee: Gengliang Wang (was: Apache Spark) > use SparkArithmeticException for Arithmetic overflow runtime errors > --- > > Key: SPARK-38710 > URL: https://issues.apache.org/jira/browse/SPARK-38710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > > use SparkArithmeticException in Arithmetic overflow runtime errors, instead > of > java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38709) remove trailing $ from function class name in sql-expression-schema.md
[ https://issues.apache.org/jira/browse/SPARK-38709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-38709. --- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36021 [https://github.com/apache/spark/pull/36021] > remove trailing $ from function class name in sql-expression-schema.md > -- > > Key: SPARK-38709 > URL: https://issues.apache.org/jira/browse/SPARK-38709 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38709) remove trailing $ from function class name in sql-expression-schema.md
[ https://issues.apache.org/jira/browse/SPARK-38709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-38709: - Assignee: Wenchen Fan > remove trailing $ from function class name in sql-expression-schema.md > -- > > Key: SPARK-38709 > URL: https://issues.apache.org/jira/browse/SPARK-38709 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38709) remove trailing $ from function class name in sql-expression-schema.md
[ https://issues.apache.org/jira/browse/SPARK-38709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-38709: Affects Version/s: 3.3.0 (was: 3.4.0) > remove trailing $ from function class name in sql-expression-schema.md > -- > > Key: SPARK-38709 > URL: https://issues.apache.org/jira/browse/SPARK-38709 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Fix For: 3.3.0 > > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38711) Refactor pyspark.sql.streaming module
Hyukjin Kwon created SPARK-38711: Summary: Refactor pyspark.sql.streaming module Key: SPARK-38711 URL: https://issues.apache.org/jira/browse/SPARK-38711 Project: Spark Issue Type: Improvement Components: PySpark, Structured Streaming Affects Versions: 3.4.0 Reporter: Hyukjin Kwon pyspark.sql.streaming is a single file that has both I/O and Streaming query instances. We should better separate them like we do in dataframe.py and readwriter.,py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38711) Refactor pyspark.sql.streaming module
[ https://issues.apache.org/jira/browse/SPARK-38711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515209#comment-17515209 ] Apache Spark commented on SPARK-38711: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/36023 > Refactor pyspark.sql.streaming module > - > > Key: SPARK-38711 > URL: https://issues.apache.org/jira/browse/SPARK-38711 > Project: Spark > Issue Type: Improvement > Components: PySpark, Structured Streaming >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Priority: Major > > pyspark.sql.streaming is a single file that has both I/O and Streaming query > instances. We should better separate them like we do in dataframe.py and > readwriter.,py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38711) Refactor pyspark.sql.streaming module
[ https://issues.apache.org/jira/browse/SPARK-38711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38711: Assignee: Apache Spark > Refactor pyspark.sql.streaming module > - > > Key: SPARK-38711 > URL: https://issues.apache.org/jira/browse/SPARK-38711 > Project: Spark > Issue Type: Improvement > Components: PySpark, Structured Streaming >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Assignee: Apache Spark >Priority: Major > > pyspark.sql.streaming is a single file that has both I/O and Streaming query > instances. We should better separate them like we do in dataframe.py and > readwriter.,py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38711) Refactor pyspark.sql.streaming module
[ https://issues.apache.org/jira/browse/SPARK-38711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38711: Assignee: (was: Apache Spark) > Refactor pyspark.sql.streaming module > - > > Key: SPARK-38711 > URL: https://issues.apache.org/jira/browse/SPARK-38711 > Project: Spark > Issue Type: Improvement > Components: PySpark, Structured Streaming >Affects Versions: 3.4.0 >Reporter: Hyukjin Kwon >Priority: Major > > pyspark.sql.streaming is a single file that has both I/O and Streaming query > instances. We should better separate them like we do in dataframe.py and > readwriter.,py -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38695) ORC can not surport the dataType,such as char or varchar
[ https://issues.apache.org/jira/browse/SPARK-38695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515222#comment-17515222 ] Kent Yao commented on SPARK-38695: -- do you see this error in spark 3.2.1 too? It shall be resolved since 3.1.3/3.2.0 by https://issues.apache.org/jira/browse/SPARK-35700 > ORC can not surport the dataType,such as char or varchar > > > Key: SPARK-38695 > URL: https://issues.apache.org/jira/browse/SPARK-38695 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2, 3.2.1 >Reporter: jacky >Priority: Major > > When testing Spark performance with TPCDS,run some sql,such as:q1,I found > this error > java.lang.UnsupportedOperationException: DataType: char(2) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getPredicateLeafType(OrcFilters.scala:150) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getType$1(OrcFilters.scala:222) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:266) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:132) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:135) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) > at scala.collection.immutable.List.foreach(List.scala:392) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) > at > scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) > at scala.collection.immutable.List.flatMap(List.scala:355) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:134) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:73) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4$adapted(OrcFileFormat.scala > > I used the sql to create table,such as > create table customer > stored as orc > as select * from tpdc_text.customer > CLUSTER BY c_customer_sk > > create table store > stored as orc > as select * from tpdc_text.store > CLUSTER BY s_store_sk > > create table date_dim > stored as orc > as select * from tpdc_text.date_dim; > > create table store_returns > ( > sr_return_time_sk bigint > , sr_item_sk bigint > , sr_customer_sk bigint > , sr_cdemo_sk bigint > , sr_hdemo_sk bigint > , sr_addr_sk bigint > , sr_store_sk bigint > , sr_reason_sk bigint > , sr_ticket_number bigint > , sr_return_quantity int > , sr_return_amt decimal(7,2) > , sr_return_tax decimal(7,2) > , sr_return_amt_inc_tax decimal(7,2) > , sr_fee decimal(7,2) > , sr_return_ship_cost decimal(7,2) > , sr_refunded_cash decimal(7,2) > , sr_reversed_charge decimal(7,2) > , sr_store_credit decimal(7,2) > , sr_net_loss decimal(7,2) > ) > partitioned by (sr_returned_date_sk bigint) > stored as orc; > > when I modify this code in the classOrcFilters,I can run succeed > /** > * Get PredicateLeafType which is corresponding to the given DataType. > */ > def getPredicateLeafType(dataType: DataType): PredicateLeaf.Type = dataType > match { > case BooleanType => PredicateLeaf.Type.BOOLEAN > case ByteType | ShortType | IntegerType | LongType => PredicateLeaf.Type.LONG > case FloatType | DoubleType => PredicateLeaf.Type.FLOAT > case StringType => PredicateLeaf.Type.STRING > {color:#ff}case CharType(length) => PredicateLeaf.Type.STRING{color} > {color:#ff} case VarcharType(length) => PredicateLeaf.Type.STRING{color} > case DateType => PredicateLeaf.Type.DATE > case TimestampType => PredicateLeaf.Type.TIMESTAMP > case _: DecimalType => PredicateLeaf.Type.DECIMAL > case _ => throw new UnsupportedOperationException(s"DataType: > ${dataType.catalogString}") > } -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38695) ORC can not surport the dataType,such as char or varchar
[ https://issues.apache.org/jira/browse/SPARK-38695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kent Yao resolved SPARK-38695. -- Resolution: Duplicate This issue has been resolved and duplicated, I will close it. Feel free to reopen it if the issue still exists after you upgrade to corresponding releases or above > ORC can not surport the dataType,such as char or varchar > > > Key: SPARK-38695 > URL: https://issues.apache.org/jira/browse/SPARK-38695 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2, 3.2.1 >Reporter: jacky >Priority: Major > > When testing Spark performance with TPCDS,run some sql,such as:q1,I found > this error > java.lang.UnsupportedOperationException: DataType: char(2) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getPredicateLeafType(OrcFilters.scala:150) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getType$1(OrcFilters.scala:222) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:266) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:132) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:135) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) > at scala.collection.immutable.List.foreach(List.scala:392) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) > at > scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) > at scala.collection.immutable.List.flatMap(List.scala:355) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:134) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:73) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4$adapted(OrcFileFormat.scala > > I used the sql to create table,such as > create table customer > stored as orc > as select * from tpdc_text.customer > CLUSTER BY c_customer_sk > > create table store > stored as orc > as select * from tpdc_text.store > CLUSTER BY s_store_sk > > create table date_dim > stored as orc > as select * from tpdc_text.date_dim; > > create table store_returns > ( > sr_return_time_sk bigint > , sr_item_sk bigint > , sr_customer_sk bigint > , sr_cdemo_sk bigint > , sr_hdemo_sk bigint > , sr_addr_sk bigint > , sr_store_sk bigint > , sr_reason_sk bigint > , sr_ticket_number bigint > , sr_return_quantity int > , sr_return_amt decimal(7,2) > , sr_return_tax decimal(7,2) > , sr_return_amt_inc_tax decimal(7,2) > , sr_fee decimal(7,2) > , sr_return_ship_cost decimal(7,2) > , sr_refunded_cash decimal(7,2) > , sr_reversed_charge decimal(7,2) > , sr_store_credit decimal(7,2) > , sr_net_loss decimal(7,2) > ) > partitioned by (sr_returned_date_sk bigint) > stored as orc; > > when I modify this code in the classOrcFilters,I can run succeed > /** > * Get PredicateLeafType which is corresponding to the given DataType. > */ > def getPredicateLeafType(dataType: DataType): PredicateLeaf.Type = dataType > match { > case BooleanType => PredicateLeaf.Type.BOOLEAN > case ByteType | ShortType | IntegerType | LongType => PredicateLeaf.Type.LONG > case FloatType | DoubleType => PredicateLeaf.Type.FLOAT > case StringType => PredicateLeaf.Type.STRING > {color:#ff}case CharType(length) => PredicateLeaf.Type.STRING{color} > {color:#ff} case VarcharType(length) => PredicateLeaf.Type.STRING{color} > case DateType => PredicateLeaf.Type.DATE > case TimestampType => PredicateLeaf.Type.TIMESTAMP > case _: DecimalType => PredicateLeaf.Type.DECIMAL > case _ => throw new UnsupportedOperationException(s"DataType: > ${dataType.catalogString}") > } -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38712) Fix perf regression in ScanOperation
Wenchen Fan created SPARK-38712: --- Summary: Fix perf regression in ScanOperation Key: SPARK-38712 URL: https://issues.apache.org/jira/browse/SPARK-38712 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Wenchen Fan -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38713) Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set
Jackey Lee created SPARK-38713: -- Summary: Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set Key: SPARK-38713 URL: https://issues.apache.org/jira/browse/SPARK-38713 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.4.0 Reporter: Jackey Lee In the sql module, we provide {{SparkSession.conf}} as a unified entry for {{{}SQLConf.set/get{}}}, which can prevent users or logic from modifying StaticSQLConf and Spark configs. However, I found {{SparkSession.sessionstate.conf}} is used in some code to getConf or setConf, which can skip the check of {{{}RuntimeConfig{}}}. In this PR, we want to unify the behavior of {{SQLConf.getConf/setConf}} to {{{}SparkSession.conf{}}}. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-37938) Use error classes in the parsing errors of partitions
[ https://issues.apache.org/jira/browse/SPARK-37938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515237#comment-17515237 ] leesf commented on SPARK-37938: --- [~maxgekk] I not working on it yet, but would love to take the ticket and work on it. > Use error classes in the parsing errors of partitions > - > > Key: SPARK-37938 > URL: https://issues.apache.org/jira/browse/SPARK-37938 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Max Gekk >Priority: Major > > Migrate the following errors in QueryParsingErrors: > * emptyPartitionKeyError > * partitionTransformNotExpectedError > * descColumnForPartitionUnsupportedError > * incompletePartitionSpecificationError > onto use error classes. Throw an implementation of SparkThrowable. Also write > a test per every error in QueryParsingErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38695) ORC can not surport the dataType,such as char or varchar
[ https://issues.apache.org/jira/browse/SPARK-38695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jacky updated SPARK-38695: -- Affects Version/s: (was: 3.2.1) > ORC can not surport the dataType,such as char or varchar > > > Key: SPARK-38695 > URL: https://issues.apache.org/jira/browse/SPARK-38695 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 >Reporter: jacky >Priority: Major > > When testing Spark performance with TPCDS,run some sql,such as:q1,I found > this error > java.lang.UnsupportedOperationException: DataType: char(2) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getPredicateLeafType(OrcFilters.scala:150) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getType$1(OrcFilters.scala:222) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:266) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:132) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:135) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) > at scala.collection.immutable.List.foreach(List.scala:392) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) > at > scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) > at scala.collection.immutable.List.flatMap(List.scala:355) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:134) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:73) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4$adapted(OrcFileFormat.scala > > I used the sql to create table,such as > create table customer > stored as orc > as select * from tpdc_text.customer > CLUSTER BY c_customer_sk > > create table store > stored as orc > as select * from tpdc_text.store > CLUSTER BY s_store_sk > > create table date_dim > stored as orc > as select * from tpdc_text.date_dim; > > create table store_returns > ( > sr_return_time_sk bigint > , sr_item_sk bigint > , sr_customer_sk bigint > , sr_cdemo_sk bigint > , sr_hdemo_sk bigint > , sr_addr_sk bigint > , sr_store_sk bigint > , sr_reason_sk bigint > , sr_ticket_number bigint > , sr_return_quantity int > , sr_return_amt decimal(7,2) > , sr_return_tax decimal(7,2) > , sr_return_amt_inc_tax decimal(7,2) > , sr_fee decimal(7,2) > , sr_return_ship_cost decimal(7,2) > , sr_refunded_cash decimal(7,2) > , sr_reversed_charge decimal(7,2) > , sr_store_credit decimal(7,2) > , sr_net_loss decimal(7,2) > ) > partitioned by (sr_returned_date_sk bigint) > stored as orc; > > when I modify this code in the classOrcFilters,I can run succeed > /** > * Get PredicateLeafType which is corresponding to the given DataType. > */ > def getPredicateLeafType(dataType: DataType): PredicateLeaf.Type = dataType > match { > case BooleanType => PredicateLeaf.Type.BOOLEAN > case ByteType | ShortType | IntegerType | LongType => PredicateLeaf.Type.LONG > case FloatType | DoubleType => PredicateLeaf.Type.FLOAT > case StringType => PredicateLeaf.Type.STRING > {color:#ff}case CharType(length) => PredicateLeaf.Type.STRING{color} > {color:#ff} case VarcharType(length) => PredicateLeaf.Type.STRING{color} > case DateType => PredicateLeaf.Type.DATE > case TimestampType => PredicateLeaf.Type.TIMESTAMP > case _: DecimalType => PredicateLeaf.Type.DECIMAL > case _ => throw new UnsupportedOperationException(s"DataType: > ${dataType.catalogString}") > } -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38713) Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set
[ https://issues.apache.org/jira/browse/SPARK-38713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-38713. - Fix Version/s: 3.4.0 Assignee: Jackey Lee Resolution: Fixed > Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set > -- > > Key: SPARK-38713 > URL: https://issues.apache.org/jira/browse/SPARK-38713 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Jackey Lee >Assignee: Jackey Lee >Priority: Major > Fix For: 3.4.0 > > > In the sql module, we provide {{SparkSession.conf}} as a unified entry for > {{{}SQLConf.set/get{}}}, which can prevent users or logic from modifying > StaticSQLConf and Spark configs. However, I found > {{SparkSession.sessionstate.conf}} is used in some code to getConf or > setConf, which can skip the check of {{{}RuntimeConfig{}}}. > In this PR, we want to unify the behavior of {{SQLConf.getConf/setConf}} to > {{{}SparkSession.conf{}}}. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38714) Interval multiplication error
chong created SPARK-38714: - Summary: Interval multiplication error Key: SPARK-38714 URL: https://issues.apache.org/jira/browse/SPARK-38714 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Environment: branch-3.3, Java 8 Reporter: chong Code gen have error when multipling interval by a decimal. $SPARK_HOME/bin/spark-shell import org.apache.spark.sql.Row import java.time.Duration import java.time.Period import org.apache.spark.sql.types._ val data = Seq(Row(new java.math.BigDecimal("123456789.11"))) val schema = StructType(Seq( StructField("c1", DecimalType(9, 2)), )) val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema) df.selectExpr("interval '100' second * c1").show(false) errors are: java.lang.AssertionError: assertion failed: Decimal$DecimalIsFractional while compiling: during phase: globalPhase=terminal, enteringPhase=jvm library version: version 2.12.15 compiler version: version 2.12.15 reconstructed args: -classpath -Yrepl-class-based -Yrepl-outdir /tmp/spark-83a0cda4-dd0b-472e-ad8b-a4b33b85f613/repl-06489815-5366-4aa0-9419-f01abda8d041 last tree to typer: TypeTree(class Byte) tree position: line 6 of tree tpe: Byte symbol: (final abstract) class Byte in package scala symbol definition: final abstract class Byte extends (a ClassSymbol) symbol package: scala symbol owners: class Byte call site: constructor $eval in object $eval in package $line21 == Source file context for tree position == 3 4 object $eval { 5 lazy val $result = $line21.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.res0 6 lazy val $print: _root_.java.lang.String = { 7 $line21.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw 8 9 "" at scala.reflect.internal.SymbolTable.throwAssertionError(SymbolTable.scala:185) at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1525) at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1514) at scala.reflect.internal.Symbols$Symbol.flatOwnerInfo(Symbols.scala:2353) at scala.reflect.internal.Symbols$ClassSymbol.companionModule0(Symbols.scala:3346) at scala.reflect.internal.Symbols$ClassSymbol.companionModule(Symbols.scala:3348) at scala.reflect.internal.Symbols$ModuleClassSymbol.sourceModule(Symbols.scala:3487) at scala.reflect.internal.Symbols.$anonfun$forEachRelevantSymbols$1$adapted(Symbols.scala:3802) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) at scala.reflect.internal.Symbols.markFlagsCompleted(Symbols.scala:3799) at scala.reflect.internal.Symbols.markFlagsCompleted$(Symbols.scala:3805) at scala.reflect.internal.SymbolTable.markFlagsCompleted(SymbolTable.scala:28) at scala.reflect.internal.pickling.UnPickler$Scan.finishSym$1(UnPickler.scala:324) at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:342) at scala.reflect.internal.pickling.UnPickler$Scan.readSymbolRef(UnPickler.scala:645) at scala.reflect.internal.pickling.UnPickler$Scan.readType(UnPickler.scala:413) at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$readSymbol$10(UnPickler.scala:357) at scala.reflect.internal.pickling.UnPickler$Scan.at(UnPickler.scala:188) at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:357) at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$run$1(UnPickler.scala:96) at scala.reflect.internal.pickling.UnPickler$Scan.run(UnPickler.scala:88) at scala.reflect.internal.pickling.UnPickler.unpickle(UnPickler.scala:47) at scala.tools.nsc.symtab.classfile.ClassfileParser.unpickleOrParseInnerClasses(ClassfileParser.scala:1186) at scala.tools.nsc.symtab.classfile.ClassfileParser.parseClass(ClassfileParser.scala:468) at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$2(ClassfileParser.scala:161) at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$1(ClassfileParser.scala:147) at scala.tools.nsc.symtab.classfile.ClassfileParser.parse(ClassfileParser.scala:130) at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:343) at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:250) at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.load(SymbolLoaders.scala:269) at scala.reflect.internal.Symbols$Symbol.exists(Symbols.scala:1104) at scala.reflect.internal.Symbols$Symbol.toOption(Symbols.scala:2609) at scala.tools.nsc.interpreter.IMain.translateSimpleResource(IMain.scala:340) at scala.tools.nsc.interpreter.IMain$TranslatingClassLoader.findAbstractFile(IMain.scala:354) at scala.reflect.internal.util.AbstractFileClassLoader.findResource(AbstractFileClassLoader.scala:76) at java.lang.ClassLoader.getResource(ClassLoader.java:1089) at java.lang.Cl
[jira] [Commented] (SPARK-38713) Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set
[ https://issues.apache.org/jira/browse/SPARK-38713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515247#comment-17515247 ] Wenchen Fan commented on SPARK-38713: - Resolved by https://github.com/apache/spark/pull/35950 > Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set > -- > > Key: SPARK-38713 > URL: https://issues.apache.org/jira/browse/SPARK-38713 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Jackey Lee >Priority: Major > > In the sql module, we provide {{SparkSession.conf}} as a unified entry for > {{{}SQLConf.set/get{}}}, which can prevent users or logic from modifying > StaticSQLConf and Spark configs. However, I found > {{SparkSession.sessionstate.conf}} is used in some code to getConf or > setConf, which can skip the check of {{{}RuntimeConfig{}}}. > In this PR, we want to unify the behavior of {{SQLConf.getConf/setConf}} to > {{{}SparkSession.conf{}}}. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38695) ORC can not surport the dataType,such as char or varchar
[ https://issues.apache.org/jira/browse/SPARK-38695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515248#comment-17515248 ] jacky commented on SPARK-38695: --- I see this error in spark 3.1.2 > ORC can not surport the dataType,such as char or varchar > > > Key: SPARK-38695 > URL: https://issues.apache.org/jira/browse/SPARK-38695 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 >Reporter: jacky >Priority: Major > > When testing Spark performance with TPCDS,run some sql,such as:q1,I found > this error > java.lang.UnsupportedOperationException: DataType: char(2) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getPredicateLeafType(OrcFilters.scala:150) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.getType$1(OrcFilters.scala:222) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.buildLeafSearchArgument(OrcFilters.scala:266) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFiltersHelper$1(OrcFilters.scala:132) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.$anonfun$convertibleFilters$4(OrcFilters.scala:135) > at > scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245) > at scala.collection.immutable.List.foreach(List.scala:392) > at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245) > at > scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242) > at scala.collection.immutable.List.flatMap(List.scala:355) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.convertibleFilters(OrcFilters.scala:134) > at > org.apache.spark.sql.execution.datasources.orc.OrcFilters$.createFilter(OrcFilters.scala:73) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4(OrcFileFormat.scala:189) > at > org.apache.spark.sql.execution.datasources.orc.OrcFileFormat.$anonfun$buildReaderWithPartitionValues$4$adapted(OrcFileFormat.scala > > I used the sql to create table,such as > create table customer > stored as orc > as select * from tpdc_text.customer > CLUSTER BY c_customer_sk > > create table store > stored as orc > as select * from tpdc_text.store > CLUSTER BY s_store_sk > > create table date_dim > stored as orc > as select * from tpdc_text.date_dim; > > create table store_returns > ( > sr_return_time_sk bigint > , sr_item_sk bigint > , sr_customer_sk bigint > , sr_cdemo_sk bigint > , sr_hdemo_sk bigint > , sr_addr_sk bigint > , sr_store_sk bigint > , sr_reason_sk bigint > , sr_ticket_number bigint > , sr_return_quantity int > , sr_return_amt decimal(7,2) > , sr_return_tax decimal(7,2) > , sr_return_amt_inc_tax decimal(7,2) > , sr_fee decimal(7,2) > , sr_return_ship_cost decimal(7,2) > , sr_refunded_cash decimal(7,2) > , sr_reversed_charge decimal(7,2) > , sr_store_credit decimal(7,2) > , sr_net_loss decimal(7,2) > ) > partitioned by (sr_returned_date_sk bigint) > stored as orc; > > when I modify this code in the classOrcFilters,I can run succeed > /** > * Get PredicateLeafType which is corresponding to the given DataType. > */ > def getPredicateLeafType(dataType: DataType): PredicateLeaf.Type = dataType > match { > case BooleanType => PredicateLeaf.Type.BOOLEAN > case ByteType | ShortType | IntegerType | LongType => PredicateLeaf.Type.LONG > case FloatType | DoubleType => PredicateLeaf.Type.FLOAT > case StringType => PredicateLeaf.Type.STRING > {color:#ff}case CharType(length) => PredicateLeaf.Type.STRING{color} > {color:#ff} case VarcharType(length) => PredicateLeaf.Type.STRING{color} > case DateType => PredicateLeaf.Type.DATE > case TimestampType => PredicateLeaf.Type.TIMESTAMP > case _: DecimalType => PredicateLeaf.Type.DECIMAL > case _ => throw new UnsupportedOperationException(s"DataType: > ${dataType.catalogString}") > } -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38714) Interval multiplication error
[ https://issues.apache.org/jira/browse/SPARK-38714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chong updated SPARK-38714: -- Description: Code gen have error when multipling interval by a decimal. $SPARK_HOME/bin/spark-shell import org.apache.spark.sql.Row import java.time.Duration import java.time.Period import org.apache.spark.sql.types._ val data = Seq(Row(new java.math.BigDecimal("123456789.11"))) val schema = StructType(Seq( StructField("c1", DecimalType(9, 2)), )) val df = spark.createDataFrame(spark.sparkContext.parallelize(data), schema) df.selectExpr("interval '100' second * c1").show(false) errors are: *{color:#FF}java.lang.AssertionError: assertion failed:{color}* Decimal$DecimalIsFractional while compiling: during phase: globalPhase=terminal, enteringPhase=jvm library version: version 2.12.15 compiler version: version 2.12.15 reconstructed args: -classpath -Yrepl-class-based -Yrepl-outdir /tmp/spark-83a0cda4-dd0b-472e-ad8b-a4b33b85f613/repl-06489815-5366-4aa0-9419-f01abda8d041 last tree to typer: TypeTree(class Byte) tree position: line 6 of tree tpe: Byte symbol: (final abstract) class Byte in package scala symbol definition: final abstract class Byte extends (a ClassSymbol) symbol package: scala symbol owners: class Byte call site: constructor $eval in object $eval in package $line21 == Source file context for tree position == 3 4 object $eval { 5 lazy val $result = $line21.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.res0 6 lazy val $print: {_}root{_}.java.lang.String = { 7 $line21.$read.INSTANCE.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw.$iw 8 9 "" at scala.reflect.internal.SymbolTable.throwAssertionError(SymbolTable.scala:185) at scala.reflect.internal.Symbols$Symbol.completeInfo(Symbols.scala:1525) at scala.reflect.internal.Symbols$Symbol.info(Symbols.scala:1514) at scala.reflect.internal.Symbols$Symbol.flatOwnerInfo(Symbols.scala:2353) at scala.reflect.internal.Symbols$ClassSymbol.companionModule0(Symbols.scala:3346) at scala.reflect.internal.Symbols$ClassSymbol.companionModule(Symbols.scala:3348) at scala.reflect.internal.Symbols$ModuleClassSymbol.sourceModule(Symbols.scala:3487) at scala.reflect.internal.Symbols.$anonfun$forEachRelevantSymbols$1$adapted(Symbols.scala:3802) at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) at scala.reflect.internal.Symbols.markFlagsCompleted(Symbols.scala:3799) at scala.reflect.internal.Symbols.markFlagsCompleted$(Symbols.scala:3805) at scala.reflect.internal.SymbolTable.markFlagsCompleted(SymbolTable.scala:28) at scala.reflect.internal.pickling.UnPickler$Scan.finishSym$1(UnPickler.scala:324) at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:342) at scala.reflect.internal.pickling.UnPickler$Scan.readSymbolRef(UnPickler.scala:645) at scala.reflect.internal.pickling.UnPickler$Scan.readType(UnPickler.scala:413) at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$readSymbol$10(UnPickler.scala:357) at scala.reflect.internal.pickling.UnPickler$Scan.at(UnPickler.scala:188) at scala.reflect.internal.pickling.UnPickler$Scan.readSymbol(UnPickler.scala:357) at scala.reflect.internal.pickling.UnPickler$Scan.$anonfun$run$1(UnPickler.scala:96) at scala.reflect.internal.pickling.UnPickler$Scan.run(UnPickler.scala:88) at scala.reflect.internal.pickling.UnPickler.unpickle(UnPickler.scala:47) at scala.tools.nsc.symtab.classfile.ClassfileParser.unpickleOrParseInnerClasses(ClassfileParser.scala:1186) at scala.tools.nsc.symtab.classfile.ClassfileParser.parseClass(ClassfileParser.scala:468) at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$2(ClassfileParser.scala:161) at scala.tools.nsc.symtab.classfile.ClassfileParser.$anonfun$parse$1(ClassfileParser.scala:147) at scala.tools.nsc.symtab.classfile.ClassfileParser.parse(ClassfileParser.scala:130) at scala.tools.nsc.symtab.SymbolLoaders$ClassfileLoader.doComplete(SymbolLoaders.scala:343) at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.complete(SymbolLoaders.scala:250) at scala.tools.nsc.symtab.SymbolLoaders$SymbolLoader.load(SymbolLoaders.scala:269) at scala.reflect.internal.Symbols$Symbol.exists(Symbols.scala:1104) at scala.reflect.internal.Symbols$Symbol.toOption(Symbols.scala:2609) at scala.tools.nsc.interpreter.IMain.translateSimpleResource(IMain.scala:340) at scala.tools.nsc.interpreter.IMain$TranslatingClassLoader.findAbstractFile(IMain.scala:354) at scala.reflect.internal.util.AbstractFileClassLoader.findResource(AbstractFileClassLoader.scala:76) at java.lang.ClassLoader.getResource(ClassLoader.java:1089) at java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:1300) at scala.reflect.internal.util.RichClassLoader$.classAsStream$extension(ScalaClassLoader.scala:89) at scala.reflect
[jira] [Commented] (SPARK-38713) Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set
[ https://issues.apache.org/jira/browse/SPARK-38713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515275#comment-17515275 ] Apache Spark commented on SPARK-38713: -- User 'jackylee-ch' has created a pull request for this issue: https://github.com/apache/spark/pull/35950 > Change spark.sessionstate.conf.getConf/setConf operation to spark.conf.get/set > -- > > Key: SPARK-38713 > URL: https://issues.apache.org/jira/browse/SPARK-38713 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.4.0 >Reporter: Jackey Lee >Assignee: Jackey Lee >Priority: Major > Fix For: 3.4.0 > > > In the sql module, we provide {{SparkSession.conf}} as a unified entry for > {{{}SQLConf.set/get{}}}, which can prevent users or logic from modifying > StaticSQLConf and Spark configs. However, I found > {{SparkSession.sessionstate.conf}} is used in some code to getConf or > setConf, which can skip the check of {{{}RuntimeConfig{}}}. > In this PR, we want to unify the behavior of {{SQLConf.getConf/setConf}} to > {{{}SparkSession.conf{}}}. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38712) Fix perf regression in ScanOperation
[ https://issues.apache.org/jira/browse/SPARK-38712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515276#comment-17515276 ] Apache Spark commented on SPARK-38712: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/36024 > Fix perf regression in ScanOperation > > > Key: SPARK-38712 > URL: https://issues.apache.org/jira/browse/SPARK-38712 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38712) Fix perf regression in ScanOperation
[ https://issues.apache.org/jira/browse/SPARK-38712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38712: Assignee: Apache Spark > Fix perf regression in ScanOperation > > > Key: SPARK-38712 > URL: https://issues.apache.org/jira/browse/SPARK-38712 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Wenchen Fan >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38712) Fix perf regression in ScanOperation
[ https://issues.apache.org/jira/browse/SPARK-38712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38712: Assignee: (was: Apache Spark) > Fix perf regression in ScanOperation > > > Key: SPARK-38712 > URL: https://issues.apache.org/jira/browse/SPARK-38712 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38702) Is there a spark (spark-sql_2.12) using log4j 2.17.X ?
[ https://issues.apache.org/jira/browse/SPARK-38702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515305#comment-17515305 ] Itay Erlichman commented on SPARK-38702: Hi - Thanks, do you have a version with log4j 2.17 compatible with 3.1.0 ? can you also tell how should we set the repositories at POM so it would take snapshots ? for some reason i couldn't pull this dependency . Thanks, Itay > Is there a spark (spark-sql_2.12) using log4j 2.17.X ? > --- > > Key: SPARK-38702 > URL: https://issues.apache.org/jira/browse/SPARK-38702 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.1.0 >Reporter: Itay Erlichman >Priority: Major > > Hello - > our system alerting that our spark dependency brings log4j 1.2.17 and i could > not find any updated release having log4j 2.17 and above. > the dependency i'm using is : > org.apache.spark > spark-sql_2.12 > 3.1.0 > Is there a plan to fix this ? anyway to avoid it ? > org.apache.spark > spark-sql_2.12 > 3.1.0 > org.apache.spark > spark-sql_2.12 > 3.1.0 > org.apache.spark > spark-sql_2.12 > 3.1.0 > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38333) DPP cause DataSourceScanExec java.lang.NullPointerException
[ https://issues.apache.org/jira/browse/SPARK-38333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-38333: --- Assignee: jiahong.li > DPP cause DataSourceScanExec java.lang.NullPointerException > --- > > Key: SPARK-38333 > URL: https://issues.apache.org/jira/browse/SPARK-38333 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 >Reporter: jiahong.li >Assignee: jiahong.li >Priority: Major > > In DPP,we trigger NPE,like blow: > Caused by: java.lang.NullPointerException > at > org.apache.spark.sql.execution.DataSourceScanExec.$init$(DataSourceScanExec.scala:57) > at > org.apache.spark.sql.execution.FileSourceScanExec.(DataSourceScanExec.scala:172) > ... > at > org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:56) > at > org.apache.spark.sql.catalyst.expressions.Predicate$.create(predicates.scala:101) > at > org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2(basicPhysicalOperators.scala:246) > at > org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2$adapted(basicPhysicalOperators.scala:245) > at > org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:885) > ,the root cause is addExprTree funtion in EquivalentExpressions: > ``` > def addExprTree( > expr: Expression, > addFunc: Expression => Boolean = addExpr): Unit = { > val skip = expr.isInstanceOf[LeafExpression] || > // `LambdaVariable` is usually used as a loop variable, which can't be > evaluated ahead of the > // loop. So we can't evaluate sub-expressions containing `LambdaVariable` at > the beginning. > expr.find(_.isInstanceOf[LambdaVariable]).isDefined || > // `PlanExpression` wraps query plan. To compare query plans of > `PlanExpression` on executor, > // can cause error like NPE. > (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null) > if (!skip && !addFunc(expr)) { > childrenToRecurse(expr).foreach(addExprTree(_, addFunc)) > commonChildrenToRecurse(expr).filter(_.nonEmpty).foreach(addCommonExprs(_, > addFunc)) > ``` > maybe we should change it like this : > ``` > (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != > null) > ``` > because, in DPP,the filter expression like this: > DynamicPruningExpression(InSubqueryExec(value, broadcastValues, exprId) > so, we should iterator children, if PlanExpression found, such as > InSubqueryExec, we should skip addExprTree, then NPE will not appears -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38333) DPP cause DataSourceScanExec java.lang.NullPointerException
[ https://issues.apache.org/jira/browse/SPARK-38333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-38333. - Fix Version/s: 3.3.0 3.2.2 3.1.3 Resolution: Fixed Issue resolved by pull request 36012 [https://github.com/apache/spark/pull/36012] > DPP cause DataSourceScanExec java.lang.NullPointerException > --- > > Key: SPARK-38333 > URL: https://issues.apache.org/jira/browse/SPARK-38333 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.2 >Reporter: jiahong.li >Assignee: jiahong.li >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.1.3 > > > In DPP,we trigger NPE,like blow: > Caused by: java.lang.NullPointerException > at > org.apache.spark.sql.execution.DataSourceScanExec.$init$(DataSourceScanExec.scala:57) > at > org.apache.spark.sql.execution.FileSourceScanExec.(DataSourceScanExec.scala:172) > ... > at > org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:56) > at > org.apache.spark.sql.catalyst.expressions.Predicate$.create(predicates.scala:101) > at > org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2(basicPhysicalOperators.scala:246) > at > org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2$adapted(basicPhysicalOperators.scala:245) > at > org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:885) > ,the root cause is addExprTree funtion in EquivalentExpressions: > ``` > def addExprTree( > expr: Expression, > addFunc: Expression => Boolean = addExpr): Unit = { > val skip = expr.isInstanceOf[LeafExpression] || > // `LambdaVariable` is usually used as a loop variable, which can't be > evaluated ahead of the > // loop. So we can't evaluate sub-expressions containing `LambdaVariable` at > the beginning. > expr.find(_.isInstanceOf[LambdaVariable]).isDefined || > // `PlanExpression` wraps query plan. To compare query plans of > `PlanExpression` on executor, > // can cause error like NPE. > (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null) > if (!skip && !addFunc(expr)) { > childrenToRecurse(expr).foreach(addExprTree(_, addFunc)) > commonChildrenToRecurse(expr).filter(_.nonEmpty).foreach(addCommonExprs(_, > addFunc)) > ``` > maybe we should change it like this : > ``` > (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != > null) > ``` > because, in DPP,the filter expression like this: > DynamicPruningExpression(InSubqueryExec(value, broadcastValues, exprId) > so, we should iterator children, if PlanExpression found, such as > InSubqueryExec, we should skip addExprTree, then NPE will not appears -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-37831) Add task partition id in metrics
[ https://issues.apache.org/jira/browse/SPARK-37831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-37831: --- Assignee: Jackey Lee > Add task partition id in metrics > > > Key: SPARK-37831 > URL: https://issues.apache.org/jira/browse/SPARK-37831 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Jackey Lee >Assignee: Jackey Lee >Priority: Major > > There is no partition id in current metrics, it makes difficult to trace > stage metrics, such as stage shuffle read, especially when there are stage > retries. It is also impossible to check task metrics between different > applications. > {code:java} > class TaskData private[spark]( > val taskId: Long, > val index: Int, > val attempt: Int, > val launchTime: Date, > val resultFetchStart: Option[Date], > @JsonDeserialize(contentAs = classOf[JLong]) > val duration: Option[Long], > val executorId: String, > val host: String, > val status: String, > val taskLocality: String, > val speculative: Boolean, > val accumulatorUpdates: Seq[AccumulableInfo], > val errorMessage: Option[String] = None, > val taskMetrics: Option[TaskMetrics] = None, > val executorLogs: Map[String, String], > val schedulerDelay: Long, > val gettingResultTime: Long) {code} > Adding partitionId in Task Data can not only make us easy to trace task > metrics, also can make it possible to collect metrics for actual stage > outputs, especially when stage retries. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-37831) Add task partition id in metrics
[ https://issues.apache.org/jira/browse/SPARK-37831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-37831. - Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 35185 [https://github.com/apache/spark/pull/35185] > Add task partition id in metrics > > > Key: SPARK-37831 > URL: https://issues.apache.org/jira/browse/SPARK-37831 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Jackey Lee >Assignee: Jackey Lee >Priority: Major > Fix For: 3.3.0 > > > There is no partition id in current metrics, it makes difficult to trace > stage metrics, such as stage shuffle read, especially when there are stage > retries. It is also impossible to check task metrics between different > applications. > {code:java} > class TaskData private[spark]( > val taskId: Long, > val index: Int, > val attempt: Int, > val launchTime: Date, > val resultFetchStart: Option[Date], > @JsonDeserialize(contentAs = classOf[JLong]) > val duration: Option[Long], > val executorId: String, > val host: String, > val status: String, > val taskLocality: String, > val speculative: Boolean, > val accumulatorUpdates: Seq[AccumulableInfo], > val errorMessage: Option[String] = None, > val taskMetrics: Option[TaskMetrics] = None, > val executorLogs: Map[String, String], > val schedulerDelay: Long, > val gettingResultTime: Long) {code} > Adding partitionId in Task Data can not only make us easy to trace task > metrics, also can make it possible to collect metrics for actual stage > outputs, especially when stage retries. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38679) Expose the number partitions in a stage to TaskContext
[ https://issues.apache.org/jira/browse/SPARK-38679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-38679: --- Assignee: Venki Korukanti > Expose the number partitions in a stage to TaskContext > -- > > Key: SPARK-38679 > URL: https://issues.apache.org/jira/browse/SPARK-38679 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.2.1 >Reporter: Venki Korukanti >Assignee: Venki Korukanti >Priority: Major > > Add a new api to expose total partition count in the stage belonging to the > task in TaskContext, so that the task knows what fraction of the computation > is doing. > With this extra information, users can also generate 32bit unique int ids as > below rather than using `monotonically_increasing_id` which generates 64bit > long ids. > > {code:java} > rdd.mapPartitions { rowsIter => > val partitionId = TaskContext.get().partitionId() > val numPartitions = TaskContext.get().numPartitions() > var i = 0 > rowsIter.map { row => > val rowId = partitionId + i * numPartitions > i += 1 > (rowId, row) > } > }{code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38679) Expose the number partitions in a stage to TaskContext
[ https://issues.apache.org/jira/browse/SPARK-38679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-38679. - Fix Version/s: 3.4.0 Resolution: Fixed Issue resolved by pull request 35995 [https://github.com/apache/spark/pull/35995] > Expose the number partitions in a stage to TaskContext > -- > > Key: SPARK-38679 > URL: https://issues.apache.org/jira/browse/SPARK-38679 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.2.1 >Reporter: Venki Korukanti >Assignee: Venki Korukanti >Priority: Major > Fix For: 3.4.0 > > > Add a new api to expose total partition count in the stage belonging to the > task in TaskContext, so that the task knows what fraction of the computation > is doing. > With this extra information, users can also generate 32bit unique int ids as > below rather than using `monotonically_increasing_id` which generates 64bit > long ids. > > {code:java} > rdd.mapPartitions { rowsIter => > val partitionId = TaskContext.get().partitionId() > val numPartitions = TaskContext.get().numPartitions() > var i = 0 > rowsIter.map { row => > val rowId = partitionId + i * numPartitions > i += 1 > (rowId, row) > } > }{code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38715) Would be nice to be able to configure a client ID pattern in Kafka integration
Cédric Chantepie created SPARK-38715: Summary: Would be nice to be able to configure a client ID pattern in Kafka integration Key: SPARK-38715 URL: https://issues.apache.org/jira/browse/SPARK-38715 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 3.0.0 Reporter: Cédric Chantepie By default Kafka client automatically generated a unique client ID. Client ID is used by many data lineage tool to gather consumer/producer (for consumer the consumer group is also used, but only client ID can be used for producer). Setting the [client.id](https://kafka.apache.org/documentation/#producerconfigs_client.id) is options passed to Spark Kafka read or write is not possible, as it would force the same client.id on at east both the driver and the executor. What could be done is to be able to passed Spark specific option, maybe named `clientIdPrefix`. e.g. ```scala val df = spark .read .format("kafka") .option("kafka.bootstrap.servers", "host1:port1,host2:port2") .option("subscribePattern", "topic.*") .option("startingOffsets", "earliest") .option("endingOffsets", "latest") .option("clientIdPrefix", "my-workflow-") .load() ``` Possible implement would be to update [InternalKafkaProducerPool](https://github.com/apache/spark/blob/master/connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/producer/InternalKafkaProducerPool.scala#L75), or maybe in Spark `KafkaConfigUpdater` ? -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38716) Provide query context in runtime error of map key not exists
Gengliang Wang created SPARK-38716: -- Summary: Provide query context in runtime error of map key not exists Key: SPARK-38716 URL: https://issues.apache.org/jira/browse/SPARK-38716 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.3.0 Reporter: Gengliang Wang Assignee: Gengliang Wang -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38716) Provide query context in runtime error of map key not exists
[ https://issues.apache.org/jira/browse/SPARK-38716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515348#comment-17515348 ] Apache Spark commented on SPARK-38716: -- User 'gengliangwang' has created a pull request for this issue: https://github.com/apache/spark/pull/36025 > Provide query context in runtime error of map key not exists > > > Key: SPARK-38716 > URL: https://issues.apache.org/jira/browse/SPARK-38716 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38716) Provide query context in runtime error of map key not exists
[ https://issues.apache.org/jira/browse/SPARK-38716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38716: Assignee: Gengliang Wang (was: Apache Spark) > Provide query context in runtime error of map key not exists > > > Key: SPARK-38716 > URL: https://issues.apache.org/jira/browse/SPARK-38716 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38716) Provide query context in runtime error of map key not exists
[ https://issues.apache.org/jira/browse/SPARK-38716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38716: Assignee: Apache Spark (was: Gengliang Wang) > Provide query context in runtime error of map key not exists > > > Key: SPARK-38716 > URL: https://issues.apache.org/jira/browse/SPARK-38716 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-38710) use SparkArithmeticException for Arithmetic overflow runtime errors
[ https://issues.apache.org/jira/browse/SPARK-38710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang resolved SPARK-38710. Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 36022 [https://github.com/apache/spark/pull/36022] > use SparkArithmeticException for Arithmetic overflow runtime errors > --- > > Key: SPARK-38710 > URL: https://issues.apache.org/jira/browse/SPARK-38710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.3.0 >Reporter: Gengliang Wang >Assignee: Gengliang Wang >Priority: Minor > Fix For: 3.3.0 > > > use SparkArithmeticException in Arithmetic overflow runtime errors, instead > of > java.lang.ArithmeticException -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38588) Validate input dataset of ml.classification
[ https://issues.apache.org/jira/browse/SPARK-38588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515418#comment-17515418 ] Apache Spark commented on SPARK-38588: -- User 'jackylee-ch' has created a pull request for this issue: https://github.com/apache/spark/pull/36026 > Validate input dataset of ml.classification > --- > > Key: SPARK-38588 > URL: https://issues.apache.org/jira/browse/SPARK-38588 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 3.4.0 >Reporter: zhengruifeng >Priority: Major > Fix For: 3.4.0 > > > LinearSVC should fail fast if the input dataset contains invalid values. > > {code:java} > import org.apache.spark.ml.feature._ > import org.apache.spark.ml.linalg._ > import org.apache.spark.ml.classification._ > import org.apache.spark.ml.clustering._ > val df = sc.parallelize(Seq(LabeledPoint(1.0, Vectors.dense(1.0, > Double.NaN)), LabeledPoint(0.0, Vectors.dense(Double.PositiveInfinity, > 2.0.toDF() > val svc = new LinearSVC() > val model = svc.fit(df) > scala> model.intercept > res0: Double = NaN > scala> model.coefficients > res1: org.apache.spark.ml.linalg.Vector = [NaN,NaN] {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38717) Handle Hive's bucket spec case preserving behaviour
Peter Toth created SPARK-38717: -- Summary: Handle Hive's bucket spec case preserving behaviour Key: SPARK-38717 URL: https://issues.apache.org/jira/browse/SPARK-38717 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.3.0 Reporter: Peter Toth {code} CREATE TABLE t( c STRING, B_C STRING ) PARTITIONED BY (p_c STRING) CLUSTERED BY (B_C) INTO 4 BUCKETS STORED AS PARQUET {code} then {code} SELECT * FROM t {code} fails with: {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Bucket columns B_C is not part of the table columns ([FieldSchema(name:c, type:string, comment:null), FieldSchema(name:b_c, type:string, comment:null)] at org.apache.hadoop.hive.ql.metadata.Table.setBucketCols(Table.java:552) at org.apache.spark.sql.hive.client.HiveClientImpl$.toHiveTable(HiveClientImpl.scala:1098) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getPartitionsByFilter$1(HiveClientImpl.scala:764) at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:294) at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:225) at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:224) at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:274) at org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:763) at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitionsByFilter$1(HiveExternalCatalog.scala:1287) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101) ... 110 more {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38718) Test the error class: AMBIGUOUS_FIELD_NAME
Max Gekk created SPARK-38718: Summary: Test the error class: AMBIGUOUS_FIELD_NAME Key: SPARK-38718 URL: https://issues.apache.org/jira/browse/SPARK-38718 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class AMBIGUOUS_FIELD_NAME to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38718) Test the error class: AMBIGUOUS_FIELD_NAME
[ https://issues.apache.org/jira/browse/SPARK-38718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38718: - Labels: starter (was: ) > Test the error class: AMBIGUOUS_FIELD_NAME > -- > > Key: SPARK-38718 > URL: https://issues.apache.org/jira/browse/SPARK-38718 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class AMBIGUOUS_FIELD_NAME to > QueryCompilationErrorsSuite. The test should cover the exception throw in > QueryCompilationErrors: > {code:scala} > def ambiguousFieldNameError( > fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { > new AnalysisException( > errorClass = "AMBIGUOUS_FIELD_NAME", > messageParameters = Array(fieldName.quoted, numMatches.toString), > origin = context) > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38718) Test the error class: AMBIGUOUS_FIELD_NAME
[ https://issues.apache.org/jira/browse/SPARK-38718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38718: - Description: Add at least one test for the error class AMBIGUOUS_FIELD_NAME to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} For example, here is a test for the error class UNSUPPORTED_FEATURE: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 was: Add at least one test for the error class AMBIGUOUS_FIELD_NAME to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} > Test the error class: AMBIGUOUS_FIELD_NAME > -- > > Key: SPARK-38718 > URL: https://issues.apache.org/jira/browse/SPARK-38718 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class AMBIGUOUS_FIELD_NAME to > QueryCompilationErrorsSuite. The test should cover the exception throw in > QueryCompilationErrors: > {code:scala} > def ambiguousFieldNameError( > fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { > new AnalysisException( > errorClass = "AMBIGUOUS_FIELD_NAME", > messageParameters = Array(fieldName.quoted, numMatches.toString), > origin = context) > } > {code} > For example, here is a test for the error class UNSUPPORTED_FEATURE: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE
Max Gekk created SPARK-38719: Summary: Test the error class: CANNOT_CAST_DATATYPE Key: SPARK-38719 URL: https://issues.apache.org/jira/browse/SPARK-38719 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class AMBIGUOUS_FIELD_NAME to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} For example, here is a test for the error class UNSUPPORTED_FEATURE: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38719: - Description: Add at least one test for the error class *CANNOT_CAST_DATATYPE* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 was: Add at least one test for the error class AMBIGUOUS_FIELD_NAME to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} For example, here is a test for the error class UNSUPPORTED_FEATURE: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CANNOT_CAST_DATATYPE > -- > > Key: SPARK-38719 > URL: https://issues.apache.org/jira/browse/SPARK-38719 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CANNOT_CAST_DATATYPE* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def cannotCastFromNullTypeError(to: DataType): Throwable = { > new SparkException(errorClass = "CANNOT_CAST_DATATYPE", > messageParameters = Array(NullType.typeName, to.typeName), null) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38718) Test the error class: AMBIGUOUS_FIELD_NAME
[ https://issues.apache.org/jira/browse/SPARK-38718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38718: - Description: Add at least one test for the error class *AMBIGUOUS_FIELD_NAME* to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 was: Add at least one test for the error class AMBIGUOUS_FIELD_NAME to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} For example, here is a test for the error class UNSUPPORTED_FEATURE: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: AMBIGUOUS_FIELD_NAME > -- > > Key: SPARK-38718 > URL: https://issues.apache.org/jira/browse/SPARK-38718 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *AMBIGUOUS_FIELD_NAME* to > QueryCompilationErrorsSuite. The test should cover the exception throw in > QueryCompilationErrors: > {code:scala} > def ambiguousFieldNameError( > fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { > new AnalysisException( > errorClass = "AMBIGUOUS_FIELD_NAME", > messageParameters = Array(fieldName.quoted, numMatches.toString), > origin = context) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38720) Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION
Max Gekk created SPARK-38720: Summary: Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION Key: SPARK-38720 URL: https://issues.apache.org/jira/browse/SPARK-38720 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *CANNOT_CAST_DATATYPE* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38720) Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION
[ https://issues.apache.org/jira/browse/SPARK-38720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38720: - Description: Add at least one test for the error class *CANNOT_CHANGE_DECIMAL_PRECISION* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotChangeDecimalPrecisionError( value: Decimal, decimalPrecision: Int, decimalScale: Int): ArithmeticException = { new SparkArithmeticException(errorClass = "CANNOT_CHANGE_DECIMAL_PRECISION", messageParameters = Array(value.toDebugString, decimalPrecision.toString, decimalScale.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 was: Add at least one test for the error class *CANNOT_CAST_DATATYPE* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION > - > > Key: SPARK-38720 > URL: https://issues.apache.org/jira/browse/SPARK-38720 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CANNOT_CHANGE_DECIMAL_PRECISION* > to QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def cannotChangeDecimalPrecisionError( > value: Decimal, decimalPrecision: Int, decimalScale: Int): > ArithmeticException = { > new SparkArithmeticException(errorClass = > "CANNOT_CHANGE_DECIMAL_PRECISION", > messageParameters = Array(value.toDebugString, > decimalPrecision.toString, decimalScale.toString, > SQLConf.ANSI_ENABLED.key)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38721) Test the error class: CANNOT_PARSE_DECIMAL
[ https://issues.apache.org/jira/browse/SPARK-38721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38721: - Description: Add at least one test for the error class *CANNOT_PARSE_DECIMAL* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotParseDecimalError(): Throwable = { new SparkIllegalStateException(errorClass = "CANNOT_PARSE_DECIMAL", messageParameters = Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 was: Add at least one test for the error class *CANNOT_CHANGE_DECIMAL_PRECISION* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotChangeDecimalPrecisionError( value: Decimal, decimalPrecision: Int, decimalScale: Int): ArithmeticException = { new SparkArithmeticException(errorClass = "CANNOT_CHANGE_DECIMAL_PRECISION", messageParameters = Array(value.toDebugString, decimalPrecision.toString, decimalScale.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CANNOT_PARSE_DECIMAL > -- > > Key: SPARK-38721 > URL: https://issues.apache.org/jira/browse/SPARK-38721 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CANNOT_PARSE_DECIMAL* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def cannotParseDecimalError(): Throwable = { > new SparkIllegalStateException(errorClass = "CANNOT_PARSE_DECIMAL", > messageParameters = Array.empty) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38721) Test the error class: CANNOT_PARSE_DECIMAL
Max Gekk created SPARK-38721: Summary: Test the error class: CANNOT_PARSE_DECIMAL Key: SPARK-38721 URL: https://issues.apache.org/jira/browse/SPARK-38721 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *CANNOT_CHANGE_DECIMAL_PRECISION* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotChangeDecimalPrecisionError( value: Decimal, decimalPrecision: Int, decimalScale: Int): ArithmeticException = { new SparkArithmeticException(errorClass = "CANNOT_CHANGE_DECIMAL_PRECISION", messageParameters = Array(value.toDebugString, decimalPrecision.toString, decimalScale.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38722) Test the error class: CAST_CAUSES_OVERFLOW
Max Gekk created SPARK-38722: Summary: Test the error class: CAST_CAUSES_OVERFLOW Key: SPARK-38722 URL: https://issues.apache.org/jira/browse/SPARK-38722 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *CANNOT_PARSE_DECIMAL* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotParseDecimalError(): Throwable = { new SparkIllegalStateException(errorClass = "CANNOT_PARSE_DECIMAL", messageParameters = Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38722) Test the error class: CAST_CAUSES_OVERFLOW
[ https://issues.apache.org/jira/browse/SPARK-38722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38722: - Description: Add at least one test for the error class *CAST_CAUSES_OVERFLOW* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def castingCauseOverflowError(t: Any, dataType: DataType): ArithmeticException = { new SparkArithmeticException(errorClass = "CAST_CAUSES_OVERFLOW", messageParameters = Array(t.toString, dataType.catalogString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 was: Add at least one test for the error class *CANNOT_PARSE_DECIMAL* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotParseDecimalError(): Throwable = { new SparkIllegalStateException(errorClass = "CANNOT_PARSE_DECIMAL", messageParameters = Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CAST_CAUSES_OVERFLOW > -- > > Key: SPARK-38722 > URL: https://issues.apache.org/jira/browse/SPARK-38722 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CAST_CAUSES_OVERFLOW* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def castingCauseOverflowError(t: Any, dataType: DataType): > ArithmeticException = { > new SparkArithmeticException(errorClass = "CAST_CAUSES_OVERFLOW", > messageParameters = Array(t.toString, dataType.catalogString, > SQLConf.ANSI_ENABLED.key)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38723) Test the error class: CONCURRENT_QUERY
Max Gekk created SPARK-38723: Summary: Test the error class: CONCURRENT_QUERY Key: SPARK-38723 URL: https://issues.apache.org/jira/browse/SPARK-38723 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *CAST_CAUSES_OVERFLOW* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def castingCauseOverflowError(t: Any, dataType: DataType): ArithmeticException = { new SparkArithmeticException(errorClass = "CAST_CAUSES_OVERFLOW", messageParameters = Array(t.toString, dataType.catalogString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38723) Test the error class: CONCURRENT_QUERY
[ https://issues.apache.org/jira/browse/SPARK-38723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38723: - Description: Add at least one test for the error class *CONCURRENT_QUERY* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def concurrentQueryInstanceError(): Throwable = { new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 was: Add at least one test for the error class *CAST_CAUSES_OVERFLOW* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def castingCauseOverflowError(t: Any, dataType: DataType): ArithmeticException = { new SparkArithmeticException(errorClass = "CAST_CAUSES_OVERFLOW", messageParameters = Array(t.toString, dataType.catalogString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CONCURRENT_QUERY > -- > > Key: SPARK-38723 > URL: https://issues.apache.org/jira/browse/SPARK-38723 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CONCURRENT_QUERY* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def concurrentQueryInstanceError(): Throwable = { > new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38717) Handle Hive's bucket spec case preserving behaviour
[ https://issues.apache.org/jira/browse/SPARK-38717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38717: Assignee: Apache Spark > Handle Hive's bucket spec case preserving behaviour > --- > > Key: SPARK-38717 > URL: https://issues.apache.org/jira/browse/SPARK-38717 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Peter Toth >Assignee: Apache Spark >Priority: Major > > {code} > CREATE TABLE t( > c STRING, > B_C STRING > ) > PARTITIONED BY (p_c STRING) > CLUSTERED BY (B_C) INTO 4 BUCKETS > STORED AS PARQUET > {code} > then > {code} > SELECT * FROM t > {code} > fails with: > {code} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Bucket columns > B_C is not part of the table columns ([FieldSchema(name:c, type:string, > comment:null), FieldSchema(name:b_c, type:string, comment:null)] > at > org.apache.hadoop.hive.ql.metadata.Table.setBucketCols(Table.java:552) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.toHiveTable(HiveClientImpl.scala:1098) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getPartitionsByFilter$1(HiveClientImpl.scala:764) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:294) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:225) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:224) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:274) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:763) > at > org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitionsByFilter$1(HiveExternalCatalog.scala:1287) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101) > ... 110 more > {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38717) Handle Hive's bucket spec case preserving behaviour
[ https://issues.apache.org/jira/browse/SPARK-38717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515435#comment-17515435 ] Apache Spark commented on SPARK-38717: -- User 'peter-toth' has created a pull request for this issue: https://github.com/apache/spark/pull/36027 > Handle Hive's bucket spec case preserving behaviour > --- > > Key: SPARK-38717 > URL: https://issues.apache.org/jira/browse/SPARK-38717 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Peter Toth >Priority: Major > > {code} > CREATE TABLE t( > c STRING, > B_C STRING > ) > PARTITIONED BY (p_c STRING) > CLUSTERED BY (B_C) INTO 4 BUCKETS > STORED AS PARQUET > {code} > then > {code} > SELECT * FROM t > {code} > fails with: > {code} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Bucket columns > B_C is not part of the table columns ([FieldSchema(name:c, type:string, > comment:null), FieldSchema(name:b_c, type:string, comment:null)] > at > org.apache.hadoop.hive.ql.metadata.Table.setBucketCols(Table.java:552) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.toHiveTable(HiveClientImpl.scala:1098) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getPartitionsByFilter$1(HiveClientImpl.scala:764) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:294) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:225) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:224) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:274) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:763) > at > org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitionsByFilter$1(HiveExternalCatalog.scala:1287) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101) > ... 110 more > {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38717) Handle Hive's bucket spec case preserving behaviour
[ https://issues.apache.org/jira/browse/SPARK-38717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38717: Assignee: (was: Apache Spark) > Handle Hive's bucket spec case preserving behaviour > --- > > Key: SPARK-38717 > URL: https://issues.apache.org/jira/browse/SPARK-38717 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.3.0 >Reporter: Peter Toth >Priority: Major > > {code} > CREATE TABLE t( > c STRING, > B_C STRING > ) > PARTITIONED BY (p_c STRING) > CLUSTERED BY (B_C) INTO 4 BUCKETS > STORED AS PARQUET > {code} > then > {code} > SELECT * FROM t > {code} > fails with: > {code} > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Bucket columns > B_C is not part of the table columns ([FieldSchema(name:c, type:string, > comment:null), FieldSchema(name:b_c, type:string, comment:null)] > at > org.apache.hadoop.hive.ql.metadata.Table.setBucketCols(Table.java:552) > at > org.apache.spark.sql.hive.client.HiveClientImpl$.toHiveTable(HiveClientImpl.scala:1098) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getPartitionsByFilter$1(HiveClientImpl.scala:764) > at > org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:294) > at > org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:225) > at > org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:224) > at > org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:274) > at > org.apache.spark.sql.hive.client.HiveClientImpl.getPartitionsByFilter(HiveClientImpl.scala:763) > at > org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitionsByFilter$1(HiveExternalCatalog.scala:1287) > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:101) > ... 110 more > {code} > -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38723) Test the error class: CONCURRENT_QUERY
[ https://issues.apache.org/jira/browse/SPARK-38723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38723: - Description: Add at least one test for the error class *CONCURRENT_QUERY* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def concurrentQueryInstanceError(): Throwable = { new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 The test must have a check of: # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *CONCURRENT_QUERY* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def concurrentQueryInstanceError(): Throwable = { new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CONCURRENT_QUERY > -- > > Key: SPARK-38723 > URL: https://issues.apache.org/jira/browse/SPARK-38723 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CONCURRENT_QUERY* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def concurrentQueryInstanceError(): Throwable = { > new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > The test must have a check of: > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38722) Test the error class: CAST_CAUSES_OVERFLOW
[ https://issues.apache.org/jira/browse/SPARK-38722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38722: - Description: Add at least one test for the error class *CAST_CAUSES_OVERFLOW* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def castingCauseOverflowError(t: Any, dataType: DataType): ArithmeticException = { new SparkArithmeticException(errorClass = "CAST_CAUSES_OVERFLOW", messageParameters = Array(t.toString, dataType.catalogString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *CAST_CAUSES_OVERFLOW* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def castingCauseOverflowError(t: Any, dataType: DataType): ArithmeticException = { new SparkArithmeticException(errorClass = "CAST_CAUSES_OVERFLOW", messageParameters = Array(t.toString, dataType.catalogString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CAST_CAUSES_OVERFLOW > -- > > Key: SPARK-38722 > URL: https://issues.apache.org/jira/browse/SPARK-38722 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CAST_CAUSES_OVERFLOW* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def castingCauseOverflowError(t: Any, dataType: DataType): > ArithmeticException = { > new SparkArithmeticException(errorClass = "CAST_CAUSES_OVERFLOW", > messageParameters = Array(t.toString, dataType.catalogString, > SQLConf.ANSI_ENABLED.key)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38723) Test the error class: CONCURRENT_QUERY
[ https://issues.apache.org/jira/browse/SPARK-38723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38723: - Description: Add at least one test for the error class *CONCURRENT_QUERY* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def concurrentQueryInstanceError(): Throwable = { new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *CONCURRENT_QUERY* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def concurrentQueryInstanceError(): Throwable = { new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 The test must have a check of: # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: CONCURRENT_QUERY > -- > > Key: SPARK-38723 > URL: https://issues.apache.org/jira/browse/SPARK-38723 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CONCURRENT_QUERY* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def concurrentQueryInstanceError(): Throwable = { > new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38721) Test the error class: CANNOT_PARSE_DECIMAL
[ https://issues.apache.org/jira/browse/SPARK-38721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38721: - Description: Add at least one test for the error class *CANNOT_PARSE_DECIMAL* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotParseDecimalError(): Throwable = { new SparkIllegalStateException(errorClass = "CANNOT_PARSE_DECIMAL", messageParameters = Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *CANNOT_PARSE_DECIMAL* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotParseDecimalError(): Throwable = { new SparkIllegalStateException(errorClass = "CANNOT_PARSE_DECIMAL", messageParameters = Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CANNOT_PARSE_DECIMAL > -- > > Key: SPARK-38721 > URL: https://issues.apache.org/jira/browse/SPARK-38721 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CANNOT_PARSE_DECIMAL* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def cannotParseDecimalError(): Throwable = { > new SparkIllegalStateException(errorClass = "CANNOT_PARSE_DECIMAL", > messageParameters = Array.empty) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38719) Test the error class: CANNOT_CAST_DATATYPE
[ https://issues.apache.org/jira/browse/SPARK-38719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38719: - Description: Add at least one test for the error class *CANNOT_CAST_DATATYPE* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *CANNOT_CAST_DATATYPE* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotCastFromNullTypeError(to: DataType): Throwable = { new SparkException(errorClass = "CANNOT_CAST_DATATYPE", messageParameters = Array(NullType.typeName, to.typeName), null) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CANNOT_CAST_DATATYPE > -- > > Key: SPARK-38719 > URL: https://issues.apache.org/jira/browse/SPARK-38719 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CANNOT_CAST_DATATYPE* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def cannotCastFromNullTypeError(to: DataType): Throwable = { > new SparkException(errorClass = "CANNOT_CAST_DATATYPE", > messageParameters = Array(NullType.typeName, to.typeName), null) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38718) Test the error class: AMBIGUOUS_FIELD_NAME
[ https://issues.apache.org/jira/browse/SPARK-38718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38718: - Description: Add at least one test for the error class *AMBIGUOUS_FIELD_NAME* to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *AMBIGUOUS_FIELD_NAME* to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def ambiguousFieldNameError( fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { new AnalysisException( errorClass = "AMBIGUOUS_FIELD_NAME", messageParameters = Array(fieldName.quoted, numMatches.toString), origin = context) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: AMBIGUOUS_FIELD_NAME > -- > > Key: SPARK-38718 > URL: https://issues.apache.org/jira/browse/SPARK-38718 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *AMBIGUOUS_FIELD_NAME* to > QueryCompilationErrorsSuite. The test should cover the exception throw in > QueryCompilationErrors: > {code:scala} > def ambiguousFieldNameError( > fieldName: Seq[String], numMatches: Int, context: Origin): Throwable = { > new AnalysisException( > errorClass = "AMBIGUOUS_FIELD_NAME", > messageParameters = Array(fieldName.quoted, numMatches.toString), > origin = context) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38720) Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION
[ https://issues.apache.org/jira/browse/SPARK-38720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38720: - Description: Add at least one test for the error class *CANNOT_CHANGE_DECIMAL_PRECISION* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotChangeDecimalPrecisionError( value: Decimal, decimalPrecision: Int, decimalScale: Int): ArithmeticException = { new SparkArithmeticException(errorClass = "CANNOT_CHANGE_DECIMAL_PRECISION", messageParameters = Array(value.toDebugString, decimalPrecision.toString, decimalScale.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *CANNOT_CHANGE_DECIMAL_PRECISION* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def cannotChangeDecimalPrecisionError( value: Decimal, decimalPrecision: Int, decimalScale: Int): ArithmeticException = { new SparkArithmeticException(errorClass = "CANNOT_CHANGE_DECIMAL_PRECISION", messageParameters = Array(value.toDebugString, decimalPrecision.toString, decimalScale.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > Test the error class: CANNOT_CHANGE_DECIMAL_PRECISION > - > > Key: SPARK-38720 > URL: https://issues.apache.org/jira/browse/SPARK-38720 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *CANNOT_CHANGE_DECIMAL_PRECISION* > to QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def cannotChangeDecimalPrecisionError( > value: Decimal, decimalPrecision: Int, decimalScale: Int): > ArithmeticException = { > new SparkArithmeticException(errorClass = > "CANNOT_CHANGE_DECIMAL_PRECISION", > messageParameters = Array(value.toDebugString, > decimalPrecision.toString, decimalScale.toString, > SQLConf.ANSI_ENABLED.key)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38724) Test the error class: DIVIDE_BY_ZERO
Max Gekk created SPARK-38724: Summary: Test the error class: DIVIDE_BY_ZERO Key: SPARK-38724 URL: https://issues.apache.org/jira/browse/SPARK-38724 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *CONCURRENT_QUERY* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def concurrentQueryInstanceError(): Throwable = { new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38724) Test the error class: DIVIDE_BY_ZERO
[ https://issues.apache.org/jira/browse/SPARK-38724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38724: - Description: Add at least one test for the error class *DIVIDE_BY_ZERO* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def divideByZeroError(): ArithmeticException = { new SparkArithmeticException( errorClass = "DIVIDE_BY_ZERO", messageParameters = Array(SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *CONCURRENT_QUERY* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def concurrentQueryInstanceError(): Throwable = { new SparkConcurrentModificationException("CONCURRENT_QUERY", Array.empty) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: DIVIDE_BY_ZERO > > > Key: SPARK-38724 > URL: https://issues.apache.org/jira/browse/SPARK-38724 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *DIVIDE_BY_ZERO* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def divideByZeroError(): ArithmeticException = { > new SparkArithmeticException( > errorClass = "DIVIDE_BY_ZERO", messageParameters = > Array(SQLConf.ANSI_ENABLED.key)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38725) Test the error class: DUPLICATE_KEY
Max Gekk created SPARK-38725: Summary: Test the error class: DUPLICATE_KEY Key: SPARK-38725 URL: https://issues.apache.org/jira/browse/SPARK-38725 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *DIVIDE_BY_ZERO* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def divideByZeroError(): ArithmeticException = { new SparkArithmeticException( errorClass = "DIVIDE_BY_ZERO", messageParameters = Array(SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38725) Test the error class: DUPLICATE_KEY
[ https://issues.apache.org/jira/browse/SPARK-38725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38725: - Description: Add at least one test for the error class *DUPLICATE_KEY* to QueryParsingErrorsSuite. The test should cover the exception throw in QueryParsingErrors: {code:scala} def duplicateKeysError(key: String, ctx: ParserRuleContext): Throwable = { // Found duplicate keys '$key' new ParseException(errorClass = "DUPLICATE_KEY", messageParameters = Array(key), ctx) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *DIVIDE_BY_ZERO* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def divideByZeroError(): ArithmeticException = { new SparkArithmeticException( errorClass = "DIVIDE_BY_ZERO", messageParameters = Array(SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: DUPLICATE_KEY > --- > > Key: SPARK-38725 > URL: https://issues.apache.org/jira/browse/SPARK-38725 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *DUPLICATE_KEY* to > QueryParsingErrorsSuite. The test should cover the exception throw in > QueryParsingErrors: > {code:scala} > def duplicateKeysError(key: String, ctx: ParserRuleContext): Throwable = { > // Found duplicate keys '$key' > new ParseException(errorClass = "DUPLICATE_KEY", messageParameters = > Array(key), ctx) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38726) Support `how` parameter of `MultiIndex.dropna`
Xinrong Meng created SPARK-38726: Summary: Support `how` parameter of `MultiIndex.dropna` Key: SPARK-38726 URL: https://issues.apache.org/jira/browse/SPARK-38726 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.4.0 Reporter: Xinrong Meng Support `how` parameter of `MultiIndex.dropna` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38727) Test the error class: FAILED_EXECUTE_UDF
Max Gekk created SPARK-38727: Summary: Test the error class: FAILED_EXECUTE_UDF Key: SPARK-38727 URL: https://issues.apache.org/jira/browse/SPARK-38727 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *DUPLICATE_KEY* to QueryParsingErrorsSuite. The test should cover the exception throw in QueryParsingErrors: {code:scala} def duplicateKeysError(key: String, ctx: ParserRuleContext): Throwable = { // Found duplicate keys '$key' new ParseException(errorClass = "DUPLICATE_KEY", messageParameters = Array(key), ctx) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38727) Test the error class: FAILED_EXECUTE_UDF
[ https://issues.apache.org/jira/browse/SPARK-38727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38727: - Description: Add at least one test for the error class *FAILED_EXECUTE_UDF* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def failedExecuteUserDefinedFunctionError(funcCls: String, inputTypes: String, outputType: String, e: Throwable): Throwable = { new SparkException(errorClass = "FAILED_EXECUTE_UDF", messageParameters = Array(funcCls, inputTypes, outputType), e) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *DUPLICATE_KEY* to QueryParsingErrorsSuite. The test should cover the exception throw in QueryParsingErrors: {code:scala} def duplicateKeysError(key: String, ctx: ParserRuleContext): Throwable = { // Found duplicate keys '$key' new ParseException(errorClass = "DUPLICATE_KEY", messageParameters = Array(key), ctx) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: FAILED_EXECUTE_UDF > > > Key: SPARK-38727 > URL: https://issues.apache.org/jira/browse/SPARK-38727 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *FAILED_EXECUTE_UDF* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def failedExecuteUserDefinedFunctionError(funcCls: String, inputTypes: > String, > outputType: String, e: Throwable): Throwable = { > new SparkException(errorClass = "FAILED_EXECUTE_UDF", > messageParameters = Array(funcCls, inputTypes, outputType), e) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38728) Test the error class: FAILED_RENAME_PATH
Max Gekk created SPARK-38728: Summary: Test the error class: FAILED_RENAME_PATH Key: SPARK-38728 URL: https://issues.apache.org/jira/browse/SPARK-38728 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *FAILED_EXECUTE_UDF* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def failedExecuteUserDefinedFunctionError(funcCls: String, inputTypes: String, outputType: String, e: Throwable): Throwable = { new SparkException(errorClass = "FAILED_EXECUTE_UDF", messageParameters = Array(funcCls, inputTypes, outputType), e) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38728) Test the error class: FAILED_RENAME_PATH
[ https://issues.apache.org/jira/browse/SPARK-38728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38728: - Description: Add at least one test for the error class *FAILED_RENAME_PATH* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def renamePathAsExistsPathError(srcPath: Path, dstPath: Path): Throwable = { new SparkFileAlreadyExistsException(errorClass = "FAILED_RENAME_PATH", Array(srcPath.toString, dstPath.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *FAILED_EXECUTE_UDF* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def failedExecuteUserDefinedFunctionError(funcCls: String, inputTypes: String, outputType: String, e: Throwable): Throwable = { new SparkException(errorClass = "FAILED_EXECUTE_UDF", messageParameters = Array(funcCls, inputTypes, outputType), e) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: FAILED_RENAME_PATH > > > Key: SPARK-38728 > URL: https://issues.apache.org/jira/browse/SPARK-38728 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *FAILED_RENAME_PATH* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def renamePathAsExistsPathError(srcPath: Path, dstPath: Path): Throwable = { > new SparkFileAlreadyExistsException(errorClass = "FAILED_RENAME_PATH", > Array(srcPath.toString, dstPath.toString)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38729) Test the error class: FAILED_SET_ORIGINAL_PERMISSION_BACK
Max Gekk created SPARK-38729: Summary: Test the error class: FAILED_SET_ORIGINAL_PERMISSION_BACK Key: SPARK-38729 URL: https://issues.apache.org/jira/browse/SPARK-38729 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *FAILED_RENAME_PATH* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def renamePathAsExistsPathError(srcPath: Path, dstPath: Path): Throwable = { new SparkFileAlreadyExistsException(errorClass = "FAILED_RENAME_PATH", Array(srcPath.toString, dstPath.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38729) Test the error class: FAILED_SET_ORIGINAL_PERMISSION_BACK
[ https://issues.apache.org/jira/browse/SPARK-38729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38729: - Description: Add at least one test for the error class *FAILED_SET_ORIGINAL_PERMISSION_BACK* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def failToSetOriginalPermissionBackError( permission: FsPermission, path: Path, e: Throwable): Throwable = { new SparkSecurityException(errorClass = "FAILED_SET_ORIGINAL_PERMISSION_BACK", Array(permission.toString, path.toString, e.getMessage)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *FAILED_RENAME_PATH* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def renamePathAsExistsPathError(srcPath: Path, dstPath: Path): Throwable = { new SparkFileAlreadyExistsException(errorClass = "FAILED_RENAME_PATH", Array(srcPath.toString, dstPath.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: FAILED_SET_ORIGINAL_PERMISSION_BACK > - > > Key: SPARK-38729 > URL: https://issues.apache.org/jira/browse/SPARK-38729 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class > *FAILED_SET_ORIGINAL_PERMISSION_BACK* to QueryExecutionErrorsSuite. The test > should cover the exception throw in QueryExecutionErrors: > {code:scala} > def failToSetOriginalPermissionBackError( > permission: FsPermission, > path: Path, > e: Throwable): Throwable = { > new SparkSecurityException(errorClass = > "FAILED_SET_ORIGINAL_PERMISSION_BACK", > Array(permission.toString, path.toString, e.getMessage)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38730) Move tests for the grouping error classes to Query*ErrorsSuite
Max Gekk created SPARK-38730: Summary: Move tests for the grouping error classes to Query*ErrorsSuite Key: SPARK-38730 URL: https://issues.apache.org/jira/browse/SPARK-38730 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Move tests for the error classes GROUPING_COLUMN_MISMATCH and GROUPING_ID_COLUMN_MISMATCH from DataFrameAggregateSuite to QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38726) Support `how` parameter of `MultiIndex.dropna`
[ https://issues.apache.org/jira/browse/SPARK-38726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515485#comment-17515485 ] Apache Spark commented on SPARK-38726: -- User 'xinrong-databricks' has created a pull request for this issue: https://github.com/apache/spark/pull/36028 > Support `how` parameter of `MultiIndex.dropna` > -- > > Key: SPARK-38726 > URL: https://issues.apache.org/jira/browse/SPARK-38726 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Priority: Major > > Support `how` parameter of `MultiIndex.dropna` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38726) Support `how` parameter of `MultiIndex.dropna`
[ https://issues.apache.org/jira/browse/SPARK-38726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38726: Assignee: Apache Spark > Support `how` parameter of `MultiIndex.dropna` > -- > > Key: SPARK-38726 > URL: https://issues.apache.org/jira/browse/SPARK-38726 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Assignee: Apache Spark >Priority: Major > > Support `how` parameter of `MultiIndex.dropna` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-38726) Support `how` parameter of `MultiIndex.dropna`
[ https://issues.apache.org/jira/browse/SPARK-38726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-38726: Assignee: (was: Apache Spark) > Support `how` parameter of `MultiIndex.dropna` > -- > > Key: SPARK-38726 > URL: https://issues.apache.org/jira/browse/SPARK-38726 > Project: Spark > Issue Type: Improvement > Components: PySpark >Affects Versions: 3.4.0 >Reporter: Xinrong Meng >Priority: Major > > Support `how` parameter of `MultiIndex.dropna` -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38731) Move tests for the`GROUPING_SIZE_LIMIT_EXCEEDED` error class to QueryCompilationErrorsSuite
Max Gekk created SPARK-38731: Summary: Move tests for the`GROUPING_SIZE_LIMIT_EXCEEDED` error class to QueryCompilationErrorsSuite Key: SPARK-38731 URL: https://issues.apache.org/jira/browse/SPARK-38731 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Move tests for the error classes GROUPING_COLUMN_MISMATCH and GROUPING_ID_COLUMN_MISMATCH from DataFrameAggregateSuite to QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38731) Move the tests `GROUPING_SIZE_LIMIT_EXCEEDED` to QueryCompilationErrorsSuite
[ https://issues.apache.org/jira/browse/SPARK-38731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38731: - Summary: Move the tests `GROUPING_SIZE_LIMIT_EXCEEDED` to QueryCompilationErrorsSuite (was: Move tests for the`GROUPING_SIZE_LIMIT_EXCEEDED` error class to QueryCompilationErrorsSuite) > Move the tests `GROUPING_SIZE_LIMIT_EXCEEDED` to QueryCompilationErrorsSuite > > > Key: SPARK-38731 > URL: https://issues.apache.org/jira/browse/SPARK-38731 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Move tests for the error class GROUPING_SIZE_LIMIT_EXCEEDED from > SQLQuerySuite to QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38731) Move tests for the`GROUPING_SIZE_LIMIT_EXCEEDED` error class to QueryCompilationErrorsSuite
[ https://issues.apache.org/jira/browse/SPARK-38731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38731: - Description: Move tests for the error class GROUPING_SIZE_LIMIT_EXCEEDED from SQLQuerySuite to QueryCompilationErrorsSuite. (was: Move tests for the error classes GROUPING_COLUMN_MISMATCH and GROUPING_ID_COLUMN_MISMATCH from DataFrameAggregateSuite to QueryCompilationErrorsSuite.) > Move tests for the`GROUPING_SIZE_LIMIT_EXCEEDED` error class to > QueryCompilationErrorsSuite > --- > > Key: SPARK-38731 > URL: https://issues.apache.org/jira/browse/SPARK-38731 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Major > > Move tests for the error class GROUPING_SIZE_LIMIT_EXCEEDED from > SQLQuerySuite to QueryCompilationErrorsSuite. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38732) Test the error class: INCOMPARABLE_PIVOT_COLUMN
Max Gekk created SPARK-38732: Summary: Test the error class: INCOMPARABLE_PIVOT_COLUMN Key: SPARK-38732 URL: https://issues.apache.org/jira/browse/SPARK-38732 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *FAILED_RENAME_PATH* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def renamePathAsExistsPathError(srcPath: Path, dstPath: Path): Throwable = { new SparkFileAlreadyExistsException(errorClass = "FAILED_RENAME_PATH", Array(srcPath.toString, dstPath.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38732) Test the error class: INCOMPARABLE_PIVOT_COLUMN
[ https://issues.apache.org/jira/browse/SPARK-38732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38732: - Description: Add at least one test for the error class *INCOMPARABLE_PIVOT_COLUMN* to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def unorderablePivotColError(pivotCol: Expression): Throwable = { new AnalysisException( errorClass = "INCOMPARABLE_PIVOT_COLUMN", messageParameters = Array(pivotCol.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *FAILED_RENAME_PATH* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def renamePathAsExistsPathError(srcPath: Path, dstPath: Path): Throwable = { new SparkFileAlreadyExistsException(errorClass = "FAILED_RENAME_PATH", Array(srcPath.toString, dstPath.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: INCOMPARABLE_PIVOT_COLUMN > --- > > Key: SPARK-38732 > URL: https://issues.apache.org/jira/browse/SPARK-38732 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *INCOMPARABLE_PIVOT_COLUMN* to > QueryCompilationErrorsSuite. The test should cover the exception throw in > QueryCompilationErrors: > {code:scala} > def unorderablePivotColError(pivotCol: Expression): Throwable = { > new AnalysisException( > errorClass = "INCOMPARABLE_PIVOT_COLUMN", > messageParameters = Array(pivotCol.toString)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38733) Test the error class: INCOMPATIBLE_DATASOURCE_REGISTER
Max Gekk created SPARK-38733: Summary: Test the error class: INCOMPATIBLE_DATASOURCE_REGISTER Key: SPARK-38733 URL: https://issues.apache.org/jira/browse/SPARK-38733 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *INCOMPARABLE_PIVOT_COLUMN* to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def unorderablePivotColError(pivotCol: Expression): Throwable = { new AnalysisException( errorClass = "INCOMPARABLE_PIVOT_COLUMN", messageParameters = Array(pivotCol.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38733) Test the error class: INCOMPATIBLE_DATASOURCE_REGISTER
[ https://issues.apache.org/jira/browse/SPARK-38733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38733: - Description: Add at least one test for the error class *INCOMPATIBLE_DATASOURCE_REGISTER* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def incompatibleDataSourceRegisterError(e: Throwable): Throwable = { new SparkClassNotFoundException("INCOMPATIBLE_DATASOURCE_REGISTER", Array(e.getMessage), e) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *INCOMPARABLE_PIVOT_COLUMN* to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def unorderablePivotColError(pivotCol: Expression): Throwable = { new AnalysisException( errorClass = "INCOMPARABLE_PIVOT_COLUMN", messageParameters = Array(pivotCol.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: INCOMPATIBLE_DATASOURCE_REGISTER > -- > > Key: SPARK-38733 > URL: https://issues.apache.org/jira/browse/SPARK-38733 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *INCOMPATIBLE_DATASOURCE_REGISTER* > to QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def incompatibleDataSourceRegisterError(e: Throwable): Throwable = { > new SparkClassNotFoundException("INCOMPATIBLE_DATASOURCE_REGISTER", > Array(e.getMessage), e) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38734) Test the error class: INDEX_OUT_OF_BOUNDS
Max Gekk created SPARK-38734: Summary: Test the error class: INDEX_OUT_OF_BOUNDS Key: SPARK-38734 URL: https://issues.apache.org/jira/browse/SPARK-38734 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *INCOMPATIBLE_DATASOURCE_REGISTER* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def incompatibleDataSourceRegisterError(e: Throwable): Throwable = { new SparkClassNotFoundException("INCOMPATIBLE_DATASOURCE_REGISTER", Array(e.getMessage), e) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38734) Test the error class: INDEX_OUT_OF_BOUNDS
[ https://issues.apache.org/jira/browse/SPARK-38734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38734: - Description: Add at least one test for the error class *INDEX_OUT_OF_BOUNDS* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def indexOutOfBoundsOfArrayDataError(idx: Int): Throwable = { new SparkIndexOutOfBoundsException(errorClass = "INDEX_OUT_OF_BOUNDS", Array(idx.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *INCOMPATIBLE_DATASOURCE_REGISTER* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def incompatibleDataSourceRegisterError(e: Throwable): Throwable = { new SparkClassNotFoundException("INCOMPATIBLE_DATASOURCE_REGISTER", Array(e.getMessage), e) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: INDEX_OUT_OF_BOUNDS > - > > Key: SPARK-38734 > URL: https://issues.apache.org/jira/browse/SPARK-38734 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add at least one test for the error class *INDEX_OUT_OF_BOUNDS* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > def indexOutOfBoundsOfArrayDataError(idx: Int): Throwable = { > new SparkIndexOutOfBoundsException(errorClass = "INDEX_OUT_OF_BOUNDS", > Array(idx.toString)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38735) Test the error class: INTERNAL_ERROR
Max Gekk created SPARK-38735: Summary: Test the error class: INTERNAL_ERROR Key: SPARK-38735 URL: https://issues.apache.org/jira/browse/SPARK-38735 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add at least one test for the error class *INDEX_OUT_OF_BOUNDS* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def indexOutOfBoundsOfArrayDataError(idx: Int): Throwable = { new SparkIndexOutOfBoundsException(errorClass = "INDEX_OUT_OF_BOUNDS", Array(idx.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38735) Test the error class: INTERNAL_ERROR
[ https://issues.apache.org/jira/browse/SPARK-38735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38735: - Description: Add tests for the error class *INTERNAL_ERROR* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def logicalHintOperatorNotRemovedDuringAnalysisError(): Throwable = { new SparkIllegalStateException(errorClass = "INTERNAL_ERROR", messageParameters = Array( "Internal error: logical hint operator should have been removed during analysis")) } def cannotEvaluateExpressionError(expression: Expression): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot evaluate expression: $expression")) } def cannotGenerateCodeForExpressionError(expression: Expression): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot generate code for expression: $expression")) } def cannotTerminateGeneratorError(generator: UnresolvedGenerator): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot terminate expression: $generator")) } def methodNotDeclaredError(name: String): Throwable = { new SparkNoSuchMethodException(errorClass = "INTERNAL_ERROR", messageParameters = Array( s"""A method named "$name" is not declared in any enclosing class nor any supertype""")) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add at least one test for the error class *INDEX_OUT_OF_BOUNDS* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def indexOutOfBoundsOfArrayDataError(idx: Int): Throwable = { new SparkIndexOutOfBoundsException(errorClass = "INDEX_OUT_OF_BOUNDS", Array(idx.toString)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error class: INTERNAL_ERROR > > > Key: SPARK-38735 > URL: https://issues.apache.org/jira/browse/SPARK-38735 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add tests for the error class *INTERNAL_ERROR* to QueryExecutionErrorsSuite. > The test should cover the exception throw in QueryExecutionErrors: > {code:scala} > def logicalHintOperatorNotRemovedDuringAnalysisError(): Throwable = { > new SparkIllegalStateException(errorClass = "INTERNAL_ERROR", > messageParameters = Array( > "Internal error: logical hint operator should have been removed > during analysis")) > } > def cannotEvaluateExpressionError(expression: Expression): Throwable = { > new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", > messageParameters = Array(s"Cannot evaluate expression: $expression")) > } > def cannotGenerateCodeForExpressionError(expression: Expression): Throwable > = { > new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", > messageParameters = Array(s"Cannot generate code for expression: > $expression")) > } > def cannotTerminateGeneratorError(generator: UnresolvedGenerator): > Throwable = { > new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", > messageParameters = Array(s"Cannot terminate expression: $generator")) > } > def methodNotDeclaredError(name: String): Throwable = { > new SparkNoSuchMethodException(errorClass = "INTERNAL_ERROR", > messageParameters = Array( > s"""A method named "$name" is not declared in any enclosing class nor > any supertype""")) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # th
[jira] [Updated] (SPARK-38736) Test the error classes: INVALID_ARRAY_INDEX*
[ https://issues.apache.org/jira/browse/SPARK-38736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38736: - Description: Add tests for the error classes *INVALID_ARRAY_INDEX* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} private def invalidArrayIndexErrorInternal( index: Int, numElements: Int, key: String): ArrayIndexOutOfBoundsException = { new SparkArrayIndexOutOfBoundsException(errorClass = "INVALID_ARRAY_INDEX", messageParameters = Array(index.toString, numElements.toString, key)) } def invalidElementAtIndexError( index: Int, numElements: Int): ArrayIndexOutOfBoundsException = { new SparkArrayIndexOutOfBoundsException(errorClass = "INVALID_ARRAY_INDEX_IN_ELEMENT_AT", messageParameters = Array(index.toString, numElements.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add tests for the error class *INTERNAL_ERROR* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def logicalHintOperatorNotRemovedDuringAnalysisError(): Throwable = { new SparkIllegalStateException(errorClass = "INTERNAL_ERROR", messageParameters = Array( "Internal error: logical hint operator should have been removed during analysis")) } def cannotEvaluateExpressionError(expression: Expression): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot evaluate expression: $expression")) } def cannotGenerateCodeForExpressionError(expression: Expression): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot generate code for expression: $expression")) } def cannotTerminateGeneratorError(generator: UnresolvedGenerator): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot terminate expression: $generator")) } def methodNotDeclaredError(name: String): Throwable = { new SparkNoSuchMethodException(errorClass = "INTERNAL_ERROR", messageParameters = Array( s"""A method named "$name" is not declared in any enclosing class nor any supertype""")) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error classes: INVALID_ARRAY_INDEX* > > > Key: SPARK-38736 > URL: https://issues.apache.org/jira/browse/SPARK-38736 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add tests for the error classes *INVALID_ARRAY_INDEX* to > QueryExecutionErrorsSuite. The test should cover the exception throw in > QueryExecutionErrors: > {code:scala} > private def invalidArrayIndexErrorInternal( > index: Int, > numElements: Int, > key: String): ArrayIndexOutOfBoundsException = { > new SparkArrayIndexOutOfBoundsException(errorClass = > "INVALID_ARRAY_INDEX", > messageParameters = Array(index.toString, numElements.toString, key)) > } > def invalidElementAtIndexError( >index: Int, >numElements: Int): ArrayIndexOutOfBoundsException = { > new SparkArrayIndexOutOfBoundsException(errorClass = > "INVALID_ARRAY_INDEX_IN_ELEMENT_AT", > messageParameters = Array(index.toString, numElements.toString, > SQLConf.ANSI_ENABLED.key)) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsub
[jira] [Created] (SPARK-38736) Test the error classes: INVALID_ARRAY_INDEX*
Max Gekk created SPARK-38736: Summary: Test the error classes: INVALID_ARRAY_INDEX* Key: SPARK-38736 URL: https://issues.apache.org/jira/browse/SPARK-38736 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add tests for the error class *INTERNAL_ERROR* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} def logicalHintOperatorNotRemovedDuringAnalysisError(): Throwable = { new SparkIllegalStateException(errorClass = "INTERNAL_ERROR", messageParameters = Array( "Internal error: logical hint operator should have been removed during analysis")) } def cannotEvaluateExpressionError(expression: Expression): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot evaluate expression: $expression")) } def cannotGenerateCodeForExpressionError(expression: Expression): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot generate code for expression: $expression")) } def cannotTerminateGeneratorError(generator: UnresolvedGenerator): Throwable = { new SparkUnsupportedOperationException(errorClass = "INTERNAL_ERROR", messageParameters = Array(s"Cannot terminate expression: $generator")) } def methodNotDeclaredError(name: String): Throwable = { new SparkNoSuchMethodException(errorClass = "INTERNAL_ERROR", messageParameters = Array( s"""A method named "$name" is not declared in any enclosing class nor any supertype""")) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38737) Test the error classes: INVALID_FIELD_NAME
Max Gekk created SPARK-38737: Summary: Test the error classes: INVALID_FIELD_NAME Key: SPARK-38737 URL: https://issues.apache.org/jira/browse/SPARK-38737 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.4.0 Reporter: Max Gekk Add tests for the error classes *INVALID_ARRAY_INDEX* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} private def invalidArrayIndexErrorInternal( index: Int, numElements: Int, key: String): ArrayIndexOutOfBoundsException = { new SparkArrayIndexOutOfBoundsException(errorClass = "INVALID_ARRAY_INDEX", messageParameters = Array(index.toString, numElements.toString, key)) } def invalidElementAtIndexError( index: Int, numElements: Int): ArrayIndexOutOfBoundsException = { new SparkArrayIndexOutOfBoundsException(errorClass = "INVALID_ARRAY_INDEX_IN_ELEMENT_AT", messageParameters = Array(index.toString, numElements.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-38737) Test the error classes: INVALID_FIELD_NAME
[ https://issues.apache.org/jira/browse/SPARK-38737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Max Gekk updated SPARK-38737: - Description: Add tests for the error class *INVALID_FIELD_NAME* to QueryCompilationErrorsSuite. The test should cover the exception throw in QueryCompilationErrors: {code:scala} def invalidFieldName(fieldName: Seq[String], path: Seq[String], context: Origin): Throwable = { new AnalysisException( errorClass = "INVALID_FIELD_NAME", messageParameters = Array(fieldName.quoted, path.quoted), origin = context) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class was: Add tests for the error classes *INVALID_ARRAY_INDEX* to QueryExecutionErrorsSuite. The test should cover the exception throw in QueryExecutionErrors: {code:scala} private def invalidArrayIndexErrorInternal( index: Int, numElements: Int, key: String): ArrayIndexOutOfBoundsException = { new SparkArrayIndexOutOfBoundsException(errorClass = "INVALID_ARRAY_INDEX", messageParameters = Array(index.toString, numElements.toString, key)) } def invalidElementAtIndexError( index: Int, numElements: Int): ArrayIndexOutOfBoundsException = { new SparkArrayIndexOutOfBoundsException(errorClass = "INVALID_ARRAY_INDEX_IN_ELEMENT_AT", messageParameters = Array(index.toString, numElements.toString, SQLConf.ANSI_ENABLED.key)) } {code} For example, here is a test for the error class *UNSUPPORTED_FEATURE*: https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 +The test must have a check of:+ # the entire error message # sqlState if it is defined in the error-classes.json file # the error class > Test the error classes: INVALID_FIELD_NAME > -- > > Key: SPARK-38737 > URL: https://issues.apache.org/jira/browse/SPARK-38737 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.4.0 >Reporter: Max Gekk >Priority: Minor > Labels: starter > > Add tests for the error class *INVALID_FIELD_NAME* to > QueryCompilationErrorsSuite. The test should cover the exception throw in > QueryCompilationErrors: > {code:scala} > def invalidFieldName(fieldName: Seq[String], path: Seq[String], context: > Origin): Throwable = { > new AnalysisException( > errorClass = "INVALID_FIELD_NAME", > messageParameters = Array(fieldName.quoted, path.quoted), > origin = context) > } > {code} > For example, here is a test for the error class *UNSUPPORTED_FEATURE*: > https://github.com/apache/spark/blob/34e3029a43d2a8241f70f2343be8285cb7f231b9/sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala#L151-L170 > +The test must have a check of:+ > # the entire error message > # sqlState if it is defined in the error-classes.json file > # the error class -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org