[jira] [Resolved] (SPARK-36247) check string length for char/varchar and apply type coercion in UPDATE/MERGE command
[ https://issues.apache.org/jira/browse/SPARK-36247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-36247. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 33468 [https://github.com/apache/spark/pull/33468] > check string length for char/varchar and apply type coercion in UPDATE/MERGE > command > > > Key: SPARK-36247 > URL: https://issues.apache.org/jira/browse/SPARK-36247 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > Fix For: 3.2.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36247) check string length for char/varchar and apply type coercion in UPDATE/MERGE command
[ https://issues.apache.org/jira/browse/SPARK-36247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-36247: --- Assignee: Wenchen Fan > check string length for char/varchar and apply type coercion in UPDATE/MERGE > command > > > Key: SPARK-36247 > URL: https://issues.apache.org/jira/browse/SPARK-36247 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Wenchen Fan >Assignee: Wenchen Fan >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36136) Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive
[ https://issues.apache.org/jira/browse/SPARK-36136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387778#comment-17387778 ] Apache Spark commented on SPARK-36136: -- User 'viirya' has created a pull request for this issue: https://github.com/apache/spark/pull/33533 > Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive > > > Key: SPARK-36136 > URL: https://issues.apache.org/jira/browse/SPARK-36136 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.2.0 >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Minor > Fix For: 3.2.0 > > > Currently both {{PruneFileSourcePartitionsSuite}} and > {{PrunePartitionSuiteBase}} are in {{org.apache.spark.sql.hive.execution}} > which doesn't look right. They should belong to > {{org.apache.spark.sql.execution.datasources}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode
[ https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387766#comment-17387766 ] Fu Chen commented on SPARK-36277: - This bug also persists when we run with the latest master branch. The user-defined schema was pruning by Rule `ColumnPruning` {noformat} === Applying Rule org.apache.spark.sql.catalyst.optimizer.ColumnPruning === Aggregate [count(1) AS count#29L] Aggregate [count(1) AS count#29L] !+- Relation [firstname#0,middlename#1,lastname#2,id#3,gender#4,salary#5] csv +- Project ! +- Relation [firstname#0,middlename#1,lastname#2,id#3,gender#4,salary#5] csv {noformat} {noformat} *(2) HashAggregate(keys=[], functions=[count(1)], output=[count#29L]) +- ShuffleQueryStage 0 +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#22] +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#32L]) +- FileScan csv [] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/tmp/sample.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<> {noformat} But note that when we read CSV files with DROPMALFORMED mode, the `UnivocityParser` needs the schema to judge whether a record is corrupt or not, so `FileScan` scans all records in the CSV file including corrupted records. (salary = 'NA' is corrupted due to user-defined schema field salary is the type of Integer) When we disabled rule `ColumnPruning`, the result is what we want(in other words, `UnivocityParser` needs schema when we scan CSV files with DROPMALFORMED mode.) {code:java} spark.sql("set spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ColumnPruning") {code} Any suggestions or ideas to fix this bug? [~hyukjin.kwon] > Issue with record count of data frame while reading in DropMalformed mode > - > > Key: SPARK-36277 > URL: https://issues.apache.org/jira/browse/SPARK-36277 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.3 >Reporter: anju >Priority: Major > Attachments: 111.PNG, Inputfile.PNG, sample.csv > > > I am writing the steps to reproduce the issue for "count" pyspark api while > using mode as dropmalformed. > I have a csv sample file in s3 bucket . I am reading the file using pyspark > api for csv . I am reading the csv "without schema" and "with schema using > mode 'dropmalformed' options in two different dataframes . While displaying > the "with schema using mode 'dropmalformed'" dataframe , the display looks > good ,it is not showing the malformed records .But when we apply count api on > the dataframe it gives the record count of actual file. I am expecting it > should give me valid record count . > here is the code used:- > {code} > without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True) > schema = StructType([ \ > StructField("firstname",StringType(),True), \ > StructField("middlename",StringType(),True), \ > StructField("lastname",StringType(),True), \ > StructField("id", StringType(), True), \ > StructField("gender", StringType(), True), \ > StructField("salary", IntegerType(), True) \ > ]) > with_schema_df = > spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED") > print("The dataframe with schema") > with_schema_df.show() > print("The dataframe without schema") > without_schema_df.show() > cnt_with_schema=with_schema_df.count() > print("The records count from with schema df :"+str(cnt_with_schema)) > cnt_without_schema=without_schema_df.count() > print("The records count from without schema df: "+str(cnt_without_schema)) > {code} > here is the outputs screen shot 111.PNG is the outputs of the code and > inputfile.csv is the input to the code > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36285) Skip MiMa in PySpark GHA job
[ https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36285: Assignee: Apache Spark > Skip MiMa in PySpark GHA job > > > Key: SPARK-36285 > URL: https://issues.apache.org/jira/browse/SPARK-36285 > Project: Spark > Issue Type: Improvement > Components: Project Infra, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36285) Skip MiMa in PySpark GHA job
[ https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36285: Assignee: (was: Apache Spark) > Skip MiMa in PySpark GHA job > > > Key: SPARK-36285 > URL: https://issues.apache.org/jira/browse/SPARK-36285 > Project: Spark > Issue Type: Improvement > Components: Project Infra, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36285) Skip MiMa in PySpark GHA job
[ https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387765#comment-17387765 ] Apache Spark commented on SPARK-36285: -- User 'williamhyun' has created a pull request for this issue: https://github.com/apache/spark/pull/33532 > Skip MiMa in PySpark GHA job > > > Key: SPARK-36285 > URL: https://issues.apache.org/jira/browse/SPARK-36285 > Project: Spark > Issue Type: Improvement > Components: Project Infra, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36312) ParquetWritter should check inner field
[ https://issues.apache.org/jira/browse/SPARK-36312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36312: Assignee: Apache Spark > ParquetWritter should check inner field > --- > > Key: SPARK-36312 > URL: https://issues.apache.org/jira/browse/SPARK-36312 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36312) ParquetWritter should check inner field
[ https://issues.apache.org/jira/browse/SPARK-36312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36312: Assignee: (was: Apache Spark) > ParquetWritter should check inner field > --- > > Key: SPARK-36312 > URL: https://issues.apache.org/jira/browse/SPARK-36312 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36312) ParquetWritter should check inner field
[ https://issues.apache.org/jira/browse/SPARK-36312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387755#comment-17387755 ] Apache Spark commented on SPARK-36312: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/33531 > ParquetWritter should check inner field > --- > > Key: SPARK-36312 > URL: https://issues.apache.org/jira/browse/SPARK-36312 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: angerszhu >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36312) ParquetWritter should check inner field
angerszhu created SPARK-36312: - Summary: ParquetWritter should check inner field Key: SPARK-36312 URL: https://issues.apache.org/jira/browse/SPARK-36312 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.2.0 Reporter: angerszhu -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode
[ https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387733#comment-17387733 ] anju edited comment on SPARK-36277 at 7/27/21, 3:33 AM: [~hyukjin.kwon]Sure let me check and update. which version would you suggest? was (Author: datumgirl): Sure let me check and update > Issue with record count of data frame while reading in DropMalformed mode > - > > Key: SPARK-36277 > URL: https://issues.apache.org/jira/browse/SPARK-36277 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.3 >Reporter: anju >Priority: Major > Attachments: 111.PNG, Inputfile.PNG, sample.csv > > > I am writing the steps to reproduce the issue for "count" pyspark api while > using mode as dropmalformed. > I have a csv sample file in s3 bucket . I am reading the file using pyspark > api for csv . I am reading the csv "without schema" and "with schema using > mode 'dropmalformed' options in two different dataframes . While displaying > the "with schema using mode 'dropmalformed'" dataframe , the display looks > good ,it is not showing the malformed records .But when we apply count api on > the dataframe it gives the record count of actual file. I am expecting it > should give me valid record count . > here is the code used:- > {code} > without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True) > schema = StructType([ \ > StructField("firstname",StringType(),True), \ > StructField("middlename",StringType(),True), \ > StructField("lastname",StringType(),True), \ > StructField("id", StringType(), True), \ > StructField("gender", StringType(), True), \ > StructField("salary", IntegerType(), True) \ > ]) > with_schema_df = > spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED") > print("The dataframe with schema") > with_schema_df.show() > print("The dataframe without schema") > without_schema_df.show() > cnt_with_schema=with_schema_df.count() > print("The records count from with schema df :"+str(cnt_with_schema)) > cnt_without_schema=without_schema_df.count() > print("The records count from without schema df: "+str(cnt_without_schema)) > {code} > here is the outputs screen shot 111.PNG is the outputs of the code and > inputfile.csv is the input to the code > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode
[ https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387733#comment-17387733 ] anju commented on SPARK-36277: -- Sure let me check and update > Issue with record count of data frame while reading in DropMalformed mode > - > > Key: SPARK-36277 > URL: https://issues.apache.org/jira/browse/SPARK-36277 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.3 >Reporter: anju >Priority: Major > Attachments: 111.PNG, Inputfile.PNG, sample.csv > > > I am writing the steps to reproduce the issue for "count" pyspark api while > using mode as dropmalformed. > I have a csv sample file in s3 bucket . I am reading the file using pyspark > api for csv . I am reading the csv "without schema" and "with schema using > mode 'dropmalformed' options in two different dataframes . While displaying > the "with schema using mode 'dropmalformed'" dataframe , the display looks > good ,it is not showing the malformed records .But when we apply count api on > the dataframe it gives the record count of actual file. I am expecting it > should give me valid record count . > here is the code used:- > {code} > without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True) > schema = StructType([ \ > StructField("firstname",StringType(),True), \ > StructField("middlename",StringType(),True), \ > StructField("lastname",StringType(),True), \ > StructField("id", StringType(), True), \ > StructField("gender", StringType(), True), \ > StructField("salary", IntegerType(), True) \ > ]) > with_schema_df = > spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED") > print("The dataframe with schema") > with_schema_df.show() > print("The dataframe without schema") > without_schema_df.show() > cnt_with_schema=with_schema_df.count() > print("The records count from with schema df :"+str(cnt_with_schema)) > cnt_without_schema=without_schema_df.count() > print("The records count from without schema df: "+str(cnt_without_schema)) > {code} > here is the outputs screen shot 111.PNG is the outputs of the code and > inputfile.csv is the input to the code > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36288) Update API usage on pyspark pandas documents
[ https://issues.apache.org/jira/browse/SPARK-36288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36288: Assignee: Leona Yoda > Update API usage on pyspark pandas documents > > > Key: SPARK-36288 > URL: https://issues.apache.org/jira/browse/SPARK-36288 > Project: Spark > Issue Type: Improvement > Components: Documentation, PySpark >Affects Versions: 3.2.0 >Reporter: Leona Yoda >Assignee: Leona Yoda >Priority: Minor > > I found several warning messages when I tested around ported Spark Pandas API > Documents (https://issues.apache.org/jira/browse/SPARK-34885). > 1. `spark.sql.execution.arrow.enabled` on Best Practice document > {code:java} > 21/07/26 05:42:02 WARN SQLConf: The SQL config > 'spark.sql.execution.arrow.enabled' has been deprecated in Spark v3.0 and may > be removed in the future. Use 'spark.sql.execution.arrow.pyspark.enabled' > instead of it. > {code} > > 2. `DataFrame.to_spark_io` on From/to other DBMSes document > {code:java} > /opt/spark/python/lib/pyspark.zip/pyspark/pandas/frame.py:4811: > FutureWarning: Deprecated in 3.2, Use spark.to_spark_io instead. > warnings.warn("Deprecated in 3.2, Use spark.to_spark_io instead.", > FutureWarning) > {code} > > At this time it worked but I think it's better to update API usage on those > documents. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36288) Update API usage on pyspark pandas documents
[ https://issues.apache.org/jira/browse/SPARK-36288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36288. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 33519 [https://github.com/apache/spark/pull/33519] > Update API usage on pyspark pandas documents > > > Key: SPARK-36288 > URL: https://issues.apache.org/jira/browse/SPARK-36288 > Project: Spark > Issue Type: Improvement > Components: Documentation, PySpark >Affects Versions: 3.2.0 >Reporter: Leona Yoda >Assignee: Leona Yoda >Priority: Minor > Fix For: 3.2.0 > > > I found several warning messages when I tested around ported Spark Pandas API > Documents (https://issues.apache.org/jira/browse/SPARK-34885). > 1. `spark.sql.execution.arrow.enabled` on Best Practice document > {code:java} > 21/07/26 05:42:02 WARN SQLConf: The SQL config > 'spark.sql.execution.arrow.enabled' has been deprecated in Spark v3.0 and may > be removed in the future. Use 'spark.sql.execution.arrow.pyspark.enabled' > instead of it. > {code} > > 2. `DataFrame.to_spark_io` on From/to other DBMSes document > {code:java} > /opt/spark/python/lib/pyspark.zip/pyspark/pandas/frame.py:4811: > FutureWarning: Deprecated in 3.2, Use spark.to_spark_io instead. > warnings.warn("Deprecated in 3.2, Use spark.to_spark_io instead.", > FutureWarning) > {code} > > At this time it worked but I think it's better to update API usage on those > documents. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.
[ https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36267. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 33528 [https://github.com/apache/spark/pull/33528] > Clean up CategoricalAccessor and CategoricalIndex. > -- > > Key: SPARK-36267 > URL: https://issues.apache.org/jira/browse/SPARK-36267 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Takuya Ueshin >Assignee: Takuya Ueshin >Priority: Major > Fix For: 3.2.0 > > > - Clean up the classes > - Add deprecation warnings -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode
[ https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387729#comment-17387729 ] Hyukjin Kwon commented on SPARK-36277: -- [~datumgirl], Spark 2.4 is EOL so the bug won't likely be fixed. Can you try higher versions of Spark and see if the bug persists? > Issue with record count of data frame while reading in DropMalformed mode > - > > Key: SPARK-36277 > URL: https://issues.apache.org/jira/browse/SPARK-36277 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.3 >Reporter: anju >Priority: Major > Attachments: 111.PNG, Inputfile.PNG, sample.csv > > > I am writing the steps to reproduce the issue for "count" pyspark api while > using mode as dropmalformed. > I have a csv sample file in s3 bucket . I am reading the file using pyspark > api for csv . I am reading the csv "without schema" and "with schema using > mode 'dropmalformed' options in two different dataframes . While displaying > the "with schema using mode 'dropmalformed'" dataframe , the display looks > good ,it is not showing the malformed records .But when we apply count api on > the dataframe it gives the record count of actual file. I am expecting it > should give me valid record count . > here is the code used:- > {code} > without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True) > schema = StructType([ \ > StructField("firstname",StringType(),True), \ > StructField("middlename",StringType(),True), \ > StructField("lastname",StringType(),True), \ > StructField("id", StringType(), True), \ > StructField("gender", StringType(), True), \ > StructField("salary", IntegerType(), True) \ > ]) > with_schema_df = > spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED") > print("The dataframe with schema") > with_schema_df.show() > print("The dataframe without schema") > without_schema_df.show() > cnt_with_schema=with_schema_df.count() > print("The records count from with schema df :"+str(cnt_with_schema)) > cnt_without_schema=without_schema_df.count() > print("The records count from without schema df: "+str(cnt_without_schema)) > {code} > here is the outputs screen shot 111.PNG is the outputs of the code and > inputfile.csv is the input to the code > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode
[ https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-36277: - Description: I am writing the steps to reproduce the issue for "count" pyspark api while using mode as dropmalformed. I have a csv sample file in s3 bucket . I am reading the file using pyspark api for csv . I am reading the csv "without schema" and "with schema using mode 'dropmalformed' options in two different dataframes . While displaying the "with schema using mode 'dropmalformed'" dataframe , the display looks good ,it is not showing the malformed records .But when we apply count api on the dataframe it gives the record count of actual file. I am expecting it should give me valid record count . here is the code used:- {code} without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True) schema = StructType([ \ StructField("firstname",StringType(),True), \ StructField("middlename",StringType(),True), \ StructField("lastname",StringType(),True), \ StructField("id", StringType(), True), \ StructField("gender", StringType(), True), \ StructField("salary", IntegerType(), True) \ ]) with_schema_df = spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED") print("The dataframe with schema") with_schema_df.show() print("The dataframe without schema") without_schema_df.show() cnt_with_schema=with_schema_df.count() print("The records count from with schema df :"+str(cnt_with_schema)) cnt_without_schema=without_schema_df.count() print("The records count from without schema df: "+str(cnt_without_schema)) {code} here is the outputs screen shot 111.PNG is the outputs of the code and inputfile.csv is the input to the code was: I am writing the steps to reproduce the issue for "count" pyspark api while using mode as dropmalformed. I have a csv sample file in s3 bucket . I am reading the file using pyspark api for csv . I am reading the csv "without schema" and "with schema using mode 'dropmalformed' options in two different dataframes . While displaying the "with schema using mode 'dropmalformed'" dataframe , the display looks good ,it is not showing the malformed records .But when we apply count api on the dataframe it gives the record count of actual file. I am expecting it should give me valid record count . here is the code used:- ``` without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True) schema = StructType([ \ StructField("firstname",StringType(),True), \ StructField("middlename",StringType(),True), \ StructField("lastname",StringType(),True), \ StructField("id", StringType(), True), \ StructField("gender", StringType(), True), \ StructField("salary", IntegerType(), True) \ ]) with_schema_df = spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED") print("The dataframe with schema") with_schema_df.show() print("The dataframe without schema") without_schema_df.show() cnt_with_schema=with_schema_df.count() print("The records count from with schema df :"+str(cnt_with_schema)) cnt_without_schema=without_schema_df.count() print("The records count from without schema df: "+str(cnt_without_schema)) ``` here is the outputs screen shot 111.PNG is the outputs of the code and inputfile.csv is the input to the code > Issue with record count of data frame while reading in DropMalformed mode > - > > Key: SPARK-36277 > URL: https://issues.apache.org/jira/browse/SPARK-36277 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.3 >Reporter: anju >Priority: Major > Attachments: 111.PNG, Inputfile.PNG, sample.csv > > > I am writing the steps to reproduce the issue for "count" pyspark api while > using mode as dropmalformed. > I have a csv sample file in s3 bucket . I am reading the file using pyspark > api for csv . I am reading the csv "without schema" and "with schema using > mode 'dropmalformed' options in two different dataframes . While displaying > the "with schema using mode 'dropmalformed'" dataframe , the display looks > good ,it is not showing the malformed records .But when we apply count api on > the dataframe it gives the record count of actual file. I am expecting it > should give me valid record count . > here is the code used:- > {code} > without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True) > schema = StructType([ \ > StructField("firstname",StringType(),True), \ > StructField("middlename",StringType(),True), \ >
[jira] [Comment Edited] (SPARK-36285) Skip MiMa in PySpark GHA job
[ https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387073#comment-17387073 ] Hyukjin Kwon edited comment on SPARK-36285 at 7/27/21, 3:21 AM: Like SPARK-36198, could you make a PR for this, [~williamhyun]? cc [~hyukjin.kwon]] was (Author: dongjoon): Like SPARK-36198, could you make a PR for this, [~williamhyun]? cc [~dongjoon] > Skip MiMa in PySpark GHA job > > > Key: SPARK-36285 > URL: https://issues.apache.org/jira/browse/SPARK-36285 > Project: Spark > Issue Type: Improvement > Components: Project Infra, Tests >Affects Versions: 3.3.0 >Reporter: Dongjoon Hyun >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.
[ https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36267: Assignee: Takuya Ueshin > Clean up CategoricalAccessor and CategoricalIndex. > -- > > Key: SPARK-36267 > URL: https://issues.apache.org/jira/browse/SPARK-36267 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Takuya Ueshin >Assignee: Takuya Ueshin >Priority: Major > > - Clean up the classes > - Add deprecation warnings -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36097) Group exception messages in core/scheduler
[ https://issues.apache.org/jira/browse/SPARK-36097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387724#comment-17387724 ] Apache Spark commented on SPARK-36097: -- User 'dgd-contributor' has created a pull request for this issue: https://github.com/apache/spark/pull/33529 > Group exception messages in core/scheduler > -- > > Key: SPARK-36097 > URL: https://issues.apache.org/jira/browse/SPARK-36097 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Allison Wang >Priority: Major > > 'core/src/main/scala/org/apache/spark/scheduler' > || Filename|| Count || > | DAGScheduler.scala | 4 | > | LiveListenerBus.scala | 1 | > | TaskInfo.scala | 1 | > | TaskSchedulerImpl.scala | 4 | > | TaskSetManager.scala| 1 | > 'core/src/main/scala/org/apache/spark/scheduler/cluster' > || Filename|| Count || > | CoarseGrainedSchedulerBackend.scala | 2 | > 'core/src/main/scala/org/apache/spark/scheduler/dynalloc' > || Filename || Count || > | ExecutorMonitor.scala | 1 | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36098) Group exception messages in core/storage
[ https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36098: Assignee: Apache Spark > Group exception messages in core/storage > > > Key: SPARK-36098 > URL: https://issues.apache.org/jira/browse/SPARK-36098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Allison Wang >Assignee: Apache Spark >Priority: Major > > 'core/src/main/scala/org/apache/spark/storage' > || Filename || Count || > | BlockId.scala | 1 | > | BlockInfoManager.scala| 2 | > | BlockManager.scala| 9 | > | BlockManagerDecommissioner.scala | 1 | > | BlockManagerMaster.scala | 2 | > | DiskBlockManager.scala| 1 | > | DiskBlockObjectWriter.scala | 1 | > | ShuffleBlockFetcherIterator.scala | 4 | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36098) Group exception messages in core/storage
[ https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36098: Assignee: (was: Apache Spark) > Group exception messages in core/storage > > > Key: SPARK-36098 > URL: https://issues.apache.org/jira/browse/SPARK-36098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Allison Wang >Priority: Major > > 'core/src/main/scala/org/apache/spark/storage' > || Filename || Count || > | BlockId.scala | 1 | > | BlockInfoManager.scala| 2 | > | BlockManager.scala| 9 | > | BlockManagerDecommissioner.scala | 1 | > | BlockManagerMaster.scala | 2 | > | DiskBlockManager.scala| 1 | > | DiskBlockObjectWriter.scala | 1 | > | ShuffleBlockFetcherIterator.scala | 4 | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36097) Group exception messages in core/scheduler
[ https://issues.apache.org/jira/browse/SPARK-36097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36097: Assignee: (was: Apache Spark) > Group exception messages in core/scheduler > -- > > Key: SPARK-36097 > URL: https://issues.apache.org/jira/browse/SPARK-36097 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Allison Wang >Priority: Major > > 'core/src/main/scala/org/apache/spark/scheduler' > || Filename|| Count || > | DAGScheduler.scala | 4 | > | LiveListenerBus.scala | 1 | > | TaskInfo.scala | 1 | > | TaskSchedulerImpl.scala | 4 | > | TaskSetManager.scala| 1 | > 'core/src/main/scala/org/apache/spark/scheduler/cluster' > || Filename|| Count || > | CoarseGrainedSchedulerBackend.scala | 2 | > 'core/src/main/scala/org/apache/spark/scheduler/dynalloc' > || Filename || Count || > | ExecutorMonitor.scala | 1 | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36097) Group exception messages in core/scheduler
[ https://issues.apache.org/jira/browse/SPARK-36097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36097: Assignee: Apache Spark > Group exception messages in core/scheduler > -- > > Key: SPARK-36097 > URL: https://issues.apache.org/jira/browse/SPARK-36097 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Allison Wang >Assignee: Apache Spark >Priority: Major > > 'core/src/main/scala/org/apache/spark/scheduler' > || Filename|| Count || > | DAGScheduler.scala | 4 | > | LiveListenerBus.scala | 1 | > | TaskInfo.scala | 1 | > | TaskSchedulerImpl.scala | 4 | > | TaskSetManager.scala| 1 | > 'core/src/main/scala/org/apache/spark/scheduler/cluster' > || Filename|| Count || > | CoarseGrainedSchedulerBackend.scala | 2 | > 'core/src/main/scala/org/apache/spark/scheduler/dynalloc' > || Filename || Count || > | ExecutorMonitor.scala | 1 | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36098) Group exception messages in core/storage
[ https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36098: Assignee: Apache Spark > Group exception messages in core/storage > > > Key: SPARK-36098 > URL: https://issues.apache.org/jira/browse/SPARK-36098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Allison Wang >Assignee: Apache Spark >Priority: Major > > 'core/src/main/scala/org/apache/spark/storage' > || Filename || Count || > | BlockId.scala | 1 | > | BlockInfoManager.scala| 2 | > | BlockManager.scala| 9 | > | BlockManagerDecommissioner.scala | 1 | > | BlockManagerMaster.scala | 2 | > | DiskBlockManager.scala| 1 | > | DiskBlockObjectWriter.scala | 1 | > | ShuffleBlockFetcherIterator.scala | 4 | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36098) Group exception messages in core/storage
[ https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387723#comment-17387723 ] Apache Spark commented on SPARK-36098: -- User 'dgd-contributor' has created a pull request for this issue: https://github.com/apache/spark/pull/33530 > Group exception messages in core/storage > > > Key: SPARK-36098 > URL: https://issues.apache.org/jira/browse/SPARK-36098 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Allison Wang >Priority: Major > > 'core/src/main/scala/org/apache/spark/storage' > || Filename || Count || > | BlockId.scala | 1 | > | BlockInfoManager.scala| 2 | > | BlockManager.scala| 9 | > | BlockManagerDecommissioner.scala | 1 | > | BlockManagerMaster.scala | 2 | > | DiskBlockManager.scala| 1 | > | DiskBlockObjectWriter.scala | 1 | > | ShuffleBlockFetcherIterator.scala | 4 | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387720#comment-17387720 ] dgd_contributor commented on SPARK-36291: - working on this > Refactor second set of 20 query execution errors to use error classes > - > > Key: SPARK-36291 > URL: https://issues.apache.org/jira/browse/SPARK-36291 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the second set of 20. > {code:java} > inputTypeUnsupportedError > invalidFractionOfSecondError > overflowInSumOfDecimalError > overflowInIntegralDivideError > mapSizeExceedArraySizeWhenZipMapError > copyNullFieldNotAllowedError > literalTypeUnsupportedError > noDefaultForDataTypeError > doGenCodeOfAliasShouldNotBeCalledError > orderedOperationUnsupportedByDataTypeError > regexGroupIndexLessThanZeroError > regexGroupIndexExceedGroupCountError > invalidUrlError > dataTypeOperationUnsupportedError > mergeUnsupportedByWindowFunctionError > dataTypeUnexpectedError > typeUnsupportedError > negativeValueUnexpectedError > addNewFunctionMismatchedWithFunctionError > cannotGenerateCodeForUncomparableTypeError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36142) Adjust exponentiation between Series with missing values and bool literal to follow pandas
[ https://issues.apache.org/jira/browse/SPARK-36142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-36142: Assignee: Yikun Jiang > Adjust exponentiation between Series with missing values and bool literal to > follow pandas > -- > > Key: SPARK-36142 > URL: https://issues.apache.org/jira/browse/SPARK-36142 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Assignee: Yikun Jiang >Priority: Major > > Currently, exponentiation between ExtentionDtypes and bools is not consistent > with pandas' behavior. > > {code:java} > >>> pser = pd.Series([1, 2, np.nan], dtype=float) > >>> psser = ps.from_pandas(pser) > >>> pser ** False > 0 1.0 > 1 1.0 > 2 1.0 > dtype: float64 > >>> psser ** False > 0 1.0 > 1 1.0 > 2 NaN > dtype: float64 > {code} > We ought to adjust that. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36142) Adjust exponentiation between Series with missing values and bool literal to follow pandas
[ https://issues.apache.org/jira/browse/SPARK-36142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-36142. -- Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 33521 [https://github.com/apache/spark/pull/33521] > Adjust exponentiation between Series with missing values and bool literal to > follow pandas > -- > > Key: SPARK-36142 > URL: https://issues.apache.org/jira/browse/SPARK-36142 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Assignee: Yikun Jiang >Priority: Major > Fix For: 3.2.0 > > > Currently, exponentiation between ExtentionDtypes and bools is not consistent > with pandas' behavior. > > {code:java} > >>> pser = pd.Series([1, 2, np.nan], dtype=float) > >>> psser = ps.from_pandas(pser) > >>> pser ** False > 0 1.0 > 1 1.0 > 2 1.0 > dtype: float64 > >>> psser ** False > 0 1.0 > 1 1.0 > 2 NaN > dtype: float64 > {code} > We ought to adjust that. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387715#comment-17387715 ] PengLei commented on SPARK-36107: - woking on this > Refactor first set of 20 query execution errors to use error classes > > > Key: SPARK-36107 > URL: https://issues.apache.org/jira/browse/SPARK-36107 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the second set of 20. > {code:java} > columnChangeUnsupportedError > logicalHintOperatorNotRemovedDuringAnalysisError > cannotEvaluateExpressionError > cannotGenerateCodeForExpressionError > cannotTerminateGeneratorError > castingCauseOverflowError > cannotChangeDecimalPrecisionError > invalidInputSyntaxForNumericError > cannotCastFromNullTypeError > cannotCastError > cannotParseDecimalError > simpleStringWithNodeIdUnsupportedError > evaluateUnevaluableAggregateUnsupportedError > dataTypeUnsupportedError > dataTypeUnsupportedError > failedExecuteUserDefinedFunctionError > divideByZeroError > invalidArrayIndexError > mapKeyNotExistError > rowFromCSVParserNotExpectedError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36266) Rename classes in shuffle RPC used for block push operations
[ https://issues.apache.org/jira/browse/SPARK-36266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mridul Muralidharan resolved SPARK-36266. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 33340 [https://github.com/apache/spark/pull/33340] > Rename classes in shuffle RPC used for block push operations > > > Key: SPARK-36266 > URL: https://issues.apache.org/jira/browse/SPARK-36266 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 3.1.0 >Reporter: Min Shen >Priority: Major > Fix For: 3.2.0 > > > In the current implementation of push-based shuffle, we are reusing certain > code between both block fetch and block push. > This is generally good except that certain classes that are meant to be used > for both block fetch and block push now have names that indicate they are > only for block fetches, which is confusing. > This ticket renames these classes to be more generic to be reused across both > block fetch and block push. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36311) DESC TABLE in v2 show more detail information
angerszhu created SPARK-36311: - Summary: DESC TABLE in v2 show more detail information Key: SPARK-36311 URL: https://issues.apache.org/jira/browse/SPARK-36311 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0 Reporter: angerszhu according to https://issues.apache.org/jira/browse/SPARK-36086?focusedCommentId=17383195=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17383195 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.
[ https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36267: Assignee: (was: Apache Spark) > Clean up CategoricalAccessor and CategoricalIndex. > -- > > Key: SPARK-36267 > URL: https://issues.apache.org/jira/browse/SPARK-36267 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Takuya Ueshin >Priority: Major > > - Clean up the classes > - Add deprecation warnings -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.
[ https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36267: Assignee: Apache Spark > Clean up CategoricalAccessor and CategoricalIndex. > -- > > Key: SPARK-36267 > URL: https://issues.apache.org/jira/browse/SPARK-36267 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Takuya Ueshin >Assignee: Apache Spark >Priority: Major > > - Clean up the classes > - Add deprecation warnings -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.
[ https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387665#comment-17387665 ] Apache Spark commented on SPARK-36267: -- User 'ueshin' has created a pull request for this issue: https://github.com/apache/spark/pull/33528 > Clean up CategoricalAccessor and CategoricalIndex. > -- > > Key: SPARK-36267 > URL: https://issues.apache.org/jira/browse/SPARK-36267 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Takuya Ueshin >Priority: Major > > - Clean up the classes > - Add deprecation warnings -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36260) Add set_categories to CategoricalAccessor and CategoricalIndex
[ https://issues.apache.org/jira/browse/SPARK-36260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-36260. --- Fix Version/s: 3.2.0 Assignee: Xinrong Meng Resolution: Fixed Issue resolved by pull request 33506 https://github.com/apache/spark/pull/33506 > Add set_categories to CategoricalAccessor and CategoricalIndex > -- > > Key: SPARK-36260 > URL: https://issues.apache.org/jira/browse/SPARK-36260 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Assignee: Xinrong Meng >Priority: Major > Fix For: 3.2.0 > > > Add set_categories to CategoricalAccessor and CategoricalIndex -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36310) Fix hasnan(), any(), and all() window function in IndexOpsMixin
[ https://issues.apache.org/jira/browse/SPARK-36310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-36310: - Summary: Fix hasnan(), any(), and all() window function in IndexOpsMixin (was: Fix hasnan(), any(), and all() window function) > Fix hasnan(), any(), and all() window function in IndexOpsMixin > --- > > Key: SPARK-36310 > URL: https://issues.apache.org/jira/browse/SPARK-36310 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Priority: Major > > > {code:java} > File "/__w/spark/spark/python/pyspark/pandas/groupby.py", line 1497, in > pyspark.pandas.groupby.GroupBy.rank > Failed example: > df.groupby("a").rank().sort_index() > Exception raised: > ... > pyspark.sql.utils.AnalysisException: It is not allowed to use a window > function inside an aggregate function. Please use the inner window function > in a sub-query. > {code} > As shown above, hasnans() used in "rank" causes "It is not allowed to use a > window function inside an aggregate function" exception. > any() and all() have the same issue. > We shall adjust that. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36310) Fix hasnan(), any(), and all() window function
Xinrong Meng created SPARK-36310: Summary: Fix hasnan(), any(), and all() window function Key: SPARK-36310 URL: https://issues.apache.org/jira/browse/SPARK-36310 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 3.2.0 Reporter: Xinrong Meng {code:java} File "/__w/spark/spark/python/pyspark/pandas/groupby.py", line 1497, in pyspark.pandas.groupby.GroupBy.rank Failed example: df.groupby("a").rank().sort_index() Exception raised: ... pyspark.sql.utils.AnalysisException: It is not allowed to use a window function inside an aggregate function. Please use the inner window function in a sub-query. {code} As shown above, hasnans() used in "rank" causes "It is not allowed to use a window function inside an aggregate function" exception. any() and all() have the same issue. We shall adjust that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36028) Allow Project to host outer references in scalar subqueries
[ https://issues.apache.org/jira/browse/SPARK-36028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387637#comment-17387637 ] Apache Spark commented on SPARK-36028: -- User 'allisonwang-db' has created a pull request for this issue: https://github.com/apache/spark/pull/33527 > Allow Project to host outer references in scalar subqueries > --- > > Key: SPARK-36028 > URL: https://issues.apache.org/jira/browse/SPARK-36028 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Fix For: 3.3.0 > > > Support Project to host outer references in subqueries, for example: > {code:sql} > SELECT (SELECT c1) FROM t > {code} > Currently, it will throw AnalysisException: > {code} > org.apache.spark.sql.AnalysisException: Expressions referencing the outer > query are not supported outside of WHERE/HAVING clauses > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36028) Allow Project to host outer references in scalar subqueries
[ https://issues.apache.org/jira/browse/SPARK-36028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387635#comment-17387635 ] Apache Spark commented on SPARK-36028: -- User 'allisonwang-db' has created a pull request for this issue: https://github.com/apache/spark/pull/33527 > Allow Project to host outer references in scalar subqueries > --- > > Key: SPARK-36028 > URL: https://issues.apache.org/jira/browse/SPARK-36028 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Allison Wang >Assignee: Allison Wang >Priority: Major > Fix For: 3.3.0 > > > Support Project to host outer references in subqueries, for example: > {code:sql} > SELECT (SELECT c1) FROM t > {code} > Currently, it will throw AnalysisException: > {code} > org.apache.spark.sql.AnalysisException: Expressions referencing the outer > query are not supported outside of WHERE/HAVING clauses > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35211) Support UDT for Pandas with Arrow Disabled
[ https://issues.apache.org/jira/browse/SPARK-35211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387630#comment-17387630 ] L. C. Hsieh commented on SPARK-35211: - The Jira title looks more like a new feature/improvement. But from the description and the PR, looks like it is a bug? Could you also update with a proper the Jira title? Thanks. > Support UDT for Pandas with Arrow Disabled > -- > > Key: SPARK-35211 > URL: https://issues.apache.org/jira/browse/SPARK-35211 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.1.1 >Reporter: Darcy Shen >Priority: Major > Labels: correctness > > {code:java} > $ pip freeze > certifi==2020.12.5 > coverage==5.5 > flake8==3.9.0 > mccabe==0.6.1 > mypy==0.812 > mypy-extensions==0.4.3 > numpy==1.20.1 > pandas==1.2.3 > pyarrow==2.0.0 > pycodestyle==2.7.0 > pyflakes==2.3.0 > python-dateutil==2.8.1 > pytz==2021.1 > scipy==1.6.1 > six==1.15.0 > typed-ast==1.4.2 > typing-extensions==3.7.4.3 > xmlrunner==1.7.7 > {code} > {code} > (spark) ➜ spark git:(master) bin/pyspark > Python 3.8.8 (default, Feb 24 2021, 13:46:16) > [Clang 10.0.0 ] :: Anaconda, Inc. on darwin > Type "help", "copyright", "credits" or "license" for more information. > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > Setting default log level to "WARN". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > 21/04/24 15:51:29 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/__ / .__/\_,_/_/ /_/\_\ version 3.2.0-SNAPSHOT > /_/ > Using Python version 3.8.8 (default, Feb 24 2021 13:46:16) > Spark context Web UI available at http://172.30.0.12:4040 > Spark context available as 'sc' (master = local[*], app id = > local-1619250689842). > SparkSession available as 'spark'. > >>> spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") > >>> from pyspark.testing.sqlutils import ExamplePoint > >>> > >>> import pandas as pd > >>> > >>> pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), > >>> ExamplePoint(2, 2)])}) > >>> > >>> df = spark.createDataFrame(pdf) > >>> > >>> df.show() > +--+ > | point| > +--+ > |(0.0, 0.0)| > |(0.0, 0.0)| > +--+ > >>> df.toPandas() >point > 0 (0.0,0.0) > 1 (0.0,0.0) > >>> > >>> > {code} > The correct result should be: > {code} >point > 0 (1.0,1.0) > 1 (2.0,2.0) > {code} > The following code snippet works fine: > {code} > (spark) ➜ spark git:(sadhen/SPARK-35211) ✗ bin/pyspark > Python 3.8.8 (default, Feb 24 2021, 13:46:16) > [Clang 10.0.0 ] :: Anaconda, Inc. on darwin > Type "help", "copyright", "credits" or "license" for more information. > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > Setting default log level to "WARN". > To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use > setLogLevel(newLevel). > 21/04/24 17:08:09 WARN NativeCodeLoader: Unable to load native-hadoop library > for your platform... using builtin-java classes where applicable > Welcome to > __ > / __/__ ___ _/ /__ > _\ \/ _ \/ _ `/ __/ '_/ >/__ / .__/\_,_/_/ /_/\_\ version 3.2.0-SNAPSHOT > /_/ > Using Python version 3.8.8 (default, Feb 24 2021 13:46:16) > Spark context Web UI available at http://172.30.0.12:4040 > Spark context available as 'sc' (master = local[*], app id = > local-1619255290637). > SparkSession available as 'spark'. > >>> spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false") > >>> from pyspark.testing.sqlutils import ExamplePoint > >>> import pandas as pd > >>> pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1.0, 1.0), > >>> ExamplePoint(2.0, 2.0)])}) > >>> df = spark.createDataFrame(pdf) > >>> df.show() > +--+ > | point| > +--+ > |(1.0, 1.0)| > |(2.0, 2.0)| > +--+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34952) DS V2 Aggregate push down
[ https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao updated SPARK-34952: --- Description: Push down aggregate to data source for better performance. This will be done in two steps: 1. add aggregate push down APIs and the implementation in JDBC 2. add the implementation in Parquet. was:Push down Max/Min/Count to Parquet for better performance. > DS V2 Aggregate push down > - > > Key: SPARK-34952 > URL: https://issues.apache.org/jira/browse/SPARK-34952 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Fix For: 3.2.0 > > > Push down aggregate to data source for better performance. This will be done > in two steps: > 1. add aggregate push down APIs and the implementation in JDBC > 2. add the implementation in Parquet. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34952) DS V2 Aggregate push down
[ https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Huaxin Gao updated SPARK-34952: --- Summary: DS V2 Aggregate push down (was: Aggregate (Min/Max/Count) push down for Parquet) > DS V2 Aggregate push down > - > > Key: SPARK-34952 > URL: https://issues.apache.org/jira/browse/SPARK-34952 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Fix For: 3.2.0 > > > Push down Max/Min/Count to Parquet for better performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34952) Aggregate (Min/Max/Count) push down for Parquet
[ https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387624#comment-17387624 ] Apache Spark commented on SPARK-34952: -- User 'huaxingao' has created a pull request for this issue: https://github.com/apache/spark/pull/33526 > Aggregate (Min/Max/Count) push down for Parquet > > > Key: SPARK-34952 > URL: https://issues.apache.org/jira/browse/SPARK-34952 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Fix For: 3.2.0 > > > Push down Max/Min/Count to Parquet for better performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-34952) Aggregate (Min/Max/Count) push down for Parquet
[ https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387623#comment-17387623 ] Apache Spark commented on SPARK-34952: -- User 'huaxingao' has created a pull request for this issue: https://github.com/apache/spark/pull/33526 > Aggregate (Min/Max/Count) push down for Parquet > > > Key: SPARK-34952 > URL: https://issues.apache.org/jira/browse/SPARK-34952 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Huaxin Gao >Assignee: Huaxin Gao >Priority: Major > Fix For: 3.2.0 > > > Push down Max/Min/Count to Parquet for better performance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36136) Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive
[ https://issues.apache.org/jira/browse/SPARK-36136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-36136. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 33350 [https://github.com/apache/spark/pull/33350] > Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive > > > Key: SPARK-36136 > URL: https://issues.apache.org/jira/browse/SPARK-36136 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.2.0 >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Minor > Fix For: 3.2.0 > > > Currently both {{PruneFileSourcePartitionsSuite}} and > {{PrunePartitionSuiteBase}} are in {{org.apache.spark.sql.hive.execution}} > which doesn't look right. They should belong to > {{org.apache.spark.sql.execution.datasources}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36136) Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive
[ https://issues.apache.org/jira/browse/SPARK-36136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh reassigned SPARK-36136: --- Assignee: Chao Sun > Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive > > > Key: SPARK-36136 > URL: https://issues.apache.org/jira/browse/SPARK-36136 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.2.0 >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Minor > > Currently both {{PruneFileSourcePartitionsSuite}} and > {{PrunePartitionSuiteBase}} are in {{org.apache.spark.sql.hive.execution}} > which doesn't look right. They should belong to > {{org.apache.spark.sql.execution.datasources}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36143) Adjust `astype` of fractional Series with missing values to follow pandas
[ https://issues.apache.org/jira/browse/SPARK-36143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-36143: - Description: {code:java} >>> pser = pd.Series([1, 2, np.nan], dtype=float) >>> psser = ps.from_pandas(pser) >>> pser.astype(int) ... ValueError: Cannot convert non-finite values (NA or inf) to integer >>> psser.astype(int) 0 1.0 1 2.0 2 NaN dtype: float64 {code} As shown above, astype of Series with fractional missing values doesn't behave the same as pandas, we ought to adjust that. was: {code:java} >>> pser = pd.Series([1, 2, np.nan], dtype=float) >>> psser = ps.from_pandas(pser) >>> pser.astype(int) ... ValueError: Cannot convert non-finite values (NA or inf) to integer >>> psser.astype(int) 0 1.0 1 2.0 2 NaN dtype: float64 {code} As shown above, astype of Series with missing values doesn't behave the same as pandas, we ought to adjust that. > Adjust `astype` of fractional Series with missing values to follow pandas > - > > Key: SPARK-36143 > URL: https://issues.apache.org/jira/browse/SPARK-36143 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Priority: Major > > {code:java} > >>> pser = pd.Series([1, 2, np.nan], dtype=float) > >>> psser = ps.from_pandas(pser) > >>> pser.astype(int) > ... > ValueError: Cannot convert non-finite values (NA or inf) to integer > >>> psser.astype(int) > 0 1.0 > 1 2.0 > 2 NaN > dtype: float64 > {code} > As shown above, astype of Series with fractional missing values doesn't > behave the same as pandas, we ought to adjust that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36143) Adjust `astype` of fractional Series with missing values to follow pandas
[ https://issues.apache.org/jira/browse/SPARK-36143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-36143: - Summary: Adjust `astype` of fractional Series with missing values to follow pandas (was: Adjust astype of Series with missing values to follow pandas) > Adjust `astype` of fractional Series with missing values to follow pandas > - > > Key: SPARK-36143 > URL: https://issues.apache.org/jira/browse/SPARK-36143 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Priority: Major > > {code:java} > >>> pser = pd.Series([1, 2, np.nan], dtype=float) > >>> psser = ps.from_pandas(pser) > >>> pser.astype(int) > ... > ValueError: Cannot convert non-finite values (NA or inf) to integer > >>> psser.astype(int) > 0 1.0 > 1 2.0 > 2 NaN > dtype: float64 > {code} > As shown above, astype of Series with missing values doesn't behave the same > as pandas, we ought to adjust that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36281) Re-enable the test of external listerners in SparkSQLEnvSuite
[ https://issues.apache.org/jira/browse/SPARK-36281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] L. C. Hsieh resolved SPARK-36281. - Resolution: Invalid > Re-enable the test of external listerners in SparkSQLEnvSuite > - > > Key: SPARK-36281 > URL: https://issues.apache.org/jira/browse/SPARK-36281 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 3.0.3 >Reporter: L. C. Hsieh >Priority: Major > > While we're trying to recover GA in branch 3.0 > ([https://github.com/apache/spark/pull/33502),] > {{org.apache.spark.sql.hive.thriftserver.SparkSQLEnvSuite}} continues to fail > in GA: > > [info] - SPARK-29604 external listeners should be initialized with Spark > classloader *** FAILED *** (34 seconds, 465 milliseconds) > [info] > scala.Predef.refArrayOps[org.apache.spark.sql.util.QueryExecutionListener](session.get.listenerManager.listListeners()).exists(((x$1: > org.apache.spark.sql.util.QueryExecutionListener) => > x$1.isInstanceOf[test.custom.listener.DummyQueryExecutionListener])) was > false (SparkSQLEnvSuite.scala:57) > > It looks not a memory issue. I also ran it locally and it passed. Not sure > why GA is special for the test. > > Ignored the test for now to recover GA. We need to investigate why the test > fails and recover it. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36094) Group SQL component error messages in Spark error class JSON file
[ https://issues.apache.org/jira/browse/SPARK-36094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36094: --- Description: To improve auditing, reduce duplication, and improve quality of error messages thrown from Spark, we should group them in a single JSON file (as discussed in the [mailing list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] and introduced in [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). In this file, the error messages should be labeled according to a consistent error class and with a SQLSTATE. We will start with the SQL component first. As a starting point, we can build off the exception grouping done in [SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, there are ~1000 error messages to group split across three files (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In this ticket, each of these files is split into chunks of ~20 errors for refactoring. As a guideline, the error classes should be de-duplicated as much as possible to improve auditing. We will improve error message quality as a follow-up. Here is an example PR that groups a few error messages in the QueryCompilationErrors class: [PR 33309|https://github.com/apache/spark/pull/33309]. was: To improve auditing, reduce duplication, and improve quality of error messages thrown from Spark, we should group them in a single JSON file (as discussed in the [mailing list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] and introduced in [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). In this file, the error messages should be labeled according to a consistent error class and with a SQLSTATE. We will start with the SQL component first. As a starting point, we can build off the exception grouping done in [SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, there are ~1000 error messages to group split across three files (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). If you work on this ticket, please create a subtask to improve ease of reviewing. As a guideline, the error classes should be de-duplicated as much as possible to improve auditing. We will improve error message quality as a follow-up. Here is an example PR that groups a few error messages in the QueryCompilationErrors class: [PR 33309|https://github.com/apache/spark/pull/33309]. > Group SQL component error messages in Spark error class JSON file > - > > Key: SPARK-36094 > URL: https://issues.apache.org/jira/browse/SPARK-36094 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > To improve auditing, reduce duplication, and improve quality of error > messages thrown from Spark, we should group them in a single JSON file (as > discussed in the [mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] > and introduced in > [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). > In this file, the error messages should be labeled according to a consistent > error class and with a SQLSTATE. > We will start with the SQL component first. > As a starting point, we can build off the exception grouping done in > [SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, > there are ~1000 error messages to group split across three files > (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In > this ticket, each of these files is split into chunks of ~20 errors for > refactoring. > As a guideline, the error classes should be de-duplicated as much as possible > to improve auditing. > We will improve error message quality as a follow-up. > Here is an example PR that groups a few error messages in the > QueryCompilationErrors class: [PR > 33309|https://github.com/apache/spark/pull/33309]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-36230) hasnans for Series of Decimal(`NaN`)
[ https://issues.apache.org/jira/browse/SPARK-36230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xinrong Meng updated SPARK-36230: - Comment: was deleted (was: https://issues.apache.org/jira/browse/SPARK-36231) > hasnans for Series of Decimal(`NaN`) > > > Key: SPARK-36230 > URL: https://issues.apache.org/jira/browse/SPARK-36230 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Priority: Major > > {code:java} > >>> import pandas as pd > >>> pser = pd.Series([Decimal('0.1'), Decimal('NaN')]) > >>> pser > 00.1 > 1NaN > dtype: object > >>> psser = ps.from_pandas(pser) > >>> psser > 0 0.1 > 1None > dtype: object > >>> psser.hasnans > False > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36309) Refactor fourth set of 20 query parsing errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36309: --- Description: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the fourth set of 20. {code} showFunctionsUnsupportedError duplicateCteDefinitionNamesError sqlStatementUnsupportedError unquotedIdentifierError duplicateClausesError duplicateKeysError unexpectedFomatForSetConfigurationError invalidPropertyKeyForSetQuotedConfigurationError invalidPropertyValueForSetQuotedConfigurationError unexpectedFormatForResetConfigurationError intervalValueOutOfRangeError invalidTimeZoneDisplacementValueError createTempTableNotSpecifyProviderError rowFormatNotUsedWithStoredAsError useDefinedRecordReaderOrWriterClassesError directoryPathAndOptionsPathBothSpecifiedError unsupportedLocalFileSchemeError invalidGroupingSetError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the third set of 20. {code} fromToIntervalUnsupportedError mixedIntervalUnitsError dataTypeUnsupportedError partitionTransformNotExpectedError tooManyArgumentsForTransformError notEnoughArgumentsForTransformError invalidBucketsNumberError invalidTransformArgumentError cannotCleanReservedNamespacePropertyError propertiesAndDbPropertiesBothSpecifiedError fromOrInNotAllowedInShowDatabasesError cannotCleanReservedTablePropertyError duplicatedTablePathsFoundError storedAsAndStoredByBothSpecifiedError operationInHiveStyleCommandUnsupportedError operationNotAllowedError descColumnForPartitionUnsupportedError incompletePartitionSpecificationError computeStatisticsNotExpectedError addCatalogInCacheTableAsSelectNotAllowedError {code} For more detail, see the parent ticket SPARK-36094. > Refactor fourth set of 20 query parsing errors to use error classes > --- > > Key: SPARK-36309 > URL: https://issues.apache.org/jira/browse/SPARK-36309 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] > to use error classes. > There are currently ~100 exceptions in this file; so this PR only focuses on > the fourth set of 20. > {code} > showFunctionsUnsupportedError > duplicateCteDefinitionNamesError > sqlStatementUnsupportedError > unquotedIdentifierError > duplicateClausesError > duplicateKeysError > unexpectedFomatForSetConfigurationError > invalidPropertyKeyForSetQuotedConfigurationError > invalidPropertyValueForSetQuotedConfigurationError > unexpectedFormatForResetConfigurationError > intervalValueOutOfRangeError > invalidTimeZoneDisplacementValueError > createTempTableNotSpecifyProviderError > rowFormatNotUsedWithStoredAsError > useDefinedRecordReaderOrWriterClassesError > directoryPathAndOptionsPathBothSpecifiedError > unsupportedLocalFileSchemeError > invalidGroupingSetError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36230) hasnans for Series of Decimal(`NaN`)
[ https://issues.apache.org/jira/browse/SPARK-36230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387596#comment-17387596 ] Xinrong Meng commented on SPARK-36230: -- https://issues.apache.org/jira/browse/SPARK-36231 > hasnans for Series of Decimal(`NaN`) > > > Key: SPARK-36230 > URL: https://issues.apache.org/jira/browse/SPARK-36230 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 3.2.0 >Reporter: Xinrong Meng >Priority: Major > > {code:java} > >>> import pandas as pd > >>> pser = pd.Series([Decimal('0.1'), Decimal('NaN')]) > >>> pser > 00.1 > 1NaN > dtype: object > >>> psser = ps.from_pandas(pser) > >>> psser > 0 0.1 > 1None > dtype: object > >>> psser.hasnans > False > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36107: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second set of 20. {code:java} columnChangeUnsupportedError logicalHintOperatorNotRemovedDuringAnalysisError cannotEvaluateExpressionError cannotGenerateCodeForExpressionError cannotTerminateGeneratorError castingCauseOverflowError cannotChangeDecimalPrecisionError invalidInputSyntaxForNumericError cannotCastFromNullTypeError cannotCastError cannotParseDecimalError simpleStringWithNodeIdUnsupportedError evaluateUnevaluableAggregateUnsupportedError dataTypeUnsupportedError dataTypeUnsupportedError failedExecuteUserDefinedFunctionError divideByZeroError invalidArrayIndexError mapKeyNotExistError rowFromCSVParserNotExpectedError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the first set of 20. {code:java} columnChangeUnsupportedError logicalHintOperatorNotRemovedDuringAnalysisError cannotEvaluateExpressionError cannotGenerateCodeForExpressionError cannotTerminateGeneratorError castingCauseOverflowError cannotChangeDecimalPrecisionError invalidInputSyntaxForNumericError cannotCastFromNullTypeError cannotCastError cannotParseDecimalError simpleStringWithNodeIdUnsupportedError evaluateUnevaluableAggregateUnsupportedError dataTypeUnsupportedError dataTypeUnsupportedError failedExecuteUserDefinedFunctionError divideByZeroError invalidArrayIndexError mapKeyNotExistError rowFromCSVParserNotExpectedError {code} For more detail, see the parent ticket SPARK-36094. > Refactor first set of 20 query execution errors to use error classes > > > Key: SPARK-36107 > URL: https://issues.apache.org/jira/browse/SPARK-36107 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the second set of 20. > {code:java} > columnChangeUnsupportedError > logicalHintOperatorNotRemovedDuringAnalysisError > cannotEvaluateExpressionError > cannotGenerateCodeForExpressionError > cannotTerminateGeneratorError > castingCauseOverflowError > cannotChangeDecimalPrecisionError > invalidInputSyntaxForNumericError > cannotCastFromNullTypeError > cannotCastError > cannotParseDecimalError > simpleStringWithNodeIdUnsupportedError > evaluateUnevaluableAggregateUnsupportedError > dataTypeUnsupportedError > dataTypeUnsupportedError > failedExecuteUserDefinedFunctionError > divideByZeroError > invalidArrayIndexError > mapKeyNotExistError > rowFromCSVParserNotExpectedError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36309) Refactor fourth set of 20 query parsing errors to use error classes
Karen Feng created SPARK-36309: -- Summary: Refactor fourth set of 20 query parsing errors to use error classes Key: SPARK-36309 URL: https://issues.apache.org/jira/browse/SPARK-36309 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the third set of 20. {code} fromToIntervalUnsupportedError mixedIntervalUnitsError dataTypeUnsupportedError partitionTransformNotExpectedError tooManyArgumentsForTransformError notEnoughArgumentsForTransformError invalidBucketsNumberError invalidTransformArgumentError cannotCleanReservedNamespacePropertyError propertiesAndDbPropertiesBothSpecifiedError fromOrInNotAllowedInShowDatabasesError cannotCleanReservedTablePropertyError duplicatedTablePathsFoundError storedAsAndStoredByBothSpecifiedError operationInHiveStyleCommandUnsupportedError operationNotAllowedError descColumnForPartitionUnsupportedError incompletePartitionSpecificationError computeStatisticsNotExpectedError addCatalogInCacheTableAsSelectNotAllowedError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36308) Refactor third set of 20 query parsing errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36308: --- Description: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the third set of 20. {code} fromToIntervalUnsupportedError mixedIntervalUnitsError dataTypeUnsupportedError partitionTransformNotExpectedError tooManyArgumentsForTransformError notEnoughArgumentsForTransformError invalidBucketsNumberError invalidTransformArgumentError cannotCleanReservedNamespacePropertyError propertiesAndDbPropertiesBothSpecifiedError fromOrInNotAllowedInShowDatabasesError cannotCleanReservedTablePropertyError duplicatedTablePathsFoundError storedAsAndStoredByBothSpecifiedError operationInHiveStyleCommandUnsupportedError operationNotAllowedError descColumnForPartitionUnsupportedError incompletePartitionSpecificationError computeStatisticsNotExpectedError addCatalogInCacheTableAsSelectNotAllowedError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the first set of 20. {code} repetitiveWindowDefinitionError invalidWindowReferenceError cannotResolveWindowReferenceError joinCriteriaUnimplementedError naturalCrossJoinUnsupportedError emptyInputForTableSampleError tableSampleByBytesUnsupportedError invalidByteLengthLiteralError invalidEscapeStringError trimOptionUnsupportedError functionNameUnsupportedError cannotParseValueTypeError cannotParseIntervalValueError literalValueTypeUnsupportedError parsingValueTypeError invalidNumericLiteralRangeError moreThanOneFromToUnitInIntervalLiteralError invalidIntervalLiteralError invalidIntervalFormError invalidFromToUnitValueError {code} For more detail, see the parent ticket SPARK-36094. > Refactor third set of 20 query parsing errors to use error classes > -- > > Key: SPARK-36308 > URL: https://issues.apache.org/jira/browse/SPARK-36308 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] > to use error classes. > There are currently ~100 exceptions in this file; so this PR only focuses on > the third set of 20. > {code} > fromToIntervalUnsupportedError > mixedIntervalUnitsError > dataTypeUnsupportedError > partitionTransformNotExpectedError > tooManyArgumentsForTransformError > notEnoughArgumentsForTransformError > invalidBucketsNumberError > invalidTransformArgumentError > cannotCleanReservedNamespacePropertyError > propertiesAndDbPropertiesBothSpecifiedError > fromOrInNotAllowedInShowDatabasesError > cannotCleanReservedTablePropertyError > duplicatedTablePathsFoundError > storedAsAndStoredByBothSpecifiedError > operationInHiveStyleCommandUnsupportedError > operationNotAllowedError > descColumnForPartitionUnsupportedError > incompletePartitionSpecificationError > computeStatisticsNotExpectedError > addCatalogInCacheTableAsSelectNotAllowedError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36308) Refactor third set of 20 query parsing errors to use error classes
Karen Feng created SPARK-36308: -- Summary: Refactor third set of 20 query parsing errors to use error classes Key: SPARK-36308 URL: https://issues.apache.org/jira/browse/SPARK-36308 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the first set of 20. {code} repetitiveWindowDefinitionError invalidWindowReferenceError cannotResolveWindowReferenceError joinCriteriaUnimplementedError naturalCrossJoinUnsupportedError emptyInputForTableSampleError tableSampleByBytesUnsupportedError invalidByteLengthLiteralError invalidEscapeStringError trimOptionUnsupportedError functionNameUnsupportedError cannotParseValueTypeError cannotParseIntervalValueError literalValueTypeUnsupportedError parsingValueTypeError invalidNumericLiteralRangeError moreThanOneFromToUnitInIntervalLiteralError invalidIntervalLiteralError invalidIntervalFormError invalidFromToUnitValueError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36307) Refactor second set of 20 query parsing errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36307: --- Description: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the first set of 20. {code} repetitiveWindowDefinitionError invalidWindowReferenceError cannotResolveWindowReferenceError joinCriteriaUnimplementedError naturalCrossJoinUnsupportedError emptyInputForTableSampleError tableSampleByBytesUnsupportedError invalidByteLengthLiteralError invalidEscapeStringError trimOptionUnsupportedError functionNameUnsupportedError cannotParseValueTypeError cannotParseIntervalValueError literalValueTypeUnsupportedError parsingValueTypeError invalidNumericLiteralRangeError moreThanOneFromToUnitInIntervalLiteralError invalidIntervalLiteralError invalidIntervalFormError invalidFromToUnitValueError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the first set of 20. {code} invalidInsertIntoError insertOverwriteDirectoryUnsupportedError columnAliasInOperationNotAllowedError emptySourceForMergeError unrecognizedMatchedActionError insertedValueNumberNotMatchFieldNumberError unrecognizedNotMatchedActionError mergeStatementWithoutWhenClauseError nonLastMatchedClauseOmitConditionError nonLastNotMatchedClauseOmitConditionError emptyPartitionKeyError combinationQueryResultClausesUnsupportedError distributeByUnsupportedError transformNotSupportQuantifierError transformWithSerdeUnsupportedError lateralWithPivotInFromClauseNotAllowedError lateralJoinWithNaturalJoinUnsupportedError lateralJoinWithUsingJoinUnsupportedError unsupportedLateralJoinTypeError invalidLateralJoinRelationError {code} For more detail, see the parent ticket SPARK-36094. > Refactor second set of 20 query parsing errors to use error classes > --- > > Key: SPARK-36307 > URL: https://issues.apache.org/jira/browse/SPARK-36307 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] > to use error classes. > There are currently ~100 exceptions in this file; so this PR only focuses on > the first set of 20. > {code} > repetitiveWindowDefinitionError > invalidWindowReferenceError > cannotResolveWindowReferenceError > joinCriteriaUnimplementedError > naturalCrossJoinUnsupportedError > emptyInputForTableSampleError > tableSampleByBytesUnsupportedError > invalidByteLengthLiteralError > invalidEscapeStringError > trimOptionUnsupportedError > functionNameUnsupportedError > cannotParseValueTypeError > cannotParseIntervalValueError > literalValueTypeUnsupportedError > parsingValueTypeError > invalidNumericLiteralRangeError > moreThanOneFromToUnitInIntervalLiteralError > invalidIntervalLiteralError > invalidIntervalFormError > invalidFromToUnitValueError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36307) Refactor second set of 20 query parsing errors to use error classes
Karen Feng created SPARK-36307: -- Summary: Refactor second set of 20 query parsing errors to use error classes Key: SPARK-36307 URL: https://issues.apache.org/jira/browse/SPARK-36307 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the first set of 20. {code} invalidInsertIntoError insertOverwriteDirectoryUnsupportedError columnAliasInOperationNotAllowedError emptySourceForMergeError unrecognizedMatchedActionError insertedValueNumberNotMatchFieldNumberError unrecognizedNotMatchedActionError mergeStatementWithoutWhenClauseError nonLastMatchedClauseOmitConditionError nonLastNotMatchedClauseOmitConditionError emptyPartitionKeyError combinationQueryResultClausesUnsupportedError distributeByUnsupportedError transformNotSupportQuantifierError transformWithSerdeUnsupportedError lateralWithPivotInFromClauseNotAllowedError lateralJoinWithNaturalJoinUnsupportedError lateralJoinWithUsingJoinUnsupportedError unsupportedLateralJoinTypeError invalidLateralJoinRelationError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36108) Refactor first set of 20 query parsing errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36108: --- Description: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on the first set of 20. {code} invalidInsertIntoError insertOverwriteDirectoryUnsupportedError columnAliasInOperationNotAllowedError emptySourceForMergeError unrecognizedMatchedActionError insertedValueNumberNotMatchFieldNumberError unrecognizedNotMatchedActionError mergeStatementWithoutWhenClauseError nonLastMatchedClauseOmitConditionError nonLastNotMatchedClauseOmitConditionError emptyPartitionKeyError combinationQueryResultClausesUnsupportedError distributeByUnsupportedError transformNotSupportQuantifierError transformWithSerdeUnsupportedError lateralWithPivotInFromClauseNotAllowedError lateralJoinWithNaturalJoinUnsupportedError lateralJoinWithUsingJoinUnsupportedError unsupportedLateralJoinTypeError invalidLateralJoinRelationError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] to use error classes. There are currently ~100 exceptions in this file; so this PR only focuses on a few. For more detail, see the parent ticket [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. Summary: Refactor first set of 20 query parsing errors to use error classes (was: Refactor a few query parsing errors to use error classes) > Refactor first set of 20 query parsing errors to use error classes > -- > > Key: SPARK-36108 > URL: https://issues.apache.org/jira/browse/SPARK-36108 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala] > to use error classes. > There are currently ~100 exceptions in this file; so this PR only focuses on > the first set of 20. > {code} > invalidInsertIntoError > insertOverwriteDirectoryUnsupportedError > columnAliasInOperationNotAllowedError > emptySourceForMergeError > unrecognizedMatchedActionError > insertedValueNumberNotMatchFieldNumberError > unrecognizedNotMatchedActionError > mergeStatementWithoutWhenClauseError > nonLastMatchedClauseOmitConditionError > nonLastNotMatchedClauseOmitConditionError > emptyPartitionKeyError > combinationQueryResultClausesUnsupportedError > distributeByUnsupportedError > transformNotSupportQuantifierError > transformWithSerdeUnsupportedError > lateralWithPivotInFromClauseNotAllowedError > lateralJoinWithNaturalJoinUnsupportedError > lateralJoinWithUsingJoinUnsupportedError > unsupportedLateralJoinTypeError > invalidLateralJoinRelationError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36306) Refactor seventeenth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36306: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the seventeenth set of 20. {code:java} legacyCheckpointDirectoryExistsError subprocessExitedError outputDataTypeUnsupportedByNodeWithoutSerdeError invalidStartIndexError concurrentModificationOnExternalAppendOnlyUnsafeRowArrayError doExecuteBroadcastNotImplementedError databaseNameConflictWithSystemPreservedDatabaseError commentOnTableUnsupportedError unsupportedUpdateColumnNullabilityError renameColumnUnsupportedForOlderMySQLError failedToExecuteQueryError nestedFieldUnsupportedError transformationsAndActionsNotInvokedByDriverError repeatedPivotsUnsupportedError pivotNotAfterGroupByUnsupportedError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the sixteenth set of 20. {code:java} cannotDropMultiPartitionsOnNonatomicPartitionTableError truncateMultiPartitionUnsupportedError overwriteTableByUnsupportedExpressionError dynamicPartitionOverwriteUnsupportedByTableError failedMergingSchemaError cannotBroadcastTableOverMaxTableRowsError cannotBroadcastTableOverMaxTableBytesError notEnoughMemoryToBuildAndBroadcastTableError executeCodePathUnsupportedError cannotMergeClassWithOtherClassError continuousProcessingUnsupportedByDataSourceError failedToReadDataError failedToGenerateEpochMarkerError foreachWriterAbortedDueToTaskFailureError integerOverflowError failedToReadDeltaFileError failedToReadSnapshotFileError cannotPurgeAsBreakInternalStateError cleanUpSourceFilesUnsupportedError latestOffsetNotCalledError {code} For more detail, see the parent ticket SPARK-36094. > Refactor seventeenth set of 20 query execution errors to use error classes > -- > > Key: SPARK-36306 > URL: https://issues.apache.org/jira/browse/SPARK-36306 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the seventeenth set of 20. > {code:java} > legacyCheckpointDirectoryExistsError > subprocessExitedError > outputDataTypeUnsupportedByNodeWithoutSerdeError > invalidStartIndexError > concurrentModificationOnExternalAppendOnlyUnsafeRowArrayError > doExecuteBroadcastNotImplementedError > databaseNameConflictWithSystemPreservedDatabaseError > commentOnTableUnsupportedError > unsupportedUpdateColumnNullabilityError > renameColumnUnsupportedForOlderMySQLError > failedToExecuteQueryError > nestedFieldUnsupportedError > transformationsAndActionsNotInvokedByDriverError > repeatedPivotsUnsupportedError > pivotNotAfterGroupByUnsupportedError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36304) Refactor fifteenth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36304: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fifteenth set of 20. {code:java} unsupportedOperationExceptionError nullLiteralsCannotBeCastedError notUserDefinedTypeError cannotLoadUserDefinedTypeError timeZoneIdNotSpecifiedForTimestampTypeError notPublicClassError primitiveTypesNotSupportedError fieldIndexOnRowWithoutSchemaError valueIsNullError onlySupportDataSourcesProvidingFileFormatError failToSetOriginalPermissionBackError failToSetOriginalACLBackError multiFailuresInStageMaterializationError unrecognizedCompressionSchemaTypeIDError getParentLoggerNotImplementedError cannotCreateParquetConverterForTypeError cannotCreateParquetConverterForDecimalTypeError cannotCreateParquetConverterForDataTypeError cannotAddMultiPartitionsOnNonatomicPartitionTableError userSpecifiedSchemaUnsupportedByDataSourceError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fourteenth set of 20. {code:java} cannotGetEventTimeWatermarkError cannotSetTimeoutTimestampError batchMetadataFileNotFoundError multiStreamingQueriesUsingPathConcurrentlyError addFilesWithAbsolutePathUnsupportedError microBatchUnsupportedByDataSourceError cannotExecuteStreamingRelationExecError invalidStreamingOutputModeError catalogPluginClassNotFoundError catalogPluginClassNotImplementedError catalogPluginClassNotFoundForCatalogError catalogFailToFindPublicNoArgConstructorError catalogFailToCallPublicNoArgConstructorError cannotInstantiateAbstractCatalogPluginClassError failedToInstantiateConstructorForCatalogError noSuchElementExceptionError noSuchElementExceptionError cannotMutateReadOnlySQLConfError cannotCloneOrCopyReadOnlySQLConfError cannotGetSQLConfInSchedulerEventLoopThreadError {code} For more detail, see the parent ticket SPARK-36094. > Refactor fifteenth set of 20 query execution errors to use error classes > > > Key: SPARK-36304 > URL: https://issues.apache.org/jira/browse/SPARK-36304 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the fifteenth set of 20. > {code:java} > unsupportedOperationExceptionError > nullLiteralsCannotBeCastedError > notUserDefinedTypeError > cannotLoadUserDefinedTypeError > timeZoneIdNotSpecifiedForTimestampTypeError > notPublicClassError > primitiveTypesNotSupportedError > fieldIndexOnRowWithoutSchemaError > valueIsNullError > onlySupportDataSourcesProvidingFileFormatError > failToSetOriginalPermissionBackError > failToSetOriginalACLBackError > multiFailuresInStageMaterializationError > unrecognizedCompressionSchemaTypeIDError > getParentLoggerNotImplementedError > cannotCreateParquetConverterForTypeError > cannotCreateParquetConverterForDecimalTypeError > cannotCreateParquetConverterForDataTypeError > cannotAddMultiPartitionsOnNonatomicPartitionTableError > userSpecifiedSchemaUnsupportedByDataSourceError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36306) Refactor seventeenth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36306: -- Summary: Refactor seventeenth set of 20 query execution errors to use error classes Key: SPARK-36306 URL: https://issues.apache.org/jira/browse/SPARK-36306 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the sixteenth set of 20. {code:java} cannotDropMultiPartitionsOnNonatomicPartitionTableError truncateMultiPartitionUnsupportedError overwriteTableByUnsupportedExpressionError dynamicPartitionOverwriteUnsupportedByTableError failedMergingSchemaError cannotBroadcastTableOverMaxTableRowsError cannotBroadcastTableOverMaxTableBytesError notEnoughMemoryToBuildAndBroadcastTableError executeCodePathUnsupportedError cannotMergeClassWithOtherClassError continuousProcessingUnsupportedByDataSourceError failedToReadDataError failedToGenerateEpochMarkerError foreachWriterAbortedDueToTaskFailureError integerOverflowError failedToReadDeltaFileError failedToReadSnapshotFileError cannotPurgeAsBreakInternalStateError cleanUpSourceFilesUnsupportedError latestOffsetNotCalledError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36305) Refactor sixteenth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36305: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the sixteenth set of 20. {code:java} cannotDropMultiPartitionsOnNonatomicPartitionTableError truncateMultiPartitionUnsupportedError overwriteTableByUnsupportedExpressionError dynamicPartitionOverwriteUnsupportedByTableError failedMergingSchemaError cannotBroadcastTableOverMaxTableRowsError cannotBroadcastTableOverMaxTableBytesError notEnoughMemoryToBuildAndBroadcastTableError executeCodePathUnsupportedError cannotMergeClassWithOtherClassError continuousProcessingUnsupportedByDataSourceError failedToReadDataError failedToGenerateEpochMarkerError foreachWriterAbortedDueToTaskFailureError integerOverflowError failedToReadDeltaFileError failedToReadSnapshotFileError cannotPurgeAsBreakInternalStateError cleanUpSourceFilesUnsupportedError latestOffsetNotCalledError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fifteenth set of 20. {code:java} unsupportedOperationExceptionError nullLiteralsCannotBeCastedError notUserDefinedTypeError cannotLoadUserDefinedTypeError timeZoneIdNotSpecifiedForTimestampTypeError notPublicClassError primitiveTypesNotSupportedError fieldIndexOnRowWithoutSchemaError valueIsNullError onlySupportDataSourcesProvidingFileFormatError failToSetOriginalPermissionBackError failToSetOriginalACLBackError multiFailuresInStageMaterializationError unrecognizedCompressionSchemaTypeIDError getParentLoggerNotImplementedError cannotCreateParquetConverterForTypeError cannotCreateParquetConverterForDecimalTypeError cannotCreateParquetConverterForDataTypeError cannotAddMultiPartitionsOnNonatomicPartitionTableError userSpecifiedSchemaUnsupportedByDataSourceError {code} For more detail, see the parent ticket SPARK-36094. > Refactor sixteenth set of 20 query execution errors to use error classes > > > Key: SPARK-36305 > URL: https://issues.apache.org/jira/browse/SPARK-36305 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the sixteenth set of 20. > {code:java} > cannotDropMultiPartitionsOnNonatomicPartitionTableError > truncateMultiPartitionUnsupportedError > overwriteTableByUnsupportedExpressionError > dynamicPartitionOverwriteUnsupportedByTableError > failedMergingSchemaError > cannotBroadcastTableOverMaxTableRowsError > cannotBroadcastTableOverMaxTableBytesError > notEnoughMemoryToBuildAndBroadcastTableError > executeCodePathUnsupportedError > cannotMergeClassWithOtherClassError > continuousProcessingUnsupportedByDataSourceError > failedToReadDataError > failedToGenerateEpochMarkerError > foreachWriterAbortedDueToTaskFailureError > integerOverflowError > failedToReadDeltaFileError > failedToReadSnapshotFileError > cannotPurgeAsBreakInternalStateError > cleanUpSourceFilesUnsupportedError > latestOffsetNotCalledError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36305) Refactor sixteenth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36305: -- Summary: Refactor sixteenth set of 20 query execution errors to use error classes Key: SPARK-36305 URL: https://issues.apache.org/jira/browse/SPARK-36305 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fifteenth set of 20. {code:java} unsupportedOperationExceptionError nullLiteralsCannotBeCastedError notUserDefinedTypeError cannotLoadUserDefinedTypeError timeZoneIdNotSpecifiedForTimestampTypeError notPublicClassError primitiveTypesNotSupportedError fieldIndexOnRowWithoutSchemaError valueIsNullError onlySupportDataSourcesProvidingFileFormatError failToSetOriginalPermissionBackError failToSetOriginalACLBackError multiFailuresInStageMaterializationError unrecognizedCompressionSchemaTypeIDError getParentLoggerNotImplementedError cannotCreateParquetConverterForTypeError cannotCreateParquetConverterForDecimalTypeError cannotCreateParquetConverterForDataTypeError cannotAddMultiPartitionsOnNonatomicPartitionTableError userSpecifiedSchemaUnsupportedByDataSourceError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36303) Refactor fourteenth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36303: -- Summary: Refactor fourteenth set of 20 query execution errors to use error classes Key: SPARK-36303 URL: https://issues.apache.org/jira/browse/SPARK-36303 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the thirteenth set of 20. {code:java} serDeInterfaceNotFoundError convertHiveTableToCatalogTableError cannotRecognizeHiveTypeError getTablesByTypeUnsupportedByHiveVersionError dropTableWithPurgeUnsupportedError alterTableWithDropPartitionAndPurgeUnsupportedError invalidPartitionFilterError getPartitionMetadataByFilterError unsupportedHiveMetastoreVersionError loadHiveClientCausesNoClassDefFoundError cannotFetchTablesOfDatabaseError illegalLocationClauseForViewPartitionError renamePathAsExistsPathError renameAsExistsPathError renameSrcPathNotFoundError failedRenameTempFileError legacyMetadataPathExistsError partitionColumnNotFoundInSchemaError stateNotDefinedOrAlreadyRemovedError cannotSetTimeoutDurationError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36303) Refactor fourteenth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36303: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fourteenth set of 20. {code:java} cannotGetEventTimeWatermarkError cannotSetTimeoutTimestampError batchMetadataFileNotFoundError multiStreamingQueriesUsingPathConcurrentlyError addFilesWithAbsolutePathUnsupportedError microBatchUnsupportedByDataSourceError cannotExecuteStreamingRelationExecError invalidStreamingOutputModeError catalogPluginClassNotFoundError catalogPluginClassNotImplementedError catalogPluginClassNotFoundForCatalogError catalogFailToFindPublicNoArgConstructorError catalogFailToCallPublicNoArgConstructorError cannotInstantiateAbstractCatalogPluginClassError failedToInstantiateConstructorForCatalogError noSuchElementExceptionError noSuchElementExceptionError cannotMutateReadOnlySQLConfError cannotCloneOrCopyReadOnlySQLConfError cannotGetSQLConfInSchedulerEventLoopThreadError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the thirteenth set of 20. {code:java} serDeInterfaceNotFoundError convertHiveTableToCatalogTableError cannotRecognizeHiveTypeError getTablesByTypeUnsupportedByHiveVersionError dropTableWithPurgeUnsupportedError alterTableWithDropPartitionAndPurgeUnsupportedError invalidPartitionFilterError getPartitionMetadataByFilterError unsupportedHiveMetastoreVersionError loadHiveClientCausesNoClassDefFoundError cannotFetchTablesOfDatabaseError illegalLocationClauseForViewPartitionError renamePathAsExistsPathError renameAsExistsPathError renameSrcPathNotFoundError failedRenameTempFileError legacyMetadataPathExistsError partitionColumnNotFoundInSchemaError stateNotDefinedOrAlreadyRemovedError cannotSetTimeoutDurationError {code} For more detail, see the parent ticket SPARK-36094. > Refactor fourteenth set of 20 query execution errors to use error classes > - > > Key: SPARK-36303 > URL: https://issues.apache.org/jira/browse/SPARK-36303 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the fourteenth set of 20. > {code:java} > cannotGetEventTimeWatermarkError > cannotSetTimeoutTimestampError > batchMetadataFileNotFoundError > multiStreamingQueriesUsingPathConcurrentlyError > addFilesWithAbsolutePathUnsupportedError > microBatchUnsupportedByDataSourceError > cannotExecuteStreamingRelationExecError > invalidStreamingOutputModeError > catalogPluginClassNotFoundError > catalogPluginClassNotImplementedError > catalogPluginClassNotFoundForCatalogError > catalogFailToFindPublicNoArgConstructorError > catalogFailToCallPublicNoArgConstructorError > cannotInstantiateAbstractCatalogPluginClassError > failedToInstantiateConstructorForCatalogError > noSuchElementExceptionError > noSuchElementExceptionError > cannotMutateReadOnlySQLConfError > cannotCloneOrCopyReadOnlySQLConfError > cannotGetSQLConfInSchedulerEventLoopThreadError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36304) Refactor fifteenth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36304: -- Summary: Refactor fifteenth set of 20 query execution errors to use error classes Key: SPARK-36304 URL: https://issues.apache.org/jira/browse/SPARK-36304 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fourteenth set of 20. {code:java} cannotGetEventTimeWatermarkError cannotSetTimeoutTimestampError batchMetadataFileNotFoundError multiStreamingQueriesUsingPathConcurrentlyError addFilesWithAbsolutePathUnsupportedError microBatchUnsupportedByDataSourceError cannotExecuteStreamingRelationExecError invalidStreamingOutputModeError catalogPluginClassNotFoundError catalogPluginClassNotImplementedError catalogPluginClassNotFoundForCatalogError catalogFailToFindPublicNoArgConstructorError catalogFailToCallPublicNoArgConstructorError cannotInstantiateAbstractCatalogPluginClassError failedToInstantiateConstructorForCatalogError noSuchElementExceptionError noSuchElementExceptionError cannotMutateReadOnlySQLConfError cannotCloneOrCopyReadOnlySQLConfError cannotGetSQLConfInSchedulerEventLoopThreadError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36301) Refactor twelfth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36301: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the twelfth set of 20. {code:java} cannotRewriteDomainJoinWithConditionsError decorrelateInnerQueryThroughPlanUnsupportedError methodCalledInAnalyzerNotAllowedError cannotSafelyMergeSerdePropertiesError pairUnsupportedAtFunctionError onceStrategyIdempotenceIsBrokenForBatchError[TreeType structuralIntegrityOfInputPlanIsBrokenInClassError structuralIntegrityIsBrokenAfterApplyingRuleError ruleIdNotFoundForRuleError cannotCreateArrayWithElementsExceedLimitError indexOutOfBoundsOfArrayDataError malformedRecordsDetectedInRecordParsingError remoteOperationsUnsupportedError invalidKerberosConfigForHiveServer2Error parentSparkUIToAttachTabNotFoundError inferSchemaUnsupportedForHiveError requestedPartitionsMismatchTablePartitionsError dynamicPartitionKeyNotAmongWrittenPartitionPathsError cannotRemovePartitionDirError cannotCreateStagingDirError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the eleventh set of 20. {code:java} expressionDecodingError expressionEncodingError classHasUnexpectedSerializerError cannotGetOuterPointerForInnerClassError userDefinedTypeNotAnnotatedAndRegisteredError invalidInputSyntaxForBooleanError unsupportedOperandTypeForSizeFunctionError unexpectedValueForStartInFunctionError unexpectedValueForLengthInFunctionError sqlArrayIndexNotStartAtOneError concatArraysWithElementsExceedLimitError flattenArraysWithElementsExceedLimitError createArrayWithElementsExceedLimitError unionArrayWithElementsExceedLimitError initialTypeNotTargetDataTypeError initialTypeNotTargetDataTypesError cannotConvertColumnToJSONError malformedRecordsDetectedInSchemaInferenceError malformedJSONError malformedRecordsDetectedInSchemaInferenceError {code} For more detail, see the parent ticket SPARK-36094. > Refactor twelfth set of 20 query execution errors to use error classes > -- > > Key: SPARK-36301 > URL: https://issues.apache.org/jira/browse/SPARK-36301 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the twelfth set of 20. > {code:java} > cannotRewriteDomainJoinWithConditionsError > decorrelateInnerQueryThroughPlanUnsupportedError > methodCalledInAnalyzerNotAllowedError > cannotSafelyMergeSerdePropertiesError > pairUnsupportedAtFunctionError > onceStrategyIdempotenceIsBrokenForBatchError[TreeType > structuralIntegrityOfInputPlanIsBrokenInClassError > structuralIntegrityIsBrokenAfterApplyingRuleError > ruleIdNotFoundForRuleError > cannotCreateArrayWithElementsExceedLimitError > indexOutOfBoundsOfArrayDataError > malformedRecordsDetectedInRecordParsingError > remoteOperationsUnsupportedError > invalidKerberosConfigForHiveServer2Error > parentSparkUIToAttachTabNotFoundError > inferSchemaUnsupportedForHiveError > requestedPartitionsMismatchTablePartitionsError > dynamicPartitionKeyNotAmongWrittenPartitionPathsError > cannotRemovePartitionDirError > cannotCreateStagingDirError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36302) Refactor thirteenth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36302: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the thirteenth set of 20. {code:java} serDeInterfaceNotFoundError convertHiveTableToCatalogTableError cannotRecognizeHiveTypeError getTablesByTypeUnsupportedByHiveVersionError dropTableWithPurgeUnsupportedError alterTableWithDropPartitionAndPurgeUnsupportedError invalidPartitionFilterError getPartitionMetadataByFilterError unsupportedHiveMetastoreVersionError loadHiveClientCausesNoClassDefFoundError cannotFetchTablesOfDatabaseError illegalLocationClauseForViewPartitionError renamePathAsExistsPathError renameAsExistsPathError renameSrcPathNotFoundError failedRenameTempFileError legacyMetadataPathExistsError partitionColumnNotFoundInSchemaError stateNotDefinedOrAlreadyRemovedError cannotSetTimeoutDurationError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the twelfth set of 20. {code:java} cannotRewriteDomainJoinWithConditionsError decorrelateInnerQueryThroughPlanUnsupportedError methodCalledInAnalyzerNotAllowedError cannotSafelyMergeSerdePropertiesError pairUnsupportedAtFunctionError onceStrategyIdempotenceIsBrokenForBatchError[TreeType structuralIntegrityOfInputPlanIsBrokenInClassError structuralIntegrityIsBrokenAfterApplyingRuleError ruleIdNotFoundForRuleError cannotCreateArrayWithElementsExceedLimitError indexOutOfBoundsOfArrayDataError malformedRecordsDetectedInRecordParsingError remoteOperationsUnsupportedError invalidKerberosConfigForHiveServer2Error parentSparkUIToAttachTabNotFoundError inferSchemaUnsupportedForHiveError requestedPartitionsMismatchTablePartitionsError dynamicPartitionKeyNotAmongWrittenPartitionPathsError cannotRemovePartitionDirError cannotCreateStagingDirError {code} For more detail, see the parent ticket SPARK-36094. > Refactor thirteenth set of 20 query execution errors to use error classes > - > > Key: SPARK-36302 > URL: https://issues.apache.org/jira/browse/SPARK-36302 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the thirteenth set of 20. > {code:java} > serDeInterfaceNotFoundError > convertHiveTableToCatalogTableError > cannotRecognizeHiveTypeError > getTablesByTypeUnsupportedByHiveVersionError > dropTableWithPurgeUnsupportedError > alterTableWithDropPartitionAndPurgeUnsupportedError > invalidPartitionFilterError > getPartitionMetadataByFilterError > unsupportedHiveMetastoreVersionError > loadHiveClientCausesNoClassDefFoundError > cannotFetchTablesOfDatabaseError > illegalLocationClauseForViewPartitionError > renamePathAsExistsPathError > renameAsExistsPathError > renameSrcPathNotFoundError > failedRenameTempFileError > legacyMetadataPathExistsError > partitionColumnNotFoundInSchemaError > stateNotDefinedOrAlreadyRemovedError > cannotSetTimeoutDurationError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36302) Refactor thirteenth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36302: -- Summary: Refactor thirteenth set of 20 query execution errors to use error classes Key: SPARK-36302 URL: https://issues.apache.org/jira/browse/SPARK-36302 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the twelfth set of 20. {code:java} cannotRewriteDomainJoinWithConditionsError decorrelateInnerQueryThroughPlanUnsupportedError methodCalledInAnalyzerNotAllowedError cannotSafelyMergeSerdePropertiesError pairUnsupportedAtFunctionError onceStrategyIdempotenceIsBrokenForBatchError[TreeType structuralIntegrityOfInputPlanIsBrokenInClassError structuralIntegrityIsBrokenAfterApplyingRuleError ruleIdNotFoundForRuleError cannotCreateArrayWithElementsExceedLimitError indexOutOfBoundsOfArrayDataError malformedRecordsDetectedInRecordParsingError remoteOperationsUnsupportedError invalidKerberosConfigForHiveServer2Error parentSparkUIToAttachTabNotFoundError inferSchemaUnsupportedForHiveError requestedPartitionsMismatchTablePartitionsError dynamicPartitionKeyNotAmongWrittenPartitionPathsError cannotRemovePartitionDirError cannotCreateStagingDirError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36301) Refactor twelfth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36301: -- Summary: Refactor twelfth set of 20 query execution errors to use error classes Key: SPARK-36301 URL: https://issues.apache.org/jira/browse/SPARK-36301 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the eleventh set of 20. {code:java} expressionDecodingError expressionEncodingError classHasUnexpectedSerializerError cannotGetOuterPointerForInnerClassError userDefinedTypeNotAnnotatedAndRegisteredError invalidInputSyntaxForBooleanError unsupportedOperandTypeForSizeFunctionError unexpectedValueForStartInFunctionError unexpectedValueForLengthInFunctionError sqlArrayIndexNotStartAtOneError concatArraysWithElementsExceedLimitError flattenArraysWithElementsExceedLimitError createArrayWithElementsExceedLimitError unionArrayWithElementsExceedLimitError initialTypeNotTargetDataTypeError initialTypeNotTargetDataTypesError cannotConvertColumnToJSONError malformedRecordsDetectedInSchemaInferenceError malformedJSONError malformedRecordsDetectedInSchemaInferenceError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36300) Refactor eleventh set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36300: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the eleventh set of 20. {code:java} expressionDecodingError expressionEncodingError classHasUnexpectedSerializerError cannotGetOuterPointerForInnerClassError userDefinedTypeNotAnnotatedAndRegisteredError invalidInputSyntaxForBooleanError unsupportedOperandTypeForSizeFunctionError unexpectedValueForStartInFunctionError unexpectedValueForLengthInFunctionError sqlArrayIndexNotStartAtOneError concatArraysWithElementsExceedLimitError flattenArraysWithElementsExceedLimitError createArrayWithElementsExceedLimitError unionArrayWithElementsExceedLimitError initialTypeNotTargetDataTypeError initialTypeNotTargetDataTypesError cannotConvertColumnToJSONError malformedRecordsDetectedInSchemaInferenceError malformedJSONError malformedRecordsDetectedInSchemaInferenceError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the tenth set of 20. {code:java} registeringStreamingQueryListenerError concurrentQueryInstanceError cannotParseJsonArraysAsStructsError cannotParseStringAsDataTypeError failToParseEmptyStringForDataTypeError failToParseValueForDataTypeError rootConverterReturnNullError cannotHaveCircularReferencesInBeanClassError cannotHaveCircularReferencesInClassError cannotUseInvalidJavaIdentifierAsFieldNameError cannotFindEncoderForTypeError attributesForTypeUnsupportedError schemaForTypeUnsupportedError cannotFindConstructorForTypeError paramExceedOneCharError paramIsNotIntegerError paramIsNotBooleanValueError foundNullValueForNotNullableFieldError malformedCSVRecordError elementsOfTupleExceedLimitError {code} For more detail, see the parent ticket SPARK-36094. > Refactor eleventh set of 20 query execution errors to use error classes > --- > > Key: SPARK-36300 > URL: https://issues.apache.org/jira/browse/SPARK-36300 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the eleventh set of 20. > {code:java} > expressionDecodingError > expressionEncodingError > classHasUnexpectedSerializerError > cannotGetOuterPointerForInnerClassError > userDefinedTypeNotAnnotatedAndRegisteredError > invalidInputSyntaxForBooleanError > unsupportedOperandTypeForSizeFunctionError > unexpectedValueForStartInFunctionError > unexpectedValueForLengthInFunctionError > sqlArrayIndexNotStartAtOneError > concatArraysWithElementsExceedLimitError > flattenArraysWithElementsExceedLimitError > createArrayWithElementsExceedLimitError > unionArrayWithElementsExceedLimitError > initialTypeNotTargetDataTypeError > initialTypeNotTargetDataTypesError > cannotConvertColumnToJSONError > malformedRecordsDetectedInSchemaInferenceError > malformedJSONError > malformedRecordsDetectedInSchemaInferenceError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36298) Refactor ninth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36298: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the ninth set of 20. {code:java} unscaledValueTooLargeForPrecisionError decimalPrecisionExceedsMaxPrecisionError outOfDecimalTypeRangeError unsupportedArrayTypeError unsupportedJavaTypeError failedParsingStructTypeError failedMergingFieldsError cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError cannotMergeDecimalTypesWithIncompatiblePrecisionError cannotMergeDecimalTypesWithIncompatibleScaleError cannotMergeIncompatibleDataTypesError exceedMapSizeLimitError duplicateMapKeyFoundError mapDataKeyArrayLengthDiffersFromValueArrayLengthError fieldDiffersFromDerivedLocalDateError failToParseDateTimeInNewParserError failToFormatDateTimeInNewFormatterError failToRecognizePatternAfterUpgradeError failToRecognizePatternError cannotCastUTF8StringToDataTypeError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the eighth set of 20. {code:java} executeBroadcastTimeoutError cannotCompareCostWithTargetCostError unsupportedDataTypeError notSupportTypeError notSupportNonPrimitiveTypeError unsupportedTypeError useDictionaryEncodingWhenDictionaryOverflowError endOfIteratorError cannotAllocateMemoryToGrowBytesToBytesMapError cannotAcquireMemoryToBuildLongHashedRelationError cannotAcquireMemoryToBuildUnsafeHashedRelationError rowLargerThan256MUnsupportedError cannotBuildHashedRelationWithUniqueKeysExceededError cannotBuildHashedRelationLargerThan8GError failedToPushRowIntoRowQueueError unexpectedWindowFunctionFrameError cannotParseStatisticAsPercentileError statisticNotRecognizedError unknownColumnError unexpectedAccumulableUpdateValueError {code} For more detail, see the parent ticket SPARK-36094. > Refactor ninth set of 20 query execution errors to use error classes > > > Key: SPARK-36298 > URL: https://issues.apache.org/jira/browse/SPARK-36298 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the ninth set of 20. > {code:java} > unscaledValueTooLargeForPrecisionError > decimalPrecisionExceedsMaxPrecisionError > outOfDecimalTypeRangeError > unsupportedArrayTypeError > unsupportedJavaTypeError > failedParsingStructTypeError > failedMergingFieldsError > cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError > cannotMergeDecimalTypesWithIncompatiblePrecisionError > cannotMergeDecimalTypesWithIncompatibleScaleError > cannotMergeIncompatibleDataTypesError > exceedMapSizeLimitError > duplicateMapKeyFoundError > mapDataKeyArrayLengthDiffersFromValueArrayLengthError > fieldDiffersFromDerivedLocalDateError > failToParseDateTimeInNewParserError > failToFormatDateTimeInNewFormatterError > failToRecognizePatternAfterUpgradeError > failToRecognizePatternError > cannotCastUTF8StringToDataTypeError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36299) Refactor tenth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36299: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the tenth set of 20. {code:java} registeringStreamingQueryListenerError concurrentQueryInstanceError cannotParseJsonArraysAsStructsError cannotParseStringAsDataTypeError failToParseEmptyStringForDataTypeError failToParseValueForDataTypeError rootConverterReturnNullError cannotHaveCircularReferencesInBeanClassError cannotHaveCircularReferencesInClassError cannotUseInvalidJavaIdentifierAsFieldNameError cannotFindEncoderForTypeError attributesForTypeUnsupportedError schemaForTypeUnsupportedError cannotFindConstructorForTypeError paramExceedOneCharError paramIsNotIntegerError paramIsNotBooleanValueError foundNullValueForNotNullableFieldError malformedCSVRecordError elementsOfTupleExceedLimitError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the ninth set of 20. {code:java} unscaledValueTooLargeForPrecisionError decimalPrecisionExceedsMaxPrecisionError outOfDecimalTypeRangeError unsupportedArrayTypeError unsupportedJavaTypeError failedParsingStructTypeError failedMergingFieldsError cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError cannotMergeDecimalTypesWithIncompatiblePrecisionError cannotMergeDecimalTypesWithIncompatibleScaleError cannotMergeIncompatibleDataTypesError exceedMapSizeLimitError duplicateMapKeyFoundError mapDataKeyArrayLengthDiffersFromValueArrayLengthError fieldDiffersFromDerivedLocalDateError failToParseDateTimeInNewParserError failToFormatDateTimeInNewFormatterError failToRecognizePatternAfterUpgradeError failToRecognizePatternError cannotCastUTF8StringToDataTypeError {code} For more detail, see the parent ticket SPARK-36094. > Refactor tenth set of 20 query execution errors to use error classes > > > Key: SPARK-36299 > URL: https://issues.apache.org/jira/browse/SPARK-36299 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the tenth set of 20. > {code:java} > registeringStreamingQueryListenerError > concurrentQueryInstanceError > cannotParseJsonArraysAsStructsError > cannotParseStringAsDataTypeError > failToParseEmptyStringForDataTypeError > failToParseValueForDataTypeError > rootConverterReturnNullError > cannotHaveCircularReferencesInBeanClassError > cannotHaveCircularReferencesInClassError > cannotUseInvalidJavaIdentifierAsFieldNameError > cannotFindEncoderForTypeError > attributesForTypeUnsupportedError > schemaForTypeUnsupportedError > cannotFindConstructorForTypeError > paramExceedOneCharError > paramIsNotIntegerError > paramIsNotBooleanValueError > foundNullValueForNotNullableFieldError > malformedCSVRecordError > elementsOfTupleExceedLimitError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36299) Refactor tenth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36299: -- Summary: Refactor tenth set of 20 query execution errors to use error classes Key: SPARK-36299 URL: https://issues.apache.org/jira/browse/SPARK-36299 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the ninth set of 20. {code:java} unscaledValueTooLargeForPrecisionError decimalPrecisionExceedsMaxPrecisionError outOfDecimalTypeRangeError unsupportedArrayTypeError unsupportedJavaTypeError failedParsingStructTypeError failedMergingFieldsError cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError cannotMergeDecimalTypesWithIncompatiblePrecisionError cannotMergeDecimalTypesWithIncompatibleScaleError cannotMergeIncompatibleDataTypesError exceedMapSizeLimitError duplicateMapKeyFoundError mapDataKeyArrayLengthDiffersFromValueArrayLengthError fieldDiffersFromDerivedLocalDateError failToParseDateTimeInNewParserError failToFormatDateTimeInNewFormatterError failToRecognizePatternAfterUpgradeError failToRecognizePatternError cannotCastUTF8StringToDataTypeError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36300) Refactor eleventh set of 20 query execution errors to use error classes
Karen Feng created SPARK-36300: -- Summary: Refactor eleventh set of 20 query execution errors to use error classes Key: SPARK-36300 URL: https://issues.apache.org/jira/browse/SPARK-36300 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the tenth set of 20. {code:java} registeringStreamingQueryListenerError concurrentQueryInstanceError cannotParseJsonArraysAsStructsError cannotParseStringAsDataTypeError failToParseEmptyStringForDataTypeError failToParseValueForDataTypeError rootConverterReturnNullError cannotHaveCircularReferencesInBeanClassError cannotHaveCircularReferencesInClassError cannotUseInvalidJavaIdentifierAsFieldNameError cannotFindEncoderForTypeError attributesForTypeUnsupportedError schemaForTypeUnsupportedError cannotFindConstructorForTypeError paramExceedOneCharError paramIsNotIntegerError paramIsNotBooleanValueError foundNullValueForNotNullableFieldError malformedCSVRecordError elementsOfTupleExceedLimitError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36297) Refactor eighth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36297: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the eighth set of 20. {code:java} executeBroadcastTimeoutError cannotCompareCostWithTargetCostError unsupportedDataTypeError notSupportTypeError notSupportNonPrimitiveTypeError unsupportedTypeError useDictionaryEncodingWhenDictionaryOverflowError endOfIteratorError cannotAllocateMemoryToGrowBytesToBytesMapError cannotAcquireMemoryToBuildLongHashedRelationError cannotAcquireMemoryToBuildUnsafeHashedRelationError rowLargerThan256MUnsupportedError cannotBuildHashedRelationWithUniqueKeysExceededError cannotBuildHashedRelationLargerThan8GError failedToPushRowIntoRowQueueError unexpectedWindowFunctionFrameError cannotParseStatisticAsPercentileError statisticNotRecognizedError unknownColumnError unexpectedAccumulableUpdateValueError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the seventh set of 20. {code:java} missingJdbcTableNameAndQueryError emptyOptionError invalidJdbcTxnIsolationLevelError cannotGetJdbcTypeError unrecognizedSqlTypeError unsupportedJdbcTypeError unsupportedArrayElementTypeBasedOnBinaryError nestedArraysUnsupportedError cannotTranslateNonNullValueForFieldError invalidJdbcNumPartitionsError transactionUnsupportedByJdbcServerError dataTypeUnsupportedYetError unsupportedOperationForDataTypeError inputFilterNotFullyConvertibleError cannotReadFooterForFileError cannotReadFooterForFileError foundDuplicateFieldInCaseInsensitiveModeError failedToMergeIncompatibleSchemasError ddlUnsupportedTemporarilyError operatingOnCanonicalizationPlanError {code} For more detail, see the parent ticket SPARK-36094. > Refactor eighth set of 20 query execution errors to use error classes > - > > Key: SPARK-36297 > URL: https://issues.apache.org/jira/browse/SPARK-36297 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the eighth set of 20. > {code:java} > executeBroadcastTimeoutError > cannotCompareCostWithTargetCostError > unsupportedDataTypeError > notSupportTypeError > notSupportNonPrimitiveTypeError > unsupportedTypeError > useDictionaryEncodingWhenDictionaryOverflowError > endOfIteratorError > cannotAllocateMemoryToGrowBytesToBytesMapError > cannotAcquireMemoryToBuildLongHashedRelationError > cannotAcquireMemoryToBuildUnsafeHashedRelationError > rowLargerThan256MUnsupportedError > cannotBuildHashedRelationWithUniqueKeysExceededError > cannotBuildHashedRelationLargerThan8GError > failedToPushRowIntoRowQueueError > unexpectedWindowFunctionFrameError > cannotParseStatisticAsPercentileError > statisticNotRecognizedError > unknownColumnError > unexpectedAccumulableUpdateValueError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36298) Refactor ninth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36298: -- Summary: Refactor ninth set of 20 query execution errors to use error classes Key: SPARK-36298 URL: https://issues.apache.org/jira/browse/SPARK-36298 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the eighth set of 20. {code:java} executeBroadcastTimeoutError cannotCompareCostWithTargetCostError unsupportedDataTypeError notSupportTypeError notSupportNonPrimitiveTypeError unsupportedTypeError useDictionaryEncodingWhenDictionaryOverflowError endOfIteratorError cannotAllocateMemoryToGrowBytesToBytesMapError cannotAcquireMemoryToBuildLongHashedRelationError cannotAcquireMemoryToBuildUnsafeHashedRelationError rowLargerThan256MUnsupportedError cannotBuildHashedRelationWithUniqueKeysExceededError cannotBuildHashedRelationLargerThan8GError failedToPushRowIntoRowQueueError unexpectedWindowFunctionFrameError cannotParseStatisticAsPercentileError statisticNotRecognizedError unknownColumnError unexpectedAccumulableUpdateValueError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36296) Refactor seventh set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36296: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the seventh set of 20. {code:java} missingJdbcTableNameAndQueryError emptyOptionError invalidJdbcTxnIsolationLevelError cannotGetJdbcTypeError unrecognizedSqlTypeError unsupportedJdbcTypeError unsupportedArrayElementTypeBasedOnBinaryError nestedArraysUnsupportedError cannotTranslateNonNullValueForFieldError invalidJdbcNumPartitionsError transactionUnsupportedByJdbcServerError dataTypeUnsupportedYetError unsupportedOperationForDataTypeError inputFilterNotFullyConvertibleError cannotReadFooterForFileError cannotReadFooterForFileError foundDuplicateFieldInCaseInsensitiveModeError failedToMergeIncompatibleSchemasError ddlUnsupportedTemporarilyError operatingOnCanonicalizationPlanError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the sixth set of 20. {code:java} noRecordsFromEmptyDataReaderError fileNotFoundError unsupportedSchemaColumnConvertError cannotReadParquetFilesError cannotCreateColumnarReaderError invalidNamespaceNameError unsupportedPartitionTransformError missingDatabaseLocationError cannotRemoveReservedPropertyError namespaceNotEmptyError writingJobFailedError writingJobAbortedError commitDeniedError unsupportedTableWritesError cannotCreateJDBCTableWithPartitionsError unsupportedUserSpecifiedSchemaError writeUnsupportedForBinaryFileDataSourceError fileLengthExceedsMaxLengthError unsupportedFieldNameError cannotSpecifyBothJdbcTableNameAndQueryError {code} For more detail, see the parent ticket SPARK-36094. > Refactor seventh set of 20 query execution errors to use error classes > -- > > Key: SPARK-36296 > URL: https://issues.apache.org/jira/browse/SPARK-36296 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the seventh set of 20. > {code:java} > missingJdbcTableNameAndQueryError > emptyOptionError > invalidJdbcTxnIsolationLevelError > cannotGetJdbcTypeError > unrecognizedSqlTypeError > unsupportedJdbcTypeError > unsupportedArrayElementTypeBasedOnBinaryError > nestedArraysUnsupportedError > cannotTranslateNonNullValueForFieldError > invalidJdbcNumPartitionsError > transactionUnsupportedByJdbcServerError > dataTypeUnsupportedYetError > unsupportedOperationForDataTypeError > inputFilterNotFullyConvertibleError > cannotReadFooterForFileError > cannotReadFooterForFileError > foundDuplicateFieldInCaseInsensitiveModeError > failedToMergeIncompatibleSchemasError > ddlUnsupportedTemporarilyError > operatingOnCanonicalizationPlanError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36297) Refactor eighth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36297: -- Summary: Refactor eighth set of 20 query execution errors to use error classes Key: SPARK-36297 URL: https://issues.apache.org/jira/browse/SPARK-36297 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the seventh set of 20. {code:java} missingJdbcTableNameAndQueryError emptyOptionError invalidJdbcTxnIsolationLevelError cannotGetJdbcTypeError unrecognizedSqlTypeError unsupportedJdbcTypeError unsupportedArrayElementTypeBasedOnBinaryError nestedArraysUnsupportedError cannotTranslateNonNullValueForFieldError invalidJdbcNumPartitionsError transactionUnsupportedByJdbcServerError dataTypeUnsupportedYetError unsupportedOperationForDataTypeError inputFilterNotFullyConvertibleError cannotReadFooterForFileError cannotReadFooterForFileError foundDuplicateFieldInCaseInsensitiveModeError failedToMergeIncompatibleSchemasError ddlUnsupportedTemporarilyError operatingOnCanonicalizationPlanError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36296) Refactor seventh set of 20 query execution errors to use error classes
Karen Feng created SPARK-36296: -- Summary: Refactor seventh set of 20 query execution errors to use error classes Key: SPARK-36296 URL: https://issues.apache.org/jira/browse/SPARK-36296 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the sixth set of 20. {code:java} noRecordsFromEmptyDataReaderError fileNotFoundError unsupportedSchemaColumnConvertError cannotReadParquetFilesError cannotCreateColumnarReaderError invalidNamespaceNameError unsupportedPartitionTransformError missingDatabaseLocationError cannotRemoveReservedPropertyError namespaceNotEmptyError writingJobFailedError writingJobAbortedError commitDeniedError unsupportedTableWritesError cannotCreateJDBCTableWithPartitionsError unsupportedUserSpecifiedSchemaError writeUnsupportedForBinaryFileDataSourceError fileLengthExceedsMaxLengthError unsupportedFieldNameError cannotSpecifyBothJdbcTableNameAndQueryError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36295) Refactor sixth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36295: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the sixth set of 20. {code:java} noRecordsFromEmptyDataReaderError fileNotFoundError unsupportedSchemaColumnConvertError cannotReadParquetFilesError cannotCreateColumnarReaderError invalidNamespaceNameError unsupportedPartitionTransformError missingDatabaseLocationError cannotRemoveReservedPropertyError namespaceNotEmptyError writingJobFailedError writingJobAbortedError commitDeniedError unsupportedTableWritesError cannotCreateJDBCTableWithPartitionsError unsupportedUserSpecifiedSchemaError writeUnsupportedForBinaryFileDataSourceError fileLengthExceedsMaxLengthError unsupportedFieldNameError cannotSpecifyBothJdbcTableNameAndQueryError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fifth set of 20. {code:java} createStreamingSourceNotSpecifySchemaError streamedOperatorUnsupportedByDataSourceError multiplePathsSpecifiedError failedToFindDataSourceError removedClassInSpark2Error incompatibleDataSourceRegisterError unrecognizedFileFormatError sparkUpgradeInReadingDatesError sparkUpgradeInWritingDatesError buildReaderUnsupportedForFileFormatError jobAbortedError taskFailedWhileWritingRowsError readCurrentFileNotFoundError unsupportedSaveModeError cannotClearOutputDirectoryError cannotClearPartitionDirectoryError failedToCastValueToDataTypeForPartitionColumnError endOfStreamError fallbackV1RelationReportsInconsistentSchemaError cannotDropNonemptyNamespaceError {code} For more detail, see the parent ticket SPARK-36094. > Refactor sixth set of 20 query execution errors to use error classes > > > Key: SPARK-36295 > URL: https://issues.apache.org/jira/browse/SPARK-36295 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the sixth set of 20. > {code:java} > noRecordsFromEmptyDataReaderError > fileNotFoundError > unsupportedSchemaColumnConvertError > cannotReadParquetFilesError > cannotCreateColumnarReaderError > invalidNamespaceNameError > unsupportedPartitionTransformError > missingDatabaseLocationError > cannotRemoveReservedPropertyError > namespaceNotEmptyError > writingJobFailedError > writingJobAbortedError > commitDeniedError > unsupportedTableWritesError > cannotCreateJDBCTableWithPartitionsError > unsupportedUserSpecifiedSchemaError > writeUnsupportedForBinaryFileDataSourceError > fileLengthExceedsMaxLengthError > unsupportedFieldNameError > cannotSpecifyBothJdbcTableNameAndQueryError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36295) Refactor sixth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36295: -- Summary: Refactor sixth set of 20 query execution errors to use error classes Key: SPARK-36295 URL: https://issues.apache.org/jira/browse/SPARK-36295 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fifth set of 20. {code:java} createStreamingSourceNotSpecifySchemaError streamedOperatorUnsupportedByDataSourceError multiplePathsSpecifiedError failedToFindDataSourceError removedClassInSpark2Error incompatibleDataSourceRegisterError unrecognizedFileFormatError sparkUpgradeInReadingDatesError sparkUpgradeInWritingDatesError buildReaderUnsupportedForFileFormatError jobAbortedError taskFailedWhileWritingRowsError readCurrentFileNotFoundError unsupportedSaveModeError cannotClearOutputDirectoryError cannotClearPartitionDirectoryError failedToCastValueToDataTypeForPartitionColumnError endOfStreamError fallbackV1RelationReportsInconsistentSchemaError cannotDropNonemptyNamespaceError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36294) Refactor fifth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36294: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fifth set of 20. {code:java} createStreamingSourceNotSpecifySchemaError streamedOperatorUnsupportedByDataSourceError multiplePathsSpecifiedError failedToFindDataSourceError removedClassInSpark2Error incompatibleDataSourceRegisterError unrecognizedFileFormatError sparkUpgradeInReadingDatesError sparkUpgradeInWritingDatesError buildReaderUnsupportedForFileFormatError jobAbortedError taskFailedWhileWritingRowsError readCurrentFileNotFoundError unsupportedSaveModeError cannotClearOutputDirectoryError cannotClearPartitionDirectoryError failedToCastValueToDataTypeForPartitionColumnError endOfStreamError fallbackV1RelationReportsInconsistentSchemaError cannotDropNonemptyNamespaceError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fourth set of 20. {code:java} unableToCreateDatabaseAsFailedToCreateDirectoryError unableToDropDatabaseAsFailedToDeleteDirectoryError unableToCreateTableAsFailedToCreateDirectoryError unableToDeletePartitionPathError unableToDropTableAsFailedToDeleteDirectoryError unableToRenameTableAsFailedToRenameDirectoryError unableToCreatePartitionPathError unableToRenamePartitionPathError methodNotImplementedError tableStatsNotSpecifiedError unaryMinusCauseOverflowError binaryArithmeticCauseOverflowError failedSplitSubExpressionMsg failedSplitSubExpressionError failedToCompileMsg internalCompilerError compilerError unsupportedTableChangeError notADatasourceRDDPartitionError dataPathNotSpecifiedError {code} For more detail, see the parent ticket SPARK-36094. > Refactor fifth set of 20 query execution errors to use error classes > > > Key: SPARK-36294 > URL: https://issues.apache.org/jira/browse/SPARK-36294 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the fifth set of 20. > {code:java} > createStreamingSourceNotSpecifySchemaError > streamedOperatorUnsupportedByDataSourceError > multiplePathsSpecifiedError > failedToFindDataSourceError > removedClassInSpark2Error > incompatibleDataSourceRegisterError > unrecognizedFileFormatError > sparkUpgradeInReadingDatesError > sparkUpgradeInWritingDatesError > buildReaderUnsupportedForFileFormatError > jobAbortedError > taskFailedWhileWritingRowsError > readCurrentFileNotFoundError > unsupportedSaveModeError > cannotClearOutputDirectoryError > cannotClearPartitionDirectoryError > failedToCastValueToDataTypeForPartitionColumnError > endOfStreamError > fallbackV1RelationReportsInconsistentSchemaError > cannotDropNonemptyNamespaceError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36294) Refactor fifth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36294: -- Summary: Refactor fifth set of 20 query execution errors to use error classes Key: SPARK-36294 URL: https://issues.apache.org/jira/browse/SPARK-36294 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fourth set of 20. {code:java} unableToCreateDatabaseAsFailedToCreateDirectoryError unableToDropDatabaseAsFailedToDeleteDirectoryError unableToCreateTableAsFailedToCreateDirectoryError unableToDeletePartitionPathError unableToDropTableAsFailedToDeleteDirectoryError unableToRenameTableAsFailedToRenameDirectoryError unableToCreatePartitionPathError unableToRenamePartitionPathError methodNotImplementedError tableStatsNotSpecifiedError unaryMinusCauseOverflowError binaryArithmeticCauseOverflowError failedSplitSubExpressionMsg failedSplitSubExpressionError failedToCompileMsg internalCompilerError compilerError unsupportedTableChangeError notADatasourceRDDPartitionError dataPathNotSpecifiedError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36293) Refactor fourth set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36293: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the fourth set of 20. {code:java} unableToCreateDatabaseAsFailedToCreateDirectoryError unableToDropDatabaseAsFailedToDeleteDirectoryError unableToCreateTableAsFailedToCreateDirectoryError unableToDeletePartitionPathError unableToDropTableAsFailedToDeleteDirectoryError unableToRenameTableAsFailedToRenameDirectoryError unableToCreatePartitionPathError unableToRenamePartitionPathError methodNotImplementedError tableStatsNotSpecifiedError unaryMinusCauseOverflowError binaryArithmeticCauseOverflowError failedSplitSubExpressionMsg failedSplitSubExpressionError failedToCompileMsg internalCompilerError compilerError unsupportedTableChangeError notADatasourceRDDPartitionError dataPathNotSpecifiedError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second set of 20. {code:java} inputTypeUnsupportedError invalidFractionOfSecondError overflowInSumOfDecimalError overflowInIntegralDivideError mapSizeExceedArraySizeWhenZipMapError copyNullFieldNotAllowedError literalTypeUnsupportedError noDefaultForDataTypeError doGenCodeOfAliasShouldNotBeCalledError orderedOperationUnsupportedByDataTypeError regexGroupIndexLessThanZeroError regexGroupIndexExceedGroupCountError invalidUrlError dataTypeOperationUnsupportedError mergeUnsupportedByWindowFunctionError dataTypeUnexpectedError typeUnsupportedError negativeValueUnexpectedError addNewFunctionMismatchedWithFunctionError cannotGenerateCodeForUncomparableTypeError {code} For more detail, see the parent ticket SPARK-36094. > Refactor fourth set of 20 query execution errors to use error classes > - > > Key: SPARK-36293 > URL: https://issues.apache.org/jira/browse/SPARK-36293 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the fourth set of 20. > {code:java} > unableToCreateDatabaseAsFailedToCreateDirectoryError > unableToDropDatabaseAsFailedToDeleteDirectoryError > unableToCreateTableAsFailedToCreateDirectoryError > unableToDeletePartitionPathError > unableToDropTableAsFailedToDeleteDirectoryError > unableToRenameTableAsFailedToRenameDirectoryError > unableToCreatePartitionPathError > unableToRenamePartitionPathError > methodNotImplementedError > tableStatsNotSpecifiedError > unaryMinusCauseOverflowError > binaryArithmeticCauseOverflowError > failedSplitSubExpressionMsg > failedSplitSubExpressionError > failedToCompileMsg > internalCompilerError > compilerError > unsupportedTableChangeError > notADatasourceRDDPartitionError > dataPathNotSpecifiedError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36293) Refactor fourth set of 20 query execution errors to use error classes
Karen Feng created SPARK-36293: -- Summary: Refactor fourth set of 20 query execution errors to use error classes Key: SPARK-36293 URL: https://issues.apache.org/jira/browse/SPARK-36293 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second set of 20. {code:java} inputTypeUnsupportedError invalidFractionOfSecondError overflowInSumOfDecimalError overflowInIntegralDivideError mapSizeExceedArraySizeWhenZipMapError copyNullFieldNotAllowedError literalTypeUnsupportedError noDefaultForDataTypeError doGenCodeOfAliasShouldNotBeCalledError orderedOperationUnsupportedByDataTypeError regexGroupIndexLessThanZeroError regexGroupIndexExceedGroupCountError invalidUrlError dataTypeOperationUnsupportedError mergeUnsupportedByWindowFunctionError dataTypeUnexpectedError typeUnsupportedError negativeValueUnexpectedError addNewFunctionMismatchedWithFunctionError cannotGenerateCodeForUncomparableTypeError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36107: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the first set of 20. {code:java} columnChangeUnsupportedError logicalHintOperatorNotRemovedDuringAnalysisError cannotEvaluateExpressionError cannotGenerateCodeForExpressionError cannotTerminateGeneratorError castingCauseOverflowError cannotChangeDecimalPrecisionError invalidInputSyntaxForNumericError cannotCastFromNullTypeError cannotCastError cannotParseDecimalError simpleStringWithNodeIdUnsupportedError evaluateUnevaluableAggregateUnsupportedError dataTypeUnsupportedError dataTypeUnsupportedError failedExecuteUserDefinedFunctionError divideByZeroError invalidArrayIndexError mapKeyNotExistError rowFromCSVParserNotExpectedError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the first 20. {code} columnChangeUnsupportedError logicalHintOperatorNotRemovedDuringAnalysisError cannotEvaluateExpressionError cannotGenerateCodeForExpressionError cannotTerminateGeneratorError castingCauseOverflowError cannotChangeDecimalPrecisionError invalidInputSyntaxForNumericError cannotCastFromNullTypeError cannotCastError cannotParseDecimalError simpleStringWithNodeIdUnsupportedError evaluateUnevaluableAggregateUnsupportedError dataTypeUnsupportedError dataTypeUnsupportedError failedExecuteUserDefinedFunctionError divideByZeroError invalidArrayIndexError mapKeyNotExistError rowFromCSVParserNotExpectedError {code} For more detail, see the parent ticket [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. > Refactor first set of 20 query execution errors to use error classes > > > Key: SPARK-36107 > URL: https://issues.apache.org/jira/browse/SPARK-36107 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the first set of 20. > {code:java} > columnChangeUnsupportedError > logicalHintOperatorNotRemovedDuringAnalysisError > cannotEvaluateExpressionError > cannotGenerateCodeForExpressionError > cannotTerminateGeneratorError > castingCauseOverflowError > cannotChangeDecimalPrecisionError > invalidInputSyntaxForNumericError > cannotCastFromNullTypeError > cannotCastError > cannotParseDecimalError > simpleStringWithNodeIdUnsupportedError > evaluateUnevaluableAggregateUnsupportedError > dataTypeUnsupportedError > dataTypeUnsupportedError > failedExecuteUserDefinedFunctionError > divideByZeroError > invalidArrayIndexError > mapKeyNotExistError > rowFromCSVParserNotExpectedError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36291: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second set of 20. {code:java} inputTypeUnsupportedError invalidFractionOfSecondError overflowInSumOfDecimalError overflowInIntegralDivideError mapSizeExceedArraySizeWhenZipMapError copyNullFieldNotAllowedError literalTypeUnsupportedError noDefaultForDataTypeError doGenCodeOfAliasShouldNotBeCalledError orderedOperationUnsupportedByDataTypeError regexGroupIndexLessThanZeroError regexGroupIndexExceedGroupCountError invalidUrlError dataTypeOperationUnsupportedError mergeUnsupportedByWindowFunctionError dataTypeUnexpectedError typeUnsupportedError negativeValueUnexpectedError addNewFunctionMismatchedWithFunctionError cannotGenerateCodeForUncomparableTypeError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second 20. {code:java} inputTypeUnsupportedError invalidFractionOfSecondError overflowInSumOfDecimalError overflowInIntegralDivideError mapSizeExceedArraySizeWhenZipMapError copyNullFieldNotAllowedError literalTypeUnsupportedError noDefaultForDataTypeError doGenCodeOfAliasShouldNotBeCalledError orderedOperationUnsupportedByDataTypeError regexGroupIndexLessThanZeroError regexGroupIndexExceedGroupCountError invalidUrlError dataTypeOperationUnsupportedError mergeUnsupportedByWindowFunctionError dataTypeUnexpectedError typeUnsupportedError negativeValueUnexpectedError addNewFunctionMismatchedWithFunctionError cannotGenerateCodeForUncomparableTypeError {code} For more detail, see the parent ticket SPARK-36094. > Refactor second set of 20 query execution errors to use error classes > - > > Key: SPARK-36291 > URL: https://issues.apache.org/jira/browse/SPARK-36291 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the second set of 20. > {code:java} > inputTypeUnsupportedError > invalidFractionOfSecondError > overflowInSumOfDecimalError > overflowInIntegralDivideError > mapSizeExceedArraySizeWhenZipMapError > copyNullFieldNotAllowedError > literalTypeUnsupportedError > noDefaultForDataTypeError > doGenCodeOfAliasShouldNotBeCalledError > orderedOperationUnsupportedByDataTypeError > regexGroupIndexLessThanZeroError > regexGroupIndexExceedGroupCountError > invalidUrlError > dataTypeOperationUnsupportedError > mergeUnsupportedByWindowFunctionError > dataTypeUnexpectedError > typeUnsupportedError > negativeValueUnexpectedError > addNewFunctionMismatchedWithFunctionError > cannotGenerateCodeForUncomparableTypeError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36292) Refactor third set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36292: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the third set of 20. {code:java} cannotGenerateCodeForUnsupportedTypeError cannotInterpolateClassIntoCodeBlockError customCollectionClsNotResolvedError classUnsupportedByMapObjectsError nullAsMapKeyNotAllowedError methodNotDeclaredError constructorNotFoundError primaryConstructorNotFoundError unsupportedNaturalJoinTypeError notExpectedUnresolvedEncoderError unsupportedEncoderError notOverrideExpectedMethodsError failToConvertValueToJsonError unexpectedOperatorInCorrelatedSubquery unreachableError unsupportedRoundingMode resolveCannotHandleNestedSchema inputExternalRowCannotBeNullError fieldCannotBeNullMsg fieldCannotBeNullError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second 20. {code:java} cannotGenerateCodeForUnsupportedTypeError cannotInterpolateClassIntoCodeBlockError customCollectionClsNotResolvedError classUnsupportedByMapObjectsError nullAsMapKeyNotAllowedError methodNotDeclaredError constructorNotFoundError primaryConstructorNotFoundError unsupportedNaturalJoinTypeError notExpectedUnresolvedEncoderError unsupportedEncoderError notOverrideExpectedMethodsError failToConvertValueToJsonError unexpectedOperatorInCorrelatedSubquery unreachableError unsupportedRoundingMode resolveCannotHandleNestedSchema inputExternalRowCannotBeNullError fieldCannotBeNullMsg fieldCannotBeNullError {code} For more detail, see the parent ticket SPARK-36094. > Refactor third set of 20 query execution errors to use error classes > > > Key: SPARK-36292 > URL: https://issues.apache.org/jira/browse/SPARK-36292 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the third set of 20. > {code:java} > cannotGenerateCodeForUnsupportedTypeError > cannotInterpolateClassIntoCodeBlockError > customCollectionClsNotResolvedError > classUnsupportedByMapObjectsError > nullAsMapKeyNotAllowedError > methodNotDeclaredError > constructorNotFoundError > primaryConstructorNotFoundError > unsupportedNaturalJoinTypeError > notExpectedUnresolvedEncoderError > unsupportedEncoderError > notOverrideExpectedMethodsError > failToConvertValueToJsonError > unexpectedOperatorInCorrelatedSubquery > unreachableError > unsupportedRoundingMode > resolveCannotHandleNestedSchema > inputExternalRowCannotBeNullError > fieldCannotBeNullMsg > fieldCannotBeNullError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36291: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second 20. {code:java} inputTypeUnsupportedError invalidFractionOfSecondError overflowInSumOfDecimalError overflowInIntegralDivideError mapSizeExceedArraySizeWhenZipMapError copyNullFieldNotAllowedError literalTypeUnsupportedError noDefaultForDataTypeError doGenCodeOfAliasShouldNotBeCalledError orderedOperationUnsupportedByDataTypeError regexGroupIndexLessThanZeroError regexGroupIndexExceedGroupCountError invalidUrlError dataTypeOperationUnsupportedError mergeUnsupportedByWindowFunctionError dataTypeUnexpectedError typeUnsupportedError negativeValueUnexpectedError addNewFunctionMismatchedWithFunctionError cannotGenerateCodeForUncomparableTypeError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second 20. {code:java} cannotGenerateCodeForUnsupportedTypeError cannotInterpolateClassIntoCodeBlockError customCollectionClsNotResolvedError classUnsupportedByMapObjectsError nullAsMapKeyNotAllowedError methodNotDeclaredError constructorNotFoundError primaryConstructorNotFoundError unsupportedNaturalJoinTypeError notExpectedUnresolvedEncoderError unsupportedEncoderError notOverrideExpectedMethodsError failToConvertValueToJsonError unexpectedOperatorInCorrelatedSubquery unreachableError unsupportedRoundingMode resolveCannotHandleNestedSchema inputExternalRowCannotBeNullError fieldCannotBeNullMsg fieldCannotBeNullError {code} For more detail, see the parent ticket SPARK-36094. > Refactor second set of 20 query execution errors to use error classes > - > > Key: SPARK-36291 > URL: https://issues.apache.org/jira/browse/SPARK-36291 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the second 20. > {code:java} > inputTypeUnsupportedError > invalidFractionOfSecondError > overflowInSumOfDecimalError > overflowInIntegralDivideError > mapSizeExceedArraySizeWhenZipMapError > copyNullFieldNotAllowedError > literalTypeUnsupportedError > noDefaultForDataTypeError > doGenCodeOfAliasShouldNotBeCalledError > orderedOperationUnsupportedByDataTypeError > regexGroupIndexLessThanZeroError > regexGroupIndexExceedGroupCountError > invalidUrlError > dataTypeOperationUnsupportedError > mergeUnsupportedByWindowFunctionError > dataTypeUnexpectedError > typeUnsupportedError > negativeValueUnexpectedError > addNewFunctionMismatchedWithFunctionError > cannotGenerateCodeForUncomparableTypeError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36292) Refactor third set of 20 query execution errors to use error classes
Karen Feng created SPARK-36292: -- Summary: Refactor third set of 20 query execution errors to use error classes Key: SPARK-36292 URL: https://issues.apache.org/jira/browse/SPARK-36292 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second 20. {code:java} cannotGenerateCodeForUnsupportedTypeError cannotInterpolateClassIntoCodeBlockError customCollectionClsNotResolvedError classUnsupportedByMapObjectsError nullAsMapKeyNotAllowedError methodNotDeclaredError constructorNotFoundError primaryConstructorNotFoundError unsupportedNaturalJoinTypeError notExpectedUnresolvedEncoderError unsupportedEncoderError notOverrideExpectedMethodsError failToConvertValueToJsonError unexpectedOperatorInCorrelatedSubquery unreachableError unsupportedRoundingMode resolveCannotHandleNestedSchema inputExternalRowCannotBeNullError fieldCannotBeNullMsg fieldCannotBeNullError {code} For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36291: --- Summary: Refactor second set of 20 query execution errors to use error classes (was: Refactor second 20 query execution errors to use error classes) > Refactor second set of 20 query execution errors to use error classes > - > > Key: SPARK-36291 > URL: https://issues.apache.org/jira/browse/SPARK-36291 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the second 20. > {code:java} > cannotGenerateCodeForUnsupportedTypeError > cannotInterpolateClassIntoCodeBlockError > customCollectionClsNotResolvedError > classUnsupportedByMapObjectsError > nullAsMapKeyNotAllowedError > methodNotDeclaredError > constructorNotFoundError > primaryConstructorNotFoundError > unsupportedNaturalJoinTypeError > notExpectedUnresolvedEncoderError > unsupportedEncoderError > notOverrideExpectedMethodsError > failToConvertValueToJsonError > unexpectedOperatorInCorrelatedSubquery > unreachableError > unsupportedRoundingMode > resolveCannotHandleNestedSchema > inputExternalRowCannotBeNullError > fieldCannotBeNullMsg > fieldCannotBeNullError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36107: --- Summary: Refactor first set of 20 query execution errors to use error classes (was: Refactor first 20 query execution errors to use error classes) > Refactor first set of 20 query execution errors to use error classes > > > Key: SPARK-36107 > URL: https://issues.apache.org/jira/browse/SPARK-36107 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the first 20. > {code} > columnChangeUnsupportedError > logicalHintOperatorNotRemovedDuringAnalysisError > cannotEvaluateExpressionError > cannotGenerateCodeForExpressionError > cannotTerminateGeneratorError > castingCauseOverflowError > cannotChangeDecimalPrecisionError > invalidInputSyntaxForNumericError > cannotCastFromNullTypeError > cannotCastError > cannotParseDecimalError > simpleStringWithNodeIdUnsupportedError > evaluateUnevaluableAggregateUnsupportedError > dataTypeUnsupportedError > dataTypeUnsupportedError > failedExecuteUserDefinedFunctionError > divideByZeroError > invalidArrayIndexError > mapKeyNotExistError > rowFromCSVParserNotExpectedError > {code} > For more detail, see the parent ticket > [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36291) Refactor second 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36291: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the second 20. {code:java} cannotGenerateCodeForUnsupportedTypeError cannotInterpolateClassIntoCodeBlockError customCollectionClsNotResolvedError classUnsupportedByMapObjectsError nullAsMapKeyNotAllowedError methodNotDeclaredError constructorNotFoundError primaryConstructorNotFoundError unsupportedNaturalJoinTypeError notExpectedUnresolvedEncoderError unsupportedEncoderError notOverrideExpectedMethodsError failToConvertValueToJsonError unexpectedOperatorInCorrelatedSubquery unreachableError unsupportedRoundingMode resolveCannotHandleNestedSchema inputExternalRowCannotBeNullError fieldCannotBeNullMsg fieldCannotBeNullError {code} For more detail, see the parent ticket SPARK-36094. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the first 20. {code} columnChangeUnsupportedError logicalHintOperatorNotRemovedDuringAnalysisError cannotEvaluateExpressionError cannotGenerateCodeForExpressionError cannotTerminateGeneratorError castingCauseOverflowError cannotChangeDecimalPrecisionError invalidInputSyntaxForNumericError cannotCastFromNullTypeError cannotCastError cannotParseDecimalError simpleStringWithNodeIdUnsupportedError evaluateUnevaluableAggregateUnsupportedError dataTypeUnsupportedError dataTypeUnsupportedError failedExecuteUserDefinedFunctionError divideByZeroError invalidArrayIndexError mapKeyNotExistError rowFromCSVParserNotExpectedError {code} For more detail, see the parent ticket [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. > Refactor second 20 query execution errors to use error classes > -- > > Key: SPARK-36291 > URL: https://issues.apache.org/jira/browse/SPARK-36291 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the second 20. > {code:java} > cannotGenerateCodeForUnsupportedTypeError > cannotInterpolateClassIntoCodeBlockError > customCollectionClsNotResolvedError > classUnsupportedByMapObjectsError > nullAsMapKeyNotAllowedError > methodNotDeclaredError > constructorNotFoundError > primaryConstructorNotFoundError > unsupportedNaturalJoinTypeError > notExpectedUnresolvedEncoderError > unsupportedEncoderError > notOverrideExpectedMethodsError > failToConvertValueToJsonError > unexpectedOperatorInCorrelatedSubquery > unreachableError > unsupportedRoundingMode > resolveCannotHandleNestedSchema > inputExternalRowCannotBeNullError > fieldCannotBeNullMsg > fieldCannotBeNullError > {code} > For more detail, see the parent ticket SPARK-36094. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36107) Refactor first 20 query execution errors to use error classes
[ https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karen Feng updated SPARK-36107: --- Description: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the first 20. {code} columnChangeUnsupportedError logicalHintOperatorNotRemovedDuringAnalysisError cannotEvaluateExpressionError cannotGenerateCodeForExpressionError cannotTerminateGeneratorError castingCauseOverflowError cannotChangeDecimalPrecisionError invalidInputSyntaxForNumericError cannotCastFromNullTypeError cannotCastError cannotParseDecimalError simpleStringWithNodeIdUnsupportedError evaluateUnevaluableAggregateUnsupportedError dataTypeUnsupportedError dataTypeUnsupportedError failedExecuteUserDefinedFunctionError divideByZeroError invalidArrayIndexError mapKeyNotExistError rowFromCSVParserNotExpectedError {code} For more detail, see the parent ticket [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. was: Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on a few. For more detail, see the parent ticket [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. Summary: Refactor first 20 query execution errors to use error classes (was: Refactor a few query execution errors to use error classes) > Refactor first 20 query execution errors to use error classes > - > > Key: SPARK-36107 > URL: https://issues.apache.org/jira/browse/SPARK-36107 > Project: Spark > Issue Type: Sub-task > Components: Spark Core, SQL >Affects Versions: 3.2.0 >Reporter: Karen Feng >Priority: Major > > Refactor some exceptions in > [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] > to use error classes. > There are currently ~350 exceptions in this file; so this PR only focuses on > the first 20. > {code} > columnChangeUnsupportedError > logicalHintOperatorNotRemovedDuringAnalysisError > cannotEvaluateExpressionError > cannotGenerateCodeForExpressionError > cannotTerminateGeneratorError > castingCauseOverflowError > cannotChangeDecimalPrecisionError > invalidInputSyntaxForNumericError > cannotCastFromNullTypeError > cannotCastError > cannotParseDecimalError > simpleStringWithNodeIdUnsupportedError > evaluateUnevaluableAggregateUnsupportedError > dataTypeUnsupportedError > dataTypeUnsupportedError > failedExecuteUserDefinedFunctionError > divideByZeroError > invalidArrayIndexError > mapKeyNotExistError > rowFromCSVParserNotExpectedError > {code} > For more detail, see the parent ticket > [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36291) Refactor second 20 query execution errors to use error classes
Karen Feng created SPARK-36291: -- Summary: Refactor second 20 query execution errors to use error classes Key: SPARK-36291 URL: https://issues.apache.org/jira/browse/SPARK-36291 Project: Spark Issue Type: Sub-task Components: Spark Core, SQL Affects Versions: 3.2.0 Reporter: Karen Feng Refactor some exceptions in [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala] to use error classes. There are currently ~350 exceptions in this file; so this PR only focuses on the first 20. {code} columnChangeUnsupportedError logicalHintOperatorNotRemovedDuringAnalysisError cannotEvaluateExpressionError cannotGenerateCodeForExpressionError cannotTerminateGeneratorError castingCauseOverflowError cannotChangeDecimalPrecisionError invalidInputSyntaxForNumericError cannotCastFromNullTypeError cannotCastError cannotParseDecimalError simpleStringWithNodeIdUnsupportedError evaluateUnevaluableAggregateUnsupportedError dataTypeUnsupportedError dataTypeUnsupportedError failedExecuteUserDefinedFunctionError divideByZeroError invalidArrayIndexError mapKeyNotExistError rowFromCSVParserNotExpectedError {code} For more detail, see the parent ticket [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org