date:20210726

[jira] [Resolved] (SPARK-36247) check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-36247.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33468
[https://github.com/apache/spark/pull/33468]

> check string length for char/varchar and apply type coercion in UPDATE/MERGE 
> command
> 
>
> Key: SPARK-36247
> URL: https://issues.apache.org/jira/browse/SPARK-36247
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 3.2.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36247) check string length for char/varchar and apply type coercion in UPDATE/MERGE command

2021-07-26 Thread Wenchen Fan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan reassigned SPARK-36247:
---

Assignee: Wenchen Fan

> check string length for char/varchar and apply type coercion in UPDATE/MERGE 
> command
> 
>
> Key: SPARK-36247
> URL: https://issues.apache.org/jira/browse/SPARK-36247
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36136) Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387778#comment-17387778
 ] 

Apache Spark commented on SPARK-36136:
--

User 'viirya' has created a pull request for this issue:
https://github.com/apache/spark/pull/33533

> Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive
> 
>
> Key: SPARK-36136
> URL: https://issues.apache.org/jira/browse/SPARK-36136
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Fix For: 3.2.0
>
>
> Currently both {{PruneFileSourcePartitionsSuite}} and 
> {{PrunePartitionSuiteBase}} are in {{org.apache.spark.sql.hive.execution}} 
> which doesn't look right. They should belong to 
> {{org.apache.spark.sql.execution.datasources}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode

2021-07-26 Thread Fu Chen (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387766#comment-17387766
 ] 

Fu Chen commented on SPARK-36277:
-

This bug also persists when we run with the latest master branch.

The user-defined schema was pruning by Rule `ColumnPruning`
{noformat}
=== Applying Rule org.apache.spark.sql.catalyst.optimizer.ColumnPruning ===
Aggregate [count(1) AS count#29L]  
Aggregate [count(1) AS count#29L]
!+- Relation [firstname#0,middlename#1,lastname#2,id#3,gender#4,salary#5] csv   
+- Project
!   
   +- Relation [firstname#0,middlename#1,lastname#2,id#3,gender#4,salary#5] csv
{noformat}
{noformat}
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count#29L])
+- ShuffleQueryStage 0
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#22]
  +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], 
output=[count#32L])
 +- FileScan csv [] Batched: false, DataFilters: [], Format: CSV, 
Location: InMemoryFileIndex(1 paths)[file:/tmp/sample.csv], PartitionFilters: 
[], PushedFilters: [], ReadSchema: struct<>
{noformat}
But note that when we read CSV files with DROPMALFORMED mode, the 
`UnivocityParser` needs the schema to judge whether a record is corrupt or not, 
so `FileScan` scans all records in the CSV file including corrupted records. 
(salary = 'NA' is corrupted due to user-defined schema field salary is the type 
of Integer)

When we disabled rule `ColumnPruning`, the result is what we want(in other 
words, `UnivocityParser` needs schema when we scan CSV files with DROPMALFORMED 
mode.)
{code:java}
spark.sql("set 
spark.sql.optimizer.excludedRules=org.apache.spark.sql.catalyst.optimizer.ColumnPruning")
{code}

 Any suggestions or ideas to fix this bug? [~hyukjin.kwon]

> Issue with record count of data frame while reading in DropMalformed mode
> -
>
> Key: SPARK-36277
> URL: https://issues.apache.org/jira/browse/SPARK-36277
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.3
>Reporter: anju
>Priority: Major
> Attachments: 111.PNG, Inputfile.PNG, sample.csv
>
>
> I am writing the steps to reproduce the issue for "count" pyspark api while 
> using mode as dropmalformed.
> I have a csv sample file in s3 bucket . I am reading the file using pyspark 
> api for csv . I am reading the csv "without schema" and "with schema using 
> mode 'dropmalformed' options  in two different dataframes . While displaying 
> the "with schema using mode 'dropmalformed'" dataframe , the display looks 
> good ,it is not showing the malformed records .But when we apply count api on 
> the dataframe it gives the record count of actual file. I am expecting it 
> should give me valid record count .
> here is the code used:-
> {code}
> without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True)
> schema = StructType([ \
> StructField("firstname",StringType(),True), \
> StructField("middlename",StringType(),True), \
> StructField("lastname",StringType(),True), \
> StructField("id", StringType(), True), \
> StructField("gender", StringType(), True), \
> StructField("salary", IntegerType(), True) \
>   ])
> with_schema_df = 
> spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED")
> print("The dataframe with schema")
> with_schema_df.show()
> print("The dataframe without schema")
> without_schema_df.show()
> cnt_with_schema=with_schema_df.count()
> print("The  records count from with schema df :"+str(cnt_with_schema))
> cnt_without_schema=without_schema_df.count()
> print("The  records count from without schema df: "+str(cnt_without_schema))
> {code}
> here is the outputs screen shot 111.PNG is the outputs of the code and 
> inputfile.csv is the input to the code
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36285) Skip MiMa in PySpark GHA job

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36285:


Assignee: Apache Spark

> Skip MiMa in PySpark GHA job
> 
>
> Key: SPARK-36285
> URL: https://issues.apache.org/jira/browse/SPARK-36285
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36285) Skip MiMa in PySpark GHA job

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36285:


Assignee: (was: Apache Spark)

> Skip MiMa in PySpark GHA job
> 
>
> Key: SPARK-36285
> URL: https://issues.apache.org/jira/browse/SPARK-36285
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36285) Skip MiMa in PySpark GHA job

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387765#comment-17387765
 ] 

Apache Spark commented on SPARK-36285:
--

User 'williamhyun' has created a pull request for this issue:
https://github.com/apache/spark/pull/33532

> Skip MiMa in PySpark GHA job
> 
>
> Key: SPARK-36285
> URL: https://issues.apache.org/jira/browse/SPARK-36285
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36312) ParquetWritter should check inner field

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36312:


Assignee: Apache Spark

> ParquetWritter should check inner field
> ---
>
> Key: SPARK-36312
> URL: https://issues.apache.org/jira/browse/SPARK-36312
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Assignee: Apache Spark
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36312) ParquetWritter should check inner field

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36312:


Assignee: (was: Apache Spark)

> ParquetWritter should check inner field
> ---
>
> Key: SPARK-36312
> URL: https://issues.apache.org/jira/browse/SPARK-36312
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36312) ParquetWritter should check inner field

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387755#comment-17387755
 ] 

Apache Spark commented on SPARK-36312:
--

User 'AngersZh' has created a pull request for this issue:
https://github.com/apache/spark/pull/33531

> ParquetWritter should check inner field
> ---
>
> Key: SPARK-36312
> URL: https://issues.apache.org/jira/browse/SPARK-36312
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: angerszhu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36312) ParquetWritter should check inner field

2021-07-26 Thread angerszhu (Jira)

angerszhu created SPARK-36312:
-

 Summary: ParquetWritter should check inner field
 Key: SPARK-36312
 URL: https://issues.apache.org/jira/browse/SPARK-36312
 Project: Spark
  Issue Type: Sub-task
  Components: SQL
Affects Versions: 3.2.0
Reporter: angerszhu






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode

2021-07-26 Thread anju (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387733#comment-17387733
 ] 

anju edited comment on SPARK-36277 at 7/27/21, 3:33 AM:


[~hyukjin.kwon]Sure let me check and update. which version would you suggest?


was (Author: datumgirl):
Sure let me check and update

> Issue with record count of data frame while reading in DropMalformed mode
> -
>
> Key: SPARK-36277
> URL: https://issues.apache.org/jira/browse/SPARK-36277
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.3
>Reporter: anju
>Priority: Major
> Attachments: 111.PNG, Inputfile.PNG, sample.csv
>
>
> I am writing the steps to reproduce the issue for "count" pyspark api while 
> using mode as dropmalformed.
> I have a csv sample file in s3 bucket . I am reading the file using pyspark 
> api for csv . I am reading the csv "without schema" and "with schema using 
> mode 'dropmalformed' options  in two different dataframes . While displaying 
> the "with schema using mode 'dropmalformed'" dataframe , the display looks 
> good ,it is not showing the malformed records .But when we apply count api on 
> the dataframe it gives the record count of actual file. I am expecting it 
> should give me valid record count .
> here is the code used:-
> {code}
> without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True)
> schema = StructType([ \
> StructField("firstname",StringType(),True), \
> StructField("middlename",StringType(),True), \
> StructField("lastname",StringType(),True), \
> StructField("id", StringType(), True), \
> StructField("gender", StringType(), True), \
> StructField("salary", IntegerType(), True) \
>   ])
> with_schema_df = 
> spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED")
> print("The dataframe with schema")
> with_schema_df.show()
> print("The dataframe without schema")
> without_schema_df.show()
> cnt_with_schema=with_schema_df.count()
> print("The  records count from with schema df :"+str(cnt_with_schema))
> cnt_without_schema=without_schema_df.count()
> print("The  records count from without schema df: "+str(cnt_without_schema))
> {code}
> here is the outputs screen shot 111.PNG is the outputs of the code and 
> inputfile.csv is the input to the code
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode

2021-07-26 Thread anju (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387733#comment-17387733
 ] 

anju commented on SPARK-36277:
--

Sure let me check and update

> Issue with record count of data frame while reading in DropMalformed mode
> -
>
> Key: SPARK-36277
> URL: https://issues.apache.org/jira/browse/SPARK-36277
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.3
>Reporter: anju
>Priority: Major
> Attachments: 111.PNG, Inputfile.PNG, sample.csv
>
>
> I am writing the steps to reproduce the issue for "count" pyspark api while 
> using mode as dropmalformed.
> I have a csv sample file in s3 bucket . I am reading the file using pyspark 
> api for csv . I am reading the csv "without schema" and "with schema using 
> mode 'dropmalformed' options  in two different dataframes . While displaying 
> the "with schema using mode 'dropmalformed'" dataframe , the display looks 
> good ,it is not showing the malformed records .But when we apply count api on 
> the dataframe it gives the record count of actual file. I am expecting it 
> should give me valid record count .
> here is the code used:-
> {code}
> without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True)
> schema = StructType([ \
> StructField("firstname",StringType(),True), \
> StructField("middlename",StringType(),True), \
> StructField("lastname",StringType(),True), \
> StructField("id", StringType(), True), \
> StructField("gender", StringType(), True), \
> StructField("salary", IntegerType(), True) \
>   ])
> with_schema_df = 
> spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED")
> print("The dataframe with schema")
> with_schema_df.show()
> print("The dataframe without schema")
> without_schema_df.show()
> cnt_with_schema=with_schema_df.count()
> print("The  records count from with schema df :"+str(cnt_with_schema))
> cnt_without_schema=without_schema_df.count()
> print("The  records count from without schema df: "+str(cnt_without_schema))
> {code}
> here is the outputs screen shot 111.PNG is the outputs of the code and 
> inputfile.csv is the input to the code
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36288) Update API usage on pyspark pandas documents

2021-07-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-36288:


Assignee: Leona Yoda

> Update API usage on pyspark pandas documents
> 
>
> Key: SPARK-36288
> URL: https://issues.apache.org/jira/browse/SPARK-36288
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, PySpark
>Affects Versions: 3.2.0
>Reporter: Leona Yoda
>Assignee: Leona Yoda
>Priority: Minor
>
> I found several warning messages when I tested around ported Spark Pandas API 
> Documents (https://issues.apache.org/jira/browse/SPARK-34885).
> 1.  `spark.sql.execution.arrow.enabled` on Best Practice document
> {code:java}
> 21/07/26 05:42:02 WARN SQLConf: The SQL config 
> 'spark.sql.execution.arrow.enabled' has been deprecated in Spark v3.0 and may 
> be removed in the future. Use 'spark.sql.execution.arrow.pyspark.enabled' 
> instead of it.
> {code}
>  
> 2. `DataFrame.to_spark_io` on From/to other DBMSes document
> {code:java}
> /opt/spark/python/lib/pyspark.zip/pyspark/pandas/frame.py:4811: 
> FutureWarning: Deprecated in 3.2, Use spark.to_spark_io instead. 
> warnings.warn("Deprecated in 3.2, Use spark.to_spark_io instead.", 
> FutureWarning)
> {code}
>  
> At this time it worked but I think it's better to update API usage on those 
> documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36288) Update API usage on pyspark pandas documents

2021-07-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-36288.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33519
[https://github.com/apache/spark/pull/33519]

> Update API usage on pyspark pandas documents
> 
>
> Key: SPARK-36288
> URL: https://issues.apache.org/jira/browse/SPARK-36288
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, PySpark
>Affects Versions: 3.2.0
>Reporter: Leona Yoda
>Assignee: Leona Yoda
>Priority: Minor
> Fix For: 3.2.0
>
>
> I found several warning messages when I tested around ported Spark Pandas API 
> Documents (https://issues.apache.org/jira/browse/SPARK-34885).
> 1.  `spark.sql.execution.arrow.enabled` on Best Practice document
> {code:java}
> 21/07/26 05:42:02 WARN SQLConf: The SQL config 
> 'spark.sql.execution.arrow.enabled' has been deprecated in Spark v3.0 and may 
> be removed in the future. Use 'spark.sql.execution.arrow.pyspark.enabled' 
> instead of it.
> {code}
>  
> 2. `DataFrame.to_spark_io` on From/to other DBMSes document
> {code:java}
> /opt/spark/python/lib/pyspark.zip/pyspark/pandas/frame.py:4811: 
> FutureWarning: Deprecated in 3.2, Use spark.to_spark_io instead. 
> warnings.warn("Deprecated in 3.2, Use spark.to_spark_io instead.", 
> FutureWarning)
> {code}
>  
> At this time it worked but I think it's better to update API usage on those 
> documents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.

2021-07-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-36267.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33528
[https://github.com/apache/spark/pull/33528]

> Clean up CategoricalAccessor and CategoricalIndex.
> --
>
> Key: SPARK-36267
> URL: https://issues.apache.org/jira/browse/SPARK-36267
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
> Fix For: 3.2.0
>
>
> - Clean up the classes
> - Add deprecation warnings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode

2021-07-26 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387729#comment-17387729
 ] 

Hyukjin Kwon commented on SPARK-36277:
--

[~datumgirl], Spark 2.4 is EOL so the bug won't likely be fixed. Can you try 
higher versions of Spark and see if the bug persists?

> Issue with record count of data frame while reading in DropMalformed mode
> -
>
> Key: SPARK-36277
> URL: https://issues.apache.org/jira/browse/SPARK-36277
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.3
>Reporter: anju
>Priority: Major
> Attachments: 111.PNG, Inputfile.PNG, sample.csv
>
>
> I am writing the steps to reproduce the issue for "count" pyspark api while 
> using mode as dropmalformed.
> I have a csv sample file in s3 bucket . I am reading the file using pyspark 
> api for csv . I am reading the csv "without schema" and "with schema using 
> mode 'dropmalformed' options  in two different dataframes . While displaying 
> the "with schema using mode 'dropmalformed'" dataframe , the display looks 
> good ,it is not showing the malformed records .But when we apply count api on 
> the dataframe it gives the record count of actual file. I am expecting it 
> should give me valid record count .
> here is the code used:-
> {code}
> without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True)
> schema = StructType([ \
> StructField("firstname",StringType(),True), \
> StructField("middlename",StringType(),True), \
> StructField("lastname",StringType(),True), \
> StructField("id", StringType(), True), \
> StructField("gender", StringType(), True), \
> StructField("salary", IntegerType(), True) \
>   ])
> with_schema_df = 
> spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED")
> print("The dataframe with schema")
> with_schema_df.show()
> print("The dataframe without schema")
> without_schema_df.show()
> cnt_with_schema=with_schema_df.count()
> print("The  records count from with schema df :"+str(cnt_with_schema))
> cnt_without_schema=without_schema_df.count()
> print("The  records count from without schema df: "+str(cnt_without_schema))
> {code}
> here is the outputs screen shot 111.PNG is the outputs of the code and 
> inputfile.csv is the input to the code
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36277) Issue with record count of data frame while reading in DropMalformed mode

2021-07-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-36277:
-
Description: 
I am writing the steps to reproduce the issue for "count" pyspark api while 
using mode as dropmalformed.

I have a csv sample file in s3 bucket . I am reading the file using pyspark api 
for csv . I am reading the csv "without schema" and "with schema using mode 
'dropmalformed' options  in two different dataframes . While displaying the 
"with schema using mode 'dropmalformed'" dataframe , the display looks good ,it 
is not showing the malformed records .But when we apply count api on the 
dataframe it gives the record count of actual file. I am expecting it should 
give me valid record count .

here is the code used:-

{code}
without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True)
schema = StructType([ \
StructField("firstname",StringType(),True), \
StructField("middlename",StringType(),True), \
StructField("lastname",StringType(),True), \
StructField("id", StringType(), True), \
StructField("gender", StringType(), True), \
StructField("salary", IntegerType(), True) \
  ])
with_schema_df = 
spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED")
print("The dataframe with schema")
with_schema_df.show()
print("The dataframe without schema")
without_schema_df.show()
cnt_with_schema=with_schema_df.count()
print("The  records count from with schema df :"+str(cnt_with_schema))
cnt_without_schema=without_schema_df.count()
print("The  records count from without schema df: "+str(cnt_without_schema))
{code}

here is the outputs screen shot 111.PNG is the outputs of the code and 
inputfile.csv is the input to the code


 







  was:
I am writing the steps to reproduce the issue for "count" pyspark api while 
using mode as dropmalformed.

I have a csv sample file in s3 bucket . I am reading the file using pyspark api 
for csv . I am reading the csv "without schema" and "with schema using mode 
'dropmalformed' options  in two different dataframes . While displaying the 
"with schema using mode 'dropmalformed'" dataframe , the display looks good ,it 
is not showing the malformed records .But when we apply count api on the 
dataframe it gives the record count of actual file. I am expecting it should 
give me valid record count .

here is the code used:-
```
without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True)
schema = StructType([ \
StructField("firstname",StringType(),True), \
StructField("middlename",StringType(),True), \
StructField("lastname",StringType(),True), \
StructField("id", StringType(), True), \
StructField("gender", StringType(), True), \
StructField("salary", IntegerType(), True) \
  ])
with_schema_df = 
spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True,schema=schema,mode="DROPMALFORMED")
print("The dataframe with schema")
with_schema_df.show()
print("The dataframe without schema")
without_schema_df.show()
cnt_with_schema=with_schema_df.count()
print("The  records count from with schema df :"+str(cnt_with_schema))
cnt_without_schema=without_schema_df.count()
print("The  records count from without schema df: "+str(cnt_without_schema))
```
here is the outputs screen shot 111.PNG is the outputs of the code and 
inputfile.csv is the input to the code


 








> Issue with record count of data frame while reading in DropMalformed mode
> -
>
> Key: SPARK-36277
> URL: https://issues.apache.org/jira/browse/SPARK-36277
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.4.3
>Reporter: anju
>Priority: Major
> Attachments: 111.PNG, Inputfile.PNG, sample.csv
>
>
> I am writing the steps to reproduce the issue for "count" pyspark api while 
> using mode as dropmalformed.
> I have a csv sample file in s3 bucket . I am reading the file using pyspark 
> api for csv . I am reading the csv "without schema" and "with schema using 
> mode 'dropmalformed' options  in two different dataframes . While displaying 
> the "with schema using mode 'dropmalformed'" dataframe , the display looks 
> good ,it is not showing the malformed records .But when we apply count api on 
> the dataframe it gives the record count of actual file. I am expecting it 
> should give me valid record count .
> here is the code used:-
> {code}
> without_schema_df=spark.read.csv("s3://noa-poc-lakeformation/data/test_files/sample.csv",header=True)
> schema = StructType([ \
> StructField("firstname",StringType(),True), \
> StructField("middlename",StringType(),True), \
>

[jira] [Comment Edited] (SPARK-36285) Skip MiMa in PySpark GHA job

2021-07-26 Thread Hyukjin Kwon (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387073#comment-17387073
 ] 

Hyukjin Kwon edited comment on SPARK-36285 at 7/27/21, 3:21 AM:


Like SPARK-36198, could you make a PR for this, [~williamhyun]?

cc [~hyukjin.kwon]]


was (Author: dongjoon):
Like SPARK-36198, could you make a PR for this, [~williamhyun]?

cc [~dongjoon]

> Skip MiMa in PySpark GHA job
> 
>
> Key: SPARK-36285
> URL: https://issues.apache.org/jira/browse/SPARK-36285
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra, Tests
>Affects Versions: 3.3.0
>Reporter: Dongjoon Hyun
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.

2021-07-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-36267:


Assignee: Takuya Ueshin

> Clean up CategoricalAccessor and CategoricalIndex.
> --
>
> Key: SPARK-36267
> URL: https://issues.apache.org/jira/browse/SPARK-36267
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Takuya Ueshin
>Assignee: Takuya Ueshin
>Priority: Major
>
> - Clean up the classes
> - Add deprecation warnings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36097) Group exception messages in core/scheduler

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387724#comment-17387724
 ] 

Apache Spark commented on SPARK-36097:
--

User 'dgd-contributor' has created a pull request for this issue:
https://github.com/apache/spark/pull/33529

> Group exception messages in core/scheduler
> --
>
> Key: SPARK-36097
> URL: https://issues.apache.org/jira/browse/SPARK-36097
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Priority: Major
>
> 'core/src/main/scala/org/apache/spark/scheduler'
> || Filename||   Count ||
> | DAGScheduler.scala  |   4 |
> | LiveListenerBus.scala   |   1 |
> | TaskInfo.scala  |   1 |
> | TaskSchedulerImpl.scala |   4 |
> | TaskSetManager.scala|   1 |
> 'core/src/main/scala/org/apache/spark/scheduler/cluster'
> || Filename||   Count ||
> | CoarseGrainedSchedulerBackend.scala |   2 |
> 'core/src/main/scala/org/apache/spark/scheduler/dynalloc'
> || Filename  ||   Count ||
> | ExecutorMonitor.scala |   1 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36098) Group exception messages in core/storage

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36098:


Assignee: Apache Spark

> Group exception messages in core/storage
> 
>
> Key: SPARK-36098
> URL: https://issues.apache.org/jira/browse/SPARK-36098
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Assignee: Apache Spark
>Priority: Major
>
> 'core/src/main/scala/org/apache/spark/storage'
> || Filename  ||   Count ||
> | BlockId.scala |   1 |
> | BlockInfoManager.scala|   2 |
> | BlockManager.scala|   9 |
> | BlockManagerDecommissioner.scala  |   1 |
> | BlockManagerMaster.scala  |   2 |
> | DiskBlockManager.scala|   1 |
> | DiskBlockObjectWriter.scala   |   1 |
> | ShuffleBlockFetcherIterator.scala |   4 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36098) Group exception messages in core/storage

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36098:


Assignee: (was: Apache Spark)

> Group exception messages in core/storage
> 
>
> Key: SPARK-36098
> URL: https://issues.apache.org/jira/browse/SPARK-36098
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Priority: Major
>
> 'core/src/main/scala/org/apache/spark/storage'
> || Filename  ||   Count ||
> | BlockId.scala |   1 |
> | BlockInfoManager.scala|   2 |
> | BlockManager.scala|   9 |
> | BlockManagerDecommissioner.scala  |   1 |
> | BlockManagerMaster.scala  |   2 |
> | DiskBlockManager.scala|   1 |
> | DiskBlockObjectWriter.scala   |   1 |
> | ShuffleBlockFetcherIterator.scala |   4 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36097) Group exception messages in core/scheduler

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36097:


Assignee: (was: Apache Spark)

> Group exception messages in core/scheduler
> --
>
> Key: SPARK-36097
> URL: https://issues.apache.org/jira/browse/SPARK-36097
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Priority: Major
>
> 'core/src/main/scala/org/apache/spark/scheduler'
> || Filename||   Count ||
> | DAGScheduler.scala  |   4 |
> | LiveListenerBus.scala   |   1 |
> | TaskInfo.scala  |   1 |
> | TaskSchedulerImpl.scala |   4 |
> | TaskSetManager.scala|   1 |
> 'core/src/main/scala/org/apache/spark/scheduler/cluster'
> || Filename||   Count ||
> | CoarseGrainedSchedulerBackend.scala |   2 |
> 'core/src/main/scala/org/apache/spark/scheduler/dynalloc'
> || Filename  ||   Count ||
> | ExecutorMonitor.scala |   1 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36097) Group exception messages in core/scheduler

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36097:


Assignee: Apache Spark

> Group exception messages in core/scheduler
> --
>
> Key: SPARK-36097
> URL: https://issues.apache.org/jira/browse/SPARK-36097
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Assignee: Apache Spark
>Priority: Major
>
> 'core/src/main/scala/org/apache/spark/scheduler'
> || Filename||   Count ||
> | DAGScheduler.scala  |   4 |
> | LiveListenerBus.scala   |   1 |
> | TaskInfo.scala  |   1 |
> | TaskSchedulerImpl.scala |   4 |
> | TaskSetManager.scala|   1 |
> 'core/src/main/scala/org/apache/spark/scheduler/cluster'
> || Filename||   Count ||
> | CoarseGrainedSchedulerBackend.scala |   2 |
> 'core/src/main/scala/org/apache/spark/scheduler/dynalloc'
> || Filename  ||   Count ||
> | ExecutorMonitor.scala |   1 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36098) Group exception messages in core/storage

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36098:


Assignee: Apache Spark

> Group exception messages in core/storage
> 
>
> Key: SPARK-36098
> URL: https://issues.apache.org/jira/browse/SPARK-36098
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Assignee: Apache Spark
>Priority: Major
>
> 'core/src/main/scala/org/apache/spark/storage'
> || Filename  ||   Count ||
> | BlockId.scala |   1 |
> | BlockInfoManager.scala|   2 |
> | BlockManager.scala|   9 |
> | BlockManagerDecommissioner.scala  |   1 |
> | BlockManagerMaster.scala  |   2 |
> | DiskBlockManager.scala|   1 |
> | DiskBlockObjectWriter.scala   |   1 |
> | ShuffleBlockFetcherIterator.scala |   4 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36098) Group exception messages in core/storage

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387723#comment-17387723
 ] 

Apache Spark commented on SPARK-36098:
--

User 'dgd-contributor' has created a pull request for this issue:
https://github.com/apache/spark/pull/33530

> Group exception messages in core/storage
> 
>
> Key: SPARK-36098
> URL: https://issues.apache.org/jira/browse/SPARK-36098
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.3.0
>Reporter: Allison Wang
>Priority: Major
>
> 'core/src/main/scala/org/apache/spark/storage'
> || Filename  ||   Count ||
> | BlockId.scala |   1 |
> | BlockInfoManager.scala|   2 |
> | BlockManager.scala|   9 |
> | BlockManagerDecommissioner.scala  |   1 |
> | BlockManagerMaster.scala  |   2 |
> | DiskBlockManager.scala|   1 |
> | DiskBlockObjectWriter.scala   |   1 |
> | ShuffleBlockFetcherIterator.scala |   4 |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes

2021-07-26 Thread dgd_contributor (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387720#comment-17387720
 ] 

dgd_contributor commented on SPARK-36291:
-

working on this

> Refactor second set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36291
> URL: https://issues.apache.org/jira/browse/SPARK-36291
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the second set of 20.
> {code:java}
> inputTypeUnsupportedError
> invalidFractionOfSecondError
> overflowInSumOfDecimalError
> overflowInIntegralDivideError
> mapSizeExceedArraySizeWhenZipMapError
> copyNullFieldNotAllowedError
> literalTypeUnsupportedError
> noDefaultForDataTypeError
> doGenCodeOfAliasShouldNotBeCalledError
> orderedOperationUnsupportedByDataTypeError
> regexGroupIndexLessThanZeroError
> regexGroupIndexExceedGroupCountError
> invalidUrlError
> dataTypeOperationUnsupportedError
> mergeUnsupportedByWindowFunctionError
> dataTypeUnexpectedError
> typeUnsupportedError
> negativeValueUnexpectedError
> addNewFunctionMismatchedWithFunctionError
> cannotGenerateCodeForUncomparableTypeError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36142) Adjust exponentiation between Series with missing values and bool literal to follow pandas

2021-07-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-36142:


Assignee: Yikun Jiang

> Adjust exponentiation between Series with missing values and bool literal to 
> follow pandas
> --
>
> Key: SPARK-36142
> URL: https://issues.apache.org/jira/browse/SPARK-36142
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Assignee: Yikun Jiang
>Priority: Major
>
> Currently, exponentiation between ExtentionDtypes and bools is not consistent 
> with pandas' behavior.
>  
> {code:java}
>  >>> pser = pd.Series([1, 2, np.nan], dtype=float)
>  >>> psser = ps.from_pandas(pser)
>  >>> pser ** False
>  0 1.0
>  1 1.0
>  2 1.0
>  dtype: float64
>  >>> psser ** False
>  0 1.0
>  1 1.0
>  2 NaN
>  dtype: float64
> {code}
> We ought to adjust that.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36142) Adjust exponentiation between Series with missing values and bool literal to follow pandas

2021-07-26 Thread Hyukjin Kwon (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-36142.
--
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33521
[https://github.com/apache/spark/pull/33521]

> Adjust exponentiation between Series with missing values and bool literal to 
> follow pandas
> --
>
> Key: SPARK-36142
> URL: https://issues.apache.org/jira/browse/SPARK-36142
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Assignee: Yikun Jiang
>Priority: Major
> Fix For: 3.2.0
>
>
> Currently, exponentiation between ExtentionDtypes and bools is not consistent 
> with pandas' behavior.
>  
> {code:java}
>  >>> pser = pd.Series([1, 2, np.nan], dtype=float)
>  >>> psser = ps.from_pandas(pser)
>  >>> pser ** False
>  0 1.0
>  1 1.0
>  2 1.0
>  dtype: float64
>  >>> psser ** False
>  0 1.0
>  1 1.0
>  2 NaN
>  dtype: float64
> {code}
> We ought to adjust that.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes

2021-07-26 Thread PengLei (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387715#comment-17387715
 ] 

PengLei commented on SPARK-36107:
-

woking on this

> Refactor first set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36107
> URL: https://issues.apache.org/jira/browse/SPARK-36107
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the second set of 20.
> {code:java}
> columnChangeUnsupportedError
> logicalHintOperatorNotRemovedDuringAnalysisError
> cannotEvaluateExpressionError
> cannotGenerateCodeForExpressionError
> cannotTerminateGeneratorError
> castingCauseOverflowError
> cannotChangeDecimalPrecisionError
> invalidInputSyntaxForNumericError
> cannotCastFromNullTypeError
> cannotCastError
> cannotParseDecimalError
> simpleStringWithNodeIdUnsupportedError
> evaluateUnevaluableAggregateUnsupportedError
> dataTypeUnsupportedError
> dataTypeUnsupportedError
> failedExecuteUserDefinedFunctionError
> divideByZeroError
> invalidArrayIndexError
> mapKeyNotExistError
> rowFromCSVParserNotExpectedError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36266) Rename classes in shuffle RPC used for block push operations

2021-07-26 Thread Mridul Muralidharan (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mridul Muralidharan resolved SPARK-36266.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33340
[https://github.com/apache/spark/pull/33340]

> Rename classes in shuffle RPC used for block push operations
> 
>
> Key: SPARK-36266
> URL: https://issues.apache.org/jira/browse/SPARK-36266
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Affects Versions: 3.1.0
>Reporter: Min Shen
>Priority: Major
> Fix For: 3.2.0
>
>
> In the current implementation of push-based shuffle, we are reusing certain 
> code between both block fetch and block push.
> This is generally good except that certain classes that are meant to be used 
> for both block fetch and block push now have names that indicate they are 
> only for block fetches, which is confusing.
> This ticket renames these classes to be more generic to be reused across both 
> block fetch and block push.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36311) DESC TABLE in v2 show more detail information

2021-07-26 Thread angerszhu (Jira)

angerszhu created SPARK-36311:
-

 Summary: DESC TABLE in v2 show more detail information
 Key: SPARK-36311
 URL: https://issues.apache.org/jira/browse/SPARK-36311
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 3.2.0
Reporter: angerszhu


according to 
https://issues.apache.org/jira/browse/SPARK-36086?focusedCommentId=17383195=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17383195



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36267:


Assignee: (was: Apache Spark)

> Clean up CategoricalAccessor and CategoricalIndex.
> --
>
> Key: SPARK-36267
> URL: https://issues.apache.org/jira/browse/SPARK-36267
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Takuya Ueshin
>Priority: Major
>
> - Clean up the classes
> - Add deprecation warnings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.

2021-07-26 Thread Apache Spark (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-36267:


Assignee: Apache Spark

> Clean up CategoricalAccessor and CategoricalIndex.
> --
>
> Key: SPARK-36267
> URL: https://issues.apache.org/jira/browse/SPARK-36267
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Takuya Ueshin
>Assignee: Apache Spark
>Priority: Major
>
> - Clean up the classes
> - Add deprecation warnings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36267) Clean up CategoricalAccessor and CategoricalIndex.

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387665#comment-17387665
 ] 

Apache Spark commented on SPARK-36267:
--

User 'ueshin' has created a pull request for this issue:
https://github.com/apache/spark/pull/33528

> Clean up CategoricalAccessor and CategoricalIndex.
> --
>
> Key: SPARK-36267
> URL: https://issues.apache.org/jira/browse/SPARK-36267
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Takuya Ueshin
>Priority: Major
>
> - Clean up the classes
> - Add deprecation warnings



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36260) Add set_categories to CategoricalAccessor and CategoricalIndex

2021-07-26 Thread Takuya Ueshin (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takuya Ueshin resolved SPARK-36260.
---
Fix Version/s: 3.2.0
 Assignee: Xinrong Meng
   Resolution: Fixed

Issue resolved by pull request 33506
https://github.com/apache/spark/pull/33506

> Add set_categories to CategoricalAccessor and CategoricalIndex
> --
>
> Key: SPARK-36260
> URL: https://issues.apache.org/jira/browse/SPARK-36260
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Assignee: Xinrong Meng
>Priority: Major
> Fix For: 3.2.0
>
>
> Add set_categories to CategoricalAccessor and CategoricalIndex



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36310) Fix hasnan(), any(), and all() window function in IndexOpsMixin

2021-07-26 Thread Xinrong Meng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-36310:
-
Summary: Fix hasnan(), any(), and all() window function in IndexOpsMixin  
(was: Fix hasnan(), any(), and all() window function)

> Fix hasnan(), any(), and all() window function in IndexOpsMixin
> ---
>
> Key: SPARK-36310
> URL: https://issues.apache.org/jira/browse/SPARK-36310
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Priority: Major
>
>  
> {code:java}
> File "/__w/spark/spark/python/pyspark/pandas/groupby.py", line 1497, in 
> pyspark.pandas.groupby.GroupBy.rank
> Failed example:
> df.groupby("a").rank().sort_index()
> Exception raised:
> ...
> pyspark.sql.utils.AnalysisException: It is not allowed to use a window 
> function inside an aggregate function. Please use the inner window function 
> in a sub-query.
> {code}
> As shown above, hasnans() used in "rank" causes "It is not allowed to use a 
> window function inside an aggregate function" exception.
> any() and all() have the same issue.
> We shall adjust that.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36310) Fix hasnan(), any(), and all() window function

2021-07-26 Thread Xinrong Meng (Jira)

Xinrong Meng created SPARK-36310:


 Summary: Fix hasnan(), any(), and all() window function
 Key: SPARK-36310
 URL: https://issues.apache.org/jira/browse/SPARK-36310
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 3.2.0
Reporter: Xinrong Meng


 
{code:java}
File "/__w/spark/spark/python/pyspark/pandas/groupby.py", line 1497, in 
pyspark.pandas.groupby.GroupBy.rank
Failed example:
df.groupby("a").rank().sort_index()
Exception raised:
...
pyspark.sql.utils.AnalysisException: It is not allowed to use a window function 
inside an aggregate function. Please use the inner window function in a 
sub-query.
{code}
As shown above, hasnans() used in "rank" causes "It is not allowed to use a 
window function inside an aggregate function" exception.
any() and all() have the same issue.

We shall adjust that.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36028) Allow Project to host outer references in scalar subqueries

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387637#comment-17387637
 ] 

Apache Spark commented on SPARK-36028:
--

User 'allisonwang-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/33527

> Allow Project to host outer references in scalar subqueries
> ---
>
> Key: SPARK-36028
> URL: https://issues.apache.org/jira/browse/SPARK-36028
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
> Fix For: 3.3.0
>
>
> Support Project to host outer references in subqueries, for example:
> {code:sql}
> SELECT (SELECT c1) FROM t
> {code}
> Currently, it will throw AnalysisException:
> {code}
> org.apache.spark.sql.AnalysisException: Expressions referencing the outer 
> query are not supported outside of WHERE/HAVING clauses
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36028) Allow Project to host outer references in scalar subqueries

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387635#comment-17387635
 ] 

Apache Spark commented on SPARK-36028:
--

User 'allisonwang-db' has created a pull request for this issue:
https://github.com/apache/spark/pull/33527

> Allow Project to host outer references in scalar subqueries
> ---
>
> Key: SPARK-36028
> URL: https://issues.apache.org/jira/browse/SPARK-36028
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Allison Wang
>Assignee: Allison Wang
>Priority: Major
> Fix For: 3.3.0
>
>
> Support Project to host outer references in subqueries, for example:
> {code:sql}
> SELECT (SELECT c1) FROM t
> {code}
> Currently, it will throw AnalysisException:
> {code}
> org.apache.spark.sql.AnalysisException: Expressions referencing the outer 
> query are not supported outside of WHERE/HAVING clauses
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35211) Support UDT for Pandas with Arrow Disabled

2021-07-26 Thread L. C. Hsieh (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-35211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387630#comment-17387630
 ] 

L. C. Hsieh commented on SPARK-35211:
-

The Jira title looks more like a new feature/improvement. But from the 
description and the PR, looks like it is a bug? Could you also update with a 
proper the Jira title? Thanks.

> Support UDT for Pandas with Arrow Disabled
> --
>
> Key: SPARK-35211
> URL: https://issues.apache.org/jira/browse/SPARK-35211
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.1.1
>Reporter: Darcy Shen
>Priority: Major
>  Labels: correctness
>
> {code:java}
> $ pip freeze
> certifi==2020.12.5
> coverage==5.5
> flake8==3.9.0
> mccabe==0.6.1
> mypy==0.812
> mypy-extensions==0.4.3
> numpy==1.20.1
> pandas==1.2.3
> pyarrow==2.0.0
> pycodestyle==2.7.0
> pyflakes==2.3.0
> python-dateutil==2.8.1
> pytz==2021.1
> scipy==1.6.1
> six==1.15.0
> typed-ast==1.4.2
> typing-extensions==3.7.4.3
> xmlrunner==1.7.7
> {code}
> {code}
> (spark) ➜  spark git:(master) bin/pyspark
> Python 3.8.8 (default, Feb 24 2021, 13:46:16)
> [Clang 10.0.0 ] :: Anaconda, Inc. on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> 21/04/24 15:51:29 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> Welcome to
>     __
>  / __/__  ___ _/ /__
> _\ \/ _ \/ _ `/ __/  '_/
>/__ / .__/\_,_/_/ /_/\_\   version 3.2.0-SNAPSHOT
>   /_/
> Using Python version 3.8.8 (default, Feb 24 2021 13:46:16)
> Spark context Web UI available at http://172.30.0.12:4040
> Spark context available as 'sc' (master = local[*], app id = 
> local-1619250689842).
> SparkSession available as 'spark'.
> >>> spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false")
> >>> from pyspark.testing.sqlutils  import ExamplePoint
> >>>
> >>> import pandas as pd
> >>>
> >>> pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), 
> >>> ExamplePoint(2, 2)])})
> >>>
> >>> df = spark.createDataFrame(pdf)
> >>>
> >>> df.show()
> +--+
> | point|
> +--+
> |(0.0, 0.0)|
> |(0.0, 0.0)|
> +--+
> >>> df.toPandas()
>point
> 0  (0.0,0.0)
> 1  (0.0,0.0)
> >>>
> >>>
> {code}
> The correct result should be:
> {code}
>point
> 0  (1.0,1.0)
> 1  (2.0,2.0)
> {code}
> The following code snippet works fine:
> {code}
> (spark) ➜  spark git:(sadhen/SPARK-35211) ✗ bin/pyspark
> Python 3.8.8 (default, Feb 24 2021, 13:46:16)
> [Clang 10.0.0 ] :: Anaconda, Inc. on darwin
> Type "help", "copyright", "credits" or "license" for more information.
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
> setLogLevel(newLevel).
> 21/04/24 17:08:09 WARN NativeCodeLoader: Unable to load native-hadoop library 
> for your platform... using builtin-java classes where applicable
> Welcome to
>     __
>  / __/__  ___ _/ /__
> _\ \/ _ \/ _ `/ __/  '_/
>/__ / .__/\_,_/_/ /_/\_\   version 3.2.0-SNAPSHOT
>   /_/
> Using Python version 3.8.8 (default, Feb 24 2021 13:46:16)
> Spark context Web UI available at http://172.30.0.12:4040
> Spark context available as 'sc' (master = local[*], app id = 
> local-1619255290637).
> SparkSession available as 'spark'.
> >>> spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false")
> >>> from pyspark.testing.sqlutils  import ExamplePoint
> >>> import pandas as pd
> >>> pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1.0, 1.0), 
> >>> ExamplePoint(2.0, 2.0)])})
> >>> df = spark.createDataFrame(pdf)
> >>> df.show()
> +--+
> | point|
> +--+
> |(1.0, 1.0)|
> |(2.0, 2.0)|
> +--+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34952) DS V2 Aggregate push down

2021-07-26 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-34952:
---
Description: 
Push down aggregate to data source for better performance. This will be done in 
two steps:
1. add aggregate push down APIs and the implementation in JDBC
2. add the implementation in Parquet.

  was:Push down Max/Min/Count to Parquet for better performance.


> DS V2 Aggregate push down
> -
>
> Key: SPARK-34952
> URL: https://issues.apache.org/jira/browse/SPARK-34952
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.2.0
>
>
> Push down aggregate to data source for better performance. This will be done 
> in two steps:
> 1. add aggregate push down APIs and the implementation in JDBC
> 2. add the implementation in Parquet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-34952) DS V2 Aggregate push down

2021-07-26 Thread Huaxin Gao (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huaxin Gao updated SPARK-34952:
---
Summary: DS V2 Aggregate push down  (was: Aggregate (Min/Max/Count) push 
down  for Parquet)

> DS V2 Aggregate push down
> -
>
> Key: SPARK-34952
> URL: https://issues.apache.org/jira/browse/SPARK-34952
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.2.0
>
>
> Push down Max/Min/Count to Parquet for better performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34952) Aggregate (Min/Max/Count) push down for Parquet

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387624#comment-17387624
 ] 

Apache Spark commented on SPARK-34952:
--

User 'huaxingao' has created a pull request for this issue:
https://github.com/apache/spark/pull/33526

> Aggregate (Min/Max/Count) push down  for Parquet
> 
>
> Key: SPARK-34952
> URL: https://issues.apache.org/jira/browse/SPARK-34952
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.2.0
>
>
> Push down Max/Min/Count to Parquet for better performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-34952) Aggregate (Min/Max/Count) push down for Parquet

2021-07-26 Thread Apache Spark (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-34952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387623#comment-17387623
 ] 

Apache Spark commented on SPARK-34952:
--

User 'huaxingao' has created a pull request for this issue:
https://github.com/apache/spark/pull/33526

> Aggregate (Min/Max/Count) push down  for Parquet
> 
>
> Key: SPARK-34952
> URL: https://issues.apache.org/jira/browse/SPARK-34952
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Huaxin Gao
>Assignee: Huaxin Gao
>Priority: Major
> Fix For: 3.2.0
>
>
> Push down Max/Min/Count to Parquet for better performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36136) Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive

2021-07-26 Thread L. C. Hsieh (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

L. C. Hsieh resolved SPARK-36136.
-
Fix Version/s: 3.2.0
   Resolution: Fixed

Issue resolved by pull request 33350
[https://github.com/apache/spark/pull/33350]

> Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive
> 
>
> Key: SPARK-36136
> URL: https://issues.apache.org/jira/browse/SPARK-36136
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
> Fix For: 3.2.0
>
>
> Currently both {{PruneFileSourcePartitionsSuite}} and 
> {{PrunePartitionSuiteBase}} are in {{org.apache.spark.sql.hive.execution}} 
> which doesn't look right. They should belong to 
> {{org.apache.spark.sql.execution.datasources}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Assigned] (SPARK-36136) Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive

2021-07-26 Thread L. C. Hsieh (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

L. C. Hsieh reassigned SPARK-36136:
---

Assignee: Chao Sun

> Move PruneFileSourcePartitionsSuite out of org.apache.spark.sql.hive
> 
>
> Key: SPARK-36136
> URL: https://issues.apache.org/jira/browse/SPARK-36136
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 3.2.0
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Minor
>
> Currently both {{PruneFileSourcePartitionsSuite}} and 
> {{PrunePartitionSuiteBase}} are in {{org.apache.spark.sql.hive.execution}} 
> which doesn't look right. They should belong to 
> {{org.apache.spark.sql.execution.datasources}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36143) Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread Xinrong Meng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-36143:
-
Description: 
{code:java}
>>> pser = pd.Series([1, 2, np.nan], dtype=float)
>>> psser = ps.from_pandas(pser)
>>> pser.astype(int)
...
 ValueError: Cannot convert non-finite values (NA or inf) to integer
>>> psser.astype(int)
 0 1.0
 1 2.0
 2 NaN
 dtype: float64
{code}
As shown above, astype of Series with fractional missing values doesn't behave 
the same as pandas, we ought to adjust that.

  was:
{code:java}
>>> pser = pd.Series([1, 2, np.nan], dtype=float)
>>> psser = ps.from_pandas(pser)
>>> pser.astype(int)
...
 ValueError: Cannot convert non-finite values (NA or inf) to integer
>>> psser.astype(int)
 0 1.0
 1 2.0
 2 NaN
 dtype: float64
{code}
As shown above, astype of Series with missing values doesn't behave the same as 
pandas, we ought to adjust that.


> Adjust `astype` of fractional Series with missing values to follow pandas
> -
>
> Key: SPARK-36143
> URL: https://issues.apache.org/jira/browse/SPARK-36143
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Priority: Major
>
> {code:java}
> >>> pser = pd.Series([1, 2, np.nan], dtype=float)
> >>> psser = ps.from_pandas(pser)
> >>> pser.astype(int)
> ...
>  ValueError: Cannot convert non-finite values (NA or inf) to integer
> >>> psser.astype(int)
>  0 1.0
>  1 2.0
>  2 NaN
>  dtype: float64
> {code}
> As shown above, astype of Series with fractional missing values doesn't 
> behave the same as pandas, we ought to adjust that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36143) Adjust `astype` of fractional Series with missing values to follow pandas

2021-07-26 Thread Xinrong Meng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-36143:
-
Summary: Adjust `astype` of fractional Series with missing values to follow 
pandas  (was: Adjust astype of Series with missing values to follow pandas)

> Adjust `astype` of fractional Series with missing values to follow pandas
> -
>
> Key: SPARK-36143
> URL: https://issues.apache.org/jira/browse/SPARK-36143
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Priority: Major
>
> {code:java}
> >>> pser = pd.Series([1, 2, np.nan], dtype=float)
> >>> psser = ps.from_pandas(pser)
> >>> pser.astype(int)
> ...
>  ValueError: Cannot convert non-finite values (NA or inf) to integer
> >>> psser.astype(int)
>  0 1.0
>  1 2.0
>  2 NaN
>  dtype: float64
> {code}
> As shown above, astype of Series with missing values doesn't behave the same 
> as pandas, we ought to adjust that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-36281) Re-enable the test of external listerners in SparkSQLEnvSuite

2021-07-26 Thread L. C. Hsieh (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

L. C. Hsieh resolved SPARK-36281.
-
Resolution: Invalid

> Re-enable the test of external listerners in SparkSQLEnvSuite
> -
>
> Key: SPARK-36281
> URL: https://issues.apache.org/jira/browse/SPARK-36281
> Project: Spark
>  Issue Type: Test
>  Components: SQL
>Affects Versions: 3.0.3
>Reporter: L. C. Hsieh
>Priority: Major
>
> While we're trying to recover GA in branch 3.0 
> ([https://github.com/apache/spark/pull/33502),] 
> {{org.apache.spark.sql.hive.thriftserver.SparkSQLEnvSuite}} continues to fail 
> in GA:
>  
> [info] - SPARK-29604 external listeners should be initialized with Spark 
> classloader *** FAILED *** (34 seconds, 465 milliseconds)
> [info] 
> scala.Predef.refArrayOps[org.apache.spark.sql.util.QueryExecutionListener](session.get.listenerManager.listListeners()).exists(((x$1:
>  org.apache.spark.sql.util.QueryExecutionListener) => 
> x$1.isInstanceOf[test.custom.listener.DummyQueryExecutionListener])) was 
> false (SparkSQLEnvSuite.scala:57)
>  
> It looks not a memory issue. I also ran it locally and it passed. Not sure 
> why GA is special for the test.
>  
> Ignored the test for now to recover GA. We need to investigate why the test 
> fails and recover it.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36094) Group SQL component error messages in Spark error class JSON file

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36094:
---
Description: 
To improve auditing, reduce duplication, and improve quality of error messages 
thrown from Spark, we should group them in a single JSON file (as discussed in 
the [mailing 
list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html]
 and introduced in 
[SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]).
 In this file, the error messages should be labeled according to a consistent 
error class and with a SQLSTATE.

We will start with the SQL component first.
As a starting point, we can build off the exception grouping done in 
[SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, 
there are ~1000 error messages to group split across three files 
(QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In this 
ticket, each of these files is split into chunks of ~20 errors for refactoring.

As a guideline, the error classes should be de-duplicated as much as possible 
to improve auditing.
We will improve error message quality as a follow-up.

Here is an example PR that groups a few error messages in the 
QueryCompilationErrors class: [PR 
33309|https://github.com/apache/spark/pull/33309].

  was:
To improve auditing, reduce duplication, and improve quality of error messages 
thrown from Spark, we should group them in a single JSON file (as discussed in 
the [mailing 
list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html]
 and introduced in 
[SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]).
 In this file, the error messages should be labeled according to a consistent 
error class and with a SQLSTATE.

We will start with the SQL component first.
As a starting point, we can build off the exception grouping done in 
[SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, 
there are ~1000 error messages to group split across three files 
(QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). If you 
work on this ticket, please create a subtask to improve ease of reviewing.

As a guideline, the error classes should be de-duplicated as much as possible 
to improve auditing.
We will improve error message quality as a follow-up.

Here is an example PR that groups a few error messages in the 
QueryCompilationErrors class: [PR 
33309|https://github.com/apache/spark/pull/33309].


> Group SQL component error messages in Spark error class JSON file
> -
>
> Key: SPARK-36094
> URL: https://issues.apache.org/jira/browse/SPARK-36094
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> To improve auditing, reduce duplication, and improve quality of error 
> messages thrown from Spark, we should group them in a single JSON file (as 
> discussed in the [mailing 
> list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html]
>  and introduced in 
> [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]).
>  In this file, the error messages should be labeled according to a consistent 
> error class and with a SQLSTATE.
> We will start with the SQL component first.
> As a starting point, we can build off the exception grouping done in 
> [SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, 
> there are ~1000 error messages to group split across three files 
> (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In 
> this ticket, each of these files is split into chunks of ~20 errors for 
> refactoring.
> As a guideline, the error classes should be de-duplicated as much as possible 
> to improve auditing.
> We will improve error message quality as a follow-up.
> Here is an example PR that groups a few error messages in the 
> QueryCompilationErrors class: [PR 
> 33309|https://github.com/apache/spark/pull/33309].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Issue Comment Deleted] (SPARK-36230) hasnans for Series of Decimal(`NaN`)

2021-07-26 Thread Xinrong Meng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinrong Meng updated SPARK-36230:
-
Comment: was deleted

(was: https://issues.apache.org/jira/browse/SPARK-36231)

> hasnans for Series of Decimal(`NaN`)
> 
>
> Key: SPARK-36230
> URL: https://issues.apache.org/jira/browse/SPARK-36230
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Priority: Major
>
> {code:java}
> >>> import pandas as pd
> >>> pser = pd.Series([Decimal('0.1'), Decimal('NaN')])
> >>> pser
> 00.1
> 1NaN
> dtype: object
> >>> psser = ps.from_pandas(pser)
> >>> psser
> 0 0.1
> 1None
> dtype: object
> >>> psser.hasnans
> False
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36309) Refactor fourth set of 20 query parsing errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36309:
---
Description: 
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the fourth set of 20.

{code}
showFunctionsUnsupportedError
duplicateCteDefinitionNamesError
sqlStatementUnsupportedError
unquotedIdentifierError
duplicateClausesError
duplicateKeysError
unexpectedFomatForSetConfigurationError
invalidPropertyKeyForSetQuotedConfigurationError
invalidPropertyValueForSetQuotedConfigurationError
unexpectedFormatForResetConfigurationError
intervalValueOutOfRangeError
invalidTimeZoneDisplacementValueError
createTempTableNotSpecifyProviderError
rowFormatNotUsedWithStoredAsError
useDefinedRecordReaderOrWriterClassesError
directoryPathAndOptionsPathBothSpecifiedError
unsupportedLocalFileSchemeError
invalidGroupingSetError
{code}

For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the third set of 20.

{code}
fromToIntervalUnsupportedError
mixedIntervalUnitsError
dataTypeUnsupportedError
partitionTransformNotExpectedError
tooManyArgumentsForTransformError
notEnoughArgumentsForTransformError
invalidBucketsNumberError
invalidTransformArgumentError
cannotCleanReservedNamespacePropertyError
propertiesAndDbPropertiesBothSpecifiedError
fromOrInNotAllowedInShowDatabasesError
cannotCleanReservedTablePropertyError
duplicatedTablePathsFoundError
storedAsAndStoredByBothSpecifiedError
operationInHiveStyleCommandUnsupportedError
operationNotAllowedError
descColumnForPartitionUnsupportedError
incompletePartitionSpecificationError
computeStatisticsNotExpectedError
addCatalogInCacheTableAsSelectNotAllowedError
{code}

For more detail, see the parent ticket SPARK-36094.


> Refactor fourth set of 20 query parsing errors to use error classes
> ---
>
> Key: SPARK-36309
> URL: https://issues.apache.org/jira/browse/SPARK-36309
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
>  to use error classes.
> There are currently ~100 exceptions in this file; so this PR only focuses on 
> the fourth set of 20.
> {code}
> showFunctionsUnsupportedError
> duplicateCteDefinitionNamesError
> sqlStatementUnsupportedError
> unquotedIdentifierError
> duplicateClausesError
> duplicateKeysError
> unexpectedFomatForSetConfigurationError
> invalidPropertyKeyForSetQuotedConfigurationError
> invalidPropertyValueForSetQuotedConfigurationError
> unexpectedFormatForResetConfigurationError
> intervalValueOutOfRangeError
> invalidTimeZoneDisplacementValueError
> createTempTableNotSpecifyProviderError
> rowFormatNotUsedWithStoredAsError
> useDefinedRecordReaderOrWriterClassesError
> directoryPathAndOptionsPathBothSpecifiedError
> unsupportedLocalFileSchemeError
> invalidGroupingSetError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-36230) hasnans for Series of Decimal(`NaN`)

2021-07-26 Thread Xinrong Meng (Jira)



[ 
https://issues.apache.org/jira/browse/SPARK-36230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17387596#comment-17387596
 ] 

Xinrong Meng commented on SPARK-36230:
--

https://issues.apache.org/jira/browse/SPARK-36231

> hasnans for Series of Decimal(`NaN`)
> 
>
> Key: SPARK-36230
> URL: https://issues.apache.org/jira/browse/SPARK-36230
> Project: Spark
>  Issue Type: Sub-task
>  Components: PySpark
>Affects Versions: 3.2.0
>Reporter: Xinrong Meng
>Priority: Major
>
> {code:java}
> >>> import pandas as pd
> >>> pser = pd.Series([Decimal('0.1'), Decimal('NaN')])
> >>> pser
> 00.1
> 1NaN
> dtype: object
> >>> psser = ps.from_pandas(pser)
> >>> psser
> 0 0.1
> 1None
> dtype: object
> >>> psser.hasnans
> False
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36107:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second set of 20.
{code:java}
columnChangeUnsupportedError
logicalHintOperatorNotRemovedDuringAnalysisError
cannotEvaluateExpressionError
cannotGenerateCodeForExpressionError
cannotTerminateGeneratorError
castingCauseOverflowError
cannotChangeDecimalPrecisionError
invalidInputSyntaxForNumericError
cannotCastFromNullTypeError
cannotCastError
cannotParseDecimalError
simpleStringWithNodeIdUnsupportedError
evaluateUnevaluableAggregateUnsupportedError
dataTypeUnsupportedError
dataTypeUnsupportedError
failedExecuteUserDefinedFunctionError
divideByZeroError
invalidArrayIndexError
mapKeyNotExistError
rowFromCSVParserNotExpectedError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the first set of 20.
{code:java}
columnChangeUnsupportedError
logicalHintOperatorNotRemovedDuringAnalysisError
cannotEvaluateExpressionError
cannotGenerateCodeForExpressionError
cannotTerminateGeneratorError
castingCauseOverflowError
cannotChangeDecimalPrecisionError
invalidInputSyntaxForNumericError
cannotCastFromNullTypeError
cannotCastError
cannotParseDecimalError
simpleStringWithNodeIdUnsupportedError
evaluateUnevaluableAggregateUnsupportedError
dataTypeUnsupportedError
dataTypeUnsupportedError
failedExecuteUserDefinedFunctionError
divideByZeroError
invalidArrayIndexError
mapKeyNotExistError
rowFromCSVParserNotExpectedError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor first set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36107
> URL: https://issues.apache.org/jira/browse/SPARK-36107
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the second set of 20.
> {code:java}
> columnChangeUnsupportedError
> logicalHintOperatorNotRemovedDuringAnalysisError
> cannotEvaluateExpressionError
> cannotGenerateCodeForExpressionError
> cannotTerminateGeneratorError
> castingCauseOverflowError
> cannotChangeDecimalPrecisionError
> invalidInputSyntaxForNumericError
> cannotCastFromNullTypeError
> cannotCastError
> cannotParseDecimalError
> simpleStringWithNodeIdUnsupportedError
> evaluateUnevaluableAggregateUnsupportedError
> dataTypeUnsupportedError
> dataTypeUnsupportedError
> failedExecuteUserDefinedFunctionError
> divideByZeroError
> invalidArrayIndexError
> mapKeyNotExistError
> rowFromCSVParserNotExpectedError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36309) Refactor fourth set of 20 query parsing errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36309:
--

 Summary: Refactor fourth set of 20 query parsing errors to use 
error classes
 Key: SPARK-36309
 URL: https://issues.apache.org/jira/browse/SPARK-36309
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the third set of 20.

{code}
fromToIntervalUnsupportedError
mixedIntervalUnitsError
dataTypeUnsupportedError
partitionTransformNotExpectedError
tooManyArgumentsForTransformError
notEnoughArgumentsForTransformError
invalidBucketsNumberError
invalidTransformArgumentError
cannotCleanReservedNamespacePropertyError
propertiesAndDbPropertiesBothSpecifiedError
fromOrInNotAllowedInShowDatabasesError
cannotCleanReservedTablePropertyError
duplicatedTablePathsFoundError
storedAsAndStoredByBothSpecifiedError
operationInHiveStyleCommandUnsupportedError
operationNotAllowedError
descColumnForPartitionUnsupportedError
incompletePartitionSpecificationError
computeStatisticsNotExpectedError
addCatalogInCacheTableAsSelectNotAllowedError
{code}

For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36308) Refactor third set of 20 query parsing errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36308:
---
Description: 
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the third set of 20.

{code}
fromToIntervalUnsupportedError
mixedIntervalUnitsError
dataTypeUnsupportedError
partitionTransformNotExpectedError
tooManyArgumentsForTransformError
notEnoughArgumentsForTransformError
invalidBucketsNumberError
invalidTransformArgumentError
cannotCleanReservedNamespacePropertyError
propertiesAndDbPropertiesBothSpecifiedError
fromOrInNotAllowedInShowDatabasesError
cannotCleanReservedTablePropertyError
duplicatedTablePathsFoundError
storedAsAndStoredByBothSpecifiedError
operationInHiveStyleCommandUnsupportedError
operationNotAllowedError
descColumnForPartitionUnsupportedError
incompletePartitionSpecificationError
computeStatisticsNotExpectedError
addCatalogInCacheTableAsSelectNotAllowedError
{code}

For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the first set of 20.

{code}
repetitiveWindowDefinitionError
invalidWindowReferenceError
cannotResolveWindowReferenceError
joinCriteriaUnimplementedError
naturalCrossJoinUnsupportedError
emptyInputForTableSampleError
tableSampleByBytesUnsupportedError
invalidByteLengthLiteralError
invalidEscapeStringError
trimOptionUnsupportedError
functionNameUnsupportedError
cannotParseValueTypeError
cannotParseIntervalValueError
literalValueTypeUnsupportedError
parsingValueTypeError
invalidNumericLiteralRangeError
moreThanOneFromToUnitInIntervalLiteralError
invalidIntervalLiteralError
invalidIntervalFormError
invalidFromToUnitValueError
{code}

For more detail, see the parent ticket SPARK-36094.


> Refactor third set of 20 query parsing errors to use error classes
> --
>
> Key: SPARK-36308
> URL: https://issues.apache.org/jira/browse/SPARK-36308
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
>  to use error classes.
> There are currently ~100 exceptions in this file; so this PR only focuses on 
> the third set of 20.
> {code}
> fromToIntervalUnsupportedError
> mixedIntervalUnitsError
> dataTypeUnsupportedError
> partitionTransformNotExpectedError
> tooManyArgumentsForTransformError
> notEnoughArgumentsForTransformError
> invalidBucketsNumberError
> invalidTransformArgumentError
> cannotCleanReservedNamespacePropertyError
> propertiesAndDbPropertiesBothSpecifiedError
> fromOrInNotAllowedInShowDatabasesError
> cannotCleanReservedTablePropertyError
> duplicatedTablePathsFoundError
> storedAsAndStoredByBothSpecifiedError
> operationInHiveStyleCommandUnsupportedError
> operationNotAllowedError
> descColumnForPartitionUnsupportedError
> incompletePartitionSpecificationError
> computeStatisticsNotExpectedError
> addCatalogInCacheTableAsSelectNotAllowedError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36308) Refactor third set of 20 query parsing errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36308:
--

 Summary: Refactor third set of 20 query parsing errors to use 
error classes
 Key: SPARK-36308
 URL: https://issues.apache.org/jira/browse/SPARK-36308
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the first set of 20.

{code}
repetitiveWindowDefinitionError
invalidWindowReferenceError
cannotResolveWindowReferenceError
joinCriteriaUnimplementedError
naturalCrossJoinUnsupportedError
emptyInputForTableSampleError
tableSampleByBytesUnsupportedError
invalidByteLengthLiteralError
invalidEscapeStringError
trimOptionUnsupportedError
functionNameUnsupportedError
cannotParseValueTypeError
cannotParseIntervalValueError
literalValueTypeUnsupportedError
parsingValueTypeError
invalidNumericLiteralRangeError
moreThanOneFromToUnitInIntervalLiteralError
invalidIntervalLiteralError
invalidIntervalFormError
invalidFromToUnitValueError
{code}

For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36307) Refactor second set of 20 query parsing errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36307:
---
Description: 
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the first set of 20.

{code}
repetitiveWindowDefinitionError
invalidWindowReferenceError
cannotResolveWindowReferenceError
joinCriteriaUnimplementedError
naturalCrossJoinUnsupportedError
emptyInputForTableSampleError
tableSampleByBytesUnsupportedError
invalidByteLengthLiteralError
invalidEscapeStringError
trimOptionUnsupportedError
functionNameUnsupportedError
cannotParseValueTypeError
cannotParseIntervalValueError
literalValueTypeUnsupportedError
parsingValueTypeError
invalidNumericLiteralRangeError
moreThanOneFromToUnitInIntervalLiteralError
invalidIntervalLiteralError
invalidIntervalFormError
invalidFromToUnitValueError
{code}

For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the first set of 20.

{code}
invalidInsertIntoError
insertOverwriteDirectoryUnsupportedError
columnAliasInOperationNotAllowedError
emptySourceForMergeError
unrecognizedMatchedActionError
insertedValueNumberNotMatchFieldNumberError
unrecognizedNotMatchedActionError
mergeStatementWithoutWhenClauseError
nonLastMatchedClauseOmitConditionError
nonLastNotMatchedClauseOmitConditionError
emptyPartitionKeyError
combinationQueryResultClausesUnsupportedError
distributeByUnsupportedError
transformNotSupportQuantifierError
transformWithSerdeUnsupportedError
lateralWithPivotInFromClauseNotAllowedError
lateralJoinWithNaturalJoinUnsupportedError
lateralJoinWithUsingJoinUnsupportedError
unsupportedLateralJoinTypeError
invalidLateralJoinRelationError
{code}

For more detail, see the parent ticket SPARK-36094.


> Refactor second set of 20 query parsing errors to use error classes
> ---
>
> Key: SPARK-36307
> URL: https://issues.apache.org/jira/browse/SPARK-36307
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
>  to use error classes.
> There are currently ~100 exceptions in this file; so this PR only focuses on 
> the first set of 20.
> {code}
> repetitiveWindowDefinitionError
> invalidWindowReferenceError
> cannotResolveWindowReferenceError
> joinCriteriaUnimplementedError
> naturalCrossJoinUnsupportedError
> emptyInputForTableSampleError
> tableSampleByBytesUnsupportedError
> invalidByteLengthLiteralError
> invalidEscapeStringError
> trimOptionUnsupportedError
> functionNameUnsupportedError
> cannotParseValueTypeError
> cannotParseIntervalValueError
> literalValueTypeUnsupportedError
> parsingValueTypeError
> invalidNumericLiteralRangeError
> moreThanOneFromToUnitInIntervalLiteralError
> invalidIntervalLiteralError
> invalidIntervalFormError
> invalidFromToUnitValueError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36307) Refactor second set of 20 query parsing errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36307:
--

 Summary: Refactor second set of 20 query parsing errors to use 
error classes
 Key: SPARK-36307
 URL: https://issues.apache.org/jira/browse/SPARK-36307
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the first set of 20.

{code}
invalidInsertIntoError
insertOverwriteDirectoryUnsupportedError
columnAliasInOperationNotAllowedError
emptySourceForMergeError
unrecognizedMatchedActionError
insertedValueNumberNotMatchFieldNumberError
unrecognizedNotMatchedActionError
mergeStatementWithoutWhenClauseError
nonLastMatchedClauseOmitConditionError
nonLastNotMatchedClauseOmitConditionError
emptyPartitionKeyError
combinationQueryResultClausesUnsupportedError
distributeByUnsupportedError
transformNotSupportQuantifierError
transformWithSerdeUnsupportedError
lateralWithPivotInFromClauseNotAllowedError
lateralJoinWithNaturalJoinUnsupportedError
lateralJoinWithUsingJoinUnsupportedError
unsupportedLateralJoinTypeError
invalidLateralJoinRelationError
{code}

For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36108) Refactor first set of 20 query parsing errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36108:
---
Description: 
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on 
the first set of 20.

{code}
invalidInsertIntoError
insertOverwriteDirectoryUnsupportedError
columnAliasInOperationNotAllowedError
emptySourceForMergeError
unrecognizedMatchedActionError
insertedValueNumberNotMatchFieldNumberError
unrecognizedNotMatchedActionError
mergeStatementWithoutWhenClauseError
nonLastMatchedClauseOmitConditionError
nonLastNotMatchedClauseOmitConditionError
emptyPartitionKeyError
combinationQueryResultClausesUnsupportedError
distributeByUnsupportedError
transformNotSupportQuantifierError
transformWithSerdeUnsupportedError
lateralWithPivotInFromClauseNotAllowedError
lateralJoinWithNaturalJoinUnsupportedError
lateralJoinWithUsingJoinUnsupportedError
unsupportedLateralJoinTypeError
invalidLateralJoinRelationError
{code}

For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
 to use error classes.

There are currently ~100 exceptions in this file; so this PR only focuses on a 
few.

For more detail, see the parent ticket 
[SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].

Summary: Refactor first set of 20 query parsing errors to use error 
classes  (was: Refactor a few query parsing errors to use error classes)

> Refactor first set of 20 query parsing errors to use error classes
> --
>
> Key: SPARK-36108
> URL: https://issues.apache.org/jira/browse/SPARK-36108
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryParsingErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryParsingErrors.scala]
>  to use error classes.
> There are currently ~100 exceptions in this file; so this PR only focuses on 
> the first set of 20.
> {code}
> invalidInsertIntoError
> insertOverwriteDirectoryUnsupportedError
> columnAliasInOperationNotAllowedError
> emptySourceForMergeError
> unrecognizedMatchedActionError
> insertedValueNumberNotMatchFieldNumberError
> unrecognizedNotMatchedActionError
> mergeStatementWithoutWhenClauseError
> nonLastMatchedClauseOmitConditionError
> nonLastNotMatchedClauseOmitConditionError
> emptyPartitionKeyError
> combinationQueryResultClausesUnsupportedError
> distributeByUnsupportedError
> transformNotSupportQuantifierError
> transformWithSerdeUnsupportedError
> lateralWithPivotInFromClauseNotAllowedError
> lateralJoinWithNaturalJoinUnsupportedError
> lateralJoinWithUsingJoinUnsupportedError
> unsupportedLateralJoinTypeError
> invalidLateralJoinRelationError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36306) Refactor seventeenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36306:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the seventeenth set of 20.
{code:java}
legacyCheckpointDirectoryExistsError
subprocessExitedError
outputDataTypeUnsupportedByNodeWithoutSerdeError
invalidStartIndexError
concurrentModificationOnExternalAppendOnlyUnsafeRowArrayError
doExecuteBroadcastNotImplementedError
databaseNameConflictWithSystemPreservedDatabaseError
commentOnTableUnsupportedError
unsupportedUpdateColumnNullabilityError
renameColumnUnsupportedForOlderMySQLError
failedToExecuteQueryError
nestedFieldUnsupportedError
transformationsAndActionsNotInvokedByDriverError
repeatedPivotsUnsupportedError
pivotNotAfterGroupByUnsupportedError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the sixteenth set of 20.
{code:java}
cannotDropMultiPartitionsOnNonatomicPartitionTableError
truncateMultiPartitionUnsupportedError
overwriteTableByUnsupportedExpressionError
dynamicPartitionOverwriteUnsupportedByTableError
failedMergingSchemaError
cannotBroadcastTableOverMaxTableRowsError
cannotBroadcastTableOverMaxTableBytesError
notEnoughMemoryToBuildAndBroadcastTableError
executeCodePathUnsupportedError
cannotMergeClassWithOtherClassError
continuousProcessingUnsupportedByDataSourceError
failedToReadDataError
failedToGenerateEpochMarkerError
foreachWriterAbortedDueToTaskFailureError
integerOverflowError
failedToReadDeltaFileError
failedToReadSnapshotFileError
cannotPurgeAsBreakInternalStateError
cleanUpSourceFilesUnsupportedError
latestOffsetNotCalledError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor seventeenth set of 20 query execution errors to use error classes
> --
>
> Key: SPARK-36306
> URL: https://issues.apache.org/jira/browse/SPARK-36306
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the seventeenth set of 20.
> {code:java}
> legacyCheckpointDirectoryExistsError
> subprocessExitedError
> outputDataTypeUnsupportedByNodeWithoutSerdeError
> invalidStartIndexError
> concurrentModificationOnExternalAppendOnlyUnsafeRowArrayError
> doExecuteBroadcastNotImplementedError
> databaseNameConflictWithSystemPreservedDatabaseError
> commentOnTableUnsupportedError
> unsupportedUpdateColumnNullabilityError
> renameColumnUnsupportedForOlderMySQLError
> failedToExecuteQueryError
> nestedFieldUnsupportedError
> transformationsAndActionsNotInvokedByDriverError
> repeatedPivotsUnsupportedError
> pivotNotAfterGroupByUnsupportedError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36304) Refactor fifteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36304:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fifteenth set of 20.
{code:java}
unsupportedOperationExceptionError
nullLiteralsCannotBeCastedError
notUserDefinedTypeError
cannotLoadUserDefinedTypeError
timeZoneIdNotSpecifiedForTimestampTypeError
notPublicClassError
primitiveTypesNotSupportedError
fieldIndexOnRowWithoutSchemaError
valueIsNullError
onlySupportDataSourcesProvidingFileFormatError
failToSetOriginalPermissionBackError
failToSetOriginalACLBackError
multiFailuresInStageMaterializationError
unrecognizedCompressionSchemaTypeIDError
getParentLoggerNotImplementedError
cannotCreateParquetConverterForTypeError
cannotCreateParquetConverterForDecimalTypeError
cannotCreateParquetConverterForDataTypeError
cannotAddMultiPartitionsOnNonatomicPartitionTableError
userSpecifiedSchemaUnsupportedByDataSourceError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fourteenth set of 20.
{code:java}
cannotGetEventTimeWatermarkError
cannotSetTimeoutTimestampError
batchMetadataFileNotFoundError
multiStreamingQueriesUsingPathConcurrentlyError
addFilesWithAbsolutePathUnsupportedError
microBatchUnsupportedByDataSourceError
cannotExecuteStreamingRelationExecError
invalidStreamingOutputModeError
catalogPluginClassNotFoundError
catalogPluginClassNotImplementedError
catalogPluginClassNotFoundForCatalogError
catalogFailToFindPublicNoArgConstructorError
catalogFailToCallPublicNoArgConstructorError
cannotInstantiateAbstractCatalogPluginClassError
failedToInstantiateConstructorForCatalogError
noSuchElementExceptionError
noSuchElementExceptionError
cannotMutateReadOnlySQLConfError
cannotCloneOrCopyReadOnlySQLConfError
cannotGetSQLConfInSchedulerEventLoopThreadError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor fifteenth set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36304
> URL: https://issues.apache.org/jira/browse/SPARK-36304
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the fifteenth set of 20.
> {code:java}
> unsupportedOperationExceptionError
> nullLiteralsCannotBeCastedError
> notUserDefinedTypeError
> cannotLoadUserDefinedTypeError
> timeZoneIdNotSpecifiedForTimestampTypeError
> notPublicClassError
> primitiveTypesNotSupportedError
> fieldIndexOnRowWithoutSchemaError
> valueIsNullError
> onlySupportDataSourcesProvidingFileFormatError
> failToSetOriginalPermissionBackError
> failToSetOriginalACLBackError
> multiFailuresInStageMaterializationError
> unrecognizedCompressionSchemaTypeIDError
> getParentLoggerNotImplementedError
> cannotCreateParquetConverterForTypeError
> cannotCreateParquetConverterForDecimalTypeError
> cannotCreateParquetConverterForDataTypeError
> cannotAddMultiPartitionsOnNonatomicPartitionTableError
> userSpecifiedSchemaUnsupportedByDataSourceError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36306) Refactor seventeenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36306:
--

 Summary: Refactor seventeenth set of 20 query execution errors to 
use error classes
 Key: SPARK-36306
 URL: https://issues.apache.org/jira/browse/SPARK-36306
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the sixteenth set of 20.
{code:java}
cannotDropMultiPartitionsOnNonatomicPartitionTableError
truncateMultiPartitionUnsupportedError
overwriteTableByUnsupportedExpressionError
dynamicPartitionOverwriteUnsupportedByTableError
failedMergingSchemaError
cannotBroadcastTableOverMaxTableRowsError
cannotBroadcastTableOverMaxTableBytesError
notEnoughMemoryToBuildAndBroadcastTableError
executeCodePathUnsupportedError
cannotMergeClassWithOtherClassError
continuousProcessingUnsupportedByDataSourceError
failedToReadDataError
failedToGenerateEpochMarkerError
foreachWriterAbortedDueToTaskFailureError
integerOverflowError
failedToReadDeltaFileError
failedToReadSnapshotFileError
cannotPurgeAsBreakInternalStateError
cleanUpSourceFilesUnsupportedError
latestOffsetNotCalledError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36305) Refactor sixteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36305:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the sixteenth set of 20.
{code:java}
cannotDropMultiPartitionsOnNonatomicPartitionTableError
truncateMultiPartitionUnsupportedError
overwriteTableByUnsupportedExpressionError
dynamicPartitionOverwriteUnsupportedByTableError
failedMergingSchemaError
cannotBroadcastTableOverMaxTableRowsError
cannotBroadcastTableOverMaxTableBytesError
notEnoughMemoryToBuildAndBroadcastTableError
executeCodePathUnsupportedError
cannotMergeClassWithOtherClassError
continuousProcessingUnsupportedByDataSourceError
failedToReadDataError
failedToGenerateEpochMarkerError
foreachWriterAbortedDueToTaskFailureError
integerOverflowError
failedToReadDeltaFileError
failedToReadSnapshotFileError
cannotPurgeAsBreakInternalStateError
cleanUpSourceFilesUnsupportedError
latestOffsetNotCalledError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fifteenth set of 20.
{code:java}
unsupportedOperationExceptionError
nullLiteralsCannotBeCastedError
notUserDefinedTypeError
cannotLoadUserDefinedTypeError
timeZoneIdNotSpecifiedForTimestampTypeError
notPublicClassError
primitiveTypesNotSupportedError
fieldIndexOnRowWithoutSchemaError
valueIsNullError
onlySupportDataSourcesProvidingFileFormatError
failToSetOriginalPermissionBackError
failToSetOriginalACLBackError
multiFailuresInStageMaterializationError
unrecognizedCompressionSchemaTypeIDError
getParentLoggerNotImplementedError
cannotCreateParquetConverterForTypeError
cannotCreateParquetConverterForDecimalTypeError
cannotCreateParquetConverterForDataTypeError
cannotAddMultiPartitionsOnNonatomicPartitionTableError
userSpecifiedSchemaUnsupportedByDataSourceError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor sixteenth set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36305
> URL: https://issues.apache.org/jira/browse/SPARK-36305
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the sixteenth set of 20.
> {code:java}
> cannotDropMultiPartitionsOnNonatomicPartitionTableError
> truncateMultiPartitionUnsupportedError
> overwriteTableByUnsupportedExpressionError
> dynamicPartitionOverwriteUnsupportedByTableError
> failedMergingSchemaError
> cannotBroadcastTableOverMaxTableRowsError
> cannotBroadcastTableOverMaxTableBytesError
> notEnoughMemoryToBuildAndBroadcastTableError
> executeCodePathUnsupportedError
> cannotMergeClassWithOtherClassError
> continuousProcessingUnsupportedByDataSourceError
> failedToReadDataError
> failedToGenerateEpochMarkerError
> foreachWriterAbortedDueToTaskFailureError
> integerOverflowError
> failedToReadDeltaFileError
> failedToReadSnapshotFileError
> cannotPurgeAsBreakInternalStateError
> cleanUpSourceFilesUnsupportedError
> latestOffsetNotCalledError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36305) Refactor sixteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36305:
--

 Summary: Refactor sixteenth set of 20 query execution errors to 
use error classes
 Key: SPARK-36305
 URL: https://issues.apache.org/jira/browse/SPARK-36305
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fifteenth set of 20.
{code:java}
unsupportedOperationExceptionError
nullLiteralsCannotBeCastedError
notUserDefinedTypeError
cannotLoadUserDefinedTypeError
timeZoneIdNotSpecifiedForTimestampTypeError
notPublicClassError
primitiveTypesNotSupportedError
fieldIndexOnRowWithoutSchemaError
valueIsNullError
onlySupportDataSourcesProvidingFileFormatError
failToSetOriginalPermissionBackError
failToSetOriginalACLBackError
multiFailuresInStageMaterializationError
unrecognizedCompressionSchemaTypeIDError
getParentLoggerNotImplementedError
cannotCreateParquetConverterForTypeError
cannotCreateParquetConverterForDecimalTypeError
cannotCreateParquetConverterForDataTypeError
cannotAddMultiPartitionsOnNonatomicPartitionTableError
userSpecifiedSchemaUnsupportedByDataSourceError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36303) Refactor fourteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36303:
--

 Summary: Refactor fourteenth set of 20 query execution errors to 
use error classes
 Key: SPARK-36303
 URL: https://issues.apache.org/jira/browse/SPARK-36303
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the thirteenth set of 20.
{code:java}
serDeInterfaceNotFoundError
convertHiveTableToCatalogTableError
cannotRecognizeHiveTypeError
getTablesByTypeUnsupportedByHiveVersionError
dropTableWithPurgeUnsupportedError
alterTableWithDropPartitionAndPurgeUnsupportedError
invalidPartitionFilterError
getPartitionMetadataByFilterError
unsupportedHiveMetastoreVersionError
loadHiveClientCausesNoClassDefFoundError
cannotFetchTablesOfDatabaseError
illegalLocationClauseForViewPartitionError
renamePathAsExistsPathError
renameAsExistsPathError
renameSrcPathNotFoundError
failedRenameTempFileError
legacyMetadataPathExistsError
partitionColumnNotFoundInSchemaError
stateNotDefinedOrAlreadyRemovedError
cannotSetTimeoutDurationError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36303) Refactor fourteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36303:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fourteenth set of 20.
{code:java}
cannotGetEventTimeWatermarkError
cannotSetTimeoutTimestampError
batchMetadataFileNotFoundError
multiStreamingQueriesUsingPathConcurrentlyError
addFilesWithAbsolutePathUnsupportedError
microBatchUnsupportedByDataSourceError
cannotExecuteStreamingRelationExecError
invalidStreamingOutputModeError
catalogPluginClassNotFoundError
catalogPluginClassNotImplementedError
catalogPluginClassNotFoundForCatalogError
catalogFailToFindPublicNoArgConstructorError
catalogFailToCallPublicNoArgConstructorError
cannotInstantiateAbstractCatalogPluginClassError
failedToInstantiateConstructorForCatalogError
noSuchElementExceptionError
noSuchElementExceptionError
cannotMutateReadOnlySQLConfError
cannotCloneOrCopyReadOnlySQLConfError
cannotGetSQLConfInSchedulerEventLoopThreadError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the thirteenth set of 20.
{code:java}
serDeInterfaceNotFoundError
convertHiveTableToCatalogTableError
cannotRecognizeHiveTypeError
getTablesByTypeUnsupportedByHiveVersionError
dropTableWithPurgeUnsupportedError
alterTableWithDropPartitionAndPurgeUnsupportedError
invalidPartitionFilterError
getPartitionMetadataByFilterError
unsupportedHiveMetastoreVersionError
loadHiveClientCausesNoClassDefFoundError
cannotFetchTablesOfDatabaseError
illegalLocationClauseForViewPartitionError
renamePathAsExistsPathError
renameAsExistsPathError
renameSrcPathNotFoundError
failedRenameTempFileError
legacyMetadataPathExistsError
partitionColumnNotFoundInSchemaError
stateNotDefinedOrAlreadyRemovedError
cannotSetTimeoutDurationError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor fourteenth set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36303
> URL: https://issues.apache.org/jira/browse/SPARK-36303
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the fourteenth set of 20.
> {code:java}
> cannotGetEventTimeWatermarkError
> cannotSetTimeoutTimestampError
> batchMetadataFileNotFoundError
> multiStreamingQueriesUsingPathConcurrentlyError
> addFilesWithAbsolutePathUnsupportedError
> microBatchUnsupportedByDataSourceError
> cannotExecuteStreamingRelationExecError
> invalidStreamingOutputModeError
> catalogPluginClassNotFoundError
> catalogPluginClassNotImplementedError
> catalogPluginClassNotFoundForCatalogError
> catalogFailToFindPublicNoArgConstructorError
> catalogFailToCallPublicNoArgConstructorError
> cannotInstantiateAbstractCatalogPluginClassError
> failedToInstantiateConstructorForCatalogError
> noSuchElementExceptionError
> noSuchElementExceptionError
> cannotMutateReadOnlySQLConfError
> cannotCloneOrCopyReadOnlySQLConfError
> cannotGetSQLConfInSchedulerEventLoopThreadError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36304) Refactor fifteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36304:
--

 Summary: Refactor fifteenth set of 20 query execution errors to 
use error classes
 Key: SPARK-36304
 URL: https://issues.apache.org/jira/browse/SPARK-36304
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fourteenth set of 20.
{code:java}
cannotGetEventTimeWatermarkError
cannotSetTimeoutTimestampError
batchMetadataFileNotFoundError
multiStreamingQueriesUsingPathConcurrentlyError
addFilesWithAbsolutePathUnsupportedError
microBatchUnsupportedByDataSourceError
cannotExecuteStreamingRelationExecError
invalidStreamingOutputModeError
catalogPluginClassNotFoundError
catalogPluginClassNotImplementedError
catalogPluginClassNotFoundForCatalogError
catalogFailToFindPublicNoArgConstructorError
catalogFailToCallPublicNoArgConstructorError
cannotInstantiateAbstractCatalogPluginClassError
failedToInstantiateConstructorForCatalogError
noSuchElementExceptionError
noSuchElementExceptionError
cannotMutateReadOnlySQLConfError
cannotCloneOrCopyReadOnlySQLConfError
cannotGetSQLConfInSchedulerEventLoopThreadError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36301) Refactor twelfth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36301:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the twelfth set of 20.
{code:java}
cannotRewriteDomainJoinWithConditionsError
decorrelateInnerQueryThroughPlanUnsupportedError
methodCalledInAnalyzerNotAllowedError
cannotSafelyMergeSerdePropertiesError
pairUnsupportedAtFunctionError
onceStrategyIdempotenceIsBrokenForBatchError[TreeType
structuralIntegrityOfInputPlanIsBrokenInClassError
structuralIntegrityIsBrokenAfterApplyingRuleError
ruleIdNotFoundForRuleError
cannotCreateArrayWithElementsExceedLimitError
indexOutOfBoundsOfArrayDataError
malformedRecordsDetectedInRecordParsingError
remoteOperationsUnsupportedError
invalidKerberosConfigForHiveServer2Error
parentSparkUIToAttachTabNotFoundError
inferSchemaUnsupportedForHiveError
requestedPartitionsMismatchTablePartitionsError
dynamicPartitionKeyNotAmongWrittenPartitionPathsError
cannotRemovePartitionDirError
cannotCreateStagingDirError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the eleventh set of 20.
{code:java}
expressionDecodingError
expressionEncodingError
classHasUnexpectedSerializerError
cannotGetOuterPointerForInnerClassError
userDefinedTypeNotAnnotatedAndRegisteredError
invalidInputSyntaxForBooleanError
unsupportedOperandTypeForSizeFunctionError
unexpectedValueForStartInFunctionError
unexpectedValueForLengthInFunctionError
sqlArrayIndexNotStartAtOneError
concatArraysWithElementsExceedLimitError
flattenArraysWithElementsExceedLimitError
createArrayWithElementsExceedLimitError
unionArrayWithElementsExceedLimitError
initialTypeNotTargetDataTypeError
initialTypeNotTargetDataTypesError
cannotConvertColumnToJSONError
malformedRecordsDetectedInSchemaInferenceError
malformedJSONError
malformedRecordsDetectedInSchemaInferenceError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor twelfth set of 20 query execution errors to use error classes
> --
>
> Key: SPARK-36301
> URL: https://issues.apache.org/jira/browse/SPARK-36301
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the twelfth set of 20.
> {code:java}
> cannotRewriteDomainJoinWithConditionsError
> decorrelateInnerQueryThroughPlanUnsupportedError
> methodCalledInAnalyzerNotAllowedError
> cannotSafelyMergeSerdePropertiesError
> pairUnsupportedAtFunctionError
> onceStrategyIdempotenceIsBrokenForBatchError[TreeType
> structuralIntegrityOfInputPlanIsBrokenInClassError
> structuralIntegrityIsBrokenAfterApplyingRuleError
> ruleIdNotFoundForRuleError
> cannotCreateArrayWithElementsExceedLimitError
> indexOutOfBoundsOfArrayDataError
> malformedRecordsDetectedInRecordParsingError
> remoteOperationsUnsupportedError
> invalidKerberosConfigForHiveServer2Error
> parentSparkUIToAttachTabNotFoundError
> inferSchemaUnsupportedForHiveError
> requestedPartitionsMismatchTablePartitionsError
> dynamicPartitionKeyNotAmongWrittenPartitionPathsError
> cannotRemovePartitionDirError
> cannotCreateStagingDirError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36302) Refactor thirteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36302:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the thirteenth set of 20.
{code:java}
serDeInterfaceNotFoundError
convertHiveTableToCatalogTableError
cannotRecognizeHiveTypeError
getTablesByTypeUnsupportedByHiveVersionError
dropTableWithPurgeUnsupportedError
alterTableWithDropPartitionAndPurgeUnsupportedError
invalidPartitionFilterError
getPartitionMetadataByFilterError
unsupportedHiveMetastoreVersionError
loadHiveClientCausesNoClassDefFoundError
cannotFetchTablesOfDatabaseError
illegalLocationClauseForViewPartitionError
renamePathAsExistsPathError
renameAsExistsPathError
renameSrcPathNotFoundError
failedRenameTempFileError
legacyMetadataPathExistsError
partitionColumnNotFoundInSchemaError
stateNotDefinedOrAlreadyRemovedError
cannotSetTimeoutDurationError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the twelfth set of 20.
{code:java}
cannotRewriteDomainJoinWithConditionsError
decorrelateInnerQueryThroughPlanUnsupportedError
methodCalledInAnalyzerNotAllowedError
cannotSafelyMergeSerdePropertiesError
pairUnsupportedAtFunctionError
onceStrategyIdempotenceIsBrokenForBatchError[TreeType
structuralIntegrityOfInputPlanIsBrokenInClassError
structuralIntegrityIsBrokenAfterApplyingRuleError
ruleIdNotFoundForRuleError
cannotCreateArrayWithElementsExceedLimitError
indexOutOfBoundsOfArrayDataError
malformedRecordsDetectedInRecordParsingError
remoteOperationsUnsupportedError
invalidKerberosConfigForHiveServer2Error
parentSparkUIToAttachTabNotFoundError
inferSchemaUnsupportedForHiveError
requestedPartitionsMismatchTablePartitionsError
dynamicPartitionKeyNotAmongWrittenPartitionPathsError
cannotRemovePartitionDirError
cannotCreateStagingDirError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor thirteenth set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36302
> URL: https://issues.apache.org/jira/browse/SPARK-36302
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the thirteenth set of 20.
> {code:java}
> serDeInterfaceNotFoundError
> convertHiveTableToCatalogTableError
> cannotRecognizeHiveTypeError
> getTablesByTypeUnsupportedByHiveVersionError
> dropTableWithPurgeUnsupportedError
> alterTableWithDropPartitionAndPurgeUnsupportedError
> invalidPartitionFilterError
> getPartitionMetadataByFilterError
> unsupportedHiveMetastoreVersionError
> loadHiveClientCausesNoClassDefFoundError
> cannotFetchTablesOfDatabaseError
> illegalLocationClauseForViewPartitionError
> renamePathAsExistsPathError
> renameAsExistsPathError
> renameSrcPathNotFoundError
> failedRenameTempFileError
> legacyMetadataPathExistsError
> partitionColumnNotFoundInSchemaError
> stateNotDefinedOrAlreadyRemovedError
> cannotSetTimeoutDurationError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36302) Refactor thirteenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36302:
--

 Summary: Refactor thirteenth set of 20 query execution errors to 
use error classes
 Key: SPARK-36302
 URL: https://issues.apache.org/jira/browse/SPARK-36302
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the twelfth set of 20.
{code:java}
cannotRewriteDomainJoinWithConditionsError
decorrelateInnerQueryThroughPlanUnsupportedError
methodCalledInAnalyzerNotAllowedError
cannotSafelyMergeSerdePropertiesError
pairUnsupportedAtFunctionError
onceStrategyIdempotenceIsBrokenForBatchError[TreeType
structuralIntegrityOfInputPlanIsBrokenInClassError
structuralIntegrityIsBrokenAfterApplyingRuleError
ruleIdNotFoundForRuleError
cannotCreateArrayWithElementsExceedLimitError
indexOutOfBoundsOfArrayDataError
malformedRecordsDetectedInRecordParsingError
remoteOperationsUnsupportedError
invalidKerberosConfigForHiveServer2Error
parentSparkUIToAttachTabNotFoundError
inferSchemaUnsupportedForHiveError
requestedPartitionsMismatchTablePartitionsError
dynamicPartitionKeyNotAmongWrittenPartitionPathsError
cannotRemovePartitionDirError
cannotCreateStagingDirError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36301) Refactor twelfth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36301:
--

 Summary: Refactor twelfth set of 20 query execution errors to use 
error classes
 Key: SPARK-36301
 URL: https://issues.apache.org/jira/browse/SPARK-36301
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the eleventh set of 20.
{code:java}
expressionDecodingError
expressionEncodingError
classHasUnexpectedSerializerError
cannotGetOuterPointerForInnerClassError
userDefinedTypeNotAnnotatedAndRegisteredError
invalidInputSyntaxForBooleanError
unsupportedOperandTypeForSizeFunctionError
unexpectedValueForStartInFunctionError
unexpectedValueForLengthInFunctionError
sqlArrayIndexNotStartAtOneError
concatArraysWithElementsExceedLimitError
flattenArraysWithElementsExceedLimitError
createArrayWithElementsExceedLimitError
unionArrayWithElementsExceedLimitError
initialTypeNotTargetDataTypeError
initialTypeNotTargetDataTypesError
cannotConvertColumnToJSONError
malformedRecordsDetectedInSchemaInferenceError
malformedJSONError
malformedRecordsDetectedInSchemaInferenceError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36300) Refactor eleventh set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36300:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the eleventh set of 20.
{code:java}
expressionDecodingError
expressionEncodingError
classHasUnexpectedSerializerError
cannotGetOuterPointerForInnerClassError
userDefinedTypeNotAnnotatedAndRegisteredError
invalidInputSyntaxForBooleanError
unsupportedOperandTypeForSizeFunctionError
unexpectedValueForStartInFunctionError
unexpectedValueForLengthInFunctionError
sqlArrayIndexNotStartAtOneError
concatArraysWithElementsExceedLimitError
flattenArraysWithElementsExceedLimitError
createArrayWithElementsExceedLimitError
unionArrayWithElementsExceedLimitError
initialTypeNotTargetDataTypeError
initialTypeNotTargetDataTypesError
cannotConvertColumnToJSONError
malformedRecordsDetectedInSchemaInferenceError
malformedJSONError
malformedRecordsDetectedInSchemaInferenceError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the tenth set of 20.
{code:java}
registeringStreamingQueryListenerError
concurrentQueryInstanceError
cannotParseJsonArraysAsStructsError
cannotParseStringAsDataTypeError
failToParseEmptyStringForDataTypeError
failToParseValueForDataTypeError
rootConverterReturnNullError
cannotHaveCircularReferencesInBeanClassError
cannotHaveCircularReferencesInClassError
cannotUseInvalidJavaIdentifierAsFieldNameError
cannotFindEncoderForTypeError
attributesForTypeUnsupportedError
schemaForTypeUnsupportedError
cannotFindConstructorForTypeError
paramExceedOneCharError
paramIsNotIntegerError
paramIsNotBooleanValueError
foundNullValueForNotNullableFieldError
malformedCSVRecordError
elementsOfTupleExceedLimitError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor eleventh set of 20 query execution errors to use error classes
> ---
>
> Key: SPARK-36300
> URL: https://issues.apache.org/jira/browse/SPARK-36300
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the eleventh set of 20.
> {code:java}
> expressionDecodingError
> expressionEncodingError
> classHasUnexpectedSerializerError
> cannotGetOuterPointerForInnerClassError
> userDefinedTypeNotAnnotatedAndRegisteredError
> invalidInputSyntaxForBooleanError
> unsupportedOperandTypeForSizeFunctionError
> unexpectedValueForStartInFunctionError
> unexpectedValueForLengthInFunctionError
> sqlArrayIndexNotStartAtOneError
> concatArraysWithElementsExceedLimitError
> flattenArraysWithElementsExceedLimitError
> createArrayWithElementsExceedLimitError
> unionArrayWithElementsExceedLimitError
> initialTypeNotTargetDataTypeError
> initialTypeNotTargetDataTypesError
> cannotConvertColumnToJSONError
> malformedRecordsDetectedInSchemaInferenceError
> malformedJSONError
> malformedRecordsDetectedInSchemaInferenceError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36298) Refactor ninth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36298:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the ninth set of 20.
{code:java}
unscaledValueTooLargeForPrecisionError
decimalPrecisionExceedsMaxPrecisionError
outOfDecimalTypeRangeError
unsupportedArrayTypeError
unsupportedJavaTypeError
failedParsingStructTypeError
failedMergingFieldsError
cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError
cannotMergeDecimalTypesWithIncompatiblePrecisionError
cannotMergeDecimalTypesWithIncompatibleScaleError
cannotMergeIncompatibleDataTypesError
exceedMapSizeLimitError
duplicateMapKeyFoundError
mapDataKeyArrayLengthDiffersFromValueArrayLengthError
fieldDiffersFromDerivedLocalDateError
failToParseDateTimeInNewParserError
failToFormatDateTimeInNewFormatterError
failToRecognizePatternAfterUpgradeError
failToRecognizePatternError
cannotCastUTF8StringToDataTypeError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the eighth set of 20.
{code:java}
executeBroadcastTimeoutError
cannotCompareCostWithTargetCostError
unsupportedDataTypeError
notSupportTypeError
notSupportNonPrimitiveTypeError
unsupportedTypeError
useDictionaryEncodingWhenDictionaryOverflowError
endOfIteratorError
cannotAllocateMemoryToGrowBytesToBytesMapError
cannotAcquireMemoryToBuildLongHashedRelationError
cannotAcquireMemoryToBuildUnsafeHashedRelationError
rowLargerThan256MUnsupportedError
cannotBuildHashedRelationWithUniqueKeysExceededError
cannotBuildHashedRelationLargerThan8GError
failedToPushRowIntoRowQueueError
unexpectedWindowFunctionFrameError
cannotParseStatisticAsPercentileError
statisticNotRecognizedError
unknownColumnError
unexpectedAccumulableUpdateValueError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor ninth set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36298
> URL: https://issues.apache.org/jira/browse/SPARK-36298
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the ninth set of 20.
> {code:java}
> unscaledValueTooLargeForPrecisionError
> decimalPrecisionExceedsMaxPrecisionError
> outOfDecimalTypeRangeError
> unsupportedArrayTypeError
> unsupportedJavaTypeError
> failedParsingStructTypeError
> failedMergingFieldsError
> cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError
> cannotMergeDecimalTypesWithIncompatiblePrecisionError
> cannotMergeDecimalTypesWithIncompatibleScaleError
> cannotMergeIncompatibleDataTypesError
> exceedMapSizeLimitError
> duplicateMapKeyFoundError
> mapDataKeyArrayLengthDiffersFromValueArrayLengthError
> fieldDiffersFromDerivedLocalDateError
> failToParseDateTimeInNewParserError
> failToFormatDateTimeInNewFormatterError
> failToRecognizePatternAfterUpgradeError
> failToRecognizePatternError
> cannotCastUTF8StringToDataTypeError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36299) Refactor tenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36299:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the tenth set of 20.
{code:java}
registeringStreamingQueryListenerError
concurrentQueryInstanceError
cannotParseJsonArraysAsStructsError
cannotParseStringAsDataTypeError
failToParseEmptyStringForDataTypeError
failToParseValueForDataTypeError
rootConverterReturnNullError
cannotHaveCircularReferencesInBeanClassError
cannotHaveCircularReferencesInClassError
cannotUseInvalidJavaIdentifierAsFieldNameError
cannotFindEncoderForTypeError
attributesForTypeUnsupportedError
schemaForTypeUnsupportedError
cannotFindConstructorForTypeError
paramExceedOneCharError
paramIsNotIntegerError
paramIsNotBooleanValueError
foundNullValueForNotNullableFieldError
malformedCSVRecordError
elementsOfTupleExceedLimitError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the ninth set of 20.
{code:java}
unscaledValueTooLargeForPrecisionError
decimalPrecisionExceedsMaxPrecisionError
outOfDecimalTypeRangeError
unsupportedArrayTypeError
unsupportedJavaTypeError
failedParsingStructTypeError
failedMergingFieldsError
cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError
cannotMergeDecimalTypesWithIncompatiblePrecisionError
cannotMergeDecimalTypesWithIncompatibleScaleError
cannotMergeIncompatibleDataTypesError
exceedMapSizeLimitError
duplicateMapKeyFoundError
mapDataKeyArrayLengthDiffersFromValueArrayLengthError
fieldDiffersFromDerivedLocalDateError
failToParseDateTimeInNewParserError
failToFormatDateTimeInNewFormatterError
failToRecognizePatternAfterUpgradeError
failToRecognizePatternError
cannotCastUTF8StringToDataTypeError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor tenth set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36299
> URL: https://issues.apache.org/jira/browse/SPARK-36299
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the tenth set of 20.
> {code:java}
> registeringStreamingQueryListenerError
> concurrentQueryInstanceError
> cannotParseJsonArraysAsStructsError
> cannotParseStringAsDataTypeError
> failToParseEmptyStringForDataTypeError
> failToParseValueForDataTypeError
> rootConverterReturnNullError
> cannotHaveCircularReferencesInBeanClassError
> cannotHaveCircularReferencesInClassError
> cannotUseInvalidJavaIdentifierAsFieldNameError
> cannotFindEncoderForTypeError
> attributesForTypeUnsupportedError
> schemaForTypeUnsupportedError
> cannotFindConstructorForTypeError
> paramExceedOneCharError
> paramIsNotIntegerError
> paramIsNotBooleanValueError
> foundNullValueForNotNullableFieldError
> malformedCSVRecordError
> elementsOfTupleExceedLimitError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36299) Refactor tenth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36299:
--

 Summary: Refactor tenth set of 20 query execution errors to use 
error classes
 Key: SPARK-36299
 URL: https://issues.apache.org/jira/browse/SPARK-36299
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the ninth set of 20.
{code:java}
unscaledValueTooLargeForPrecisionError
decimalPrecisionExceedsMaxPrecisionError
outOfDecimalTypeRangeError
unsupportedArrayTypeError
unsupportedJavaTypeError
failedParsingStructTypeError
failedMergingFieldsError
cannotMergeDecimalTypesWithIncompatiblePrecisionAndScaleError
cannotMergeDecimalTypesWithIncompatiblePrecisionError
cannotMergeDecimalTypesWithIncompatibleScaleError
cannotMergeIncompatibleDataTypesError
exceedMapSizeLimitError
duplicateMapKeyFoundError
mapDataKeyArrayLengthDiffersFromValueArrayLengthError
fieldDiffersFromDerivedLocalDateError
failToParseDateTimeInNewParserError
failToFormatDateTimeInNewFormatterError
failToRecognizePatternAfterUpgradeError
failToRecognizePatternError
cannotCastUTF8StringToDataTypeError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36300) Refactor eleventh set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36300:
--

 Summary: Refactor eleventh set of 20 query execution errors to use 
error classes
 Key: SPARK-36300
 URL: https://issues.apache.org/jira/browse/SPARK-36300
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the tenth set of 20.
{code:java}
registeringStreamingQueryListenerError
concurrentQueryInstanceError
cannotParseJsonArraysAsStructsError
cannotParseStringAsDataTypeError
failToParseEmptyStringForDataTypeError
failToParseValueForDataTypeError
rootConverterReturnNullError
cannotHaveCircularReferencesInBeanClassError
cannotHaveCircularReferencesInClassError
cannotUseInvalidJavaIdentifierAsFieldNameError
cannotFindEncoderForTypeError
attributesForTypeUnsupportedError
schemaForTypeUnsupportedError
cannotFindConstructorForTypeError
paramExceedOneCharError
paramIsNotIntegerError
paramIsNotBooleanValueError
foundNullValueForNotNullableFieldError
malformedCSVRecordError
elementsOfTupleExceedLimitError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36297) Refactor eighth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36297:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the eighth set of 20.
{code:java}
executeBroadcastTimeoutError
cannotCompareCostWithTargetCostError
unsupportedDataTypeError
notSupportTypeError
notSupportNonPrimitiveTypeError
unsupportedTypeError
useDictionaryEncodingWhenDictionaryOverflowError
endOfIteratorError
cannotAllocateMemoryToGrowBytesToBytesMapError
cannotAcquireMemoryToBuildLongHashedRelationError
cannotAcquireMemoryToBuildUnsafeHashedRelationError
rowLargerThan256MUnsupportedError
cannotBuildHashedRelationWithUniqueKeysExceededError
cannotBuildHashedRelationLargerThan8GError
failedToPushRowIntoRowQueueError
unexpectedWindowFunctionFrameError
cannotParseStatisticAsPercentileError
statisticNotRecognizedError
unknownColumnError
unexpectedAccumulableUpdateValueError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the seventh set of 20.
{code:java}
missingJdbcTableNameAndQueryError
emptyOptionError
invalidJdbcTxnIsolationLevelError
cannotGetJdbcTypeError
unrecognizedSqlTypeError
unsupportedJdbcTypeError
unsupportedArrayElementTypeBasedOnBinaryError
nestedArraysUnsupportedError
cannotTranslateNonNullValueForFieldError
invalidJdbcNumPartitionsError
transactionUnsupportedByJdbcServerError
dataTypeUnsupportedYetError
unsupportedOperationForDataTypeError
inputFilterNotFullyConvertibleError
cannotReadFooterForFileError
cannotReadFooterForFileError
foundDuplicateFieldInCaseInsensitiveModeError
failedToMergeIncompatibleSchemasError
ddlUnsupportedTemporarilyError
operatingOnCanonicalizationPlanError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor eighth set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36297
> URL: https://issues.apache.org/jira/browse/SPARK-36297
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the eighth set of 20.
> {code:java}
> executeBroadcastTimeoutError
> cannotCompareCostWithTargetCostError
> unsupportedDataTypeError
> notSupportTypeError
> notSupportNonPrimitiveTypeError
> unsupportedTypeError
> useDictionaryEncodingWhenDictionaryOverflowError
> endOfIteratorError
> cannotAllocateMemoryToGrowBytesToBytesMapError
> cannotAcquireMemoryToBuildLongHashedRelationError
> cannotAcquireMemoryToBuildUnsafeHashedRelationError
> rowLargerThan256MUnsupportedError
> cannotBuildHashedRelationWithUniqueKeysExceededError
> cannotBuildHashedRelationLargerThan8GError
> failedToPushRowIntoRowQueueError
> unexpectedWindowFunctionFrameError
> cannotParseStatisticAsPercentileError
> statisticNotRecognizedError
> unknownColumnError
> unexpectedAccumulableUpdateValueError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36298) Refactor ninth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36298:
--

 Summary: Refactor ninth set of 20 query execution errors to use 
error classes
 Key: SPARK-36298
 URL: https://issues.apache.org/jira/browse/SPARK-36298
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the eighth set of 20.
{code:java}
executeBroadcastTimeoutError
cannotCompareCostWithTargetCostError
unsupportedDataTypeError
notSupportTypeError
notSupportNonPrimitiveTypeError
unsupportedTypeError
useDictionaryEncodingWhenDictionaryOverflowError
endOfIteratorError
cannotAllocateMemoryToGrowBytesToBytesMapError
cannotAcquireMemoryToBuildLongHashedRelationError
cannotAcquireMemoryToBuildUnsafeHashedRelationError
rowLargerThan256MUnsupportedError
cannotBuildHashedRelationWithUniqueKeysExceededError
cannotBuildHashedRelationLargerThan8GError
failedToPushRowIntoRowQueueError
unexpectedWindowFunctionFrameError
cannotParseStatisticAsPercentileError
statisticNotRecognizedError
unknownColumnError
unexpectedAccumulableUpdateValueError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36296) Refactor seventh set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36296:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the seventh set of 20.
{code:java}
missingJdbcTableNameAndQueryError
emptyOptionError
invalidJdbcTxnIsolationLevelError
cannotGetJdbcTypeError
unrecognizedSqlTypeError
unsupportedJdbcTypeError
unsupportedArrayElementTypeBasedOnBinaryError
nestedArraysUnsupportedError
cannotTranslateNonNullValueForFieldError
invalidJdbcNumPartitionsError
transactionUnsupportedByJdbcServerError
dataTypeUnsupportedYetError
unsupportedOperationForDataTypeError
inputFilterNotFullyConvertibleError
cannotReadFooterForFileError
cannotReadFooterForFileError
foundDuplicateFieldInCaseInsensitiveModeError
failedToMergeIncompatibleSchemasError
ddlUnsupportedTemporarilyError
operatingOnCanonicalizationPlanError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the sixth set of 20.
{code:java}
noRecordsFromEmptyDataReaderError
fileNotFoundError
unsupportedSchemaColumnConvertError
cannotReadParquetFilesError
cannotCreateColumnarReaderError
invalidNamespaceNameError
unsupportedPartitionTransformError
missingDatabaseLocationError
cannotRemoveReservedPropertyError
namespaceNotEmptyError
writingJobFailedError
writingJobAbortedError
commitDeniedError
unsupportedTableWritesError
cannotCreateJDBCTableWithPartitionsError
unsupportedUserSpecifiedSchemaError
writeUnsupportedForBinaryFileDataSourceError
fileLengthExceedsMaxLengthError
unsupportedFieldNameError
cannotSpecifyBothJdbcTableNameAndQueryError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor seventh set of 20 query execution errors to use error classes
> --
>
> Key: SPARK-36296
> URL: https://issues.apache.org/jira/browse/SPARK-36296
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the seventh set of 20.
> {code:java}
> missingJdbcTableNameAndQueryError
> emptyOptionError
> invalidJdbcTxnIsolationLevelError
> cannotGetJdbcTypeError
> unrecognizedSqlTypeError
> unsupportedJdbcTypeError
> unsupportedArrayElementTypeBasedOnBinaryError
> nestedArraysUnsupportedError
> cannotTranslateNonNullValueForFieldError
> invalidJdbcNumPartitionsError
> transactionUnsupportedByJdbcServerError
> dataTypeUnsupportedYetError
> unsupportedOperationForDataTypeError
> inputFilterNotFullyConvertibleError
> cannotReadFooterForFileError
> cannotReadFooterForFileError
> foundDuplicateFieldInCaseInsensitiveModeError
> failedToMergeIncompatibleSchemasError
> ddlUnsupportedTemporarilyError
> operatingOnCanonicalizationPlanError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36297) Refactor eighth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36297:
--

 Summary: Refactor eighth set of 20 query execution errors to use 
error classes
 Key: SPARK-36297
 URL: https://issues.apache.org/jira/browse/SPARK-36297
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the seventh set of 20.
{code:java}
missingJdbcTableNameAndQueryError
emptyOptionError
invalidJdbcTxnIsolationLevelError
cannotGetJdbcTypeError
unrecognizedSqlTypeError
unsupportedJdbcTypeError
unsupportedArrayElementTypeBasedOnBinaryError
nestedArraysUnsupportedError
cannotTranslateNonNullValueForFieldError
invalidJdbcNumPartitionsError
transactionUnsupportedByJdbcServerError
dataTypeUnsupportedYetError
unsupportedOperationForDataTypeError
inputFilterNotFullyConvertibleError
cannotReadFooterForFileError
cannotReadFooterForFileError
foundDuplicateFieldInCaseInsensitiveModeError
failedToMergeIncompatibleSchemasError
ddlUnsupportedTemporarilyError
operatingOnCanonicalizationPlanError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36296) Refactor seventh set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36296:
--

 Summary: Refactor seventh set of 20 query execution errors to use 
error classes
 Key: SPARK-36296
 URL: https://issues.apache.org/jira/browse/SPARK-36296
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the sixth set of 20.
{code:java}
noRecordsFromEmptyDataReaderError
fileNotFoundError
unsupportedSchemaColumnConvertError
cannotReadParquetFilesError
cannotCreateColumnarReaderError
invalidNamespaceNameError
unsupportedPartitionTransformError
missingDatabaseLocationError
cannotRemoveReservedPropertyError
namespaceNotEmptyError
writingJobFailedError
writingJobAbortedError
commitDeniedError
unsupportedTableWritesError
cannotCreateJDBCTableWithPartitionsError
unsupportedUserSpecifiedSchemaError
writeUnsupportedForBinaryFileDataSourceError
fileLengthExceedsMaxLengthError
unsupportedFieldNameError
cannotSpecifyBothJdbcTableNameAndQueryError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36295) Refactor sixth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36295:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the sixth set of 20.
{code:java}
noRecordsFromEmptyDataReaderError
fileNotFoundError
unsupportedSchemaColumnConvertError
cannotReadParquetFilesError
cannotCreateColumnarReaderError
invalidNamespaceNameError
unsupportedPartitionTransformError
missingDatabaseLocationError
cannotRemoveReservedPropertyError
namespaceNotEmptyError
writingJobFailedError
writingJobAbortedError
commitDeniedError
unsupportedTableWritesError
cannotCreateJDBCTableWithPartitionsError
unsupportedUserSpecifiedSchemaError
writeUnsupportedForBinaryFileDataSourceError
fileLengthExceedsMaxLengthError
unsupportedFieldNameError
cannotSpecifyBothJdbcTableNameAndQueryError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fifth set of 20.
{code:java}
createStreamingSourceNotSpecifySchemaError
streamedOperatorUnsupportedByDataSourceError
multiplePathsSpecifiedError
failedToFindDataSourceError
removedClassInSpark2Error
incompatibleDataSourceRegisterError
unrecognizedFileFormatError
sparkUpgradeInReadingDatesError
sparkUpgradeInWritingDatesError
buildReaderUnsupportedForFileFormatError
jobAbortedError
taskFailedWhileWritingRowsError
readCurrentFileNotFoundError
unsupportedSaveModeError
cannotClearOutputDirectoryError
cannotClearPartitionDirectoryError
failedToCastValueToDataTypeForPartitionColumnError
endOfStreamError
fallbackV1RelationReportsInconsistentSchemaError
cannotDropNonemptyNamespaceError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor sixth set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36295
> URL: https://issues.apache.org/jira/browse/SPARK-36295
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the sixth set of 20.
> {code:java}
> noRecordsFromEmptyDataReaderError
> fileNotFoundError
> unsupportedSchemaColumnConvertError
> cannotReadParquetFilesError
> cannotCreateColumnarReaderError
> invalidNamespaceNameError
> unsupportedPartitionTransformError
> missingDatabaseLocationError
> cannotRemoveReservedPropertyError
> namespaceNotEmptyError
> writingJobFailedError
> writingJobAbortedError
> commitDeniedError
> unsupportedTableWritesError
> cannotCreateJDBCTableWithPartitionsError
> unsupportedUserSpecifiedSchemaError
> writeUnsupportedForBinaryFileDataSourceError
> fileLengthExceedsMaxLengthError
> unsupportedFieldNameError
> cannotSpecifyBothJdbcTableNameAndQueryError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36295) Refactor sixth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36295:
--

 Summary: Refactor sixth set of 20 query execution errors to use 
error classes
 Key: SPARK-36295
 URL: https://issues.apache.org/jira/browse/SPARK-36295
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fifth set of 20.
{code:java}
createStreamingSourceNotSpecifySchemaError
streamedOperatorUnsupportedByDataSourceError
multiplePathsSpecifiedError
failedToFindDataSourceError
removedClassInSpark2Error
incompatibleDataSourceRegisterError
unrecognizedFileFormatError
sparkUpgradeInReadingDatesError
sparkUpgradeInWritingDatesError
buildReaderUnsupportedForFileFormatError
jobAbortedError
taskFailedWhileWritingRowsError
readCurrentFileNotFoundError
unsupportedSaveModeError
cannotClearOutputDirectoryError
cannotClearPartitionDirectoryError
failedToCastValueToDataTypeForPartitionColumnError
endOfStreamError
fallbackV1RelationReportsInconsistentSchemaError
cannotDropNonemptyNamespaceError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36294) Refactor fifth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36294:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fifth set of 20.
{code:java}
createStreamingSourceNotSpecifySchemaError
streamedOperatorUnsupportedByDataSourceError
multiplePathsSpecifiedError
failedToFindDataSourceError
removedClassInSpark2Error
incompatibleDataSourceRegisterError
unrecognizedFileFormatError
sparkUpgradeInReadingDatesError
sparkUpgradeInWritingDatesError
buildReaderUnsupportedForFileFormatError
jobAbortedError
taskFailedWhileWritingRowsError
readCurrentFileNotFoundError
unsupportedSaveModeError
cannotClearOutputDirectoryError
cannotClearPartitionDirectoryError
failedToCastValueToDataTypeForPartitionColumnError
endOfStreamError
fallbackV1RelationReportsInconsistentSchemaError
cannotDropNonemptyNamespaceError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fourth set of 20.
{code:java}
unableToCreateDatabaseAsFailedToCreateDirectoryError
unableToDropDatabaseAsFailedToDeleteDirectoryError
unableToCreateTableAsFailedToCreateDirectoryError
unableToDeletePartitionPathError
unableToDropTableAsFailedToDeleteDirectoryError
unableToRenameTableAsFailedToRenameDirectoryError
unableToCreatePartitionPathError
unableToRenamePartitionPathError
methodNotImplementedError
tableStatsNotSpecifiedError
unaryMinusCauseOverflowError
binaryArithmeticCauseOverflowError
failedSplitSubExpressionMsg
failedSplitSubExpressionError
failedToCompileMsg
internalCompilerError
compilerError
unsupportedTableChangeError
notADatasourceRDDPartitionError
dataPathNotSpecifiedError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor fifth set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36294
> URL: https://issues.apache.org/jira/browse/SPARK-36294
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the fifth set of 20.
> {code:java}
> createStreamingSourceNotSpecifySchemaError
> streamedOperatorUnsupportedByDataSourceError
> multiplePathsSpecifiedError
> failedToFindDataSourceError
> removedClassInSpark2Error
> incompatibleDataSourceRegisterError
> unrecognizedFileFormatError
> sparkUpgradeInReadingDatesError
> sparkUpgradeInWritingDatesError
> buildReaderUnsupportedForFileFormatError
> jobAbortedError
> taskFailedWhileWritingRowsError
> readCurrentFileNotFoundError
> unsupportedSaveModeError
> cannotClearOutputDirectoryError
> cannotClearPartitionDirectoryError
> failedToCastValueToDataTypeForPartitionColumnError
> endOfStreamError
> fallbackV1RelationReportsInconsistentSchemaError
> cannotDropNonemptyNamespaceError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36294) Refactor fifth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36294:
--

 Summary: Refactor fifth set of 20 query execution errors to use 
error classes
 Key: SPARK-36294
 URL: https://issues.apache.org/jira/browse/SPARK-36294
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fourth set of 20.
{code:java}
unableToCreateDatabaseAsFailedToCreateDirectoryError
unableToDropDatabaseAsFailedToDeleteDirectoryError
unableToCreateTableAsFailedToCreateDirectoryError
unableToDeletePartitionPathError
unableToDropTableAsFailedToDeleteDirectoryError
unableToRenameTableAsFailedToRenameDirectoryError
unableToCreatePartitionPathError
unableToRenamePartitionPathError
methodNotImplementedError
tableStatsNotSpecifiedError
unaryMinusCauseOverflowError
binaryArithmeticCauseOverflowError
failedSplitSubExpressionMsg
failedSplitSubExpressionError
failedToCompileMsg
internalCompilerError
compilerError
unsupportedTableChangeError
notADatasourceRDDPartitionError
dataPathNotSpecifiedError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36293) Refactor fourth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36293:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the fourth set of 20.
{code:java}
unableToCreateDatabaseAsFailedToCreateDirectoryError
unableToDropDatabaseAsFailedToDeleteDirectoryError
unableToCreateTableAsFailedToCreateDirectoryError
unableToDeletePartitionPathError
unableToDropTableAsFailedToDeleteDirectoryError
unableToRenameTableAsFailedToRenameDirectoryError
unableToCreatePartitionPathError
unableToRenamePartitionPathError
methodNotImplementedError
tableStatsNotSpecifiedError
unaryMinusCauseOverflowError
binaryArithmeticCauseOverflowError
failedSplitSubExpressionMsg
failedSplitSubExpressionError
failedToCompileMsg
internalCompilerError
compilerError
unsupportedTableChangeError
notADatasourceRDDPartitionError
dataPathNotSpecifiedError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second set of 20.
{code:java}
inputTypeUnsupportedError
invalidFractionOfSecondError
overflowInSumOfDecimalError
overflowInIntegralDivideError
mapSizeExceedArraySizeWhenZipMapError
copyNullFieldNotAllowedError
literalTypeUnsupportedError
noDefaultForDataTypeError
doGenCodeOfAliasShouldNotBeCalledError
orderedOperationUnsupportedByDataTypeError
regexGroupIndexLessThanZeroError
regexGroupIndexExceedGroupCountError
invalidUrlError
dataTypeOperationUnsupportedError
mergeUnsupportedByWindowFunctionError
dataTypeUnexpectedError
typeUnsupportedError
negativeValueUnexpectedError
addNewFunctionMismatchedWithFunctionError
cannotGenerateCodeForUncomparableTypeError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor fourth set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36293
> URL: https://issues.apache.org/jira/browse/SPARK-36293
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the fourth set of 20.
> {code:java}
> unableToCreateDatabaseAsFailedToCreateDirectoryError
> unableToDropDatabaseAsFailedToDeleteDirectoryError
> unableToCreateTableAsFailedToCreateDirectoryError
> unableToDeletePartitionPathError
> unableToDropTableAsFailedToDeleteDirectoryError
> unableToRenameTableAsFailedToRenameDirectoryError
> unableToCreatePartitionPathError
> unableToRenamePartitionPathError
> methodNotImplementedError
> tableStatsNotSpecifiedError
> unaryMinusCauseOverflowError
> binaryArithmeticCauseOverflowError
> failedSplitSubExpressionMsg
> failedSplitSubExpressionError
> failedToCompileMsg
> internalCompilerError
> compilerError
> unsupportedTableChangeError
> notADatasourceRDDPartitionError
> dataPathNotSpecifiedError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36293) Refactor fourth set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36293:
--

 Summary: Refactor fourth set of 20 query execution errors to use 
error classes
 Key: SPARK-36293
 URL: https://issues.apache.org/jira/browse/SPARK-36293
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second set of 20.
{code:java}
inputTypeUnsupportedError
invalidFractionOfSecondError
overflowInSumOfDecimalError
overflowInIntegralDivideError
mapSizeExceedArraySizeWhenZipMapError
copyNullFieldNotAllowedError
literalTypeUnsupportedError
noDefaultForDataTypeError
doGenCodeOfAliasShouldNotBeCalledError
orderedOperationUnsupportedByDataTypeError
regexGroupIndexLessThanZeroError
regexGroupIndexExceedGroupCountError
invalidUrlError
dataTypeOperationUnsupportedError
mergeUnsupportedByWindowFunctionError
dataTypeUnexpectedError
typeUnsupportedError
negativeValueUnexpectedError
addNewFunctionMismatchedWithFunctionError
cannotGenerateCodeForUncomparableTypeError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36107:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the first set of 20.
{code:java}
columnChangeUnsupportedError
logicalHintOperatorNotRemovedDuringAnalysisError
cannotEvaluateExpressionError
cannotGenerateCodeForExpressionError
cannotTerminateGeneratorError
castingCauseOverflowError
cannotChangeDecimalPrecisionError
invalidInputSyntaxForNumericError
cannotCastFromNullTypeError
cannotCastError
cannotParseDecimalError
simpleStringWithNodeIdUnsupportedError
evaluateUnevaluableAggregateUnsupportedError
dataTypeUnsupportedError
dataTypeUnsupportedError
failedExecuteUserDefinedFunctionError
divideByZeroError
invalidArrayIndexError
mapKeyNotExistError
rowFromCSVParserNotExpectedError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the first 20.

{code}
columnChangeUnsupportedError
logicalHintOperatorNotRemovedDuringAnalysisError
cannotEvaluateExpressionError
cannotGenerateCodeForExpressionError
cannotTerminateGeneratorError
castingCauseOverflowError
cannotChangeDecimalPrecisionError
invalidInputSyntaxForNumericError
cannotCastFromNullTypeError
cannotCastError
cannotParseDecimalError
simpleStringWithNodeIdUnsupportedError
evaluateUnevaluableAggregateUnsupportedError
dataTypeUnsupportedError
dataTypeUnsupportedError
failedExecuteUserDefinedFunctionError
divideByZeroError
invalidArrayIndexError
mapKeyNotExistError
rowFromCSVParserNotExpectedError
{code}

For more detail, see the parent ticket 
[SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].


> Refactor first set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36107
> URL: https://issues.apache.org/jira/browse/SPARK-36107
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the first set of 20.
> {code:java}
> columnChangeUnsupportedError
> logicalHintOperatorNotRemovedDuringAnalysisError
> cannotEvaluateExpressionError
> cannotGenerateCodeForExpressionError
> cannotTerminateGeneratorError
> castingCauseOverflowError
> cannotChangeDecimalPrecisionError
> invalidInputSyntaxForNumericError
> cannotCastFromNullTypeError
> cannotCastError
> cannotParseDecimalError
> simpleStringWithNodeIdUnsupportedError
> evaluateUnevaluableAggregateUnsupportedError
> dataTypeUnsupportedError
> dataTypeUnsupportedError
> failedExecuteUserDefinedFunctionError
> divideByZeroError
> invalidArrayIndexError
> mapKeyNotExistError
> rowFromCSVParserNotExpectedError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36291:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second set of 20.
{code:java}
inputTypeUnsupportedError
invalidFractionOfSecondError
overflowInSumOfDecimalError
overflowInIntegralDivideError
mapSizeExceedArraySizeWhenZipMapError
copyNullFieldNotAllowedError
literalTypeUnsupportedError
noDefaultForDataTypeError
doGenCodeOfAliasShouldNotBeCalledError
orderedOperationUnsupportedByDataTypeError
regexGroupIndexLessThanZeroError
regexGroupIndexExceedGroupCountError
invalidUrlError
dataTypeOperationUnsupportedError
mergeUnsupportedByWindowFunctionError
dataTypeUnexpectedError
typeUnsupportedError
negativeValueUnexpectedError
addNewFunctionMismatchedWithFunctionError
cannotGenerateCodeForUncomparableTypeError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second 20.
{code:java}
inputTypeUnsupportedError
invalidFractionOfSecondError
overflowInSumOfDecimalError
overflowInIntegralDivideError
mapSizeExceedArraySizeWhenZipMapError
copyNullFieldNotAllowedError
literalTypeUnsupportedError
noDefaultForDataTypeError
doGenCodeOfAliasShouldNotBeCalledError
orderedOperationUnsupportedByDataTypeError
regexGroupIndexLessThanZeroError
regexGroupIndexExceedGroupCountError
invalidUrlError
dataTypeOperationUnsupportedError
mergeUnsupportedByWindowFunctionError
dataTypeUnexpectedError
typeUnsupportedError
negativeValueUnexpectedError
addNewFunctionMismatchedWithFunctionError
cannotGenerateCodeForUncomparableTypeError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor second set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36291
> URL: https://issues.apache.org/jira/browse/SPARK-36291
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the second set of 20.
> {code:java}
> inputTypeUnsupportedError
> invalidFractionOfSecondError
> overflowInSumOfDecimalError
> overflowInIntegralDivideError
> mapSizeExceedArraySizeWhenZipMapError
> copyNullFieldNotAllowedError
> literalTypeUnsupportedError
> noDefaultForDataTypeError
> doGenCodeOfAliasShouldNotBeCalledError
> orderedOperationUnsupportedByDataTypeError
> regexGroupIndexLessThanZeroError
> regexGroupIndexExceedGroupCountError
> invalidUrlError
> dataTypeOperationUnsupportedError
> mergeUnsupportedByWindowFunctionError
> dataTypeUnexpectedError
> typeUnsupportedError
> negativeValueUnexpectedError
> addNewFunctionMismatchedWithFunctionError
> cannotGenerateCodeForUncomparableTypeError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36292) Refactor third set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36292:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the third set of 20.
{code:java}
cannotGenerateCodeForUnsupportedTypeError
cannotInterpolateClassIntoCodeBlockError
customCollectionClsNotResolvedError
classUnsupportedByMapObjectsError
nullAsMapKeyNotAllowedError
methodNotDeclaredError
constructorNotFoundError
primaryConstructorNotFoundError
unsupportedNaturalJoinTypeError
notExpectedUnresolvedEncoderError
unsupportedEncoderError
notOverrideExpectedMethodsError
failToConvertValueToJsonError
unexpectedOperatorInCorrelatedSubquery
unreachableError
unsupportedRoundingMode
resolveCannotHandleNestedSchema
inputExternalRowCannotBeNullError
fieldCannotBeNullMsg
fieldCannotBeNullError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second 20.
{code:java}
cannotGenerateCodeForUnsupportedTypeError
cannotInterpolateClassIntoCodeBlockError
customCollectionClsNotResolvedError
classUnsupportedByMapObjectsError
nullAsMapKeyNotAllowedError
methodNotDeclaredError
constructorNotFoundError
primaryConstructorNotFoundError
unsupportedNaturalJoinTypeError
notExpectedUnresolvedEncoderError
unsupportedEncoderError
notOverrideExpectedMethodsError
failToConvertValueToJsonError
unexpectedOperatorInCorrelatedSubquery
unreachableError
unsupportedRoundingMode
resolveCannotHandleNestedSchema
inputExternalRowCannotBeNullError
fieldCannotBeNullMsg
fieldCannotBeNullError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor third set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36292
> URL: https://issues.apache.org/jira/browse/SPARK-36292
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the third set of 20.
> {code:java}
> cannotGenerateCodeForUnsupportedTypeError
> cannotInterpolateClassIntoCodeBlockError
> customCollectionClsNotResolvedError
> classUnsupportedByMapObjectsError
> nullAsMapKeyNotAllowedError
> methodNotDeclaredError
> constructorNotFoundError
> primaryConstructorNotFoundError
> unsupportedNaturalJoinTypeError
> notExpectedUnresolvedEncoderError
> unsupportedEncoderError
> notOverrideExpectedMethodsError
> failToConvertValueToJsonError
> unexpectedOperatorInCorrelatedSubquery
> unreachableError
> unsupportedRoundingMode
> resolveCannotHandleNestedSchema
> inputExternalRowCannotBeNullError
> fieldCannotBeNullMsg
> fieldCannotBeNullError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36291:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second 20.
{code:java}
inputTypeUnsupportedError
invalidFractionOfSecondError
overflowInSumOfDecimalError
overflowInIntegralDivideError
mapSizeExceedArraySizeWhenZipMapError
copyNullFieldNotAllowedError
literalTypeUnsupportedError
noDefaultForDataTypeError
doGenCodeOfAliasShouldNotBeCalledError
orderedOperationUnsupportedByDataTypeError
regexGroupIndexLessThanZeroError
regexGroupIndexExceedGroupCountError
invalidUrlError
dataTypeOperationUnsupportedError
mergeUnsupportedByWindowFunctionError
dataTypeUnexpectedError
typeUnsupportedError
negativeValueUnexpectedError
addNewFunctionMismatchedWithFunctionError
cannotGenerateCodeForUncomparableTypeError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second 20.
{code:java}
cannotGenerateCodeForUnsupportedTypeError
cannotInterpolateClassIntoCodeBlockError
customCollectionClsNotResolvedError
classUnsupportedByMapObjectsError
nullAsMapKeyNotAllowedError
methodNotDeclaredError
constructorNotFoundError
primaryConstructorNotFoundError
unsupportedNaturalJoinTypeError
notExpectedUnresolvedEncoderError
unsupportedEncoderError
notOverrideExpectedMethodsError
failToConvertValueToJsonError
unexpectedOperatorInCorrelatedSubquery
unreachableError
unsupportedRoundingMode
resolveCannotHandleNestedSchema
inputExternalRowCannotBeNullError
fieldCannotBeNullMsg
fieldCannotBeNullError
{code}
For more detail, see the parent ticket SPARK-36094.


> Refactor second set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36291
> URL: https://issues.apache.org/jira/browse/SPARK-36291
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the second 20.
> {code:java}
> inputTypeUnsupportedError
> invalidFractionOfSecondError
> overflowInSumOfDecimalError
> overflowInIntegralDivideError
> mapSizeExceedArraySizeWhenZipMapError
> copyNullFieldNotAllowedError
> literalTypeUnsupportedError
> noDefaultForDataTypeError
> doGenCodeOfAliasShouldNotBeCalledError
> orderedOperationUnsupportedByDataTypeError
> regexGroupIndexLessThanZeroError
> regexGroupIndexExceedGroupCountError
> invalidUrlError
> dataTypeOperationUnsupportedError
> mergeUnsupportedByWindowFunctionError
> dataTypeUnexpectedError
> typeUnsupportedError
> negativeValueUnexpectedError
> addNewFunctionMismatchedWithFunctionError
> cannotGenerateCodeForUncomparableTypeError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36292) Refactor third set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36292:
--

 Summary: Refactor third set of 20 query execution errors to use 
error classes
 Key: SPARK-36292
 URL: https://issues.apache.org/jira/browse/SPARK-36292
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second 20.
{code:java}
cannotGenerateCodeForUnsupportedTypeError
cannotInterpolateClassIntoCodeBlockError
customCollectionClsNotResolvedError
classUnsupportedByMapObjectsError
nullAsMapKeyNotAllowedError
methodNotDeclaredError
constructorNotFoundError
primaryConstructorNotFoundError
unsupportedNaturalJoinTypeError
notExpectedUnresolvedEncoderError
unsupportedEncoderError
notOverrideExpectedMethodsError
failToConvertValueToJsonError
unexpectedOperatorInCorrelatedSubquery
unreachableError
unsupportedRoundingMode
resolveCannotHandleNestedSchema
inputExternalRowCannotBeNullError
fieldCannotBeNullMsg
fieldCannotBeNullError
{code}
For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36291) Refactor second set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36291:
---
Summary: Refactor second set of 20 query execution errors to use error 
classes  (was: Refactor second 20 query execution errors to use error classes)

> Refactor second set of 20 query execution errors to use error classes
> -
>
> Key: SPARK-36291
> URL: https://issues.apache.org/jira/browse/SPARK-36291
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the second 20.
> {code:java}
> cannotGenerateCodeForUnsupportedTypeError
> cannotInterpolateClassIntoCodeBlockError
> customCollectionClsNotResolvedError
> classUnsupportedByMapObjectsError
> nullAsMapKeyNotAllowedError
> methodNotDeclaredError
> constructorNotFoundError
> primaryConstructorNotFoundError
> unsupportedNaturalJoinTypeError
> notExpectedUnresolvedEncoderError
> unsupportedEncoderError
> notOverrideExpectedMethodsError
> failToConvertValueToJsonError
> unexpectedOperatorInCorrelatedSubquery
> unreachableError
> unsupportedRoundingMode
> resolveCannotHandleNestedSchema
> inputExternalRowCannotBeNullError
> fieldCannotBeNullMsg
> fieldCannotBeNullError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36107) Refactor first set of 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36107:
---
Summary: Refactor first set of 20 query execution errors to use error 
classes  (was: Refactor first 20 query execution errors to use error classes)

> Refactor first set of 20 query execution errors to use error classes
> 
>
> Key: SPARK-36107
> URL: https://issues.apache.org/jira/browse/SPARK-36107
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the first 20.
> {code}
> columnChangeUnsupportedError
> logicalHintOperatorNotRemovedDuringAnalysisError
> cannotEvaluateExpressionError
> cannotGenerateCodeForExpressionError
> cannotTerminateGeneratorError
> castingCauseOverflowError
> cannotChangeDecimalPrecisionError
> invalidInputSyntaxForNumericError
> cannotCastFromNullTypeError
> cannotCastError
> cannotParseDecimalError
> simpleStringWithNodeIdUnsupportedError
> evaluateUnevaluableAggregateUnsupportedError
> dataTypeUnsupportedError
> dataTypeUnsupportedError
> failedExecuteUserDefinedFunctionError
> divideByZeroError
> invalidArrayIndexError
> mapKeyNotExistError
> rowFromCSVParserNotExpectedError
> {code}
> For more detail, see the parent ticket 
> [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36291) Refactor second 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36291:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the second 20.
{code:java}
cannotGenerateCodeForUnsupportedTypeError
cannotInterpolateClassIntoCodeBlockError
customCollectionClsNotResolvedError
classUnsupportedByMapObjectsError
nullAsMapKeyNotAllowedError
methodNotDeclaredError
constructorNotFoundError
primaryConstructorNotFoundError
unsupportedNaturalJoinTypeError
notExpectedUnresolvedEncoderError
unsupportedEncoderError
notOverrideExpectedMethodsError
failToConvertValueToJsonError
unexpectedOperatorInCorrelatedSubquery
unreachableError
unsupportedRoundingMode
resolveCannotHandleNestedSchema
inputExternalRowCannotBeNullError
fieldCannotBeNullMsg
fieldCannotBeNullError
{code}
For more detail, see the parent ticket SPARK-36094.

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the first 20.

{code}
columnChangeUnsupportedError
logicalHintOperatorNotRemovedDuringAnalysisError
cannotEvaluateExpressionError
cannotGenerateCodeForExpressionError
cannotTerminateGeneratorError
castingCauseOverflowError
cannotChangeDecimalPrecisionError
invalidInputSyntaxForNumericError
cannotCastFromNullTypeError
cannotCastError
cannotParseDecimalError
simpleStringWithNodeIdUnsupportedError
evaluateUnevaluableAggregateUnsupportedError
dataTypeUnsupportedError
dataTypeUnsupportedError
failedExecuteUserDefinedFunctionError
divideByZeroError
invalidArrayIndexError
mapKeyNotExistError
rowFromCSVParserNotExpectedError
{code}

For more detail, see the parent ticket 
[SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].


> Refactor second 20 query execution errors to use error classes
> --
>
> Key: SPARK-36291
> URL: https://issues.apache.org/jira/browse/SPARK-36291
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the second 20.
> {code:java}
> cannotGenerateCodeForUnsupportedTypeError
> cannotInterpolateClassIntoCodeBlockError
> customCollectionClsNotResolvedError
> classUnsupportedByMapObjectsError
> nullAsMapKeyNotAllowedError
> methodNotDeclaredError
> constructorNotFoundError
> primaryConstructorNotFoundError
> unsupportedNaturalJoinTypeError
> notExpectedUnresolvedEncoderError
> unsupportedEncoderError
> notOverrideExpectedMethodsError
> failToConvertValueToJsonError
> unexpectedOperatorInCorrelatedSubquery
> unreachableError
> unsupportedRoundingMode
> resolveCannotHandleNestedSchema
> inputExternalRowCannotBeNullError
> fieldCannotBeNullMsg
> fieldCannotBeNullError
> {code}
> For more detail, see the parent ticket SPARK-36094.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-36107) Refactor first 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)



 [ 
https://issues.apache.org/jira/browse/SPARK-36107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36107:
---
Description: 
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the first 20.

{code}
columnChangeUnsupportedError
logicalHintOperatorNotRemovedDuringAnalysisError
cannotEvaluateExpressionError
cannotGenerateCodeForExpressionError
cannotTerminateGeneratorError
castingCauseOverflowError
cannotChangeDecimalPrecisionError
invalidInputSyntaxForNumericError
cannotCastFromNullTypeError
cannotCastError
cannotParseDecimalError
simpleStringWithNodeIdUnsupportedError
evaluateUnevaluableAggregateUnsupportedError
dataTypeUnsupportedError
dataTypeUnsupportedError
failedExecuteUserDefinedFunctionError
divideByZeroError
invalidArrayIndexError
mapKeyNotExistError
rowFromCSVParserNotExpectedError
{code}

For more detail, see the parent ticket 
[SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].

  was:
Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on a 
few.

For more detail, see the parent ticket 
[SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].

Summary: Refactor first 20 query execution errors to use error classes  
(was: Refactor a few query execution errors to use error classes)

> Refactor first 20 query execution errors to use error classes
> -
>
> Key: SPARK-36107
> URL: https://issues.apache.org/jira/browse/SPARK-36107
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core, SQL
>Affects Versions: 3.2.0
>Reporter: Karen Feng
>Priority: Major
>
> Refactor some exceptions in 
> [QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
>  to use error classes.
> There are currently ~350 exceptions in this file; so this PR only focuses on 
> the first 20.
> {code}
> columnChangeUnsupportedError
> logicalHintOperatorNotRemovedDuringAnalysisError
> cannotEvaluateExpressionError
> cannotGenerateCodeForExpressionError
> cannotTerminateGeneratorError
> castingCauseOverflowError
> cannotChangeDecimalPrecisionError
> invalidInputSyntaxForNumericError
> cannotCastFromNullTypeError
> cannotCastError
> cannotParseDecimalError
> simpleStringWithNodeIdUnsupportedError
> evaluateUnevaluableAggregateUnsupportedError
> dataTypeUnsupportedError
> dataTypeUnsupportedError
> failedExecuteUserDefinedFunctionError
> divideByZeroError
> invalidArrayIndexError
> mapKeyNotExistError
> rowFromCSVParserNotExpectedError
> {code}
> For more detail, see the parent ticket 
> [SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-36291) Refactor second 20 query execution errors to use error classes

2021-07-26 Thread Karen Feng (Jira)

Karen Feng created SPARK-36291:
--

 Summary: Refactor second 20 query execution errors to use error 
classes
 Key: SPARK-36291
 URL: https://issues.apache.org/jira/browse/SPARK-36291
 Project: Spark
  Issue Type: Sub-task
  Components: Spark Core, SQL
Affects Versions: 3.2.0
Reporter: Karen Feng


Refactor some exceptions in 
[QueryExecutionErrors|https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryExecutionErrors.scala]
 to use error classes.

There are currently ~350 exceptions in this file; so this PR only focuses on 
the first 20.

{code}
columnChangeUnsupportedError
logicalHintOperatorNotRemovedDuringAnalysisError
cannotEvaluateExpressionError
cannotGenerateCodeForExpressionError
cannotTerminateGeneratorError
castingCauseOverflowError
cannotChangeDecimalPrecisionError
invalidInputSyntaxForNumericError
cannotCastFromNullTypeError
cannotCastError
cannotParseDecimalError
simpleStringWithNodeIdUnsupportedError
evaluateUnevaluableAggregateUnsupportedError
dataTypeUnsupportedError
dataTypeUnsupportedError
failedExecuteUserDefinedFunctionError
divideByZeroError
invalidArrayIndexError
mapKeyNotExistError
rowFromCSVParserNotExpectedError
{code}

For more detail, see the parent ticket 
[SPARK-36094|https://issues.apache.org/jira/browse/SPARK-36094].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

1 2 >

1 - 100 of 164 matches

Mail list logo