[ 
https://issues.apache.org/jira/browse/SPARK-36094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Feng updated SPARK-36094:
-------------------------------
    Description: 
To improve auditing, reduce duplication, and improve quality of error messages 
thrown from Spark, we should group them in a single JSON file (as discussed in 
the [mailing 
list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html]
 and introduced in 
[SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]).
 In this file, the error messages should be labeled according to a consistent 
error class and with a SQLSTATE.

We will start with the SQL component first.
 As a starting point, we can build off the exception grouping done in 
SPARK-33539. In total, there are ~1000 error messages to group split across 
three files (QueryCompilationErrors, QueryExecutionErrors, and 
QueryParsingErrors). In this ticket, each of these files is split into chunks 
of ~20 errors for refactoring.

Here is an example PR that groups a few error messages in the 
QueryCompilationErrors class: [PR 
33309|https://github.com/apache/spark/pull/33309].

[Guidelines|https://github.com/apache/spark/blob/master/core/src/main/resources/error/README.md]:
 - Error classes should be unique and sorted in alphabetical order.
 - Error classes should be unified as much as possible to improve auditing. If 
error messages are similar, group them into a single error class and add 
parameters to the error message.
 - SQLSTATE should match the ANSI/ISO standard, without introducing new classes 
or subclasses.
 - The Throwable should extend 
[SparkThrowable|https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/SparkThrowable.java];
 see 
[SparkArithmeticException|https://github.com/apache/spark/blob/f90eb6a5db0778fd18b0b544f93eac3103bbf03b/core/src/main/scala/org/apache/spark/SparkException.scala#L75]
 as an example of how to mix SparkThrowable into a base Exception type.

We will improve error message quality as a follow-up.

  was:
To improve auditing, reduce duplication, and improve quality of error messages 
thrown from Spark, we should group them in a single JSON file (as discussed in 
the [mailing 
list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html]
 and introduced in 
[SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]).
 In this file, the error messages should be labeled according to a consistent 
error class and with a SQLSTATE.

We will start with the SQL component first.
As a starting point, we can build off the exception grouping done in 
[SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, 
there are ~1000 error messages to group split across three files 
(QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In this 
ticket, each of these files is split into chunks of ~20 errors for refactoring.

Here is an example PR that groups a few error messages in the 
QueryCompilationErrors class: [PR 
33309|https://github.com/apache/spark/pull/33309].

[Guidelines|https://github.com/apache/spark/blob/master/core/src/main/resources/error/README.md]:

- Error classes should be de-duplicated as much as possible to improve 
auditing. If error messages are similar, group them into a single error class 
and add parameters to the error message.
- SQLSTATE should match the ANSI/ISO standard, without introducing new classes 
or subclasses.
- The Throwable should extend 
[SparkThrowable|https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/SparkThrowable.java];
 see 
[SparkArithmeticException|https://github.com/apache/spark/blob/f90eb6a5db0778fd18b0b544f93eac3103bbf03b/core/src/main/scala/org/apache/spark/SparkException.scala#L75]
 as an example of how to mix SparkThrowable into a base Exception type.

We will improve error message quality as a follow-up.


> Group SQL component error messages in Spark error class JSON file
> -----------------------------------------------------------------
>
>                 Key: SPARK-36094
>                 URL: https://issues.apache.org/jira/browse/SPARK-36094
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, SQL
>    Affects Versions: 3.2.0
>            Reporter: Karen Feng
>            Priority: Major
>
> To improve auditing, reduce duplication, and improve quality of error 
> messages thrown from Spark, we should group them in a single JSON file (as 
> discussed in the [mailing 
> list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html]
>  and introduced in 
> [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]).
>  In this file, the error messages should be labeled according to a consistent 
> error class and with a SQLSTATE.
> We will start with the SQL component first.
>  As a starting point, we can build off the exception grouping done in 
> SPARK-33539. In total, there are ~1000 error messages to group split across 
> three files (QueryCompilationErrors, QueryExecutionErrors, and 
> QueryParsingErrors). In this ticket, each of these files is split into chunks 
> of ~20 errors for refactoring.
> Here is an example PR that groups a few error messages in the 
> QueryCompilationErrors class: [PR 
> 33309|https://github.com/apache/spark/pull/33309].
> [Guidelines|https://github.com/apache/spark/blob/master/core/src/main/resources/error/README.md]:
>  - Error classes should be unique and sorted in alphabetical order.
>  - Error classes should be unified as much as possible to improve auditing. 
> If error messages are similar, group them into a single error class and add 
> parameters to the error message.
>  - SQLSTATE should match the ANSI/ISO standard, without introducing new 
> classes or subclasses.
>  - The Throwable should extend 
> [SparkThrowable|https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/SparkThrowable.java];
>  see 
> [SparkArithmeticException|https://github.com/apache/spark/blob/f90eb6a5db0778fd18b0b544f93eac3103bbf03b/core/src/main/scala/org/apache/spark/SparkException.scala#L75]
>  as an example of how to mix SparkThrowable into a base Exception type.
> We will improve error message quality as a follow-up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to