[ https://issues.apache.org/jira/browse/SPARK-36094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karen Feng updated SPARK-36094: ------------------------------- Description: To improve auditing, reduce duplication, and improve quality of error messages thrown from Spark, we should group them in a single JSON file (as discussed in the [mailing list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] and introduced in [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). In this file, the error messages should be labeled according to a consistent error class and with a SQLSTATE. We will start with the SQL component first. As a starting point, we can build off the exception grouping done in SPARK-33539. In total, there are ~1000 error messages to group split across three files (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In this ticket, each of these files is split into chunks of ~20 errors for refactoring. Here is an example PR that groups a few error messages in the QueryCompilationErrors class: [PR 33309|https://github.com/apache/spark/pull/33309]. [Guidelines|https://github.com/apache/spark/blob/master/core/src/main/resources/error/README.md]: - Error classes should be unique and sorted in alphabetical order. - Error classes should be unified as much as possible to improve auditing. If error messages are similar, group them into a single error class and add parameters to the error message. - SQLSTATE should match the ANSI/ISO standard, without introducing new classes or subclasses. - The Throwable should extend [SparkThrowable|https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/SparkThrowable.java]; see [SparkArithmeticException|https://github.com/apache/spark/blob/f90eb6a5db0778fd18b0b544f93eac3103bbf03b/core/src/main/scala/org/apache/spark/SparkException.scala#L75] as an example of how to mix SparkThrowable into a base Exception type. We will improve error message quality as a follow-up. was: To improve auditing, reduce duplication, and improve quality of error messages thrown from Spark, we should group them in a single JSON file (as discussed in the [mailing list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] and introduced in [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). In this file, the error messages should be labeled according to a consistent error class and with a SQLSTATE. We will start with the SQL component first. As a starting point, we can build off the exception grouping done in [SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, there are ~1000 error messages to group split across three files (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). In this ticket, each of these files is split into chunks of ~20 errors for refactoring. Here is an example PR that groups a few error messages in the QueryCompilationErrors class: [PR 33309|https://github.com/apache/spark/pull/33309]. [Guidelines|https://github.com/apache/spark/blob/master/core/src/main/resources/error/README.md]: - Error classes should be de-duplicated as much as possible to improve auditing. If error messages are similar, group them into a single error class and add parameters to the error message. - SQLSTATE should match the ANSI/ISO standard, without introducing new classes or subclasses. - The Throwable should extend [SparkThrowable|https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/SparkThrowable.java]; see [SparkArithmeticException|https://github.com/apache/spark/blob/f90eb6a5db0778fd18b0b544f93eac3103bbf03b/core/src/main/scala/org/apache/spark/SparkException.scala#L75] as an example of how to mix SparkThrowable into a base Exception type. We will improve error message quality as a follow-up. > Group SQL component error messages in Spark error class JSON file > ----------------------------------------------------------------- > > Key: SPARK-36094 > URL: https://issues.apache.org/jira/browse/SPARK-36094 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL > Affects Versions: 3.2.0 > Reporter: Karen Feng > Priority: Major > > To improve auditing, reduce duplication, and improve quality of error > messages thrown from Spark, we should group them in a single JSON file (as > discussed in the [mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] > and introduced in > [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). > In this file, the error messages should be labeled according to a consistent > error class and with a SQLSTATE. > We will start with the SQL component first. > As a starting point, we can build off the exception grouping done in > SPARK-33539. In total, there are ~1000 error messages to group split across > three files (QueryCompilationErrors, QueryExecutionErrors, and > QueryParsingErrors). In this ticket, each of these files is split into chunks > of ~20 errors for refactoring. > Here is an example PR that groups a few error messages in the > QueryCompilationErrors class: [PR > 33309|https://github.com/apache/spark/pull/33309]. > [Guidelines|https://github.com/apache/spark/blob/master/core/src/main/resources/error/README.md]: > - Error classes should be unique and sorted in alphabetical order. > - Error classes should be unified as much as possible to improve auditing. > If error messages are similar, group them into a single error class and add > parameters to the error message. > - SQLSTATE should match the ANSI/ISO standard, without introducing new > classes or subclasses. > - The Throwable should extend > [SparkThrowable|https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/SparkThrowable.java]; > see > [SparkArithmeticException|https://github.com/apache/spark/blob/f90eb6a5db0778fd18b0b544f93eac3103bbf03b/core/src/main/scala/org/apache/spark/SparkException.scala#L75] > as an example of how to mix SparkThrowable into a base Exception type. > We will improve error message quality as a follow-up. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org