[ https://issues.apache.org/jira/browse/SPARK-36094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karen Feng updated SPARK-36094: ------------------------------- Description: To improve auditing, reduce duplication, and improve quality of error messages thrown from Spark, we should group them in a single JSON file (as discussed in the [mailing list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] and introduced in [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). In this file, the error messages should be labeled according to a consistent error class and with a SQLSTATE. We will start with the SQL component first, building off the exception grouping done in [SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, there are ~1000 error messages to group split across three files (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). As a result, the work on this has been broken up into three subtasks, each of which involve grouping error messages across one of these files. This work should be done across multiple PRs per subtask to improve ease of reviewing. For each subtask, comment to place a lock and minimize merge conflicts down the line. As a guideline, the error classes should be de-duplicated as much as possible to improve auditing. We will improve error message quality as a follow-up. Here is an example PR that groups a few error messages in the QueryCompilationErrors class: [PR 33309|https://github.com/apache/spark/pull/33309]. was: To improve auditing, reduce duplication, and improve quality of error messages thrown from Spark, we should group them in a single JSON file (as discussed in the [mailing list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] and introduced in [SPARK-34920).|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]).] In this file, the error messages should be labeled according to a consistent error class and with a SQLSTATE. We will start with the SQL component first, > Group SQL component error messages in Spark error class JSON file > ----------------------------------------------------------------- > > Key: SPARK-36094 > URL: https://issues.apache.org/jira/browse/SPARK-36094 > Project: Spark > Issue Type: Improvement > Components: Spark Core, SQL > Affects Versions: 3.2.0 > Reporter: Karen Feng > Priority: Major > > To improve auditing, reduce duplication, and improve quality of error > messages thrown from Spark, we should group them in a single JSON file (as > discussed in the [mailing > list|http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Add-error-IDs-td31126.html] > and introduced in > [SPARK-34920|#diff-d41e24da75af19647fadd76ad0b63ecb22b08c0004b07091e4603a30ec0fe013]). > In this file, the error messages should be labeled according to a consistent > error class and with a SQLSTATE. > We will start with the SQL component first, building off the exception > grouping done in > [SPARK-33539|https://issues.apache.org/jira/browse/SPARK-33539]. In total, > there are ~1000 error messages to group split across three files > (QueryCompilationErrors, QueryExecutionErrors, and QueryParsingErrors). As a > result, the work on this has been broken up into three subtasks, each of > which involve grouping error messages across one of these files. > This work should be done across multiple PRs per subtask to improve ease of > reviewing. For each subtask, comment to place a lock and minimize merge > conflicts down the line. > As a guideline, the error classes should be de-duplicated as much as possible > to improve auditing. > We will improve error message quality as a follow-up. > Here is an example PR that groups a few error messages in the > QueryCompilationErrors class: [PR > 33309|https://github.com/apache/spark/pull/33309]. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org