Nicholas Chammas created SPARK-46810:
----------------------------------------

             Summary: Clarify error class terminology
                 Key: SPARK-46810
                 URL: https://issues.apache.org/jira/browse/SPARK-46810
             Project: Spark
          Issue Type: Improvement
          Components: Documentation, SQL
    Affects Versions: 4.0.0
            Reporter: Nicholas Chammas


We use inconsistent terminology when talking about error classes. I'd like to 
get some clarity on that before contributing any potential improvements to this 
part of the documentation.

Consider 
[INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html].
 It has several key pieces of hierarchical information that have inconsistent 
names throughout our documentation and codebase:
 * 42
 ** K01
 *** INCOMPLETE_TYPE_DEFINITION
 **** ARRAY
 **** MAP
 **** STRUCT

What are the names of these different levels of information?

Some examples of inconsistent terminology:
 * [Over 
here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation]
 we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION we 
call that an "error class". So what exactly is a class, the 42 or the 
INCOMPLETE_TYPE_DEFINITION?
 * [Over 
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122]
 we call K01 the "subclass". But [over 
here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467]
 we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for 
INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes". 
So what exactly is a subclass?

I propose the following terminology, which we should use consistently 
throughout our code and documentation:
 * Error class: 42
 * Error subclass: K01
 * Error state: 42K01
 * Error condition: INCOMPLETE_TYPE_DEFINITION
 * Error sub-conditions: ARRAY, MAP, STRUCT

Side note: With this terminology, I believe talking about error classes and 
subclasses in front of users is not helpful. I don't think anybody cares about 
what 42 by itself means, or what K01 by itself means. Accordingly, we should 
limit how much we talk about these concepts.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to