Nicholas Chammas created SPARK-46810: ----------------------------------------
Summary: Clarify error class terminology Key: SPARK-46810 URL: https://issues.apache.org/jira/browse/SPARK-46810 Project: Spark Issue Type: Improvement Components: Documentation, SQL Affects Versions: 4.0.0 Reporter: Nicholas Chammas We use inconsistent terminology when talking about error classes. I'd like to get some clarity on that before contributing any potential improvements to this part of the documentation. Consider [INCOMPLETE_TYPE_DEFINITION|https://spark.apache.org/docs/3.5.0/sql-error-conditions-incomplete-type-definition-error-class.html]. It has several key pieces of hierarchical information that have inconsistent names throughout our documentation and codebase: * 42 ** K01 *** INCOMPLETE_TYPE_DEFINITION **** ARRAY **** MAP **** STRUCT What are the names of these different levels of information? Some examples of inconsistent terminology: * [Over here|https://spark.apache.org/docs/latest/sql-error-conditions-sqlstates.html#class-42-syntax-error-or-access-rule-violation] we call 42 the "class". Yet on the main page for INCOMPLETE_TYPE_DEFINITION we call that an "error class". So what exactly is a class, the 42 or the INCOMPLETE_TYPE_DEFINITION? * [Over here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/README.md#L122] we call K01 the "subclass". But [over here|https://github.com/apache/spark/blob/26d3eca0a8d3303d0bb9450feb6575ed145bbd7e/common/utils/src/main/resources/error/error-classes.json#L1452-L1467] we call the ARRAY, MAP, and STRUCT the subclasses. And on the main page for INCOMPLETE_TYPE_DEFINITION we call those same things "derived error classes". So what exactly is a subclass? I propose the following terminology, which we should use consistently throughout our code and documentation: * Error class: 42 * Error subclass: K01 * Error state: 42K01 * Error condition: INCOMPLETE_TYPE_DEFINITION * Error sub-conditions: ARRAY, MAP, STRUCT Side note: With this terminology, I believe talking about error classes and subclasses in front of users is not helpful. I don't think anybody cares about what 42 by itself means, or what K01 by itself means. Accordingly, we should limit how much we talk about these concepts. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org