Hi all, We would like to kick off a discussion on adding error IDs to Spark.
Proposal: Add error IDs to provide a language-agnostic, locale-agnostic, specific, and succinct answer for which class the problem falls under. When partnered with a text-based error class (eg. 12345 TABLE_OR_VIEW_NOT_FOUND), error IDs can provide meaningful categorization. They are useful for all Spark personas: from users, to support engineers, to developers. Add SQLSTATEs. As discussed in #32013 <https://github.com/apache/spark/pull/32013>, SQLSTATEs <https://docs.teradata.com/r/EClCkxtGMW6hxXXtL8sBfA/ZDOZe5cOpMSSNnWOg8iLyw> are portable error codes that are part of the ANSI/ISO SQL-99 standard <https://github.com/apache/spark/files/6236838/ANSI.pdf>, and especially useful for JDBC/ODBC users. They are not mutually exclusive with adding product-specific error IDs, which can be more specific; for example, MySQL uses an N-1 mapping from error IDs to SQLSTATEs: https://dev.mysql.com/doc/refman/8.0/en/error-message-elements.html. Uniquely link error IDs to error messages (1-1). This simplifies the auditing process and ensures that we uphold quality standards, as outlined in SPIP: Standardize Error Message in Spark ( https://docs.google.com/document/d/1XGj1o3xAFh8BA7RCn3DtwIPC6--hIFOaNUNSlpaOIZs/edit ). Requirements: Changes are backwards compatible; developers should still be able to throw exceptions in the existing style (eg. throw new AnalysisException(“Arbitrary error message.”)). Adding error IDs will be a gradual process, as there are thousands of exceptions thrown across the code base. Optional: Label errors as user-facing or internal. Internal errors should be logged, and end-users should be aware that they likely cannot fix the error themselves. End result: Before: AnalysisException: Cannot find column ‘fakeColumn’; line 1 pos 14; After: AnalysisException: SPK-12345 COLUMN_NOT_FOUND: Cannot find column ‘fakeColumn’; line 1 pos 14; (SQLSTATE 42704) Please let us know what you think about this proposal! We’d love to hear what you think. Best, Karen Feng