+1 for this proposal. On Fri, Apr 16, 2021 at 5:15 AM Karen <karenfeng...@gmail.com> wrote:
> We could leave space in the numbering system, but a more flexible method > may be to have the severity as a field associated with the error class - > the same way we would associate error ID with SQLSTATE, or with whether an > error is user-facing or internal. As you noted, I don't believe there is a > standard framework for hints/warnings in Spark today. I propose that we > leave out severity as a field until there is sufficient demand. We will > leave room in the format for other fields. > > On Thu, Apr 15, 2021 at 3:18 AM Steve Loughran <ste...@cloudera.com.invalid> > wrote: > >> >> Machine readable logs are always good, especially if you can read the >> entire logs into an SQL query. >> >> It might be good to use some specific differentiation between >> hint/warn/fatal error in the numbering so that any automated analysis of >> the logs can identify the class of an error even if its an error not >> actually recognised. See VMS docs for an example of this; that in Windows >> is apparently based on their work >> https://www.stsci.edu/ftp/documents/system-docs/vms-guide/html/VUG_19.html >> . Even if things are only errors for now, leaving room in the format for >> other levels is wise. >> >> The trend in cloud infras is always to have some string "NoSuchBucket" >> which is (a) guaranteed to be maintained over time and (b) searchable in >> google. >> >> (That said. AWS has every service not just making up their own values but >> not even consistent responses for the same problem. S3 throttling: 503. >> DynamoDB: 500 + one of two different messages. see >> com.amazonaws.retry.RetryUtils for the details ) >> >> On Wed, 14 Apr 2021 at 20:04, Karen <karenfeng...@gmail.com> wrote: >> >>> Hi all, >>> >>> We would like to kick off a discussion on adding error IDs to Spark. >>> >>> Proposal: >>> >>> Add error IDs to provide a language-agnostic, locale-agnostic, specific, >>> and succinct answer for which class the problem falls under. When partnered >>> with a text-based error class (eg. 12345 TABLE_OR_VIEW_NOT_FOUND), error >>> IDs can provide meaningful categorization. They are useful for all Spark >>> personas: from users, to support engineers, to developers. >>> >>> Add SQLSTATEs. As discussed in #32013 >>> <https://github.com/apache/spark/pull/32013>, SQLSTATEs >>> <https://docs.teradata.com/r/EClCkxtGMW6hxXXtL8sBfA/ZDOZe5cOpMSSNnWOg8iLyw> >>> are portable error codes that are part of the ANSI/ISO SQL-99 standard >>> <https://github.com/apache/spark/files/6236838/ANSI.pdf>, and >>> especially useful for JDBC/ODBC users. They are not mutually exclusive with >>> adding product-specific error IDs, which can be more specific; for example, >>> MySQL uses an N-1 mapping from error IDs to SQLSTATEs: >>> https://dev.mysql.com/doc/refman/8.0/en/error-message-elements.html. >>> >>> Uniquely link error IDs to error messages (1-1). This simplifies the >>> auditing process and ensures that we uphold quality standards, as outlined >>> in SPIP: Standardize Error Message in Spark ( >>> https://docs.google.com/document/d/1XGj1o3xAFh8BA7RCn3DtwIPC6--hIFOaNUNSlpaOIZs/edit >>> ). >>> >>> Requirements: >>> >>> Changes are backwards compatible; developers should still be able to >>> throw exceptions in the existing style (eg. throw new >>> AnalysisException(“Arbitrary error message.”)). Adding error IDs will be a >>> gradual process, as there are thousands of exceptions thrown across the >>> code base. >>> >>> Optional: >>> >>> Label errors as user-facing or internal. Internal errors should be >>> logged, and end-users should be aware that they likely cannot fix the error >>> themselves. >>> >>> End result: >>> >>> Before: >>> >>> AnalysisException: Cannot find column ‘fakeColumn’; line 1 pos 14; >>> >>> After: >>> >>> AnalysisException: SPK-12345 COLUMN_NOT_FOUND: Cannot find column >>> ‘fakeColumn’; line 1 pos 14; (SQLSTATE 42704) >>> >>> Please let us know what you think about this proposal! We’d love to hear >>> what you think. >>> >>> Best, >>> >>> Karen Feng >>> >>