Re: [HACKERS] Upgrading the backend's error-message infrastructure
On Thursday 13 March 2003 20:51, Tom Lane wrote: (Or, protocol upgrade phase 1...) After digging through our many past discussions of what to do with error messages, I have put together the following first-cut proposal. Fire at will... Objective - The basic objective here is to divide error reports into multiple fields, and in particular to include an error code field that gives applications a stable value to test against when they're trying to find out what went wrong. (I am not spending much space in this proposal on the question of exactly what set of error codes we ought to have, but that comes soon.) Peter Eisentraut argued cogently awhile back that the error codes ought not be hard-wired to specific error message texts, so this proposal treats them as separate entities. What about user messages ? If I remember correct, MSSQL had a system catalog table with formated error messages, and it was possible to raise error with error number and it's parameters. It can be very useful when you must raise same error from different places in the code. It is very useful when you need to translate error messages to another language for example. I think that there was a range of error numbers reserved for user error messages. Maybe even system messages can be stored in same way. OK, there is problem how to raise an error if you can sp_connect and get the error message (because an error is in sp_connect) ??? Just an Idea (from M$) ! ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Upgrading the backend's error-message infrastructure
Darko Prenosil [EMAIL PROTECTED] writes: What about user messages ? If I remember correct, MSSQL had a system catalog table with formated error messages, and it was possible to raise error with error number and it's parameters. It can be very useful when you must raise same error from different places in the code. But that's exactly the direction we are *not* going in. We had that discussion a long time ago when we first started internationalizing our error messages. Peter Eisentraut convinced everybody that we did not want to tie error codes to unique error messages. [digs in archives ...] See for example http://fts.postgresql.org/db/mw/msg.html?mid=1279991 I have no desire to revisit that choice. There is nothing to stop you from creating your own user-defined messages, and even adding them to the .po files in your installation if the need strikes. We aren't going to store them in any system table, however. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Upgrading the backend's error-message infrastructure
-*- Tom Lane [EMAIL PROTECTED] [ 2003-03-14 15:33 ]: Darko Prenosil [EMAIL PROTECTED] writes: What about user messages ? If I remember correct, MSSQL had a system catalog table with formated error messages, and it was possible to raise error with error number and it's parameters. It can be very useful when you must raise same error from different places in the code. But that's exactly the direction we are *not* going in. We had that discussion a long time ago when we first started internationalizing our error messages. Peter Eisentraut convinced everybody that we did not want to tie error codes to unique error messages. [digs in archives ...] See for example http://fts.postgresql.org/db/mw/msg.html?mid=1279991 I have no desire to revisit that choice. There is nothing to stop you from creating your own user-defined messages, and even adding them to the .po files in your installation if the need strikes. We aren't going to store them in any system table, however. What about the option of having error numbers unique, but have error numbers linked to error messages (unique in code, but share strings). Just my .02 ISK. -- Regards, Tolli [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Upgrading the backend's error-message infrastructure
On Thu, Mar 13, 2003 at 03:51:00PM -0500, Tom Lane wrote: Wire-protocol changes - Error and Notice (maybe also Notify?) msgs will have this structure: E x string \0 x string \0 x string \0 \0 where the x's are single-character field identifiers. A frontend should simply ignore any unrecognized fields. Initially defined fields for Error and Notice are: ... S,C,M fields will always appear (at least in Error messages; perhaps Notices might omit C?). The rest are optional. It strikes me that this error response could be made slimmer by removing the text fields. It makes sense for P, F, L, and R to be returned when available, as they're specific to the instance of the error. C is clearly necessary, as well. S is questionable, though, depending on whether (for every C there is one, and only one S). But the others are going to be the same for every instance of a given C. It would seem to make more sense to me to provide a different function(s) which allows the lookup Messages, Details, and Hints based on the SQLSTATE. The benefits that i see would be: - Less clutter and wasted space on the wire. If we are concerned enough about space to reduce the SQLSTATE to an integer mapping, removing all the extra text should be a big win. Couple this with the libraries' ability to now do things like cache messages, or not bother to retrieve messages for certain SQLSTATEs, and the benefit gets larger. - Removal of localization from error/notice generation libraries. This should make that section of code simpler and more fault-tolerant. It also allows libraries to do potentially weird stuff like using multiple different locales per connection, so long as they can specify a locale for the lookup functions. Does that make sense, or am i missing something? -johnn ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Upgrading the backend's error-message infrastructure
johnn [EMAIL PROTECTED] writes: It would seem to make more sense to me to provide a different function(s) which allows the lookup Messages, Details, and Hints based on the SQLSTATE. This would constrain us to have a different SQLSTATE for every error message, which we aren't going to do. See elsewhere in thread. It's also unclear how you insert parameters into error strings if you do this. - Less clutter and wasted space on the wire. I am not really concerned about shaving bytes transmitted for an error condition. If that's a performance-critical path for your app, you need to rewrite the app ;-) - Removal of localization from error/notice generation libraries. This should make that section of code simpler and more fault-tolerant. And you put it where, instead? The existing scheme for localization works fine AFAICT. I don't have any interest in reinventing it (nor any chance of getting this done for 7.4, if I were to try...) regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Upgrading the backend's error-message infrastructure
On Fri, Mar 14, 2003 at 12:23:04PM -0500, Tom Lane wrote: It would seem to make more sense to me to provide a different function(s) which allows the lookup Messages, Details, and Hints based on the SQLSTATE. This would constrain us to have a different SQLSTATE for every error message, which we aren't going to do. That makes sense -- i was assuming a one-to-one mapping (or, at least, many-to-one in the other direction: many SQLSTATEs for the same Unknown error message). I'm not sure i follow the reasoning behind allowing multiple messages for a single SQLSTATE, though. I would think that having the machine-readable portion of the error be the most granular would make sense. I can't imagine the SQLSTATE space being too small for us at this point. If it's different enough to warrant a different message, then, in my mind, it's different enough to warrant a different SQLSTATE. It's also unclear how you insert parameters into error strings if you do this. That's valid, but there are other ways of dealing with it. The position in the SQL statement has been moved out to another item in the response, so why not move the table, column, index, or whatnot into another item(s) as well? - Removal of localization from error/notice generation libraries. This should make that section of code simpler and more fault-tolerant. And you put it where, instead? Sorry, i think i phrased that poorly. What i meant was that the functions which provide lookups would need to be aware of locale because they're referencing localized strings. The functions which are specifically generating and transmitting the errors, on the other hand, would be free of localized strings, so would not have to rely on any of the locale infrastructure at all. I'm not suggesting any change in the scheme for localization or anything like that, just saying that limiting the internal access points might make things cleaner. The usual other benefits should result as well: simpler unit tests, easier maintenance, etc. -joh ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Upgrading the backend's error-message infrastructure
johnn [EMAIL PROTECTED] writes: If it's different enough to warrant a different message, then, in my mind, it's different enough to warrant a different SQLSTATE. Unfortunately, you're at odds with the SQL spec authors, who have made their intentions pretty clear by defining only about 130 standard SQLSTATEs: the granularity is supposed to be coarse. To take one example, there's just a single SQLSTATE for division by zero. One might or might not want different messages for float vs integer zero divide, but they're going to have the same SQLSTATE. My feeling is that the spec authors knew what they were doing, at least for the purposes they intended SQLSTATE to be used for. Applications want to detect errors at a granularity corresponding to what their recovery choices might be. For example, apps want to distinguish unique key violation from zero divide because they probably have something they can do about a unique-key problem. They *don't* want unique key violation to be broken down into forty-seven subvariants (especially not implementation-specific subvariants) because that just makes it difficult to detect the condition reliably --- it's almost as bad as having to look at an error message text. We could possibly invent a unique code for each message that is separate from SQLSTATE, but that idea was considered and rejected some time ago for what seem to me good reasons: it adds a lot of bookkeeping/maintenance effort for far too little return. Ultimately, the source code is the authoritative database for the set of possible errors, and trying to put that authority someplace else is just not worth the effort. (Besides, we already have tools that can extract information from the source code at need --- gettext does exactly this to prepare the NLS files.) It's also unclear how you insert parameters into error strings if you do this. That's valid, but there are other ways of dealing with it. The position in the SQL statement has been moved out to another item in the response, so why not move the table, column, index, or whatnot into another item(s) as well? Because then the reassembly becomes the front-end's problem. This was in fact an approach I proposed a year or two back, and it was (correctly, in hindsight) shot down. We have multiple frontend libraries and only one backend, so it's better to do this sort of thing once in the backend. There is not enough payback from making each frontend have to implement it. There is a good reason for separating out position --- different frontends are going to want to handle syntax-error marking differently (consider psql vs some kind of windowed GUI). But there's no corresponding bang for the buck in making every frontend handle localization issues. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Upgrading the backend's error-message infrastructure
(Or, protocol upgrade phase 1...) After digging through our many past discussions of what to do with error messages, I have put together the following first-cut proposal. Fire at will... Objective - The basic objective here is to divide error reports into multiple fields, and in particular to include an error code field that gives applications a stable value to test against when they're trying to find out what went wrong. (I am not spending much space in this proposal on the question of exactly what set of error codes we ought to have, but that comes soon.) Peter Eisentraut argued cogently awhile back that the error codes ought not be hard-wired to specific error message texts, so this proposal treats them as separate entities. Wire-protocol changes - Error and Notice (maybe also Notify?) msgs will have this structure: E x string \0 x string \0 x string \0 \0 where the x's are single-character field identifiers. A frontend should simply ignore any unrecognized fields. Initially defined fields for Error and Notice are: S Severity --- the string is ERROR, FATAL, or PANIC (if E msg) or WARNING, NOTICE, DEBUG, INFO, or LOG (if N msg). (Should this string be localizable? Probably, assuming that the E/N distinction is all the client library really cares about.) C Code --- SQLSTATE code for error (a 5-character string per SQL spec). Not localizable. M Message --- the string is the primary error message (localized). D Detail --- secondary error message, carrying more detail about the problem (localized). H Hint --- a suggestion what to do about the error (localized). P Position --- the string is a decimal ASCII integer, indicating an error cursor position as an index into the original query string. First character is index 1. Q: measure index in bytes, or characters? Latter seems preferable considering that an encoding conversion may have occurred. F File --- file name of source-code location where error was reported (__FILE__) L Line # --- line number of source-code location (__LINE__) R Routine --- source code routine name reporting error (__func__ or __FUNCTION__) S,C,M fields will always appear (at least in Error messages; perhaps Notices might omit C?). The rest are optional. Why three textual message fields? 'M' should always appear, 'D' and 'H' are optional (and relatively rare). The convention is that the primary 'M' message should be accurate but terse (normally one line); if more info is needed than can reasonably fit on a line, use the detail message to carry additional lines. A hint is something that doesn't directly describe the error, but is a suggestion what to do to get around it. 'M' and 'D' should be factual, whereas 'H' may contain some guesswork, or advice that might not always apply. Client interfaces are expected to report 'M', but might suppress 'D' and/or 'H' depending on factors such as screen space. (Preferably they should have a verbose mode that shows all available info, though.) Error codes --- The SQL spec defines a set of 5-character status codes (called SQLSTATE values). We'll use these as the language-independent identifiers for error conditions. There is code space reserved by the spec for implementation-defined error conditions, which we'll surely need. Per spec, each of the five characters in a SQLSTATE code must be a digit '0'-'9' or an upper-case Latin letter 'A'-'Z'. So it's possible to fit a SQLSTATE code into a 32-bit integer with some simple encoding conventions. I propose that we use such a representation in the backend; that is, instead of passing around strings like 1200D we pass around integers formed like ((('1' - '0') 6) + '2' - '0') 6 ... This should save a useful amount of space per elog call site, and it won't obscure the code noticeably since all the common values will be represented as macro names anyway, something like #define ERRCODE_DIVISION_BY_ZERO MAKE_SQLSTATE('2','2', '0','1','2') We need to do some legwork to figure out what set of implementation-defined error codes we want. It might make sense to look and see what other DBMSes are using. Backend source-code representation for extended error messages -- How do we generalize the elog() interface to cope with all this stuff? I don't think I want a function with a fixed parameter list --- some sort of open-ended API would be a lot more forward-looking. After some fooling around I've come up with the following proposal. A typical elog() call might be replaced by ereport(ERROR, ERRCODE_INTERNAL, errmsg(Big trouble with table %s, name), errhint(Bail out now, boss)); ERROR is the severity level, same as before, and ERRCODE_xxx is (a macro for) the
Re: [HACKERS] Upgrading the backend's error-message infrastructure
On Thu, 2003-03-13 at 15:51, Tom Lane wrote: After digging through our many past discussions of what to do with error messages, I have put together the following first-cut proposal. Great work, Tom! While we're effectively changing every elog call site in the backend, would it also be a good idea to adopt a standard for the format of error messages? (e.g. capitalization, grammar, etc.) extern int errmsg_internal(const char *fmt, ...); Like errmsg() except that the first parameter is not subject to gettext-ification. My thought is that this would be used for internal can't-happen conditions; there's no need to make translators labor over translating stuff like eval_const_expressions: unexpected boolop %d, nor even to make them think about whether they need to. If we wanted to get fancy, we could make use of the glibc ability to generate a back trace programatically: http://www.gnu.org/manual/glibc-2.2.5/html_node/Backtraces.html#Backtraces In gcc-compiled backends, the function name will be provided automatically by errstart, but there will be some places where we need the name to be available even in a non-gcc build. To be honest, I'd be sceptical whether there are enough platforms without *either* gcc or a C99 compiler that it's worthwhile worrying about them that much (all that is at stake is some backward compatibility, anyway). Cheers, Neil -- Neil Conway [EMAIL PROTECTED] || PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Upgrading the backend's error-message infrastructure
Neil Conway [EMAIL PROTECTED] writes: While we're effectively changing every elog call site in the backend, would it also be a good idea to adopt a standard for the format of error messages? (e.g. capitalization, grammar, etc.) Yup. I was planning to bring that up as a separate thread. I think Peter has already put some thought into it, but I couldn't find anything in the archives... If we wanted to get fancy, we could make use of the glibc ability to generate a back trace programatically: Hmm ... maybe. Certainly we all too often ask people to get this info by hand ... too bad it only works in glibc though. In gcc-compiled backends, the function name will be provided automatically by errstart, but there will be some places where we need the name to be available even in a non-gcc build. To be honest, I'd be sceptical whether there are enough platforms without *either* gcc or a C99 compiler that it's worthwhile worrying about them that much (all that is at stake is some backward compatibility, anyway). I'm only planning to bother with the errfunction hack for messages that I know are being specifically tested for by existing frontends. ecpg looks for PerformPortalFetch messages, for example. If we don't keep that name in the (old version of the) error message then we have a compatibility problem. But I do want to move away from having function names in the primary error message text. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Upgrading the backend's error-message infrastructure
Comments? All the error stuff sounds really neat. I volunteer for doing lots of elog changes when the time comes. Would it be possible to do a command line app? bash$ pg_error 1200D Severity: ERROR Message: Division by zero Detail: Hint: Modify statement to prevent zeros appearing in denominators. So people can look up errors offline (oracle-style) Chris ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Upgrading the backend's error-message infrastructure
Great work, Tom! While we're effectively changing every elog call site in the backend, would it also be a good idea to adopt a standard for the format of error messages? (e.g. capitalization, grammar, etc.) I 100% agree with this - a style guide! Chris ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Upgrading the backend's error-message infrastructure
On Thu, 2003-03-13 at 21:16, Christopher Kings-Lynne wrote: Would it be possible to do a command line app? bash$ pg_error 1200D Severity: ERROR Message: Division by zero Detail: Hint: Modify statement to prevent zeros appearing in denominators. Is there any benefit to having this over just including an index of error codes in the documentation? Cheers, Neil -- Neil Conway [EMAIL PROTECTED] || PGP Key ID: DB3C29FC ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Upgrading the backend's error-message infrastructure
On Thu, 2003-03-13 at 21:16, Christopher Kings-Lynne wrote: Would it be possible to do a command line app? bash$ pg_error 1200D Severity: ERROR Message: Division by zero Detail: Hint: Modify statement to prevent zeros appearing in denominators. Is there any benefit to having this over just including an index of error codes in the documentation? It's quick and easy, especially when there's thousands of error codes. Ideally, the pg_error app and the error code documentation should be automatically generated... You could have a built-in function: pg_print_error(text) returns text, then the pg_error command line program could just call that, plus the user could check up errors from within postgresql as well... Chris ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Upgrading the backend's error-message infrastructure
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Would it be possible to do a command line app? bash$ pg_error 1200D Severity: ERROR Message: Division by zero Detail: Hint: Modify statement to prevent zeros appearing in denominators. You're assuming that there's a one-to-one mapping of error codes to messages, which is not likely to be the case --- for example, all the can't happen errors will probably get lumped together under a single internal error error code. You could provide a lookup of the spec-defined meaning of each error code, maybe. Is there any benefit to having this over just including an index of error codes in the documentation? It's quick and easy, especially when there's thousands of error codes. But there aren't. I count about 130 SQLSTATEs defined by the spec. Undoubtedly we'll make more for Postgres-specific errors, but not hundreds more. There's just not value to applications in distinguishing errors at such a fine grain. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Upgrading the backend's error-message infrastructure
--On Thursday, March 13, 2003 15:51:00 -0500 Tom Lane [EMAIL PROTECTED] wrote: (__FUNCTION__ is only used if we are compiling in gcc). errstart() pushes an empty entry onto an error-data-collection stack and fills in the behind-the-scenes file/line entries. errmsg() and friends stash values into the top-level stack entry. Finally errfinish() assembles and emits the completed message, then pops the stack. By using a stack, we can be assured that things will work correctly if a message is logged by some subroutine called in the parameters to ereport (not too unlikely when you think about formatting functions like format_type_be()). __FUNCTION__ or an equivalent is MANDATED by C99, and available on UnixWare's native cc. You might want to make a configure test for it. I believe the __func__ is the C99 spelling (that's what's available on UnixWare): $ cc -O -o testfunc testfunc.c $ ./testfunc function=main,file=testfunc.c,line=4 $ cat testfunc.c #include stdio.h int main(int argc,char **argv) { printf(function=%s,file=%s,line=%d\n,__func__,__FILE__,__LINE__); } $ -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED] US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Upgrading the backend's error-message infrastructure
Larry Rosenman [EMAIL PROTECTED] writes: __FUNCTION__ or an equivalent is MANDATED by C99, and available on UnixWare's native cc. You might want to make a configure test for it. Right, __func__ is the C99 spelling. I did have a configure test in mind here: __func__ or __FUNCTION__ or NULL is what would get compiled in. One nice thing about this approach is that we need change only one place to adjust the set of behind-the-scenes error parameters. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Upgrading the backend's error-message infrastructure
--On Thursday, March 13, 2003 16:20:21 -0500 Tom Lane [EMAIL PROTECTED] wrote: Larry Rosenman [EMAIL PROTECTED] writes: __FUNCTION__ or an equivalent is MANDATED by C99, and available on UnixWare's native cc. You might want to make a configure test for it. Right, __func__ is the C99 spelling. I did have a configure test in mind here: __func__ or __FUNCTION__ or NULL is what would get compiled in. One nice thing about this approach is that we need change only one place to adjust the set of behind-the-scenes error parameters. Ok, you had said GCC only. Please do use the configure test, and __func__ if it's available. Thanks, LER -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED] US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]