Hi, I wanted to ask if the issue of URIs as source file names in standard GNU-style error messages has ever come up, and what guidance you might be able to provide on how an application should deal with URIs (containing colon characters) as source file names.
I also wanted to ask if the "Formatting Error Messages" section of the GNU Coding Standards might possibly be updated to specify how error message with URIs as source file names should be formatted. http://www.gnu.org/prep/standards/standards.html#Errors To be more specific: I note that the "Formatting Error Messages" section of the current GNU Coding Standards specifies the following basic format for error messages: source-file-name:lineno: message And, for an error message that reports both line and column numbers, either of the following: source-file-name:lineno:column: message source-file-name:lineno.column: message This issue with that format is: What should an application be expected to do if the "source-file-name" part contains one or more colon characters -- which it will if it is, say, an HTTP or FTP URI. For example: ftp://www.w3.org/foo/bar.html:5: Error; Attribute "charset" not allowed on element "meta" at this point. For that particular example, the current spec would seem to indicate that an application should expect that the "source-file-name" part of it is "http" and the line number is "//www.w3.org/foo/bar.html" and the column number is 5 -- or that the line number is "//www.w3.org/foo/bar" and the column number is "html" and there's an extra ":5" thrown in. There's a further issue if the URI contains more than one colon -- which it could if it were a URI for a remote file on an HTTP server running on a non-standard port; for example: http://www.w3.org:8080:5: Error; Attribute "charset" not allowed on element "meta" at this point. So it seems like it might be beneficial for the GNU Coding Standards to specify a standard way to indicate that the source-file-name part of the error message is a URI instead of a local file. For example: If the "source-file-name" part is a URI instead of a local path, the error message should use angle brackets to delimit the URI: <URI>:lineno: message So an example error message would look like this: <http://www.w3.org:8080/foo/bar.html>:5: Error; Attribute "charset" not allowed on element "meta" at this point. Applications such as Emacs that have built-in capabilities for parsing GNU-formatted error message could then be updated to handle the URI case by recognizing the angle brackets. As far as the use-case/rationale behind this, consider the case of applications that may accept (either directly or indirectly) as input files not just files on a local filesystem but also files/resources at remote locations -- with such remote locations being specified by a URI. By "indirectly" I mean that even in the case where an application is processing a file on the local filesystem, the file might reference other files through include/import statements and the like -- with the possibility that such an import/include statement might reference a remote resource/file using a URI. This is, for example, a common occurrence in XSLT stylesheets. I realize that the "Formatting Error Messages" was originally intended to define how compilers, specifically, should format their error messages (and that it in fact starts out by saying "Error messages from compilers should look like this") and that compilers traditionally are used to compile source files that are actually on the same local filesystem, not are remote locations. But I think the GNU error format is now used across a wide range of applications -- not just by compilers -- and in particular, by applications that do need to report errors in remote files (by giving their URIs). So I think it would be appropriate and beneficial for the GNU Coding Standards to specify a standard way of formatting error messages for errors in files are remote URIs. --Mike -- Michael(tm) Smith http://people.w3.org/mike/ http://sideshowbarker.net/
