Re: LCLint 3.0.0.17 parse problem
Richard, >I can find nothing in either C standard that requires a C system to >support blanks at the ends of lines. I therefore deny that the translation >unit above *is* strictly conforming. (Nothing in the C standard requires >a C system to support curly braces, either. Hence trigraphs and digraphs.) I think these somewhat unusual views are best argued out on comp.std.c derek -- Derek M Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltdmailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk
Re: LCLint 3.0.0.17 parse problem
I observed that >As has often been pointed out in comp.std.c, there is NOTHING in any C or >C++ standard to forbid a compiler writer defining the end of line indicator >to be "end of record, preceded by any number of blanks". There is NO >requirement whatsoever that the "end of line indicator" be just the physical >end of record. Derek M Jones <[EMAIL PROTECTED]> wrote in reply: True. But the compiler has to handle any strictly conforming program. Agreed. I can write a strictly conforming program (using macros and stringizing) where a backslash followed by blanks, followed by end-of-record occurs. Let's see it! I've tried to think how it might be done. AH! preprocessing-token: header-name identifier pp-number character-constant string-literal punctuator each non-white-space character that cannot be one of the above* So #define bar(x) #x #define foo(x) bar(x) #define fred~\ char *x = foo(fred); => char *x = "~\"; [lcc and SPARCompiler cc like this, gcc doesn't.] But if we change fred to #define fred~\ then we get char *x = "~"; A compiler that unconditionally turned this into a line splice would be faulty. Well, no. As I've said, a compiler is at liberty to define an end of line indicator however it wants to. Such a compiler would not "turn this into a line splice", because it would be a compiler in which blanks at the end of a line (logically) DID NOT EXIST. I can find nothing in either C standard that requires a C system to support blanks at the ends of lines. I therefore deny that the translation unit above *is* strictly conforming. (Nothing in the C standard requires a C system to support curly braces, either. Hence trigraphs and digraphs.) The output of the cc and lcc compilers is arguably incorrect; a \ is supposed to be inserted before each " or \ in the replacement text of the parameter. I think it should have been "~\\", not "~\". The purpose of my reference to the #if macro DR was to point out that such a macro could not be used by a strictly conforming program within a #if directive. The reason why foo(fred) can't be used within an #if is that it yields a string, and strings can't appear in #if. None of the macro _calls_ involves an embedded newline. So far, the only way I know of that this can show up is a combination of - stringizing, and - the fact that ANY character not otherwise forming part of a token is allowed as a pp-token. I think the last rule there is a bad one. gcc -E produces workable output for this example, but gcc -c refuses to compile it. I don't think any realistic working code is likely to be broken by a compiler that handles fixed records this way.
Re: LCLint 3.0.0.17 parse problem
Richard, >So what is an end-of-line indicator? The standard just says >"In source files, there shall be SOME way of indicating the end of >each line of text; this International Standard treats such an >end-of-line indicator as if it were a single new-line character." > >As has often been pointed out in comp.std.c, there is NOTHING in any C or >C++ standard to forbid a compiler writer defining the end of line indicator >to be "end of record, preceded by any number of blanks". There is NO >requirement whatsoever that the "end of line indicator" be just the physical >end of record. True. But the compiler has to handle any strictly conforming program. I can write a strictly conforming program (using macros and stringizing) where a backslash followed by blanks, followed by end-of-record occurs. A compiler that unconditionally turned this into a line splice would be faulty. The purpose of my reference to the #if macro DR was to point out that such a macro could not be used by a strictly conforming program within a #if directive. So in this special case a #if followed by backslash, followed by blanks could only be treated as a syntax error, or a line splice. derek -- Derek M Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltdmailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk
Re: LCLint 3.0.0.17 parse problem
Derek M Jones <[EMAIL PROTECTED]> wrote: #if is a special case in that it is not possible to split macro invocations across it, as answered by the following C90 Defect Report: http://anubis.dkuug.dk/JTC1/SC22/WG14/www/docs/dr_017.html Thank you for this reference. But what it says is that #if f(1, 2) is not defined. It doesn't say anything about backslash. The correction says it is not allowed: preprocessing directives are _first_ terminated by newline and _then_ macro-expanded if appropriate. It looks like you have found a compiler that, strictly speaking, is making use of an extension. Or IBM wiggle a bit and point out that their compilers undefined behaviour on encountering this kind of syntax error (which is what it is) is to treat it as a line splice. The C99 standard says [translation phase 1] "Physical source file multibyte characters are mapped to the source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences are replaced by corresponding single-character internal representations." Backslash/newline splicing takes place in phase 2, AFTER end-of-line indicators have been replaced. So what is an end-of-line indicator? The standard just says "In source files, there shall be SOME way of indicating the end of each line of text; this International Standard treats such an end-of-line indicator as if it were a single new-line character." As has often been pointed out in comp.std.c, there is NOTHING in any C or C++ standard to forbid a compiler writer defining the end of line indicator to be "end of record, preceded by any number of blanks". There is NO requirement whatsoever that the "end of line indicator" be just the physical end of record. In the spirit of "be strict about what you generate, forgiving about what you accept", the best thing for whoever wrote the C compiler in question to do would be to be very careful to put backslashes in the rightmost column of their header files, but accept any number of spaces between a backslash and end of record. Note that there is nothing to stop a UNIX C compiler defining ::= ( | )* ? (This could be quite useful if one were trying to compile from a source file on a floppy disc written on a Win/DOS system.) You will either have to modify the preprocessor in LCLint, LCLint really ought to handle this. Spaces after a backslash should be recognised as something that the standard DOES allow (if a compiler writer cares to define "end-of-line indicator" appropriately) but does not require, so is a definite porting problem. or make a local copy of the offending header and edit around the problem. The quickest workaround.
Re: LCLint 3.0.0.17 parse problem
I did some poking around with a hex editor in the offending header file, and this is what I found: File: features.h EBCDIC Offset: 0xB6E0 / 0x0001B4A3 (%42) B690 15 40 40 40 40 40 40 7B 89 86 40 4D 5A 84 85 86 #if (!def B6A0 89 95 85 84 4D 6D C1 D3 D3 6D E2 D6 E4 D9 C3 C5 ined(_ALL_SOURCE B6B0 5D 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 ) B6C0 40 40 E0 40 40 40 40 40 40 40 40 40 40 40 40 40 \ B6D0 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 B6E0 40 15 40 40 40 40 40 40 40 50 50 40 40 5A 84 85 && !de B6F0 86 89 95 85 84 4D 6D D6 D7 C5 D5 6D E2 D6 E4 D9 fined(_OPEN_SOUR B700 C3 C5 5D 40 40 40 40 40 40 40 40 40 40 40 40 40 CE) B710 40 40 40 E0 40 40 40 40 40 40 40 40 40 40 40 40 \ B720 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 B730 40 40 15 40 40 40 40 40 40 40 50 50 40 40 5A 84 && !d B740 85 86 89 95 85 84 4D 6D D6 D7 C5 D5 6D E2 E8 E2 efined(_OPEN_SYS B750 5D 5D 40 40 40 40 40 40 40 40 40 40 40 40 40 40 )) B760 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 B770 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 B780 40 40 40 15 40 40 40 40 40 40 40 40 40 40 40 40 B790 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 40 0x15 is the EBCDIC newline character, 0x40 is the space character. It seems that there are an awful lot of spaces between the backslash char (0xE0) and the newline. On another note, please excuse my ignorance of optimizing compilers and the like. I'm just starting out, and learning just how much I need to learn. Anthony Giorgio DBX Developer Phone: (845) 435-9115 Tie Line: 295-9115 Email: agiorgio AT us.ibm.com >> Does the file system pad lines with spaces? > >Yes it does! It seems that the file is very similar to the old-style IBM >punch cards, where everything had 80 columns, and anything that wasn't >filled in was a space. The file is filled out to column 81 with spaces, >and the \ is in column 66. Shouldn't lclint just ignore the whitespace >following the trailing \? You might like to poke around in using a binary editor to see what representation is used by IBM.
Re: LCLint 3.0.0.17 parse problem
The thing about \ is that each C implementation gets to define its own line termination sequence. I may be misreading the standard (actually a draft; I have bought three paper copies of the C89 standard and they have _all_ walked out of my office over the years) but as far as I can see there is no reason why an IBM mainframe C compiler couldn't just say that the character sequence constrained by the C standard is data Card = Card [Char] -- a record with possible space padding to_standard: [Card] -> [Char] to_standard [] = [] to_standard (Card card : cards) = trim_right card ++ "\n" ++ to_standard cards trim_right :: [Char] -> [Char] trim_right line = reverse (dropWhile (<= ' ') (reverse line)) That is, for the purpose of \, there is no reason why the padding spaces needed for fixed length records have to count as existing. It most cases it could ignore whitespace after a \, provided there were only trailing whitespaces. It is possible to come up with programs that rely on the line being a \ followed by whitespace, for instance, and not being a line splice. Such cases are rare, but that does not mean an implementation is free to change the requirements specified in the standard. True, but an implementation IS free to define how the character sequence constrained by the standard is computed from raw bytes. Since line splicing is in one of the earliest translation phases, it would be interesting to see a legal example where \ was allowed by the standard.
Re: LCLint 3.0.0.17 parse problem
On Fri, Sep 28, 2001 at 04:01:40PM +, Derek M Jones wrote: > Anthony, > > >> Does the file system pad lines with spaces? > > > >Yes it does! It seems that the file is very similar to the old-style IBM > >punch cards, where everything had 80 columns, and anything that wasn't > >filled in was a space. The file is filled out to column 81 with spaces, > >and the \ is in column 66. Shouldn't lclint just ignore the whitespace > >following the trailing \? > > It most cases it could ignore whitespace after a \, provided there > were only trailing whitespaces. It is possible to come up with > programs that rely on the line being a \ followed by whitespace, > for instance, and not being a line splice. Such cases are rare, > but that does not mean an implementation is free to change the > requirements specified in the standard. [...] Well, in principle it's not OK to ignore them. The C standard clearly says that "a '\' immediately followed by a new-line character" fulfills the desired job. So for those 'strange' systems what is lclint supposed to do in your opinion? Only chance would be to use system's preprocessor? -- Alexander Mai [EMAIL PROTECTED]
Re: LCLint 3.0.0.17 parse problem
Anthony, >> Does the file system pad lines with spaces? > >Yes it does! It seems that the file is very similar to the old-style IBM >punch cards, where everything had 80 columns, and anything that wasn't >filled in was a space. The file is filled out to column 81 with spaces, >and the \ is in column 66. Shouldn't lclint just ignore the whitespace >following the trailing \? It most cases it could ignore whitespace after a \, provided there were only trailing whitespaces. It is possible to come up with programs that rely on the line being a \ followed by whitespace, for instance, and not being a line splice. Such cases are rare, but that does not mean an implementation is free to change the requirements specified in the standard. Some implementations that have to exist on file systems that pad lines with spaces use an alternative representation for \. For instance a \ on the end of line being represented by two \\ (but in all other positions being represented by itself. You might like to poke around in using a binary editor to see what representation is used by IBM. You could make a copy of the offending header, edit it, and use the -I option to cause LCLint to pick up that file first. Are you sure that spaces are the problem? There must be other line splices in the headers. What abou tmy suggestion that there is a bug in LCLint for this case? derek -- Derek M Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltdmailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk
Re: LCLint 3.0.0.17 parse problem
> Does the file system pad lines with spaces? Yes it does! It seems that the file is very similar to the old-style IBM punch cards, where everything had 80 columns, and anything that wasn't filled in was a space. The file is filled out to column 81 with spaces, and the \ is in column 66. Shouldn't lclint just ignore the whitespace following the trailing \? Anthony Giorgio DBX Developer Phone: (845) 435-9115 Tie Line: 295-9115 Email: agiorgio AT us.ibm.com Derek M Jones <[EMAIL PROTECTED]> 09/28/2001 11:33 AM To: Anthony Giorgio <[EMAIL PROTECTED]> cc: [EMAIL PROTECTED] Subject: Re: LCLint 3.0.0.17 parse problem Anthony, >I'm using LCLint 3.0.0.17 on an IBM zSeries mainframe, and I'm having Does the file system pad lines with spaces? >problems getting it to parse the code for the project I'm on. Whenever it >tries to parse one of the systerm header files, it gags on the >preprocessor step. Many of the header files have a construct similar to >the one below: > > #if (lots_of_stuff) \ > && (other_stuff) >#define some_flag > #endif > > >Whenever I run lclint on a file that includes a header with the above >construct, it dies with the following errror: > >/usr/include/features.h:203:67: Invalid character in #if: \ > >Is it valid to allow #if directives to span lines with the use of '\' , This error message is very specific. Perhaps it is a bug in LCLint. The code is certainly conforming. >and if so, how can I convince lclint that it's okay? If it's not valid, >then how can I have lclint ignore the problem? I can't change the system >header files to make them more compliant, even though it might be a good >idea :) They are already compliant. Nothing to change. derek -- Derek M Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltdmailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk
Re: LCLint 3.0.0.17 parse problem
Anthony, >I'm using LCLint 3.0.0.17 on an IBM zSeries mainframe, and I'm having Does the file system pad lines with spaces? >problems getting it to parse the code for the project I'm on. Whenever it >tries to parse one of the systerm header files, it gags on the >preprocessor step. Many of the header files have a construct similar to >the one below: > > #if (lots_of_stuff) \ > && (other_stuff) >#define some_flag > #endif > > >Whenever I run lclint on a file that includes a header with the above >construct, it dies with the following errror: > >/usr/include/features.h:203:67: Invalid character in #if: \ > >Is it valid to allow #if directives to span lines with the use of '\' , This error message is very specific. Perhaps it is a bug in LCLint. The code is certainly conforming. >and if so, how can I convince lclint that it's okay? If it's not valid, >then how can I have lclint ignore the problem? I can't change the system >header files to make them more compliant, even though it might be a good >idea :) They are already compliant. Nothing to change. derek -- Derek M Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltdmailto:[EMAIL PROTECTED] Applications Standards Conformance Testing http://www.knosof.co.uk
LCLint 3.0.0.17 parse problem
I'm using LCLint 3.0.0.17 on an IBM zSeries mainframe, and I'm having problems getting it to parse the code for the project I'm on. Whenever it tries to parse one of the systerm header files, it gags on the preprocessor step. Many of the header files have a construct similar to the one below: #if (lots_of_stuff) \ && (other_stuff) #define some_flag #endif Whenever I run lclint on a file that includes a header with the above construct, it dies with the following errror: /usr/include/features.h:203:67: Invalid character in #if: \ Is it valid to allow #if directives to span lines with the use of '\' , and if so, how can I convince lclint that it's okay? If it's not valid, then how can I have lclint ignore the problem? I can't change the system header files to make them more compliant, even though it might be a good idea :) Anthony Giorgio DBX Developer Phone: (845) 435-9115 Tie Line: 295-9115 Email: agiorgio AT us.ibm.com