Re: [PATCH] libcpp: Fix up -fdirectives-only preprocessing [PR98882]

2021-02-03 Thread Jeff Law via Gcc-patches



On 1/29/21 4:01 PM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> GCC 11 ICEs on all -fdirectives-only preprocessing when the files don't end
> with a newline.
>
> The problem is in the assertion, for empty TUs buffer->cur == buffer->rlimit
> and so buffer->rlimit[-1] access triggers UB in the preprocessor, for
> non-empty TUs it refers to the last character in the file, which can be
> anything.
> The preprocessor adds a '\n' character (or '\r', in particular if the
> user file ends with '\r' then it adds another '\r' rather than '\n'), but
> that is added after the limit, i.e. at buffer->rlimit[0].
>
> Now, if the routine handles occassional bumping of pos to buffer->rlimit + 1,
> I think it is just the assert that needs changing, usually we read from *pos
> if pos < limit and then e.g. if it is '\r', look at the following character
> (which could be one of those '\n' or '\r' at buffer->rlimit[0]).  There is
> also the case where for '\\' before the limit we read following character
> and if it is '\n', do one thing, if it is '\r' read another character.
> But in that case if '\\' was the last char in the TU, the limit char will be
> '\n', so we are ok.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-01-29  Jakub Jelinek  
>
>   PR preprocessor/98882
>   * lex.c (cpp_directive_only_process): Don't assert that rlimit[-1]
>   is a newline, instead assert that rlimit[0] is either newline or
>   carriage return.  When seeing '\\' followed by '\r', check limit
>   before accessing pos[1].
OK
jeff



[PATCH] libcpp: Fix up -fdirectives-only preprocessing [PR98882]

2021-01-29 Thread Jakub Jelinek via Gcc-patches
Hi!

GCC 11 ICEs on all -fdirectives-only preprocessing when the files don't end
with a newline.

The problem is in the assertion, for empty TUs buffer->cur == buffer->rlimit
and so buffer->rlimit[-1] access triggers UB in the preprocessor, for
non-empty TUs it refers to the last character in the file, which can be
anything.
The preprocessor adds a '\n' character (or '\r', in particular if the
user file ends with '\r' then it adds another '\r' rather than '\n'), but
that is added after the limit, i.e. at buffer->rlimit[0].

Now, if the routine handles occassional bumping of pos to buffer->rlimit + 1,
I think it is just the assert that needs changing, usually we read from *pos
if pos < limit and then e.g. if it is '\r', look at the following character
(which could be one of those '\n' or '\r' at buffer->rlimit[0]).  There is
also the case where for '\\' before the limit we read following character
and if it is '\n', do one thing, if it is '\r' read another character.
But in that case if '\\' was the last char in the TU, the limit char will be
'\n', so we are ok.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-29  Jakub Jelinek  

PR preprocessor/98882
* lex.c (cpp_directive_only_process): Don't assert that rlimit[-1]
is a newline, instead assert that rlimit[0] is either newline or
carriage return.  When seeing '\\' followed by '\r', check limit
before accessing pos[1].

--- libcpp/lex.c.jj 2021-01-27 11:50:09.174981229 +0100
+++ libcpp/lex.c2021-01-29 13:08:59.400944436 +0100
@@ -4318,9 +4318,9 @@ cpp_directive_only_process (cpp_reader *
   buffer->cur_note = buffer->notes_used = 0;
   buffer->cur = buffer->line_base = buffer->next_line;
   buffer->need_line = false;
-  /* Files always end in a newline.  We rely on this for
+  /* Files always end in a newline or carriage return.  We rely on this for
 character peeking safety.  */
-  gcc_assert (buffer->rlimit[-1] == '\n');
+  gcc_assert (buffer->rlimit[0] == '\n' || buffer->rlimit[0] == '\r');
 
   const unsigned char *base = buffer->cur;
   unsigned line_count = 0;
--- gcc/testsuite/gcc.dg/cpp/pr98882.c.jj   2021-01-29 13:12:25.933613898 
+0100
+++ gcc/testsuite/gcc.dg/cpp/pr98882.c  2021-01-29 13:12:13.717751745 +0100
@@ -0,0 +1,6 @@
+/* PR preprocessor/98882 */
+/* { dg-do preprocess } */
+/* { dg-options "-fdirectives-only" } */
+
+/* Last line does not end with a newline.  */
+/*Here*/
\ No newline at end of file

Jakub