Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
"H.J. Lu" writes: > The new tests failed on Linux/x86: Woops. I have committed the patch below under the obvious rule for this. Sorry for the inconvenience. gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-location-2.c: Fix error message specifier. diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog index a468447..2da 100644 --- a/gcc/testsuite/ChangeLog +++ b/gcc/testsuite/ChangeLog @@ -1,3 +1,8 @@ +2014-01-29 Dodji Seketeli + + * c-c++-common/cpp/warning-zero-location-2.c: Fix error message + selector. + 2014-01-29 Jakub Jelinek PR middle-end/59917 diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c index c0e0bf7..e919bca 100644 --- a/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c +++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c @@ -7,4 +7,4 @@ #include int main() { return 0; } -/* { dg-error "No such file or directory" { target *-*-* } 4636 } */ +/* { dg-message "" "#include" {target *-*-* } 0 } -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Tue, Jan 28, 2014 at 5:23 AM, Dodji Seketeli wrote: > Dodji Seketeli writes: > >> Here is the patch I am committing right now. >> >> gcc/ChangeLog >> >> * input.c (location_get_source_line): Bail out on when line number >> is zero, and test the return value of >> lookup_or_add_file_to_cache_tab. >> >> gcc/testsuite/ChangeLog >> >> * c-c++-common/cpp/warning-zero-location.c: New test. >> * c-c++-common/cpp/warning-zero-location-2.c: Likewise. > > I forgot to say that it passed bootstrap & test on > x86_64-unknown-linux-gnu against trunk. > The new tests failed on Linux/x86: ERROR: c-c++-common/cpp/warning-zero-location-2.c -std=gnu++11: syntax error in target selector "4636" for " dg-error 10 "No such file or directory" { target *-*-* } 4636 " ERROR: c-c++-common/cpp/warning-zero-location-2.c -std=gnu++98: syntax error in target selector "4636" for " dg-error 10 "No such file or directory" { target *-*-* } 4636 " ERROR: c-c++-common/cpp/warning-zero-location-2.c -Wc++-compat : syntax error in target selector "4636" for " dg-error 10 "No such file or directory" { target *-*-* } 4636 " -- H.J.
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Dodji Seketeli writes: > Here is the patch I am committing right now. > > gcc/ChangeLog > > * input.c (location_get_source_line): Bail out on when line number > is zero, and test the return value of > lookup_or_add_file_to_cache_tab. > > gcc/testsuite/ChangeLog > > * c-c++-common/cpp/warning-zero-location.c: New test. > * c-c++-common/cpp/warning-zero-location-2.c: Likewise. I forgot to say that it passed bootstrap & test on x86_64-unknown-linux-gnu against trunk. Thanks. -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Here is the patch I am committing right now. gcc/ChangeLog * input.c (location_get_source_line): Bail out on when line number is zero, and test the return value of lookup_or_add_file_to_cache_tab. gcc/testsuite/ChangeLog * c-c++-common/cpp/warning-zero-location.c: New test. * c-c++-common/cpp/warning-zero-location-2.c: Likewise. diff --git a/gcc/input.c b/gcc/input.c index 547c177..63cd062 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -698,7 +698,13 @@ location_get_source_line (expanded_location xloc, static char *buffer; static ssize_t len; - fcache * c = lookup_or_add_file_to_cache_tab (xloc.file); + if (xloc.line == 0) +return NULL; + + fcache *c = lookup_or_add_file_to_cache_tab (xloc.file); + if (c == NULL) +return NULL; + bool read = read_line_num (c, xloc.line, &buffer, &len); if (read && line_len) diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c new file mode 100644 index 000..c0e0bf7 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location-2.c @@ -0,0 +1,10 @@ +/* + { dg-options "-D _GNU_SOURCE -fdiagnostics-show-caret" } + { dg-do compile } + */ + +#line 4636 "configure" +#include +int main() { return 0; } + +/* { dg-error "No such file or directory" { target *-*-* } 4636 } */ diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c new file mode 100644 index 000..ca2e102 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c @@ -0,0 +1,8 @@ +/* + { dg-options "-D _GNU_SOURCE -fdiagnostics-show-caret" } + { dg-do compile } + */ + +#define _GNU_SOURCE/* { dg-warning "redefined" } */ + +/* { dg-message "" "#define _GNU_SOURCE" {target *-*-* } 0 } -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On 2014.01.25 at 00:02 +0100, Markus Trippelsdorf wrote: > On 2014.01.24 at 17:09 +0100, Dodji Seketeli wrote: > > Jakub Jelinek writes: > > > > > On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote: > > >> > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 . > > >> > The follow-up patch (fp == NULL check) doesn't help. > > >> > > >> I am looking into that, sorry for the inconvenience. > > > > > > I'd say we want something like following. Note that while the c == NULL > > > bailout would be usually sufficient, if you'll do: > > > echo foobar > '' > > > it would still crash. Line 0 is used only for the special locations > > > (command line, built-in macros) and there is no file associated with it > > > anyway. > > > > > > --- gcc/input.c.jj2014-01-24 16:32:34.0 +0100 > > > +++ gcc/input.c 2014-01-24 16:41:42.012671452 +0100 > > > @@ -698,7 +698,13 @@ location_get_source_line (expanded_locat > > >static char *buffer; > > >static ssize_t len; > > > > > > - fcache * c = lookup_or_add_file_to_cache_tab (xloc.file); > > > + if (xloc.line == 0) > > > +return NULL; > > > + > > > + fcache *c = lookup_or_add_file_to_cache_tab (xloc.file); > > > + if (c == NULL) > > > +return NULL; > > > + > > >bool read = read_line_num (c, xloc.line, &buffer, &len); > > > > > >if (read && line_len) > > > > Indeed. > > > > Though, I am testing the patch below that makes read_line_num gracefully > > handle empty caches or zero locations. The rest of the code should just > > work with that as is. > > > > * input.c (read_line_num): Gracefully handle non-file locations or > > empty caches. > > Unfortunately this doesn't fix yet another issue: Whereas Jakub's patch is fine. -- Markus
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On 2014.01.24 at 17:09 +0100, Dodji Seketeli wrote: > Jakub Jelinek writes: > > > On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote: > >> > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 . > >> > The follow-up patch (fp == NULL check) doesn't help. > >> > >> I am looking into that, sorry for the inconvenience. > > > > I'd say we want something like following. Note that while the c == NULL > > bailout would be usually sufficient, if you'll do: > > echo foobar > '' > > it would still crash. Line 0 is used only for the special locations > > (command line, built-in macros) and there is no file associated with it > > anyway. > > > > --- gcc/input.c.jj 2014-01-24 16:32:34.0 +0100 > > +++ gcc/input.c 2014-01-24 16:41:42.012671452 +0100 > > @@ -698,7 +698,13 @@ location_get_source_line (expanded_locat > >static char *buffer; > >static ssize_t len; > > > > - fcache * c = lookup_or_add_file_to_cache_tab (xloc.file); > > + if (xloc.line == 0) > > +return NULL; > > + > > + fcache *c = lookup_or_add_file_to_cache_tab (xloc.file); > > + if (c == NULL) > > +return NULL; > > + > >bool read = read_line_num (c, xloc.line, &buffer, &len); > > > >if (read && line_len) > > Indeed. > > Though, I am testing the patch below that makes read_line_num gracefully > handle empty caches or zero locations. The rest of the code should just > work with that as is. > > * input.c (read_line_num): Gracefully handle non-file locations or > empty caches. Unfortunately this doesn't fix yet another issue: markus@x4 /tmp % cat foo.c #line 4636 "configure" #include int main() { return 0; } markus@x4 /tmp % gcc foo.c configure:4636:26: fatal error: .h: No such file or directory gcc: internal compiler error: Segmentation fault (program cc1) 0x40cc8e execute ../../gcc/gcc/gcc.c:2841 0x40cf09 do_spec_1 ../../gcc/gcc/gcc.c:4641 0x40fc91 process_brace_body ../../gcc/gcc/gcc.c:5924 0x40fc91 handle_braces ../../gcc/gcc/gcc.c:5838 0x40d692 do_spec_1 ../../gcc/gcc/gcc.c:5295 0x40fc91 process_brace_body ../../gcc/gcc/gcc.c:5924 0x40fc91 handle_braces ../../gcc/gcc/gcc.c:5838 0x40d692 do_spec_1 ../../gcc/gcc/gcc.c:5295 0x40d28e do_spec_1 ../../gcc/gcc/gcc.c:5410 0x40fc91 process_brace_body ../../gcc/gcc/gcc.c:5924 0x40fc91 handle_braces ../../gcc/gcc/gcc.c:5838 0x40d692 do_spec_1 ../../gcc/gcc/gcc.c:5295 0x40fc91 process_brace_body ../../gcc/gcc/gcc.c:5924 0x40fc91 handle_braces ../../gcc/gcc/gcc.c:5838 0x40d692 do_spec_1 ../../gcc/gcc/gcc.c:5295 0x40fc91 process_brace_body ../../gcc/gcc/gcc.c:5924 0x40fc91 handle_braces ../../gcc/gcc/gcc.c:5838 0x40d692 do_spec_1 ../../gcc/gcc/gcc.c:5295 0x40fc91 process_brace_body ../../gcc/gcc/gcc.c:5924 0x40fc91 handle_braces ../../gcc/gcc/gcc.c:5838 Please submit a full bug report, with preprocessed source if appropriate. -- Markus
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Fri, Jan 24, 2014 at 05:09:29PM +0100, Dodji Seketeli wrote: > * input.c (read_line_num): Gracefully handle non-file locations or > empty caches. > > diff --git a/gcc/input.c b/gcc/input.c > index 547c177..b05e1da 100644 > --- a/gcc/input.c > +++ b/gcc/input.c > @@ -600,7 +600,8 @@ static bool > read_line_num (fcache *c, size_t line_num, > char ** line, ssize_t *line_len) > { > - gcc_assert (line_num > 0); > + if (!c || line_num < 1) > +return false; > >if (line_num <= c->line_num) > { Ok. > --- /dev/null > +++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c > @@ -0,0 +1,6 @@ > +/* > + { dg-options "-D _GNU_SOURCE" } > + { dg-do compile } > + */ > + > +#define _GNU_SOURCE /* { dg-warning "redefined" } */ I doubt this would fail without the patch though, because fno-diagnostics-show-caret is added by default to flags. So, I'd say you need also -fdiagnostics-show-caret in dg-options to reproduce it. Jakub
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Jakub Jelinek writes: > On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote: >> > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 . >> > The follow-up patch (fp == NULL check) doesn't help. >> >> I am looking into that, sorry for the inconvenience. > > I'd say we want something like following. Note that while the c == NULL > bailout would be usually sufficient, if you'll do: > echo foobar > '' > it would still crash. Line 0 is used only for the special locations > (command line, built-in macros) and there is no file associated with it > anyway. > > --- gcc/input.c.jj2014-01-24 16:32:34.0 +0100 > +++ gcc/input.c 2014-01-24 16:41:42.012671452 +0100 > @@ -698,7 +698,13 @@ location_get_source_line (expanded_locat >static char *buffer; >static ssize_t len; > > - fcache * c = lookup_or_add_file_to_cache_tab (xloc.file); > + if (xloc.line == 0) > +return NULL; > + > + fcache *c = lookup_or_add_file_to_cache_tab (xloc.file); > + if (c == NULL) > +return NULL; > + >bool read = read_line_num (c, xloc.line, &buffer, &len); > >if (read && line_len) Indeed. Though, I am testing the patch below that makes read_line_num gracefully handle empty caches or zero locations. The rest of the code should just work with that as is. * input.c (read_line_num): Gracefully handle non-file locations or empty caches. diff --git a/gcc/input.c b/gcc/input.c index 547c177..b05e1da 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -600,7 +600,8 @@ static bool read_line_num (fcache *c, size_t line_num, char ** line, ssize_t *line_len) { - gcc_assert (line_num > 0); + if (!c || line_num < 1) +return false; if (line_num <= c->line_num) { diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c new file mode 100644 index 000..04a06b2 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cpp/warning-zero-location.c @@ -0,0 +1,6 @@ +/* + { dg-options "-D _GNU_SOURCE" } + { dg-do compile } + */ + +#define _GNU_SOURCE/* { dg-warning "redefined" } */ -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Fri, Jan 24, 2014 at 04:40:52PM +0100, Dodji Seketeli wrote: > > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 . > > The follow-up patch (fp == NULL check) doesn't help. > > I am looking into that, sorry for the inconvenience. I'd say we want something like following. Note that while the c == NULL bailout would be usually sufficient, if you'll do: echo foobar > '' it would still crash. Line 0 is used only for the special locations (command line, built-in macros) and there is no file associated with it anyway. --- gcc/input.c.jj 2014-01-24 16:32:34.0 +0100 +++ gcc/input.c 2014-01-24 16:41:42.012671452 +0100 @@ -698,7 +698,13 @@ location_get_source_line (expanded_locat static char *buffer; static ssize_t len; - fcache * c = lookup_or_add_file_to_cache_tab (xloc.file); + if (xloc.line == 0) +return NULL; + + fcache *c = lookup_or_add_file_to_cache_tab (xloc.file); + if (c == NULL) +return NULL; + bool read = read_line_num (c, xloc.line, &buffer, &len); if (read && line_len) Jakub
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Markus Trippelsdorf writes: > On 2014.01.22 at 09:16 +0100, Dodji Seketeli wrote: >> Bernd Edlinger writes: >> >> > Hi, >> >> Hello, >> >> > since there was no progress in the last 2 months on that matter, >> >> Sorry, this is my bad. I got sidetracked by something else and forgot >> that I had the patch working et al, and all its bits that need approval >> got approved. It still can go in right now. It improves performance >> and fixes the issue the way it was discussed. >> >> Here it is, regtested on x86_64-linux-gnu against trunk. >> >> If nobody objects in the next 24 hours, I'll commit it. > > The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 . > The follow-up patch (fp == NULL check) doesn't help. I am looking into that, sorry for the inconvenience. -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On 2014.01.22 at 09:16 +0100, Dodji Seketeli wrote: > Bernd Edlinger writes: > > > Hi, > > Hello, > > > since there was no progress in the last 2 months on that matter, > > Sorry, this is my bad. I got sidetracked by something else and forgot > that I had the patch working et al, and all its bits that need approval > got approved. It still can go in right now. It improves performance > and fixes the issue the way it was discussed. > > Here it is, regtested on x86_64-linux-gnu against trunk. > > If nobody objects in the next 24 hours, I'll commit it. The patch causes http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59935 . The follow-up patch (fp == NULL check) doesn't help. -- Markus
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Jakub Jelinek writes: > On Wed, Jan 22, 2014 at 09:16:02AM +0100, Dodji Seketeli wrote: >> +static fcache* >> +add_file_to_cache_tab (const char *file_path) >> +{ >> + >> + FILE *fp = fopen (file_path, "r"); >> + if (ferror (fp)) >> +{ >> + fclose (fp); >> + return NULL; >> +} > > I've seen various segfaults here when playing with preprocessed sources > from PRs (obviously don't have the original source files). > When fopen fails, it just returns NULL, so I don't see why you just don't > do > if (fp == NULL) > return fp; Right, I am testing the patch below. * input.c (add_file_to_cache_tab): Handle the case where fopen returns NULL. diff --git a/gcc/input.c b/gcc/input.c index 290680c..547c177 100644 --- a/gcc/input.c +++ b/gcc/input.c @@ -293,11 +293,8 @@ add_file_to_cache_tab (const char *file_path) { FILE *fp = fopen (file_path, "r"); - if (ferror (fp)) -{ - fclose (fp); - return NULL; -} + if (fp == NULL) +return NULL; unsigned highest_use_count = 0; fcache *r = evicted_cache_tab_entry (&highest_use_count); -- Dodji
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hi, On Thu, 23 Jan 2014 18:12:45, Jakub Jelinek wrote: > > On Wed, Jan 22, 2014 at 09:16:02AM +0100, Dodji Seketeli wrote: >> +static fcache* >> +add_file_to_cache_tab (const char *file_path) >> +{ >> + >> + FILE *fp = fopen (file_path, "r"); >> + if (ferror (fp)) >> + { >> + fclose (fp); >> + return NULL; >> + } > > I've seen various segfaults here when playing with preprocessed sources > from PRs (obviously don't have the original source files). > When fopen fails, it just returns NULL, so I don't see why you just don't > do > if (fp == NULL) > return fp; > > Jakub This would be a good idea for test cases too. However the test system always calls the compiler with -fno-diagnostics-show-caret so I doubt your test case is actually testing anything when it is called from the test environment with that option. Bernd.
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Wed, Jan 22, 2014 at 09:16:02AM +0100, Dodji Seketeli wrote: > +static fcache* > +add_file_to_cache_tab (const char *file_path) > +{ > + > + FILE *fp = fopen (file_path, "r"); > + if (ferror (fp)) > +{ > + fclose (fp); > + return NULL; > +} I've seen various segfaults here when playing with preprocessed sources from PRs (obviously don't have the original source files). When fopen fails, it just returns NULL, so I don't see why you just don't do if (fp == NULL) return fp; Jakub
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Bernd Edlinger writes: > Hi, Hello, > since there was no progress in the last 2 months on that matter, Sorry, this is my bad. I got sidetracked by something else and forgot that I had the patch working et al, and all its bits that need approval got approved. It still can go in right now. It improves performance and fixes the issue the way it was discussed. Here it is, regtested on x86_64-linux-gnu against trunk. If nobody objects in the next 24 hours, I'll commit it. libcpp/ChangeLog: * include/line-map.h (linemap_get_file_highest_location): Declare new function. * line-map.c (linemap_get_file_highest_location): Define it. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter. (void diagnostics_file_cache_fini): Declare new function. * input.c (struct fcache): New type. (fcache_tab_size, fcache_buffer_size, fcache_line_record_size): New static constants. (diagnostic_file_cache_init, total_lines_num) (lookup_file_in_cache_tab, evicted_cache_tab_entry) (add_file_to_cache_tab, lookup_or_add_file_to_cache_tab) (needs_read, needs_grow, maybe_grow, read_data, maybe_read_data) (get_next_line, read_next_line, goto_next_line, read_line_num): New static function definitions. (diagnostic_file_cache_fini): New function. (location_get_source_line): Take an additional output line_len parameter. Re-write using lookup_or_add_file_to_cache_tab and read_line_num. * diagnostic.c (diagnostic_finish): Call diagnostic_file_cache_fini. (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. * Makefile.in: Add vec.o to OBJS-libcommon gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4 Signed-off-by: Dodji Seketeli --- gcc/Makefile.in| 3 +- gcc/diagnostic.c | 19 +- gcc/diagnostic.h | 1 + gcc/input.c| 633 - gcc/input.h| 5 +- .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes libcpp/include/line-map.h | 8 + libcpp/line-map.c | 40 ++ 8 files changed, 670 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 4d683a0..06c617a 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1472,7 +1472,8 @@ OBJS = \ # Objects in libcommon.a, potentially used by all host binaries and with # no target dependencies. -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \ + vec.o input.o version.o # Objects in libcommon-target.a, used by drivers and by the core # compiler and containing target-dependent code. diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 36094a1..6c83f03 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context) progname); pp_newline_and_flush (context->printer); } + + diagnostic_file_cache_fini (); } /* Initialize DIAGNOSTIC, where the message MSG has already been @@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context, MAX_WIDTH by some margin, then adjust the start of the line such that the COLUMN is smaller than MAX_WIDTH minus the margin. The margin is either 10 characters or the difference between the column - and the length of the line, whatever is smaller. */ + and the length of the line, whatever is smaller. The length of + LINE is given by LINE_WIDTH. */ static const char * -adjust_line (const char *line, int max_width, int *column_p) +adjust_line (const char *line, int line_width, +int max_width, int *column_p) { int right_margin = 10; - int line_width = strlen (line); int column = *column_p; right_margin = MIN (line_width - column, right_margin); @@ -284,6 +287,7 @@ diagnostic_show_locus (diagnostic_context * context, const diagnostic_info *diagnostic) { const char *line; + int line_width; char *buffer; expanded_location s; int m
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hi, since there was no progress in the last 2 months on that matter, and we are quite late in Phase 3 now, I dare to propose an alternative, very simple solution for this bug now. It does not try to improve or degade the perfomance at all, instead it simply detects binary files with embedded NULs and stops parsing at that point. Boot-strapped and regression-tested on X86_64-linux-gnu. Ok for trunk? Bernd. On Thu, 14 Nov 2013 15:01:59, Dodji Seketeli wrote: > > Jakub Jelinek writes: > >> On Tue, Nov 12, 2013 at 04:33:41PM +0100, Dodji Seketeli wrote: >>> + >>> + memmove (*line, l, len); >>> + (*line)[len - 1] = '\0'; >>> + *line_len = --len; >> >> Shouldn't this be testing that len> 0 && (*line)[len - 1] == '\n' >> first before you decide to overwrite it and decrement len? > > That code above is in a if (len> 0) block. So checking that condition > again is not necessary. Also, I think we don't need to test there is a > terminal '\n' at the end because get_next_line always return the line > content followed either by a '\n' or by a "junk byte" that is right > after the last byte of the file -- in case we reach end of file w/o > seeing a '\n'. > >> Though in that case there would be no '\0' termination of the string >> for files not ending in a new-line. So, either get_next_line should >> append '\n' to the buffer, or you should have there space for that, or >> you can't rely on zero termination of the string and need to use just >> the length. > > OK, I am settling for doing away with the '\0' altogether. > > The patch below makes get_next_line always point to the last character > of the line before the '\n' when it is present. So '\n' is never > counted int the string. I guess that's less confusing to people. > > Tested on x86_64-unknown-linux-gnu against trunk. > > libcpp/ChangeLog: > > * include/line-map.h (linemap_get_file_highest_location): Declare > new function. > * line-map.c (linemap_get_file_highest_location): Define it. > > gcc/ChangeLog: > > * input.h (location_get_source_line): Take an additional line_size > parameter. > (void diagnostics_file_cache_fini): Declare new function. > * input.c (struct fcache): New type. > (fcache_tab_size, fcache_buffer_size, fcache_line_record_size): > New static constants. > (diagnostic_file_cache_init, total_lines_num) > (lookup_file_in_cache_tab, evicted_cache_tab_entry) > (add_file_to_cache_tab, lookup_or_add_file_to_cache_tab) > (needs_read, needs_grow, maybe_grow, read_data, maybe_read_data) > (get_next_line, read_next_line, goto_next_line, read_line_num): > New static function definitions. > (diagnostic_file_cache_fini): New function. > (location_get_source_line): Take an additional output line_len > parameter. Re-write using lookup_or_add_file_to_cache_tab and > read_line_num. > * diagnostic.c (diagnostic_finish): Call > diagnostic_file_cache_fini. > (adjust_line): Take an additional input parameter for the length > of the line, rather than calculating it with strlen. > (diagnostic_show_locus): Adjust the use of > location_get_source_line and adjust_line with respect to their new > signature. While displaying a line now, do not stop at the first > null byte. Rather, display the zero byte as a space and keep > going until we reach the size of the line. > * Makefile.in: Add vec.o to OBJS-libcommon > > gcc/testsuite/ChangeLog: > > * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. > > git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 > 138bc75d-0d04-0410-961f-82ee72b054a4 > --- > gcc/Makefile.in | 3 +- > gcc/diagnostic.c | 19 +- > gcc/diagnostic.h | 1 + > gcc/input.c | 633 - > gcc/input.h | 5 +- > .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes > libcpp/include/line-map.h | 8 + > libcpp/line-map.c | 40 ++ > 8 files changed, 670 insertions(+), 39 deletions(-) > create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index 49285e5..9fe9060 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1469,7 +1469,8 @@ OBJS = \ > > # Objects in libcommon.a, potentially used by all host binaries and with > # no target dependencies. > -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o > input.o version.o > +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \ > + vec.o input.o version.o > > # Objects in libcommon-target.a, used by drivers and by the core > # compiler and containing target-dependent code. > diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c > index 36094a1..6c83f03 100644 > --- a/gcc/diagnostic.c > +++ b/gcc/diagnostic.c > @@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context) > progname); > pp_newline_and_flush (context->printer); > } > + > + diagnostic_file_cache_fini (); > } > > /* Initialize DIAGNOSTIC, where the message MSG has already been > @@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context, > MAX_WIDTH by some margin, then
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
> "Dodji" == Dodji Seketeli writes: Dodji> * include/line-map.h (linemap_get_file_highest_location): Declare Dodji> new function. Dodji> * line-map.c (linemap_get_file_highest_location): Define it. I wasn't sure if this is the patch you were needing review for ... Dodji> +bool linemap_get_file_highest_location (struct line_maps * set, Dodji> +const char *file_name, Dodji> +source_location*LOC); The spacing is slight off -- one too many before "set", one too few before LOC. And LOC presumably shouldn't be uppercase here. Dodji> + const char *fname = set->info_ordinary.maps[i].d.ordinary.to_file; Dodji> + if (fname && !strcmp (fname, file_name)) Other spots in this code use filename_cmp. Otherwise the libcpp bits look ok to me. Tom
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Jakub Jelinek writes: > On Tue, Nov 12, 2013 at 04:33:41PM +0100, Dodji Seketeli wrote: >> + >> + memmove (*line, l, len); >> + (*line)[len - 1] = '\0'; >> + *line_len = --len; > > Shouldn't this be testing that len > 0 && (*line)[len - 1] == '\n' > first before you decide to overwrite it and decrement len? That code above is in a if (len > 0) block. So checking that condition again is not necessary. Also, I think we don't need to test there is a terminal '\n' at the end because get_next_line always return the line content followed either by a '\n' or by a "junk byte" that is right after the last byte of the file -- in case we reach end of file w/o seeing a '\n'. > Though in that case there would be no '\0' termination of the string > for files not ending in a new-line. So, either get_next_line should > append '\n' to the buffer, or you should have there space for that, or > you can't rely on zero termination of the string and need to use just > the length. OK, I am settling for doing away with the '\0' altogether. The patch below makes get_next_line always point to the last character of the line before the '\n' when it is present. So '\n' is never counted int the string. I guess that's less confusing to people. Tested on x86_64-unknown-linux-gnu against trunk. libcpp/ChangeLog: * include/line-map.h (linemap_get_file_highest_location): Declare new function. * line-map.c (linemap_get_file_highest_location): Define it. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter. (void diagnostics_file_cache_fini): Declare new function. * input.c (struct fcache): New type. (fcache_tab_size, fcache_buffer_size, fcache_line_record_size): New static constants. (diagnostic_file_cache_init, total_lines_num) (lookup_file_in_cache_tab, evicted_cache_tab_entry) (add_file_to_cache_tab, lookup_or_add_file_to_cache_tab) (needs_read, needs_grow, maybe_grow, read_data, maybe_read_data) (get_next_line, read_next_line, goto_next_line, read_line_num): New static function definitions. (diagnostic_file_cache_fini): New function. (location_get_source_line): Take an additional output line_len parameter. Re-write using lookup_or_add_file_to_cache_tab and read_line_num. * diagnostic.c (diagnostic_finish): Call diagnostic_file_cache_fini. (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. * Makefile.in: Add vec.o to OBJS-libcommon gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/Makefile.in| 3 +- gcc/diagnostic.c | 19 +- gcc/diagnostic.h | 1 + gcc/input.c| 633 - gcc/input.h| 5 +- .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes libcpp/include/line-map.h | 8 + libcpp/line-map.c | 40 ++ 8 files changed, 670 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 49285e5..9fe9060 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1469,7 +1469,8 @@ OBJS = \ # Objects in libcommon.a, potentially used by all host binaries and with # no target dependencies. -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \ + vec.o input.o version.o # Objects in libcommon-target.a, used by drivers and by the core # compiler and containing target-dependent code. diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 36094a1..6c83f03 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context) progname); pp_newline_and_flush (context->printer); } + + diagnostic_file_cache_fini (); } /* Initialize DIAGNOSTIC, where the message MSG has already been @@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context, MAX_WIDTH by some margin, then adjust the start of the line such that the COLUMN is smaller than MAX_WIDTH minus the m
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Tue, Nov 12, 2013 at 04:33:41PM +0100, Dodji Seketeli wrote: > + > + memmove (*line, l, len); > + (*line)[len - 1] = '\0'; > + *line_len = --len; Shouldn't this be testing that len > 0 && (*line)[len - 1] == '\n' first before you decide to overwrite it and decrement len? Though in that case there would be no '\0' termination of the string for files not ending in a new-line. So, either get_next_line should append '\n' to the buffer, or you should have there space for that, or you can't rely on zero termination of the string and need to use just the length. Jakub
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Sorry, I missed one question in the previous email. Bernd Edlinger writes: > and what is it if there is no terminal '\n' ? In that case it's that the entire file is made of one line. For that case get_next_line has allocated enough space for one byte-passed-the-end of the file, so that there is no buffer overflow here. -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Bernd Edlinger writes: >>> Using -- on a value that goes out of scope looks >>> awkward IMHO. >> >> I don't understand this sentence. What do you mean by "Using -- on a >> value that goes out of scope"? >> > > I meant the operator -- in *line_len = --len; Sorry, I don't see how that is an issue. This looks like a classical way of passing an output parameter to me. > Maybe, You could also avoid the copying completely, if you just hand out > a pointer to the line buffer as const char*, and use the length instead of the > nul-char as end delimiter ? I thought about avoiding the copying of course. But the issue with that is that that ties the lifetime of the returned line to the time between two invocations of read_next_line. IOW, you'd have to use the line "quickly" before calling read_next_line again. Actually that non-copying API that you are talking about exists in the patch; it's get_next_line. And you see that it's what we use when we want to avoid the copying, e.g, in goto_next_line. But when we want to give the "final" user the string, I believe that copying is less surprising. And from what I could see from the tests I have done, the copying doesn't make the thing slower than without the patch. So I'd like to keep this unless folks have very strong feeling about it. -- Dodji
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
> >>> + memmove (*line, l, len); >>> + (*line)[len - 1] = '\0'; >>> + *line_len = --len; >> >> Generally, I would prefer to use memcpy, >> if it is clear that the memory does not overlap. > > I don't mind. I'll change that in my local copy. Thanks. > >> You copy one char too much and set it to zero? > > It's not one char too much. That char is the terminal '\n' in most > cases. > and what is it if there is no terminal '\n' ? >> Using -- on a value that goes out of scope looks >> awkward IMHO. > > I don't understand this sentence. What do you mean by "Using -- on a > value that goes out of scope"? > I meant the operator -- in *line_len = --len; Maybe, You could also avoid the copying completely, if you just hand out a pointer to the line buffer as const char*, and use the length instead of the nul-char as end delimiter ? Bernd.
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Bernd Edlinger writes: >> + memmove (*line, l, len); >> + (*line)[len - 1] = '\0'; >> + *line_len = --len; > > Generally, I would prefer to use memcpy, > if it is clear that the memory does not overlap. I don't mind. I'll change that in my local copy. Thanks. > You copy one char too much and set it to zero? It's not one char too much. That char is the terminal '\n' in most cases. > Using -- on a value that goes out of scope looks > awkward IMHO. I don't understand this sentence. What do you mean by "Using -- on a value that goes out of scope"? -- Dodji
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hi, On Tue, 12 Nov 2013 16:33:41, Dodji Seketeli wrote: > > +/* Reads the next line from FILE into *LINE. If *LINE is too small > + (or NULL) it is allocated (or extended) to have enough space to > + containe the line. *LINE_LENGTH must contain the size of the > + initial*LINE buffer. It's then updated by this function to the > + actual length of the returned line. Note that the returned line > + can contain several zero bytes. Also note that the returned string > + is allocated in static storage that is going to be re-used by > + subsequent invocations of read_line. */ > + > +static bool > +read_next_line (fcache *cache, char ** line, ssize_t *line_len) > +{ > + char *l = NULL; > + ssize_t len = get_next_line (cache, &l); > + > + if (len> 0) > + { > + if (*line == NULL) > { > - string[pos + len - 1] = 0; > - return string; > + *line = XNEWVEC (char, len); > + *line_len = len; > } > - pos += len; > - string = XRESIZEVEC (char, string, string_len * 2); > - string_len *= 2; > + else > + if (*line_len < len) > + *line = XRESIZEVEC (char, *line, len); > + > + memmove (*line, l, len); > + (*line)[len - 1] = '\0'; > + *line_len = --len; Generally, I would prefer to use memcpy, if it is clear that the memory does not overlap. You copy one char too much and set it to zero? Using -- on a value that goes out of scope looks awkward IMHO. Bernd. > + return true; > } > - > - return pos ? string : NULL; > + > + return false; > +}
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hello, Below is the updated patch amended to take your previous comments in account. In add_file_to_cache_tab the evicted cache array entry is the one that was less used. Incidentally I also fixed some thinkos and issued that I have seen in the previous patch. Bootstrapped on x86_64-unknown-linux-gnu against trunk. libcpp/ChangeLog: * include/line-map.h (linemap_get_file_highest_location): Declare new function. * line-map.c (linemap_get_file_highest_location): Define it. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter. (void diagnostics_file_cache_fini): Declare new function. * input.c (struct fcache): New type. (fcache_tab_size, fcache_buffer_size, fcache_line_record_size): New static constants. (diagnostic_file_cache_init, total_lines_num) (lookup_file_in_cache_tab, evicted_cache_tab_entry) (add_file_to_cache_tab, lookup_or_add_file_to_cache_tab) (needs_read, needs_grow, maybe_grow, read_data, maybe_read_data) (get_next_line, read_next_line, goto_next_line, read_line_num): New static function definitions. (diagnostic_file_cache_fini): New function. (location_get_source_line): Take an additional output line_len parameter. Re-write using lookup_or_add_file_to_cache_tab and read_line_num. * diagnostic.c (diagnostic_finish): Call diagnostic_file_cache_fini. (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. * Makefile.in: Add vec.o to OBJS-libcommon gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@204453 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/Makefile.in| 3 +- gcc/diagnostic.c | 19 +- gcc/diagnostic.h | 1 + gcc/input.c| 637 - gcc/input.h| 5 +- .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes libcpp/include/line-map.h | 8 + libcpp/line-map.c | 40 ++ 8 files changed, 674 insertions(+), 39 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 49285e5..9fe9060 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1469,7 +1469,8 @@ OBJS = \ # Objects in libcommon.a, potentially used by all host binaries and with # no target dependencies. -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o input.o version.o +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o \ + vec.o input.o version.o # Objects in libcommon-target.a, used by drivers and by the core # compiler and containing target-dependent code. diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 36094a1..6c83f03 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -176,6 +176,8 @@ diagnostic_finish (diagnostic_context *context) progname); pp_newline_and_flush (context->printer); } + + diagnostic_file_cache_fini (); } /* Initialize DIAGNOSTIC, where the message MSG has already been @@ -259,12 +261,13 @@ diagnostic_build_prefix (diagnostic_context *context, MAX_WIDTH by some margin, then adjust the start of the line such that the COLUMN is smaller than MAX_WIDTH minus the margin. The margin is either 10 characters or the difference between the column - and the length of the line, whatever is smaller. */ + and the length of the line, whatever is smaller. The length of + LINE is given by LINE_WIDTH. */ static const char * -adjust_line (const char *line, int max_width, int *column_p) +adjust_line (const char *line, int line_width, +int max_width, int *column_p) { int right_margin = 10; - int line_width = strlen (line); int column = *column_p; right_margin = MIN (line_width - column, right_margin); @@ -284,6 +287,7 @@ diagnostic_show_locus (diagnostic_context * context, const diagnostic_info *diagnostic) { const char *line; + int line_width; char *buffer; expanded_location s; int max_width; @@ -297,22 +301,25 @@ diagnostic_show_locus (diagnostic_context * context, context->last_location = diagnostic->location; s = expand_location_to_spelling_point (diagnostic->lo
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Jakub Jelinek writes: >> -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o >> input.o version.o >> +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o >> vec.o input.o version.o > > Too long line? Fixed in my local copy of the patch, thanks. > >> + if (c == '\0') >> +c = ' '; >>pp_character (context->printer, c); > > Does that match how libcpp counts the embedded '\0' character in column > computation? Yes, I think so. _cpp_lex_direct in libcpp/lex.c considers '\0' just like a white space and so the column number is incremented when it's encountered. >> +/* The position (byte count) the the last byte of the line. This >> + normally points to the '\n' character, or to one byte after the >> + last byte of the file, if the file doesn't contain a '\n' >> + character. */ >> +size_t end_pos; > > Does it really help to note this? You can always just walk the line from > start_pos looking for '\n' or end of file. Yes you are right, it's not strictly necessary. But with that end_pos, copying a line is even faster; no need of walking. I thought the goal was to avoid re-doing the work we've already done, as much as possible. > >> +static fcache* >> +add_file_to_cache_tab (const char *file_path) >> +{ >> + static size_t idx; >> + fcache *r; >> + >> + FILE *fp = fopen (file_path, "r"); >> + if (ferror (fp)) >> +{ >> + fclose (fp); >> + return NULL; >> +} >> + >> + r = &fcache_tab[idx]; > > Wouldn't it be better to use some LRU algorithm to determine which > file to kick out of the cache? Have some tick counter of cache lookups (or > creation) and assign the tick counter to the just created resp. used > cache entry, and remove the one with the smallest counter? Hehe, the LRU idea occurred to me too, but I dismissed the idea as something probably over-engineered. But now that you are mentioning it I guess I should give it a try ;-) I'll post a patch about that later then. >> + fcache * r = lookup_file_in_cache_tab (file_path); >> + if (r == NULL) > > Formatting (no space after *, extra space after ==). Fixed in my local copy. Thanks. -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Mon, Nov 11, 2013 at 11:19:21AM +0100, Dodji Seketeli wrote: > .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes > libcpp/include/line-map.h | 8 + > libcpp/line-map.c | 40 ++ > 8 files changed, 585 insertions(+), 39 deletions(-) > create mode 100644 > gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index 49285e5..50c2482 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1469,7 +1469,7 @@ OBJS = \ > > # Objects in libcommon.a, potentially used by all host binaries and with > # no target dependencies. > -OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o > input.o version.o > +OBJS-libcommon = diagnostic.o diagnostic-color.o pretty-print.o intl.o vec.o > input.o version.o Too long line? > + if (c == '\0') > + c = ' '; >pp_character (context->printer, c); Does that match how libcpp counts the embedded '\0' character in column computation? > +/* The position (byte count) the the last byte of the line. This > + normally points to the '\n' character, or to one byte after the > + last byte of the file, if the file doesn't contain a '\n' > + character. */ > +size_t end_pos; Does it really help to note this? You can always just walk the line from start_pos looking for '\n' or end of file. > +static fcache* > +add_file_to_cache_tab (const char *file_path) > +{ > + static size_t idx; > + fcache *r; > + > + FILE *fp = fopen (file_path, "r"); > + if (ferror (fp)) > +{ > + fclose (fp); > + return NULL; > +} > + > + r = &fcache_tab[idx]; Wouldn't it be better to use some LRU algorithm to determine which file to kick out of the cache? Have some tick counter of cache lookups (or creation) and assign the tick counter to the just created resp. used cache entry, and remove the one with the smallest counter? > + fcache * r = lookup_file_in_cache_tab (file_path); > + if (r == NULL) Formatting (no space after *, extra space after ==). Jakub
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hello, As it appeared that concerns about the speed of location_get_source_line were as present as the need of just fixing this bug, I have conflated the two concerns in a new attempt below, trying to address the points you guys have raised during the previous reviews. The patch below introduces a cache for the data read from the file we want to emit caret diagnostic for. In that cache it stashes the bytes read from the file as well as a number of positions of line delimiters that we encountered while reading the file. It keeps a number of the last file caches in memory in case location_get_source_line is later asked for lines from the same file. To avoid exploding the memory consumption, the number line delimiter position saved is fixed (100). So if a file is smaller than 100 lines all of its line positions can be saved. That is, if location_get_source_line is first asked to return line 20, all the position of the lines encountered since the beginning of the file -- up to line 20 -- are going to be saved in the cache. Next time, if location_get_source_line is asked to return line 10, as the position of the beginning/end of line 10 is saved in the cache, returning that line is fast. If it's asked to return line 25, it will have to start reading from line 20, not from the beginning of the file. If the file is bigger than 100, then the patch just saves 100 line positions. To evenly spread the line position saved, it needs to know the total number lines of the file. Luckily we can usually get this information from the line map subsystem (from libcpp). The patch thus adds a new entry point in the line map (linemap_get_file_highest_location) that gives the greatest source_location seen for a given file and uses that to decide what line position to save in the cache. The speed gain I have seen is variable, depending on the size (in number of quasi adjacent lines) of the diagnostics, but on some pathological cases I have seen, it can divide the time spend displaying the diagnostics by two ore more. I had to add hackery in the code to measure this, unfortunately :-( The patch doesn't try to reuse the same infrastructure for gcov for now. I am letting that for later now when I have more time. Bootstrapped on x86_64-unknown-linux-gnu against trunk. PS: To ease the review (especially for Tom Tromey who I am CC-ing because of the new entry point in the line map sub-system) I am attaching the cover letter of the patch itself that does the analysis of the initial bug. Sorry to the other addressees of this message for the redundancy. Thanks. >8<--- In this problem report, the compiler is fed a (bogus) translation unit in which some literals contain bytes whose value is zero. The preprocessor detects that and proceeds to emit diagnostics for that king of bogus literals. But then when the diagnostics machinery re-reads the input file again to display the bogus literals with a caret, it attempts to calculate the length of each of the lines it got using fgets. The line length calculation is done using strlen. But that doesn't work well when the content of the line can have several zero bytes. The result is that the read_line never sees the end of the line because strlen repeatedly reports that the line ends before the end-of-line character; so read_line thinks its buffer for reading the line is too small; it thus increases the buffer, leading to a huge memory consumption and disaster. Here is what this patch does. location_get_source_line is modified to return the length of a source line that can now contain bytes with zero value. diagnostic_show_locus() is then modified to consider that a line can have characters of value zero, and so just shows a white space when instructed to display one of these characters. Additionally location_get_source_line is modified to avoid re-reading each and every line from the beginning of the file until it reaches the line number N that it is instructed to get; this was leading to annoying quadratic behaviour when reading adjacent lines near the end of (big) files. So a cache is now associated to the file opened in text mode. When the content of the file is read, that content is stashed in the file cache. That file cache is searched for line delimiters. A number of line positions are saved in the cache and a number of file caches are kept in memory. That way when location_get_source_line is asked to read line N + 1, it just has to start reading from line N that it has already read. >8<--- And now the real patch. libcpp/ChangeLog: * include/line-map.h (linemap_get_file_highest_location): Declare new function. * line-map.c (linemap_get_file_highest_location): Define it. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter. (void diagno
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Sorry Dodji, I still do not see how this is supposed to work: If the previous invocation of get_line already had read some characters of the following line(s), how is that information recovered? I see it is copied behind lineptr[cur_len]. But when get_line is re-entered, cur_len is set to zero again. and all that contents up to 16K are forgotten. Right? If an empty line of just a new-line is read, the return value of get_line is 0, and string is "". But the return value of read_line is NULL in that case. Now the function location_get_source_line will leave the while loop. But there may be more lines, propably not just empty ones? How did you test your patch? Regards Bernd.
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Bernd Edlinger writes: > If you want to have at least a chance to survive something like: > > > dd if=/dev/zero of=test.c bs=10240 count=1000 > > gcc -Wall test.c > > > Then you should change the implementation of read_line to > _not_ returning something like 100GB of zeros. I'd say that in that case, we'd rather just die in an OOM condition and be done with it. Otherwise, If fear that read_line might become too slow; you'd have to detect that the content is just zeros, for instance. > IMHO it would be nice to limit lines returned to 10.000 bytes, > maybe add "..." or "" if the limit is reached. In general, setting a limit for pathological cases like this is a good idea, I think. But that seems a bit ouf of the scope of this particular bug fix; we'd need to e.g, define a new command line argument to extend that limit if need be, for instance. If people really feel strongly about this I can propose a later patch to set a limit in get_line and define a command like argument that would override that parameter. > And maybe it would make the life of read_line's callers lots easier > if the zero-chars are silently replaced with spaces in the returned > line buffer. As speed seemed to be a concern (even if, in my opinion, we are dealing with diagnostics that are being emitted when the compilation has been halted anyway, so we shouldn't be too concerned, unless we are talking about pathological cases), I think that read_line should be fast by default. If a particular caller doesn't want to see the zeros (and thus is ready to pay the speed price) then it can replace the zeros with white space. Otherwise, let's have read_line be as fast as possible. Also keep in mind that in subsequent patches, read_line might be re-used by e.g, gcov in nominal contexts where we don't have zeros in the middle of the line. In that case, speed can be a concern. Thanks for the helpful thoughts. -- Dodji
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hi, you're welcome. Just one more thought on the design. If you want to have at least a chance to survive something like: dd if=/dev/zero of=test.c bs=10240 count=1000 gcc -Wall test.c Then you should change the implementation of read_line to _not_ returning something like 100GB of zeros. IMHO it would be nice to limit lines returned to 10.000 bytes, maybe add "..." or "" if the limit is reached. Just skip over-sized bytes until the newline is consumed, to make the line numbers consistent. And maybe it would make the life of read_line's callers lots easier if the zero-chars are silently replaced with spaces in the returned line buffer. That would allow to keep the current interface, and somehow reduce the complexity of this patch. What do you think? Regards Bernd. On Tue, 5 Nov 2013 10:41:19, Dodji Seketeli wrote: > > Bernd Edlinger writes: > > [...] > >>> if (!string_len) >>> { >>> string_len = 200; >>> - string = XNEWVEC (char, string_len); >>> + string = XCNEWVEC (char, string_len); >>> } >>> + else >>> + memset (string, 0, string_len); >> >> Is this memset still necessary? > > Of course not ... > > [...] > >> If "ptr" is passed to get_line it will try to reallocate it, >> which must fail, right? >> >> Maybe, this line of code is unreachable? >> >> Who is responsible for reallocating "string" get_line or read_line? > > Correct, these are real concerns. > > > I am wondering what I was thinking. Actually, I think read_line should > almost just call get_line now. Like what is done in the new version of > the patch below; basically if there is a line to return, read_line just > gets it (the static buffer containing the line) from get_line and > returns it, otherwise the static buffer containing the last read line is > left untouched and read_line returns a NULL constant. > > > I guess this resolves the valid concern that you raised below: > >> If the previous invocation of read_line already had read some >> characters of the following line, how is that information >> recovered? How is it detected if another file is to be read this >> time? > > Thank you very much for this thorough review. > > Here is the updated patch that I am bootstrapping: > > gcc/ChangeLog: > > * input.h (location_get_source_line): Take an additional line_size > parameter. > * input.c (get_line): New static function definition. > (read_line): Take an additional line_length output parameter to be > set to the size of the line. Use the new get_line function do the > actual line reading. > (location_get_source_line): Take an additional output line_len > parameter. Update the use of read_line to pass it the line_len > parameter. > * diagnostic.c (adjust_line): Take an additional input parameter > for the length of the line, rather than calculating it with > strlen. > (diagnostic_show_locus): Adjust the use of > location_get_source_line and adjust_line with respect to their new > signature. While displaying a line now, do not stop at the first > null byte. Rather, display the zero byte as a space and keep > going until we reach the size of the line. > > gcc/testsuite/ChangeLog: > > * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. > --- > gcc/diagnostic.c | 17 ++-- > gcc/input.c | 111 - > gcc/input.h | 3 +- > .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes > 4 files changed, 97 insertions(+), 34 deletions(-) > create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c > > diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c > index 36094a1..e0c5d9d 100644 > --- a/gcc/diagnostic.c > +++ b/gcc/diagnostic.c > @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context, > MAX_WIDTH by some margin, then adjust the start of the line such > that the COLUMN is smaller than MAX_WIDTH minus the margin. The > margin is either 10 characters or the difference between the column > - and the length of the line, whatever is smaller. */ > + and the length of the line, whatever is smaller. The length of > + LINE is given by LINE_WIDTH. */ > static const char * > -adjust_line (const char *line, int max_width, int *column_p) > +adjust_line (const char *line, int line_width, > + int max_width, int *column_p) > { > int right_margin = 10; > - int line_width = strlen (line); > int column = *column_p; > > right_margin = MIN (line_width - column, right_margin); > @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, > const diagnostic_info *diagnostic) > { > const char *line; > + int line_width; > char *buffer; > expanded_location s; > int max_width; > @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context, > > context->last_location = diagnostic->location; > s = expand_location_to_spelling_point (diagnostic->location); > - line = location_get_source_line (s); > + line = location_get_source_line (s, &line_width); > if (line == NULL) > return; > > max_width = context->caret_max_width; > - line = adjust_line (line, max_width,
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Bernd Edlinger writes: [...] >> if (!string_len) >> { >> string_len = 200; >> - string = XNEWVEC (char, string_len); >> + string = XCNEWVEC (char, string_len); >> } >> + else >> + memset (string, 0, string_len); > > Is this memset still necessary? Of course not ... [...] > If "ptr" is passed to get_line it will try to reallocate it, > which must fail, right? > > Maybe, this line of code is unreachable? > > Who is responsible for reallocating "string" get_line or read_line? Correct, these are real concerns. I am wondering what I was thinking. Actually, I think read_line should almost just call get_line now. Like what is done in the new version of the patch below; basically if there is a line to return, read_line just gets it (the static buffer containing the line) from get_line and returns it, otherwise the static buffer containing the last read line is left untouched and read_line returns a NULL constant. I guess this resolves the valid concern that you raised below: > If the previous invocation of read_line already had read some > characters of the following line, how is that information > recovered? How is it detected if another file is to be read this > time? Thank you very much for this thorough review. Here is the updated patch that I am bootstrapping: gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter. * input.c (get_line): New static function definition. (read_line): Take an additional line_length output parameter to be set to the size of the line. Use the new get_line function do the actual line reading. (location_get_source_line): Take an additional output line_len parameter. Update the use of read_line to pass it the line_len parameter. * diagnostic.c (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. --- gcc/diagnostic.c | 17 ++-- gcc/input.c| 111 - gcc/input.h| 3 +- .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes 4 files changed, 97 insertions(+), 34 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 36094a1..e0c5d9d 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context, MAX_WIDTH by some margin, then adjust the start of the line such that the COLUMN is smaller than MAX_WIDTH minus the margin. The margin is either 10 characters or the difference between the column - and the length of the line, whatever is smaller. */ + and the length of the line, whatever is smaller. The length of + LINE is given by LINE_WIDTH. */ static const char * -adjust_line (const char *line, int max_width, int *column_p) +adjust_line (const char *line, int line_width, +int max_width, int *column_p) { int right_margin = 10; - int line_width = strlen (line); int column = *column_p; right_margin = MIN (line_width - column, right_margin); @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, const diagnostic_info *diagnostic) { const char *line; + int line_width; char *buffer; expanded_location s; int max_width; @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context, context->last_location = diagnostic->location; s = expand_location_to_spelling_point (diagnostic->location); - line = location_get_source_line (s); + line = location_get_source_line (s, &line_width); if (line == NULL) return; max_width = context->caret_max_width; - line = adjust_line (line, max_width, &(s.column)); + line = adjust_line (line, line_width, max_width, &(s.column)); pp_newline (context->printer); saved_prefix = pp_get_prefix (context->printer); pp_set_prefix (context->printer, NULL); pp_space (context->printer); - while (max_width > 0 && *line != '\0') + while (max_width > 0 && line_width > 0) { char c = *line == '\t' ? ' ' : *line; + if (c == '\0') + c = ' '; pp_character (context->printer, c); max_width--; + line_width--; line++; } pp_newline (context->printer); diff --git a/gcc/input.c b/gcc/input.c index a141a92..9526d88 100644 --- a/gcc/input.c +++
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hi, On Mon, 4 Nov 2013 16:40:38, Dodji Seketeli wrote: > +static ssize_t > +get_line (char **lineptr, size_t *n, FILE *fp) > +{ > + ssize_t cur_len = 0, len; > + char buf[16384]; > + > + if (lineptr == NULL || n == NULL) > + return -1; > + > + if (*lineptr == NULL || *n == 0) > + { > + *n = 120; > + *lineptr = XNEWVEC (char, *n); > + } > + > + len = fread (buf, 1, sizeof buf, fp); > + if (ferror (fp)) > + return -1; > + > + for (;;) > + { > + size_t needed; > + char *t = (char*) memchr (buf, '\n', len); > + if (t != NULL) len = (t - buf) + 1; > + if (__builtin_expect (len>= SSIZE_MAX - cur_len, 0)) > + return -1; > + needed = cur_len + len + 1; > + if (needed> *n) > + { > + char *new_lineptr; > + if (needed < 2 * *n) > + needed = 2 * *n; > + new_lineptr = XRESIZEVEC (char, *lineptr, needed); > + *lineptr = new_lineptr; > + *n = needed; > + } > + memcpy (*lineptr + cur_len, buf, len); > + cur_len += len; > + if (t != NULL) > + break; > + len = fread (buf, 1, sizeof buf, fp); > + if (ferror (fp)) > + return -1; > + if (len == 0) > + break; > + } > + (*lineptr)[cur_len] = '\0'; > + return cur_len; > +} > + > +/* Reads one line from FILE into a static buffer. LINE_LENGTH is set > + by this function to the length of the returned line. Note that the > + returned line can contain several zero bytes. */ > static const char * > -read_line (FILE *file) > +read_line (FILE *file, int *line_length) > { > static char *string; > - static size_t string_len; > + static size_t string_len, cur_len; > size_t pos = 0; > char *ptr; > > if (!string_len) > { > string_len = 200; > - string = XNEWVEC (char, string_len); > + string = XCNEWVEC (char, string_len); > } > + else > + memset (string, 0, string_len); Is this memset still necessary? If the previous invocation of read_line already had read some characters of the following line, how is that information recovered? How is it detected if another file is to be read this time? > > - while ((ptr = fgets (string + pos, string_len - pos, file))) > + ptr = string; > + cur_len = string_len; > + while (size_t len = get_line (&ptr, &cur_len, file)) > { > - size_t len = strlen (string + pos); > - > - if (string[pos + len - 1] == '\n') > + if (ptr[len - 1] == '\n') > { > - string[pos + len - 1] = 0; > + ptr[len - 1] = 0; > + *line_length = len; > return string; > } > pos += len; > string = XRESIZEVEC (char, string, string_len * 2); > string_len *= 2; > - } > - > + ptr = string + pos; If "ptr" is passed to get_line it will try to reallocate it, which must fail, right? Maybe, this line of code is unreachable? Who is responsible for reallocating "string" get_line or read_line? > + cur_len = string_len - pos; > + } > + > + *line_length = pos ? string_len : 0; > return pos ? string : NULL; > } > > /* Return the physical source line that corresponds to xloc in a > buffer that is statically allocated. The newline is replaced by > - the null character. */ > + the null character. Note that the line can contain several null > + characters, so LINE_LEN contains the actual length of the line. */ > > const char * > -location_get_source_line (expanded_location xloc) > +location_get_source_line (expanded_location xloc, > + int& line_len) > { > const char *buffer; > int lines = 1; > @@ -132,7 +204,7 @@ location_get_source_line (expanded_location xloc) > if (!stream) > return NULL; > > - while ((buffer = read_line (stream)) && lines < xloc.line) > + while ((buffer = read_line (stream, &line_len)) && lines < xloc.line) > lines++; > > fclose (stream); Regards Bernd.
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Jakub Jelinek writes: [...] > Eventually, I think using int for sizes is just a ticking bomb, what if > somebody uses > 2GB long lines? Surely, on 32-bit hosts we are unlikely to > handle it, but why couldn't 64-bit host handle it? Column info maybe bogus > in there, sure, but at least we should not crash or overflow buffers on it > ;). Anyway, not something needed to be fixed right now, but in the future > it would be nicer to use size_t and/or ssize_t here. Yes. I initially tried to use size_t but found that I'd need to modify several other places to shutdown many warning because these places where using int :-(. So I felt that would be a battle for later. But I am adding this to my TODO. I'll send a patch later that changes this to size_t then, and adjusts the other places that need it as well. [...] >>context->last_location = diagnostic->location; >>s = expand_location_to_spelling_point (diagnostic->location); >> - line = location_get_source_line (s); >> + line = location_get_source_line (s, line_width); > > I think richi didn't like C++ reference arguments to be used that way (and > perhaps guidelines don't either), because it isn't immediately obvious > that line_width is modified by the call. Can you change it to a pointer > argument instead and pass &line_width? Sure. I have done the change in the patch below. Sorry for this reflex. I tend to use pointers like these only in places where we can allow them to be NULL. > XNEWVEC or XRESIZEVEC will never return NULL though, so it doesn't have > to be tested. Though, the question is if that is what we want, caret > diagnostics should be optional, if we can't print it, we just won't. Hmmh. This particular bug was noticed because of the explicit OOM message displayed by XNEWVEC/XRESIZEVEC; otherwise, I bet this could have just felt through the crack for a little longer. So I'd say let's just use XNEWVEC/XRESIZEVEC and remove the test, as you first suggested. The caret diagnostics functionality as a whole can be disabled with -fno-diagnostic-show-caret. [...] > Otherwise, LGTM. Thanks. So here is the patch that bootstraps. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter by reference. * input.c (get_line): New static function definition. (read_line): Take an additional line_length output parameter to be set to the size of the line. Use the new get_line function to compute the size of the line returned by fgets, rather than using strlen. Ensure that the buffer is initially zeroed; ensure that when growing the buffer too. (location_get_source_line): Take an additional output line_len parameter. Update the use of read_line to pass it the line_len parameter. * diagnostic.c (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. --- gcc/diagnostic.c | 17 ++-- gcc/input.c| 100 ++--- gcc/input.h| 3 +- .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes 4 files changed, 99 insertions(+), 21 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 36094a1..0ca7081 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context, MAX_WIDTH by some margin, then adjust the start of the line such that the COLUMN is smaller than MAX_WIDTH minus the margin. The margin is either 10 characters or the difference between the column - and the length of the line, whatever is smaller. */ + and the length of the line, whatever is smaller. The length of + LINE is given by LINE_WIDTH. */ static const char * -adjust_line (const char *line, int max_width, int *column_p) +adjust_line (const char *line, int line_width, +int max_width, int *column_p) { int right_margin = 10; - int line_width = strlen (line); int column = *column_p; right_margin = MIN (line_width - column, right_margin); @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, const diagnostic_info *diagnostic) { const char *line; + int line_width; char *buffer; expanded_location s; int max_width; @@ -297,22 +299,25 @@ diagnostic_show_locus (diag
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Bernd Edlinger writes: > I see another "read_line" at gcov.c, which seems to be a copy. > > Maybe this should be changed too? I have seen it as well. I'd rather have the patch be reviewed and everthing, and only then propose to share the implementation with the gcov module. -- Dodji
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
> > On Mon, Nov 04, 2013 at 12:59:49PM +0100, Bernd Edlinger wrote: >> I see another "read_line" at gcov.c, which seems to be a copy. > > Copy of what? gcov.c read_line hardly can be allowed to fail because out of > mem unlike this one for caret diagnostics. > Though, surely, this one could be somewhat adjusted so that it really > doesn't use a temporary buffer but reads directly into the initially > malloced, then realloced, buffer. But, if we want it to eventually switch > to caching the caret diagnostics, it won't be possible/desirable anymore. > > Jakub gcov.c and input.c currently both have a static function "read_line" they are currently 100% in sync. Both _can_ fail, if the file gets deleted or modified while the function executes. If gcov.c crashes in that event, I'd call it a bug. Bernd.
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Mon, Nov 04, 2013 at 12:59:49PM +0100, Bernd Edlinger wrote: > I see another "read_line" at gcov.c, which seems to be a copy. Copy of what? gcov.c read_line hardly can be allowed to fail because out of mem unlike this one for caret diagnostics. Though, surely, this one could be somewhat adjusted so that it really doesn't use a temporary buffer but reads directly into the initially malloced, then realloced, buffer. But, if we want it to eventually switch to caching the caret diagnostics, it won't be possible/desirable anymore. Jakub
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hi, I see another "read_line" at gcov.c, which seems to be a copy. Maybe this should be changed too? What do you think? Bernd. On Mon, 4 Nov 2013 12:46:10, Dodji Seketeli wrote: > > Jakub Jelinek writes: > >> I think even as a fallback is the patch too expensive. >> I'd say best would be to write some getline API compatible function >> and just use it, using fread on say fixed size buffer. > > OK, thanks for the insight. I have just used the getline_fallback > function you proposed, slightly amended to use the memory allocation > routines commonly used in gcc and renamed into get_line, with a > hopefully complete comment explaining where this function comes from > etc. > > [...] > >> A slight complication is what to do on mingw/cygwin and other DOS or >> Mac style line ending environments, no idea what fgets exactly does >> there. > > Actually, I think that even fgets just deals with '\n'. The reason, > from what I gathered being that on windows, we fopen the input file in > text mode; and in that mode, the \r\n is transformed into just \n. > > Apparently OSX is compatible with '\n' too. Someone corrects me if I am > saying non-sense here. > > So the patch below is what I am bootstrapping at the moment. > > OK if it passes bootstrap on x86_64-unknown-linux-gnu against trunk? > >> BTW, we probably want to do something with the speed of the caret >> diagnostics too, right now it opens the file again for each single line >> to be printed in caret diagnostics and reads all lines until the right one, >> so imagine how fast is printing of many warnings on almost adjacent lines >> near the end of many megabytes long file. >> Perhaps we could remember the last file we've opened for caret diagnostics, >> don't fclose the file right away but only if a new request is for a >> different file, perhaps keep some vector of line start offsets (say starting >> byte of every 100th line or similar) and also remember the last read line >> offset, so if a new request is for the same file, but higher line than last, >> we can just keep getlineing, and if it is smaller line than last, we look up >> the offset of the line / 100, fseek to it and just getline only modulo 100 >> lines. Maybe we should keep not just one, but 2 or 4 opened files as cache >> (again, with the starting line offsets vectors). > > I like this idea. I'll try and work on it. > > And now the patch. > > Cheers. > > gcc/ChangeLog: > > * input.h (location_get_source_line): Take an additional line_size > parameter by reference. > * input.c (get_line): New static function definition. > (read_line): Take an additional line_length output parameter to be > set to the size of the line. Use the new get_line function to > compute the size of the line returned by fgets, rather than using > strlen. Ensure that the buffer is initially zeroed; ensure that > when growing the buffer too. > (location_get_source_line): Take an additional output line_len > parameter. Update the use of read_line to pass it the line_len > parameter. > * diagnostic.c (adjust_line): Take an additional input parameter > for the length of the line, rather than calculating it with > strlen. > (diagnostic_show_locus): Adjust the use of > location_get_source_line and adjust_line with respect to their new > signature. While displaying a line now, do not stop at the first > null byte. Rather, display the zero byte as a space and keep > going until we reach the size of the line. > > gcc/testsuite/ChangeLog: > > * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. > --- > gcc/diagnostic.c | 17 ++-- > gcc/input.c | 104 ++--- > gcc/input.h | 3 +- > .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes > 4 files changed, 103 insertions(+), 21 deletions(-) > create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c > > diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c > index 36094a1..0ca7081 100644 > --- a/gcc/diagnostic.c > +++ b/gcc/diagnostic.c > @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context, > MAX_WIDTH by some margin, then adjust the start of the line such > that the COLUMN is smaller than MAX_WIDTH minus the margin. The > margin is either 10 characters or the difference between the column > - and the length of the line, whatever is smaller. */ > + and the length of the line, whatever is smaller. The length of > + LINE is given by LINE_WIDTH. */ > static const char * > -adjust_line (const char *line, int max_width, int *column_p) > +adjust_line (const char *line, int line_width, > + int max_width, int *column_p) > { > int right_margin = 10; > - int line_width = strlen (line); > int column = *column_p; > > right_margin = MIN (line_width - column, right_margin); > @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, > const diagnostic_info *diagnostic) > { > const char *line; > + int line_width; > char *buffer; > expanded_location s; > int max_width; > @@ -297,22 +299,25 @@ diagnostic_show_
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Mon, Nov 04, 2013 at 12:46:10PM +0100, Dodji Seketeli wrote: > --- a/gcc/diagnostic.c > +++ b/gcc/diagnostic.c > @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context, > MAX_WIDTH by some margin, then adjust the start of the line such > that the COLUMN is smaller than MAX_WIDTH minus the margin. The > margin is either 10 characters or the difference between the column > - and the length of the line, whatever is smaller. */ > + and the length of the line, whatever is smaller. The length of > + LINE is given by LINE_WIDTH. */ > static const char * > -adjust_line (const char *line, int max_width, int *column_p) > +adjust_line (const char *line, int line_width, > + int max_width, int *column_p) Eventually, I think using int for sizes is just a ticking bomb, what if somebody uses > 2GB long lines? Surely, on 32-bit hosts we are unlikely to handle it, but why couldn't 64-bit host handle it? Column info maybe bogus in there, sure, but at least we should not crash or overflow buffers on it ;). Anyway, not something needed to be fixed right now, but in the future it would be nicer to use size_t and/or ssize_t here. > { >int right_margin = 10; > - int line_width = strlen (line); >int column = *column_p; > >right_margin = MIN (line_width - column, right_margin); > @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, > const diagnostic_info *diagnostic) > { >const char *line; > + int line_width; >char *buffer; >expanded_location s; >int max_width; > @@ -297,22 +299,25 @@ diagnostic_show_locus (diagnostic_context * context, > >context->last_location = diagnostic->location; >s = expand_location_to_spelling_point (diagnostic->location); > - line = location_get_source_line (s); > + line = location_get_source_line (s, line_width); I think richi didn't like C++ reference arguments to be used that way (and perhaps guidelines don't either), because it isn't immediately obvious that line_width is modified by the call. Can you change it to a pointer argument instead and pass &line_width? > + *lineptr = XNEWVEC (char, *n); > + if (*lineptr == NULL) > + return -1; XNEWVEC or XRESIZEVEC will never return NULL though, so it doesn't have to be tested. Though, the question is if that is what we want, caret diagnostics should be optional, if we can't print it, we just won't. So perhaps using malloc/realloc here would be better? > > const char * > -location_get_source_line (expanded_location xloc) > +location_get_source_line (expanded_location xloc, > + int& line_len) Ditto. Otherwise, LGTM. Jakub
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Jakub Jelinek writes: > I think even as a fallback is the patch too expensive. > I'd say best would be to write some getline API compatible function > and just use it, using fread on say fixed size buffer. OK, thanks for the insight. I have just used the getline_fallback function you proposed, slightly amended to use the memory allocation routines commonly used in gcc and renamed into get_line, with a hopefully complete comment explaining where this function comes from etc. [...] > A slight complication is what to do on mingw/cygwin and other DOS or > Mac style line ending environments, no idea what fgets exactly does > there. Actually, I think that even fgets just deals with '\n'. The reason, from what I gathered being that on windows, we fopen the input file in text mode; and in that mode, the \r\n is transformed into just \n. Apparently OSX is compatible with '\n' too. Someone corrects me if I am saying non-sense here. So the patch below is what I am bootstrapping at the moment. OK if it passes bootstrap on x86_64-unknown-linux-gnu against trunk? > BTW, we probably want to do something with the speed of the caret > diagnostics too, right now it opens the file again for each single line > to be printed in caret diagnostics and reads all lines until the right one, > so imagine how fast is printing of many warnings on almost adjacent lines > near the end of many megabytes long file. > Perhaps we could remember the last file we've opened for caret diagnostics, > don't fclose the file right away but only if a new request is for a > different file, perhaps keep some vector of line start offsets (say starting > byte of every 100th line or similar) and also remember the last read line > offset, so if a new request is for the same file, but higher line than last, > we can just keep getlineing, and if it is smaller line than last, we look up > the offset of the line / 100, fseek to it and just getline only modulo 100 > lines. Maybe we should keep not just one, but 2 or 4 opened files as cache > (again, with the starting line offsets vectors). I like this idea. I'll try and work on it. And now the patch. Cheers. gcc/ChangeLog: * input.h (location_get_source_line): Take an additional line_size parameter by reference. * input.c (get_line): New static function definition. (read_line): Take an additional line_length output parameter to be set to the size of the line. Use the new get_line function to compute the size of the line returned by fgets, rather than using strlen. Ensure that the buffer is initially zeroed; ensure that when growing the buffer too. (location_get_source_line): Take an additional output line_len parameter. Update the use of read_line to pass it the line_len parameter. * diagnostic.c (adjust_line): Take an additional input parameter for the length of the line, rather than calculating it with strlen. (diagnostic_show_locus): Adjust the use of location_get_source_line and adjust_line with respect to their new signature. While displaying a line now, do not stop at the first null byte. Rather, display the zero byte as a space and keep going until we reach the size of the line. gcc/testsuite/ChangeLog: * c-c++-common/cpp/warning-zero-in-literals-1.c: New test file. --- gcc/diagnostic.c | 17 ++-- gcc/input.c| 104 ++--- gcc/input.h| 3 +- .../c-c++-common/cpp/warning-zero-in-literals-1.c | Bin 0 -> 240 bytes 4 files changed, 103 insertions(+), 21 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c index 36094a1..0ca7081 100644 --- a/gcc/diagnostic.c +++ b/gcc/diagnostic.c @@ -259,12 +259,13 @@ diagnostic_build_prefix (diagnostic_context *context, MAX_WIDTH by some margin, then adjust the start of the line such that the COLUMN is smaller than MAX_WIDTH minus the margin. The margin is either 10 characters or the difference between the column - and the length of the line, whatever is smaller. */ + and the length of the line, whatever is smaller. The length of + LINE is given by LINE_WIDTH. */ static const char * -adjust_line (const char *line, int max_width, int *column_p) +adjust_line (const char *line, int line_width, +int max_width, int *column_p) { int right_margin = 10; - int line_width = strlen (line); int column = *column_p; right_margin = MIN (line_width - column, right_margin); @@ -284,6 +285,7 @@ diagnostic_show_locus (diagnostic_context * context, const diagnostic_info *diagnostic) { const char *line; + int line_width; char *buffer; expanded_location s; int max_width; @@ -297,22 +299,25 @@ diagnostic_show
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Thu, Oct 31, 2013 at 04:00:01PM +0100, Dodji Seketeli wrote: > Jakub Jelinek writes: > > > On Thu, Oct 31, 2013 at 03:36:07PM +0100, Bernd Edlinger wrote: > >> if you want to read zero-chars, why don't you simply use fgetc, > >> optionally replacing '\0' with ' ' in read_line? > > > > Because it is too slow? > > > > getline(3) would be much better for this purpose, though of course > > it is a GNU extension in glibc and so we'd need some fallback, which > > very well could be the fgetc or something similar. > > So would getline (+ the current patch as a fallback) be acceptable? I think even as a fallback is the patch too expensive. I'd say best would be to write some getline API compatible function and just use it, using fread on say fixed size buffer (4KB or similar), then for the number of characters returned by fread that were stored into that buffer look for the line terminator there and allocate/copy to the dynamically allocated buffer. A slight complication is what to do on mingw/cygwin and other DOS or Mac style line ending environments, no idea what fgets exactly does there. But, ignoring the DOS/Mac style line endings, it would be roughly (partially from glibc iogetdelim.c). ssize_t getline_fallback (char **lineptr, size_t *n, FILE *fp) { ssize_t cur_len = 0, len; char buf[16384]; if (lineptr == NULL || n == NULL) return -1; if (*lineptr == NULL || *n == 0) { *n = 120; *lineptr = (char *) malloc (*n); if (*lineptr == NULL) return -1; } len = fread (buf, 1, sizeof buf, fp); if (ferror (fp)) return -1; for (;;) { size_t needed; char *t = memchr (buf, '\n', len); if (t != NULL) len = (t - buf) + 1; if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0)) return -1; needed = cur_len + len + 1; if (needed > *n) { char *new_lineptr; if (needed < 2 * *n) needed = 2 * *n; new_lineptr = realloc (*lineptr, needed); if (new_lineptr == NULL) return -1; *lineptr = new_lineptr; *n = needed; } memcpy (*lineptr + cur_len, buf, len); cur_len += len; if (t != NULL) break; len = fread (buf, 1, sizeof buf, fp); if (ferror (fp)) return -1; if (len == 0) break; } (*lineptr)[cur_len] = '\0'; return cur_len; } For the DOS/Mac style line endings, you probably want to look at what exactly does libcpp do with them. BTW, we probably want to do something with the speed of the caret diagnostics too, right now it opens the file again for each single line to be printed in caret diagnostics and reads all lines until the right one, so imagine how fast is printing of many warnings on almost adjacent lines near the end of many megabytes long file. Perhaps we could remember the last file we've opened for caret diagnostics, don't fclose the file right away but only if a new request is for a different file, perhaps keep some vector of line start offsets (say starting byte of every 100th line or similar) and also remember the last read line offset, so if a new request is for the same file, but higher line than last, we can just keep getlineing, and if it is smaller line than last, we look up the offset of the line / 100, fseek to it and just getline only modulo 100 lines. Maybe we should keep not just one, but 2 or 4 opened files as cache (again, with the starting line offsets vectors). Jakub
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On 31 October 2013 05:46, Dodji Seketeli wrote: > +*/ > +static size_t > +string_length (const char* buf, size_t buf_size) > +{ > + for (int i = buf_size - 1; i > 0; --i) > +{ > + if (buf[i] != 0) > + return i + 1; > + > + if (buf[i - 1] != 0) > + return i; > +} > + return 0; > +} Why do you check both buf[i] and buf[i - 1] within the loop? Cheers, Manuel.
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Jakub Jelinek writes: > On Thu, Oct 31, 2013 at 03:36:07PM +0100, Bernd Edlinger wrote: >> if you want to read zero-chars, why don't you simply use fgetc, >> optionally replacing '\0' with ' ' in read_line? > > Because it is too slow? > > getline(3) would be much better for this purpose, though of course > it is a GNU extension in glibc and so we'd need some fallback, which > very well could be the fgetc or something similar. So would getline (+ the current patch as a fallback) be acceptable? -- Dodji
Re: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
On Thu, Oct 31, 2013 at 03:36:07PM +0100, Bernd Edlinger wrote: > if you want to read zero-chars, why don't you simply use fgetc, > optionally replacing '\0' with ' ' in read_line? Because it is too slow? getline(3) would be much better for this purpose, though of course it is a GNU extension in glibc and so we'd need some fallback, which very well could be the fgetc or something similar. Jakub
RE: [PATCH] preprocessor/58580 - preprocessor goes OOM with warning for zero literals
Hi, if you want to read zero-chars, why don't you simply use fgetc, optionally replacing '\0' with ' ' in read_line? Bernd.