On Fri, Nov 23, 2012 at 11:23 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Fri, Nov 23, 2012 at 8:16 PM, H.J. Lu <hjl.to...@gmail.com> wrote: >> On Fri, Nov 23, 2012 at 10:59 AM, Uros Bizjak <ubiz...@gmail.com> wrote: >>> Hello! >>> >>>> This patch allocates extra 16 bytes for -fsanitize=address so that >>>> asan won't report read beyond memory buffer. It is used by >>>> bootstrap-asan. OK to install? >>> >>> /* Resize buffer if we allocated substantially too much, or if we >>> - haven't enough space for the \n-terminator. */ >>> + haven't enough space for the \n-terminator. Allocate extra 16 >>> + bytes for -fsanitize=address. */ >>> >>> I guess that extra _15_ bytes should be enough? The maximum we need >>> for SSE stringops is additional 15 bytes, when only the first byte is >>> allocated in the 16byte bundle. >>> >> >> If we need to clear the extra bytes, clearing 16 bytes can be faster >> than 16 bytes. > > (I assume "... than 15 bytes"). > > However, this is ideal for the code below, where we add either +16 or > +1, considering also additional byte for the terminator. > > Uros.
Here is the updated patch. I added CLEAR_CPP_PAD_BUFFER to be used later for valgrind. I have no real preferences for 15 vs 16 extra bytes. -- H.J. --- 2012-11-21 H.J. Lu <hongjiu...@intel.com> PR bootstrap/55380 * charset.c (_cpp_convert_input): Clear CPP_PAD_BUFFER_SIZE bytes if CLEAR_CPP_PAD_BUFFER isn't 0. Allocate extra CPP_PAD_BUFFER_SIZE bytes and clear it if CLEAR_CPP_PAD_BUFFER isn't 0. * files.c (read_file_guts): Allocate extra CPP_PAD_BUFFER_SIZE bytes. * internal.h (CPP_PAD_BUFFER_SIZE): New. Defined to 16 for -fsanitize=address if __SANITIZE_ADDRESS__ is defined. (CLEAR_CPP_PAD_BUFFER): New. diff --git a/libcpp/charset.c b/libcpp/charset.c index cba19a6..1ca94a1 100644 --- a/libcpp/charset.c +++ b/libcpp/charset.c @@ -1709,6 +1709,8 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_c harset, to.text = input; to.asize = size; to.len = len; + if (CLEAR_CPP_PAD_BUFFER) + memset (input + size, 0, CPP_PAD_BUFFER_SIZE); } else { @@ -1731,7 +1733,12 @@ _cpp_convert_input (cpp_reader *pfile, const char *input_ charset, /* Resize buffer if we allocated substantially too much, or if we haven't enough space for the \n-terminator. */ if (to.len + 4096 < to.asize || to.len >= to.asize) - to.text = XRESIZEVEC (uchar, to.text, to.len + 1); + { + to.text = XRESIZEVEC (uchar, to.text, + to.len + CPP_PAD_BUFFER_SIZE + 1); + if (CLEAR_CPP_PAD_BUFFER) + memset (to.text + to.len + 1, 0, CPP_PAD_BUFFER_SIZE); + } /* If the file is using old-school Mac line endings (\r only), terminate with another \r, not an \n, so that we do not mistake diff --git a/libcpp/files.c b/libcpp/files.c index 9f84d8c..090f928 100644 --- a/libcpp/files.c +++ b/libcpp/files.c @@ -671,7 +671,7 @@ read_file_guts (cpp_reader *pfile, _cpp_file *file) the majority of C source files. */ size = 8 * 1024; - buf = XNEWVEC (uchar, size + 1); + buf = XNEWVEC (uchar, size + CPP_PAD_BUFFER_SIZE + 1); total = 0; while ((count = read (file->fd, buf + total, size - total)) > 0) { @@ -682,7 +682,7 @@ read_file_guts (cpp_reader *pfile, _cpp_file *file) if (regular) break; size *= 2; - buf = XRESIZEVEC (uchar, buf, size + 1); + buf = XRESIZEVEC (uchar, buf, size + CPP_PAD_BUFFER_SIZE + 1); } } diff --git a/libcpp/internal.h b/libcpp/internal.h index 312b8b5..bc1f311 100644 --- a/libcpp/internal.h +++ b/libcpp/internal.h @@ -77,6 +77,16 @@ struct cset_converter efficiency, and partly to limit runaway recursion. */ #define CPP_STACK_MAX 200 +/* Allocate extra 16 bytes for -fsanitize=address. */ +#ifdef __SANITIZE_ADDRESS__ +#define CPP_PAD_BUFFER_SIZE 16 +#else +#define CPP_PAD_BUFFER_SIZE 0 +#endif + +/* 1 to clear extra 16 bytes. */ +#define CLEAR_CPP_PAD_BUFFER 0 + /* Host alignment handling. */ struct dummy {