Hi bison maintainers,
we have found a NULL pointer dereference and would like to report this issue.
I am happy to provide any additional information needed.
## Summary
The trace output path in `location_print` reaches `boundary_print` while
allowing `loc.end.file == NULL`, passing NULL to `quotearg_n_style(...,
b->file)`, which results in a NULL pointer dereference within
`quotearg_buffer_restyled`.
## Details
* **Vulnerability Type**: Segmentation fault due to NULL pointer dereference
* **Version**: 3.8.2
- `location_print` passes `loc.start` and `loc.end` to `boundary_print` for
output when `--trace=location` is enabled (`location.c` line 160).
- `boundary_print` unconditionally passes `b->file` to `quotearg_n_style`,
causing undefined behavior when `b->file == NULL` (`location.c` line 149).
- Meanwhile, `lloc_default`, the implementation of `YYLLOC_DEFAULT`, copies the
LHS position's `start` and `end` from the RHS tail `rhs[n].end`. Therefore, if
`end.file` on the RHS side remains unset during the transition, the LHS's
`end.file` remains NULL (`parse-gram.c` lines 3204, 3205).
- This NULL was passed through `location_print → boundary_print`.
## Reproduction
### Tested Environment
- **Operating System:** Ubuntu 22.04 LTS
- **Architecture:** x86_64
- **Compiler:** gcc with AddressSanitizer (gcc version: 11.4.0)
### Reproduction Steps
```Dockerfile
FROM ubuntu:22.04
RUN apt-get update && \
apt-get install -y \
build-essential \
libtool \
automake \
autoconf \
pkg-config \
git \
ca-certificates \
wget \
autoconf \
automake \
autopoint \
rsync \
gcc \
g++ \
make
WORKDIR /root/workdir
RUN wget https://ftp.gnu.org/gnu/bison/bison-3.8.2.tar.gz
RUN tar xvf bison-3.8.2.tar.gz
WORKDIR /root/workdir/bison-3.8.2
RUN CFLAGS="-g -O0 -fsanitize=address -fno-omit-frame-pointer"
CXXFLAGS="$CFLAGS" LDFLAGS="-fsanitize=address" ./configure && make -j$(nproc)
&& make install
```
### poc file
```yacc
%poc p[
%%
```
## Output
### ASanLog
```
=================================================================
==8983==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc
0x575f010633b5 bp 0x7ffc35704cf0 sp 0x7ffc35704ba0 T0)
==8983==The signal is caused by a READ memory access.
==8983==Hint: address points to the zero page.
#0 0x575f010633b5 in quotearg_buffer_restyled lib/quotearg.c:393
#1 0x575f01064175 in quotearg_n_options lib/quotearg.c:899
#2 0x575f010645f1 in quotearg_n_style lib/quotearg.c:950
#3 0x575f00f89bd6 in boundary_print src/location.c:149
#4 0x575f00f89dcd in location_print src/location.c:164
#5 0x575f00fc6b35 in yy_symbol_print src/parse-gram.c:1390
#6 0x575f00fcfcf3 in gram_parse src/parse-gram.c:3099
#7 0x575f00fe9753 in reader src/reader.c:766
#8 0x575f00f92b6a in main src/main.c:118
#9 0x702b6e1c0d8f in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
#10 0x702b6e1c0e3f in __libc_start_main_impl ../csu/libc-start.c:392
#11 0x575f00f47594 in _start (/usr/local/bin/bison+0x36594)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV lib/quotearg.c:393 in quotearg_buffer_restyled
==8983==ABORTING
```
### Crash Flow
* `location_print` is called from `yy_symbol_print` via `YYLOCATION_PRINT`.
* `location_print` normally assumes non-NULL with `aver(loc.start.file);
aver(loc.end.file);` before output, but the `trace_locations` path bypasses
this check. If the trace flag is enabled after passing the `if (location_empty
(loc))` check, it calls `boundary_print`.
* `lloc_default` aligns the LHS's `start` and `end` to the RHS tail `end`,
overwriting only the leading non-empty element's `start`. Therefore, if the RHS
tail's `end.file` is NULL, the LHS's `end.file` remains NULL.
* As a result, `boundary_print` receives `b->file == NULL`, and `arg=NULL` is
passed from `quotearg_n_style` to `quotearg_buffer_restyled`, causing a SEGV.
### Affected Code
* The trace path in `location_print` (does not verify that loc.start and
loc.end are non-NULL before calling boundary_print)
* `quotearg_n_style(..., b->file)` in `boundary_print` (no NULL protection)
* LHS position synthesis in `lloc_default` (copying the tail end)
```c
quotearg.c:393
yy_symbol_print (FILE *yyo,
yysymbol_kind_t yykind, YYSTYPE const * const yyvaluep,
YYLTYPE const * const yylocationp)
{
YYFPRINTF (yyo, "%s %s (",
yykind < YYNTOKENS ? "token" : "nterm", yysymbol_name (yykind));
YYLOCATION_PRINT (yyo, yylocationp); /*---------------- call
----------------*/
YYFPRINTF (yyo, ": ");
yy_symbol_value_print (yyo, yykind, yyvaluep, yylocationp);
YYFPRINTF (yyo, ")");
}
int
location_print (location loc, FILE *out)
{
int res = 0;
if (location_empty (loc))
res += fprintf (out, "(empty location)");
else if (trace_flag & trace_locations)
{
res += boundary_print (&loc.start, out);
res += fprintf (out, "-");
res += boundary_print (&loc.end, out); /*---------------- call
----------------*/
}
else
...
}
static int
boundary_print (boundary const *b, FILE *out)
{
return fprintf (out, "%s:%d.%d@%d",
quotearg_n_style (3, escape_quoting_style, b->file),
/*---------------- call ----------------*/
b->line, b->column, b->byte);
}
char *
quotearg_n_style (int n, enum quoting_style s, char const *arg)
{
struct quoting_options const o = quoting_options_from_style (s);
return quotearg_n_options (n, arg, SIZE_MAX, &o); /*---------------- call
----------------*/
}
static char *
quotearg_n_options (int n, char const *arg, size_t argsize,
struct quoting_options const *options)
{
...
size_t size = sv[n].size;
char *val = sv[n].val;
/* Elide embedded null bytes since we don't return a size. */
int flags = options->flags | QA_ELIDE_NULL_BYTES;
size_t qsize = quotearg_buffer_restyled (val, size, arg, argsize,
/*---------------- call ----------------*/
options->style, flags,
options->quote_these_too,
options->left_quote,
options->right_quote);
...
}
static size_t
quotearg_buffer_restyled (char *buffer, size_t buffersize,
char const *arg, size_t argsize,
enum quoting_style quoting_style, int flags,
unsigned int const *quote_these_too,
char const *left_quote,
char const *right_quote)
{
...
default:
abort ();
}
for (i = 0; ! (argsize == SIZE_MAX ? arg[i] == '\0' : i == argsize); i++)
/*------------- CRASH! -------------*/
...
}
```
## Proposed Fix
* Fix on the generation side
+ Correct the consistency of `start.file` and `end.file` before returning
from `lloc_default`.
+ Set `end = start` when `end.file == NULL` and `start.file != NULL`.
* Fix on the output side
+ Modify `boundary_print` to output "(NULL)" when b->file is NULL as follows:
boundary_print(location.c:149)
```c
static int
boundary_print (boundary const *b, FILE *out)
{
const char *tmp_filename = b->file ? b->file : "(NULL)"; /*----------------
append ----------------*/
return fprintf (out, "%s:%d.%d@%d",
quotearg_n_style (3, escape_quoting_style, tmp_filename),
b->line, b->column, b->byte);
}
```