Sorry, I forgot to include the trigger command that causes the crash. When 
running bison with bison --trace=all poc, in the debug execution path, 
boundary_print is executed without checking loc.end.file == NULL. I've included 
the corrected report below.

Some programs provide recommendations for how to write bug reports, and we 
refer to those.
At the same time, since AI has likely trained on similar documentation, 
AI-generated reports might have a similar format.


--
## Summary

The trace output path in `location_print` reaches `boundary_print` while 
allowing `loc.end.file == NULL`, passing NULL to `quotearg_n_style(..., 
b->file)`, which results in a NULL pointer dereference within 
`quotearg_buffer_restyled`.

## Details

* **Vulnerability Type**: Segmentation fault due to NULL pointer dereference
* **Version**: 3.8.2

- `location_print` passes `loc.start` and `loc.end` to `boundary_print` for 
output when `--trace=location` is enabled (`location.c` line 160).
- `boundary_print` unconditionally passes `b->file` to `quotearg_n_style`, 
causing undefined behavior when `b->file == NULL` (`location.c` line 149).

- Meanwhile, `lloc_default`, the implementation of `YYLLOC_DEFAULT`, copies the 
LHS position's `start` and `end` from the RHS tail `rhs[n].end`. Therefore, if 
`end.file` on the RHS side remains unset during the transition, the LHS's 
`end.file` remains NULL (`parse-gram.c` lines 3204, 3205).
- This NULL was passed through `location_print → boundary_print`.

## Reproduction

### Tested Environment

- **Operating System:** Ubuntu 22.04 LTS
- **Architecture:** x86_64
- **Compiler:** gcc with AddressSanitizer (gcc version: 11.4.0)

### Reproduction Steps

### poc file
```yacc
%poc p[

%%
```

```sh
wget https://ftp.gnu.org/gnu/bison/bison-3.8.2.tar.gz
tar xvf bison-3.8.2.tar.gz

cd bison-3.8.2

CFLAGS="-g -O0 -fsanitize=address -fno-omit-frame-pointer" CXXFLAGS="$CFLAGS" 
LDFLAGS="-fsanitize=address" ./configure && make -j$(nproc) && make install

bison --trace=all poc
```

## Output
### ASanLog

```
=================================================================
==8983==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 
0x575f010633b5 bp 0x7ffc35704cf0 sp 0x7ffc35704ba0 T0)
==8983==The signal is caused by a READ memory access.
==8983==Hint: address points to the zero page.
    #0 0x575f010633b5 in quotearg_buffer_restyled lib/quotearg.c:393
    #1 0x575f01064175 in quotearg_n_options lib/quotearg.c:899
    #2 0x575f010645f1 in quotearg_n_style lib/quotearg.c:950
    #3 0x575f00f89bd6 in boundary_print src/location.c:149
    #4 0x575f00f89dcd in location_print src/location.c:164
    #5 0x575f00fc6b35 in yy_symbol_print src/parse-gram.c:1390
    #6 0x575f00fcfcf3 in gram_parse src/parse-gram.c:3099
    #7 0x575f00fe9753 in reader src/reader.c:766
    #8 0x575f00f92b6a in main src/main.c:118
    #9 0x702b6e1c0d8f in __libc_start_call_main 
../sysdeps/nptl/libc_start_call_main.h:58
    #10 0x702b6e1c0e3f in __libc_start_main_impl ../csu/libc-start.c:392
    #11 0x575f00f47594 in _start (/usr/local/bin/bison+0x36594)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV lib/quotearg.c:393 in quotearg_buffer_restyled
==8983==ABORTING
```

### Crash Flow

* `location_print` is called from `yy_symbol_print` via `YYLOCATION_PRINT`.
* `location_print` normally assumes non-NULL with `aver(loc.start.file); 
aver(loc.end.file);` before output, but the `trace_locations` path bypasses 
this check. If the trace flag is enabled after passing the `if (location_empty 
(loc))` check, it calls `boundary_print`.
* `lloc_default` aligns the LHS's `start` and `end` to the RHS tail `end`, 
overwriting only the leading non-empty element's `start`. Therefore, if the RHS 
tail's `end.file` is NULL, the LHS's `end.file` remains NULL.
* As a result, `boundary_print` receives `b->file == NULL`, and `arg=NULL` is 
passed from `quotearg_n_style` to `quotearg_buffer_restyled`, causing a SEGV.

### Affected Code

* The trace path in `location_print` (does not verify that loc.start and 
loc.end are non-NULL before calling boundary_print)
* `quotearg_n_style(..., b->file)` in `boundary_print` (no NULL protection)
* LHS position synthesis in `lloc_default` (copying the tail end)

```c
quotearg.c:393

yy_symbol_print (FILE *yyo,
                 yysymbol_kind_t yykind, YYSTYPE const * const yyvaluep, 
YYLTYPE const * const yylocationp)
{
  YYFPRINTF (yyo, "%s %s (",
             yykind < YYNTOKENS ? "token" : "nterm", yysymbol_name (yykind));

  YYLOCATION_PRINT (yyo, yylocationp); /*---------------- call 
----------------*/
  YYFPRINTF (yyo, ": ");
  yy_symbol_value_print (yyo, yykind, yyvaluep, yylocationp);
  YYFPRINTF (yyo, ")");
}

int
location_print (location loc, FILE *out)
{
  int res = 0;
  if (location_empty (loc))
    res += fprintf (out, "(empty location)");
  else if (trace_flag & trace_locations)
    {
      res += boundary_print (&loc.start, out);
      res += fprintf (out, "-");
      res += boundary_print (&loc.end, out); /*---------------- call 
----------------*/
    }
  else
...
}

static int
boundary_print (boundary const *b, FILE *out)
{
  return fprintf (out, "%s:%d.%d@%d",
                  quotearg_n_style (3, escape_quoting_style, b->file), 
/*---------------- call ----------------*/
                  b->line, b->column, b->byte);
}

char *
quotearg_n_style (int n, enum quoting_style s, char const *arg)
{
  struct quoting_options const o = quoting_options_from_style (s);
  return quotearg_n_options (n, arg, SIZE_MAX, &o); /*---------------- call 
----------------*/
}

static char *
quotearg_n_options (int n, char const *arg, size_t argsize,
                    struct quoting_options const *options)
{
...
    size_t size = sv[n].size;
    char *val = sv[n].val;
    /* Elide embedded null bytes since we don't return a size.  */
    int flags = options->flags | QA_ELIDE_NULL_BYTES;
    size_t qsize = quotearg_buffer_restyled (val, size, arg, argsize,  
/*---------------- call ----------------*/
                                             options->style, flags,
                                             options->quote_these_too,
                                             options->left_quote,
                                             options->right_quote);

...
}

static size_t
quotearg_buffer_restyled (char *buffer, size_t buffersize,
                          char const *arg, size_t argsize,
                          enum quoting_style quoting_style, int flags,
                          unsigned int const *quote_these_too,
                          char const *left_quote,
                          char const *right_quote)
{
...
    default:
      abort ();
    }

  for (i = 0;  ! (argsize == SIZE_MAX ? arg[i] == '\0' : i == argsize);  i++)  
/*------------- CRASH! -------------*/
...
}
```


## Proposed Fix

* Fix on the generation side
  + Correct the consistency of `start.file` and `end.file` before returning 
from `lloc_default`.
  + Set `end = start` when `end.file == NULL` and `start.file != NULL`.
* Fix on the output side
  + Modify `boundary_print` to output "(NULL)" when b->file is NULL as follows:

boundary_print(location.c:149)
```c
static int
boundary_print (boundary const *b, FILE *out)
{
  const char *tmp_filename = b->file ? b->file : "(NULL)"; /*---------------- 
append ----------------*/
  return fprintf (out, "%s:%d.%d@%d",
                  quotearg_n_style (3, escape_quoting_style, tmp_filename),
                  b->line, b->column, b->byte);
}
```


Best regards,
Momoko
________________________________
差出人: Collin Funk <[email protected]>
送信日時: 2025年10月19日 7:24
宛先: Momoko Shiraishi <[email protected]>
CC: [email protected] <[email protected]>
件名: Re: NULL pointer dereference in boundary_print

Momoko Shiraishi <[email protected]> writes:

> ### poc file
> ```yacc
> %poc p[
>
> %%
> ```
> ## Output
> ### ASanLog
>
> ```
> =================================================================
> ==8983==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 
> 0x575f010633b5 bp 0x7ffc35704cf0 sp 0x7ffc35704ba0 T0)
> ==8983==The signal is caused by a READ memory access.
> ==8983==Hint: address points to the zero page.
>     #0 0x575f010633b5 in quotearg_buffer_restyled lib/quotearg.c:393
>     #1 0x575f01064175 in quotearg_n_options lib/quotearg.c:899
>     #2 0x575f010645f1 in quotearg_n_style lib/quotearg.c:950
>     #3 0x575f00f89bd6 in boundary_print src/location.c:149
>     #4 0x575f00f89dcd in location_print src/location.c:164
>     #5 0x575f00fc6b35 in yy_symbol_print src/parse-gram.c:1390
>     #6 0x575f00fcfcf3 in gram_parse src/parse-gram.c:3099
>     #7 0x575f00fe9753 in reader src/reader.c:766
>     #8 0x575f00f92b6a in main src/main.c:118
>     #9 0x702b6e1c0d8f in __libc_start_call_main 
> ../sysdeps/nptl/libc_start_call_main.h:58
>     #10 0x702b6e1c0e3f in __libc_start_main_impl ../csu/libc-start.c:392
>     #11 0x575f00f47594 in _start (/usr/local/bin/bison+0x36594)
>
> AddressSanitizer can not provide additional info.
> SUMMARY: AddressSanitizer: SEGV lib/quotearg.c:393 in quotearg_buffer_restyled
> ==8983==ABORTING
> ```

This doesn't occur for me from the git repository or bison-3.8.2. I see
the following, using your './configure' invocation:


    $ ./src/bison ~/input
    /home/collin/input:1.1-4: error: invalid directive: ‘%poc’
        1 | %poc p[
          | ^~~~
    /home/collin/input:3.1-2: error: invalid characters in bracketed name: ‘%%’
        3 | %%
          | ^~
    /home/collin/input:1.7-4.0: error: missing ‘]’ at end of file
        1 | %poc p[
          |       ^
    /home/collin/input:4: error: invalid character: ‘]’

Is this another AI report like the previous ones [1][2]? The formatting
looks suspiciously similar...

Collin

[1] https://lists.gnu.org/archive/html/bug-bison/2025-07/msg00009.html
[2] https://lists.gnu.org/archive/html/bug-bison/2025-07/msg00008.html

Reply via email to