The following issue has been SUBMITTED.
======================================================================
https://www.austingroupbugs.net/view.php?id=1858
======================================================================
Reported By: nabijaczleweli
Assigned To:
======================================================================
Project: 1003.1(2024)/Issue8
Issue ID: 1858
Category: System Interfaces
Type: Clarification Requested
Severity: Editorial
Priority: normal
Status: New
Name: наб
Organization:
User Reference:
Section: open_memstream
Page Number: 1526
Line Number: 51164-51176
Interp Status: ---
Final Accepted Text:
======================================================================
Date Submitted: 2024-09-19 12:25 UTC
Last Modified: 2024-09-19 12:25 UTC
======================================================================
Summary: open_memstream() doesn't say where the result lies
Description:
open_memstream() doesn't actually say where the result is. All
implementations (musl, NetBSD, the illumos gate, OpenBSD, FreeBSD) except
glibc are in agreement that it's in
[*bufp, *bufp + max-ever-ftell]
whereas glibc behaves like
[*bufp, *bufp + *sizep] and mangles the last byte to always be NUL
This stems from running the following program:
#include <stdio.h>
int main() {
char *dt;
size_t s;
FILE * f = open_memstream(&dt, &s);
fputs("gaming", f); // 1
fputs("tereftalan", f); // 2
fseek(f, 6, SEEK_SET); // 3
fputc('Q', f); // 4
fclose(f);
puts(dt);
printf("%zu\n", s);
}
which outputs ("gamingQereftalan", 7) on all implementations except glibc
which yields ("gamingQ", 7).
The relevant bit is ll. 51164-51176:
51164 The stream shall maintain a current position in the allocated buffer
and a current buffer length.
51165 The position shall be initially set to zero (the start of the
buffer). Each write to the stream shall
51166 start at the current position and move this position by the number
of successfully written bytes
51167 for open_memstream( ) or the number of successfully written wide
characters for
51168 open_wmemstream( ). The length shall be initially set to zero. If a
write moves the position to a
51169 value larger than the current length, the current length shall be
set to this position. In this case a
51170 null character for open_memstream( ) or a null wide character for
open_wmemstream( ) shall be
51171 appended to the current buffer. For both functions the terminating
null is not included in the
51172 calculation of the buffer length.
51173 After a successful fflush( ) or fclose( ), the pointer referenced by
bufp shall contain the address of
51174 the buffer, and the variable pointed to by sizep shall contain the
smaller of the current buffer
51175 length and the number of bytes for open_memstream( ), or the number
of wide characters for
51176 open_wmemstream( ), between the beginning of the buffer and the
current file position indicator.
During initial discussion
(https://sourceware.org/pipermail/libc-alpha/2024-September/159999.html) DJ
read the second paragraph to mean "the size of the output is the smaller of
...". It's important to note that it does not say this.
In fact, it doesn't say anything about where the [*bufp, ... range ends.
The reads I can imagine are, in ascending order:
[*bufp, *sizep) — this is obviously not it, because 51168-51172
explicitly mention both appending of and accounting for a NUL terminator;
if it weren't part of the output, then it wouldn't be mentioned
[*bufp, *sizep] — the page doesn't mention the buffer shrinking, ever,
so I'd say "no". but clearly glibc thinks this (and also mangles the last
byte)
[*bufp, max-ever-ftell) — also obviously not it for the same reason as
the first open range
[*bufp, max-ever-ftell] — this accounts for all behaviours described in
the manual, and agrees with every implementation that isn't glibc
>From glibc's behaviour of altering (*bufp)[*sizep] (and DJ's comments), we
can interpret that it considers it to be a part of the output and that the
output ends there, so [*bufp, *sizep].
>From everyone else's behaviour of not doing this, we can interpret that
they consider the buffer to be [*bufp, max-ever-ftell]-sized (in this case,
per 51168-51172, the last byte is always NUL by definition, but that
doesn't change anything).
If the output range is [*bufp, *sizep] then if fflush(f) was called after
each numbered line the result is:
1. ("gaming\0", 6)
2. ("gamingtereftalan\0", 16)
3. ("gamingt", 6)
4. ("gamingQe", 7)
If the output range is [*bufp, max-ever-ftell] then if fflush(f) was called
after each numbered line the result is:
1. ("gaming\0", 6)
2. ("gamingtereftalan\0", 16)
3. ("gamingtereftalan\0", 6)
4. ("gamingQereftalan\0", 7)
glibc does a variant of the former that also changes the last byte of the
output to NUL. As far I can tell, it is not allowed to do this (it can only
add NULs after writes that extend the buffer (51168-51172), so after 1. and
2.).
Thus, IMO, the only consensus implementation that doesn't already violate
the standard text is everyone-sans-glibc.
The EXAMPLE avoids touching this (presumably because, per 0001406, it at
least nominally tries to mirror glibc), and always seeks to a position
that's identical to the former max-ever-ftell (which, if it didn't, would
be broken on glibc).
Desired Action:
Please interpret what the output range is.
======================================================================
Issue History
Date Modified Username Field Change
======================================================================
2024-09-19 12:25 nabijaczleweli New Issue
2024-09-19 12:25 nabijaczleweli Name => наб
2024-09-19 12:25 nabijaczleweli Section => open_memstream
2024-09-19 12:25 nabijaczleweli Page Number => 1526
2024-09-19 12:25 nabijaczleweli Line Number => 51164-51176
======================================================================