On Thursday, February 07, 2013 4:12:22 pm Jilles Tjoelker wrote:
> On Tue, Feb 05, 2013 at 03:46:43PM -0500, John Baldwin wrote:
> > I've written an implementation of open_memstream() and
> > open_wmemstream() along with a set of regression tests.  I'm pretty
> > sure open_memstream() is correct, and I believe open_wmemstream() is
> > correct for expected usage.  The latter might even do the right thing
> > if you split a multi-byte character across multiple writes.  One
> > question I have is if my choice to discard any pending multi-byte
> > state in the stream anytime a seek changes the effective position in
> > the output stream.  I think this is correct as stdio will flush any
> > pending data before doing a seek, so if there is a partially parsed
> > character we aren't going to get the rest of it.
> 
> I don't think partially parsed characters can happen with a correct
> application. As per C99, an application must not call byte output
> functions on a wide-oriented stream, and vice versa.

Oh, interesting.  Should I remove those tests for using byte output
functions from the open_wmemstream test?

> Discarding the shift state on fseek()/fseeko() is permitted (but should
> be documented as this is implementation-defined behaviour).
> State-dependent encodings (where this is relevant) are rarely used
> nowadays.
> 
> The conversion to bytes and back probably makes open_wmemstream() quite
> slow but I don't think that is very important.

Note that we do this already for swprintf().  I've added this note:

IMPLEMENTATION NOTES
     Internally all I/O streams are effectively byte-oriented, so using wide-
     oriented operations to write to a stream opened via open_wmemstream()
     results in wide characters being expanded to a stream of multibyte char-
     acters in stdio's internal buffers.  These multibyte characters are then
     converted back to wide characters when written into the stream.  As a
     result, the wide-oriented streams maintain an internal multibyte charac-
     ter conversion state that is cleared on any seek opertion that changes
     the current position.  This should have no effect unless byte-oriented
     output operations are used on a wide-oriented stream.
 
> > http://www.FreeBSD.org/~jhb/patches/open_memstream.patch
> 
> The seek functions should check for overflow in the addition (for
> SEEK_CUR and SEEK_END) and the conversion to size_t.

Note that SEEK_CUR is effectively only called with an offset of 0
via __ftello().  I'm not sure of the best way to check for overflow, do I have 
to do something like this:

static int
will_overflow(size_t off, fpos_t pos)
{
        if (pos < 0)
                return (-pos > off);
        return (SIZE_MAX - off > pos);
}

For SEEK_CUR I would test 'will_overflow(ms->offset, pos)' and
for SEEK_END 'will_overflow(ms->len, pos)'

Presumably SEEK_SET should test for the requested position being
beyond SIZE_MAX as well?

If my above test is ok then I end up with this:

static int
will_overflow(size_t offset, fpos_t pos)
{
        if (pos < 0) {
                if (-pos > off) {
                        errno = EINVAL;
                        return (1);
                }
        } else {
                if (SIZE_MAX - off > pos) {
                        errno = EOVERFLOW;
                        return (1);
                }
        }
        return (0);
}

static fpos_t
memstream_seek(void *cookie, fpos_t pos, int whence)
{
        struct memstream *ms;
#ifdef DEBUG
        size_t old;
#endif

        ms = cookie;
#ifdef DEBUG
        old = ms->offset;
#endif
        switch (whence) {
        case SEEK_SET:
                if (pos > SIZE_MAX) {
                        errno = EOVERFLOW;
                        return (-1);
                }
                ms->offset = pos;
                break;
        case SEEK_CUR:
                if (will_overflow(ms->offset, pos))
                        return (-1);
                ms->offset += pos;
                break;
        case SEEK_END:
                if (will_overflow(ms->len, pos))
                        return (-1);
                ms->offset = ms->len + pos;
                break;
        }
        memstream_update(ms);
#ifdef DEBUG
        fprintf(stderr, "MS: seek(%p, %zd, %d) %zd -> %zd\n", ms, pos, whence,
            old, ms->offset);
#endif
        return (ms->offset);
}

Thanks for reviewing this so far.

-- 
John Baldwin
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Reply via email to