>

On Tue, Jul 9, 2024 at 10:39 PM Peter Maydell <peter.mayd...@linaro.org>
> wrote:


> > +#define MAX_CHUNK_SIZE (128 * 1024 * 1024)
>
> I think we could add a comment here, something like:
>
> /*
>  * Maximum size to fwrite() to the output file at once;
>  * the MSVCRT runtime will not correctly handle fwrite()
>  * of more than 4GB at once.
>  */

That will act as a reminder about why we do it.
>
>
Thanks, I agree.


> (Does the library fwrite fail for > 4GB, or for >= 4GB ?
> Your commit message says the former, so I've gone with that,
> but if it's an "overflows 32 bit variable" kind of bug then
> 4GB exactly probably also doesn't work.)
>


It fails for > 4GB.
The msvcrt.dll!fwrite(buff, (4G+0x1000), 1, file)  works as following:
(based on assembly, not the original source, irrelevant parts have been
omitted)

size_t fwrite(const void* buff, size_t element_size, size_t element_count,
FILE* file_p)
{
    size_t size_t_total_size = element_size * element_count;
    size_t size_t_remain_size = size_t_total_size;

    unsigned int u32_written_bytes;
    unsigned int buf_size;

    /* The register used is r12d but not r12.
     * So I suspect that Microsoft wrote it as an unsigned int type
     * (or msvc compiler bug? seems unlikely)
     */
    unsigned int u32_chunk_size;

    while (true) {

        if ((file_p->flags & 0x10C) != 0) {
            buf_size = file_p->buf_size;
        }
        else {
            // Always reaches here on the first fwrite() after fopen().
            buf_size = 4096;  // mov     r15d, 1000h
        }

        if (size_t_remain_size > buf_size) {

            u32_chunk_size = size_t_remain_size;

            if (buf_size) {

                // div ... ;  sub r12d,edx;   size_t stored into r12d ,
lost high 32 bits
                // u32_chunk_size = 0x100000FFF - 0x100000FFF % 0x1000
                //                            = 0x100000FFF - 0xFFF
                //                            = 0x1 0000 0000
                //                            = (u32) 0x1 0000 0000
                //                            = 0
                u32_chunk_size = size_t_remain_size - (size_t_remain_size %
buf_size);
            }

            //call _write() with zero size, returns 0
            u32_written_bytes = __write(file_p, data_buff, u32_chunk_size);

            // They didn't check if __write() returns 0.
            if (u32_written_bytes == -1 || u32_written_bytes <
u32_chunk_size) {
                return (size_t_total_size - size_t_remain_size) /
element_size;
            }

            //size_t_remain_size -= 0
            size_t_remain_size -= u32_written_bytes;
            buff += u32_written_bytes;

            //size_t_remain_size will never decrease to zero, then
while(true) loop forever.
            if (size_t_remain_size == 0) {
                return element_count;
            }
            // ...
        }
        else {
            // ...
        }
    }
    // ...
}

note:
1. The path of buggy msvcrt.dll is c:\windows\system32\msvcrt.dll( mingw64
links to it );
2. fwrite implementation in another msvc library which is called
ucrtbase.dll is correct(msvc links to it by default).



> Is there a particular reason to use 128MB here? If the
> runtime only fails on 4GB or more, maybe we should use
> a larger MAX_CHUNK_SIZE, like 2GB ?
>

According to current analysis, size <= 4GB all are safe, however there are
many
versions of msvcrt, this bug exists on Server 2008/2019/2022 and Windows
11(all
with full latest updates), and it may also exist in other versions, but it
is difficult to
check each version individually. I am not sure if all versions handle
boundary sizes
like 2GB/4GB correctly. So I prefer a relatively conservative value: 128MB.

Maybe we could use #ifdef _WIN32 to differentiate the handling between
Linux and
Windows. For Linux, it remains unchanged, while for Windows, it processes
by chunks
with max_chunk_sizeto 1GB.



> > +        while (offset < b->size) {
> > +            chunk_size = (b->size - offset > MAX_CHUNK_SIZE)
> > +                         ? MAX_CHUNK_SIZE
> > +                         : (b->size - offset);
>
> You can write this as
>      chunk_size = MIN(b->size - offset, MAX_CHUNK_SIZE);
> which I think is clearer. (Our osdep header provides MIN().)
>
> I think we should abstract out the bug workaround into a
> separate function, with the same API as fwrite(). Call
> it do_fwrite() or something, and make all the fwrite()
> calls use it. I know at the moment there's only two of
> them, and one of them is the header so never 4GB, but
> I think this more cleanly separates out the "work around
> a runtime library problem" part from the main logic of
> the program, and will mean that if we ever need to rearrange
> how we write out the data in future it will be simple.
>
>
Thanks, you are right!

Reply via email to