Pádraig Brady <[email protected]> writes:

> Note building with -O3 gave the following warning
> which should be fixed, but which I've not looked at:
>   In function 'mbfile_multi_getc',
>       inlined from 'fold_file' at src/fold.c:164:7:
>   ./lib/mbfile.h:235:14: error: writing 1 byte into a region of size 0 
> [-Werror=stringop-overflow=]
>     235 |           *p = *(p + bytes);
>         |              ^
>   ./lib/mbfile.h: In function 'fold_file':
>   ./lib/mbfile.h:82:8: note: at offset [36, 4294967268] into destination 
> object 'buf' of size 4
>      82 |   char buf[MBCHAR_BUF_SIZE];
>         |        ^

Thanks, I didn't compile with -O3. I'll have a look at it.

> Re performance, it's good, but it would be great if
> we could maintain the LC_ALL=C performance like the i18n patch does.
> Some quick testing shows:
>
>   $ yes `seq 100` | head -n 1M > file.in
>
>   # Note /bin/fold has the the (Fedora) i18n patch applied
>   $ for L in en_US.UTF-8 C; do
>       for FOLD in src/fold src/fold-c /bin/fold; do
>         printf "LC_ALL=$L $FOLD: "
>         time LC_ALL=$L $FOLD < file.in | wc -l
>       done
>     done
>
>   LC_ALL=en_US.UTF-8 src/fold: 4194304
>   real        0m1.046s
>   LC_ALL=en_US.UTF-8 src/fold-c: 4194304
>   real        0m8.294s
>   LC_ALL=en_US.UTF-8 /bin/fold: 4194304
>   real        0m11.556s
>   LC_ALL=C src/fold: 4194304
>   real        0m0.979s
>   LC_ALL=C src/fold-c: 4194304
>   real        0m8.277s
>   LC_ALL=C /bin/fold: 4194304
>   real        0m0.976s
>
> I.e. we beat the i18n patch implementation,
> but we don't shortcut the LC_ALL=C case.

Good point. I think it is worth some extra code size and duplication to
preserve the current speed using LC_ALL=C. Maybe something like this:

    if (STREQ_OPT (locale_charset (), "ASCII",
                   'A', 'S', 'C', 'I', 'I', 0, 0, 0, 0))
      fold_file (...)
    else
      fold_file_multibyte (...)

Where fold_file doesn't have to care about the extra information
contained in mb_file_t and mbf_char_t.

> BTW I've some general notes on i18n in coreutils at:
> https://www.pixelbeat.org/docs/coreutils_i18n/

Thanks! That is useful.

Collin

Reply via email to