Pádraig Brady <[email protected]> writes:

> Thanks a lot for working on this.
>
> The idea is sound. I also biased towards using memchr() and strstr()
> with an update to cut(1) I'm working on, as the platform specific
> optimizations for those in glibc are a significant win.
>
> One of your observances confused me though.
> getc() and putchar() should also avoid function calls,
> and degenerate via inlining and  macros to directly manipulate stdio mem,
> periodically calling __uflow() and __overflow() respectively.
>
> Compiling uniq(1) on Fedora 43 here with default (-O2) options I see:
>
> $ yes $(yes eeeaae | head -n9 | paste -s -d,) | head -n1M > as.in
> $ ltrace -c uniq as.in
> eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae
> % time     seconds  usecs/call     calls      function
> ------ ----------- ----------- --------- --------------------
>  97.95   57.971874          55   1048575 memcmp
>   2.05    1.211896          75     16129 __uflow
> ...
>
> So function calls happen only once per 4KiB rather than per byte.
>
> BTW fwrite(...,1) is also similarly optimized,
> but var=1; fwrite(...,var) is not, so I'm also doing the attached
> in my cut(1) update to avoid the function call overhead.

This all makes sense, but looking at this part of the code:

> +          /* Buffer empty or not accessible.  Consume one byte via getc to
> +             trigger a refill, then loop back to use freadptr on the newly
> +             populated buffer.  */
> +          int c = getc (stream);

Wouldn't it be a bit easier, and a bit more safe, to buffer things
ourselves and just use read/write?

By that I mean, change 'struct linebuffer' to be something like this:

    struct linebuffer
    {
      char *buffer;        /* The current line.  */
      idx_t nalloc;        /* The number of bytes allocated for
                              BUFFER.  */
      idx_t eol;           /* The index of the delimiter in BUFFER. */
      char iobuf[BUFSIZE]; /* The buffer with read data.  */
      idx_t iobuf_used;    /* The number of bytes in IOBUF not yet
                              copied to BUFFER.  */
    };

Then allocate and copy to BUFFER as needed so that we read BUFSIZE bytes
at a time?

Collin

Reply via email to