Pádraig Brady <[email protected]> writes:
> Thanks a lot for working on this.
>
> The idea is sound. I also biased towards using memchr() and strstr()
> with an update to cut(1) I'm working on, as the platform specific
> optimizations for those in glibc are a significant win.
>
> One of your observances confused me though.
> getc() and putchar() should also avoid function calls,
> and degenerate via inlining and macros to directly manipulate stdio mem,
> periodically calling __uflow() and __overflow() respectively.
>
> Compiling uniq(1) on Fedora 43 here with default (-O2) options I see:
>
> $ yes $(yes eeeaae | head -n9 | paste -s -d,) | head -n1M > as.in
> $ ltrace -c uniq as.in
> eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae,eeeaae
> % time seconds usecs/call calls function
> ------ ----------- ----------- --------- --------------------
> 97.95 57.971874 55 1048575 memcmp
> 2.05 1.211896 75 16129 __uflow
> ...
>
> So function calls happen only once per 4KiB rather than per byte.
>
> BTW fwrite(...,1) is also similarly optimized,
> but var=1; fwrite(...,var) is not, so I'm also doing the attached
> in my cut(1) update to avoid the function call overhead.
This all makes sense, but looking at this part of the code:
> + /* Buffer empty or not accessible. Consume one byte via getc to
> + trigger a refill, then loop back to use freadptr on the newly
> + populated buffer. */
> + int c = getc (stream);
Wouldn't it be a bit easier, and a bit more safe, to buffer things
ourselves and just use read/write?
By that I mean, change 'struct linebuffer' to be something like this:
struct linebuffer
{
char *buffer; /* The current line. */
idx_t nalloc; /* The number of bytes allocated for
BUFFER. */
idx_t eol; /* The index of the delimiter in BUFFER. */
char iobuf[BUFSIZE]; /* The buffer with read data. */
idx_t iobuf_used; /* The number of bytes in IOBUF not yet
copied to BUFFER. */
};
Then allocate and copy to BUFFER as needed so that we read BUFSIZE bytes
at a time?
Collin