Re: cat(1): always use a 64K buffer
I think Todd is right, and this doesn't need to change. Todd C. Miller wrote: > On Sun, 12 Dec 2021 19:15:51 -0600, Scott Cheloha wrote: > > > cat(1) sizes its I/O buffer according to the st_blksize of the first > > file it processes. We don't do this very often in the tree. I'm not > > sure if we should trust st_blksize. > > It sizes the buffer based on st_blksize of stdout, not the input > file. Since st_blksize is the "optimal" blocksize for that device. > Since cat only has a single output, the output buffer only needs > to be sized once. > > > It would be simpler to just choose a value that works in practice and > > always use it. > > I prefer it the way it is now. By using st_blksize you get the > proper block size regardless of whether output is to a pipe, a file > or a tty. I didn't suggest this for tee since it supports multiple > output files. > > - todd >
Re: cat(1): always use a 64K buffer
On Sun, 12 Dec 2021 19:15:51 -0600, Scott Cheloha wrote: > cat(1) sizes its I/O buffer according to the st_blksize of the first > file it processes. We don't do this very often in the tree. I'm not > sure if we should trust st_blksize. It sizes the buffer based on st_blksize of stdout, not the input file. Since st_blksize is the "optimal" blocksize for that device. Since cat only has a single output, the output buffer only needs to be sized once. > It would be simpler to just choose a value that works in practice and > always use it. I prefer it the way it is now. By using st_blksize you get the proper block size regardless of whether output is to a pipe, a file or a tty. I didn't suggest this for tee since it supports multiple output files. - todd
Re: cat(1): always use a 64K buffer
On Mon, Dec 13, 2021 at 02:52:50AM -0600, Scott Cheloha wrote: > > On Dec 13, 2021, at 01:13, Otto Moerbeek wrote: > > > > On Sun, Dec 12, 2021 at 07:15:51PM -0600, Scott Cheloha wrote: > > > >> cat(1) sizes its I/O buffer according to the st_blksize of the first > >> file it processes. We don't do this very often in the tree. I'm not > >> sure if we should trust st_blksize. > >> > >> It would be simpler to just choose a value that works in practice and > >> always use it. > >> > >> 64K works well. We settled on that for tee(1) recently. > > > > Why are you also change the allocation to be per-file? You might as > > well go for a static buffer if you make it fixed size. > > Ottomalloc does canary checks when free(3) > is called if malloc_options is set up for it. > I always thought that was useful when > auditing my code. In this case (large allocation an exact multiple of the page size) any overflow will be caught by the MMU, independent of malloc options. So no big gain here to be gained from calling free. I'd say keep the original one time malloc call. -Otto
Re: cat(1): always use a 64K buffer
> On Dec 13, 2021, at 01:13, Otto Moerbeek wrote: > > On Sun, Dec 12, 2021 at 07:15:51PM -0600, Scott Cheloha wrote: > >> cat(1) sizes its I/O buffer according to the st_blksize of the first >> file it processes. We don't do this very often in the tree. I'm not >> sure if we should trust st_blksize. >> >> It would be simpler to just choose a value that works in practice and >> always use it. >> >> 64K works well. We settled on that for tee(1) recently. > > Why are you also change the allocation to be per-file? You might as > well go for a static buffer if you make it fixed size. Ottomalloc does canary checks when free(3) is called if malloc_options is set up for it. I always thought that was useful when auditing my code.
Re: cat(1): always use a 64K buffer
On Sun, Dec 12, 2021 at 07:15:51PM -0600, Scott Cheloha wrote: > cat(1) sizes its I/O buffer according to the st_blksize of the first > file it processes. We don't do this very often in the tree. I'm not > sure if we should trust st_blksize. > > It would be simpler to just choose a value that works in practice and > always use it. > > 64K works well. We settled on that for tee(1) recently. Why are you also change the allocation to be per-file? You might as well go for a static buffer if you make it fixed size. -Otto > > Thoughts? > > Index: cat.c > === > RCS file: /cvs/src/bin/cat/cat.c,v > retrieving revision 1.32 > diff -u -p -r1.32 cat.c > --- cat.c 24 Oct 2021 21:24:21 - 1.32 > +++ cat.c 13 Dec 2021 00:57:48 - > @@ -33,9 +33,6 @@ > * SUCH DAMAGE. > */ > > -#include > -#include > - > #include > #include > #include > @@ -45,7 +42,7 @@ > #include > #include > > -#define MAXIMUM(a, b)(((a) > (b)) ? (a) : (b)) > +#define BSIZE (64 * 1024) > > int bflag, eflag, nflag, sflag, tflag, vflag; > int rval; > @@ -221,26 +218,20 @@ raw_args(char **argv) > void > raw_cat(int rfd, const char *filename) > { > - int wfd; > + char *buf; > ssize_t nr, nw, off; > - static size_t bsize; > - static char *buf = NULL; > - struct stat sbuf; > + int wfd; > > wfd = fileno(stdout); > - if (buf == NULL) { > - if (fstat(wfd, &sbuf) == -1) > - err(1, "stdout"); > - bsize = MAXIMUM(sbuf.st_blksize, BUFSIZ); > - if ((buf = malloc(bsize)) == NULL) > - err(1, NULL); > - } > - while ((nr = read(rfd, buf, bsize)) != -1 && nr != 0) { > + if ((buf = malloc(BSIZE)) == NULL) > + err(1, NULL); > + while ((nr = read(rfd, buf, BSIZE)) != -1 && nr != 0) { > for (off = 0; nr; nr -= nw, off += nw) { > if ((nw = write(wfd, buf + off, nr)) == -1 || nw == 0) > err(1, "stdout"); > } > } > + free(buf); > if (nr == -1) { > warn("%s", filename); > rval = 1; >
cat(1): always use a 64K buffer
cat(1) sizes its I/O buffer according to the st_blksize of the first file it processes. We don't do this very often in the tree. I'm not sure if we should trust st_blksize. It would be simpler to just choose a value that works in practice and always use it. 64K works well. We settled on that for tee(1) recently. Thoughts? Index: cat.c === RCS file: /cvs/src/bin/cat/cat.c,v retrieving revision 1.32 diff -u -p -r1.32 cat.c --- cat.c 24 Oct 2021 21:24:21 - 1.32 +++ cat.c 13 Dec 2021 00:57:48 - @@ -33,9 +33,6 @@ * SUCH DAMAGE. */ -#include -#include - #include #include #include @@ -45,7 +42,7 @@ #include #include -#define MAXIMUM(a, b) (((a) > (b)) ? (a) : (b)) +#define BSIZE (64 * 1024) int bflag, eflag, nflag, sflag, tflag, vflag; int rval; @@ -221,26 +218,20 @@ raw_args(char **argv) void raw_cat(int rfd, const char *filename) { - int wfd; + char *buf; ssize_t nr, nw, off; - static size_t bsize; - static char *buf = NULL; - struct stat sbuf; + int wfd; wfd = fileno(stdout); - if (buf == NULL) { - if (fstat(wfd, &sbuf) == -1) - err(1, "stdout"); - bsize = MAXIMUM(sbuf.st_blksize, BUFSIZ); - if ((buf = malloc(bsize)) == NULL) - err(1, NULL); - } - while ((nr = read(rfd, buf, bsize)) != -1 && nr != 0) { + if ((buf = malloc(BSIZE)) == NULL) + err(1, NULL); + while ((nr = read(rfd, buf, BSIZE)) != -1 && nr != 0) { for (off = 0; nr; nr -= nw, off += nw) { if ((nw = write(wfd, buf + off, nr)) == -1 || nw == 0) err(1, "stdout"); } } + free(buf); if (nr == -1) { warn("%s", filename); rval = 1;