Robert Elz wrote: > ps: f there was actually a desire to use dd to do re-buffering, the > correct usage is not to use "bs=" (which simply does read write with > a buffer that size) but "obs=" which reads (using ibs if needed, which it > would not be here), copies to an output buffer and writes only when that > buffer is full (or on EOF on the input).
If the goal is to minimize writes then it won't matter as long as the buffer size picked is larger than needed. Using the same buffer size for input and output is usually most efficient. First let's set it to something small to prove that it is buffering as expected. $ printf -- "%s\n" one two | strace -o /tmp/out -e read,write dd status=none bs=2 ; cat /tmp/out one two ... read(0, "on", 2) = 2 write(1, "on", 2) = 2 read(0, "e\n", 2) = 2 write(1, "e\n", 2) = 2 read(0, "tw", 2) = 2 write(1, "tw", 2) = 2 read(0, "o\n", 2) = 2 write(1, "o\n", 2) = 2 read(0, "", 2) = 0 +++ exited with 0 +++ Lots of reads and writes but all as expected. Or set just the output buffer size large. Then the input buffer size defaults to 512 bytes on my system. $ printf -- "%s\n" one two | strace -o /tmp/out -e write,read dd status=none obs=1M ; cat /tmp/out one two ... read(0, "one\ntwo\n", 512) = 8 read(0, "", 512) = 0 write(1, "one\ntwo\n", 8) = 8 +++ exited with 0 +++ But even if ibs is much too small it still behaves okay with a small input buffer size and a large output buffer size. $ printf -- "%s\n" one two | strace -o /tmp/out -e write,read dd status=none ibs=2 obs=1M ; cat /tmp/out one two ... read(0, "on", 2) = 2 read(0, "e\n", 2) = 2 read(0, "tw", 2) = 2 read(0, "o\n", 2) = 2 read(0, "", 2) = 0 write(1, "one\ntwo\n", 8) = 8 +++ exited with 0 +++ Then set both ibs and obs to be something quite large using bs= and let it gather up all of the input and write with that buffer size. $ printf -- "%s\n" one two | strace -o /tmp/out -e write,read dd status=none bs=1M ; cat /tmp/out one two ... read(0, "one\ntwo\n", 1048576) = 8 write(1, "one\ntwo\n", 8) = 8 read(0, "", 1048576) = 0 +++ exited with 0 +++ It seems to me that using a large buffer size for both read and write would be the most efficient. It can then use the same buffer that data was read into for the output buffer directly. Bob