On Thu, 18 Apr 2024, Martin D Kealey wrote:
On Wed, 17 Apr 2024, Chet Ramey wrote:
It has to do it that way to find the delimiter on a non-seekable file
descriptor, since it has to leave everything it didn't consume
available on stdin.
Has anyone tried asking any of the kernel teams (Linux, BSD, or other)
to add a new system call such as readln() or readd()?
You mean, specifically in order to implement a slightly-more-efficient
'read' builtin in the shell?
I envisage this working like stty cooked mode works on a tty, except it
would also work on files, pipes, and sockets: you'd get back *at most*
as many bytes as you ask for, but you may get fewer if a delimiter is
found. The delimiter is consumed (and returned in the buffer), but
everything following a delimiter is left available for a subsequent
read.
One downside is you'd end up with a system call for each token, which is
only a little bit better than the 'read' builtin read(2)'ing 1 byte at a
time. If your program / shell script is going to be processing a long
stream of tokens, it's just going to be more efficient to read(2) a block
at a time and do the tokenizing in userspace. And with any luck you can
find an appropriate command line utility to do that for you, rather than
relying on the shell's 'read' builtin.
(Or for your own programs, use getline(3)/getdelim(3), as Chet mentioned.)
Carl