On Thu, Apr 16, 2015 at 9:50 AM, Greg Wooledge <wool...@eeg.ccf.org> wrote: > I don't see why such features should be compiled into bash's read builtin. > I'd have no problem with adding better splitting/joining/parsing features > in a more general context, probably operating on a string variable, but > certainly not operating on a file descriptor.
I don't think they should be part of `read` either. Some way to extend the BASH_REMATCH mechanism would be better. > > Doesn't the underlying C library only guarantee you a single character of > lookahead when reading? (Or maybe a single byte. I'm way out of date. > My knowledge of C comes from the days when char = byte.) You can't do > all this fancy perl-RE-style lookahead stuff on a stream with only a > single byte/char of lookahead. Hm, maybe you're referring to ungetc? IIRC one byte is the only guarantee when dealing with pipes. I don't really care about having it pattern match while reading a stream. To make that work well would probably involve mmap (and even then, only on regular files). Probably the most portable way to support "fancier" regex is to call into std::regex. Any system with a modern C++ compiler should support ECMAScript regex, which is close to a superset of ERE.