> Regexp(6) handles "characters" that are runes. perhaps the man page is misleading. rune in this context means utf-8. see regexp(2). all the functions take char*s.
> I wonder if Plan9 developers, when trying to design a way towards some > localization, have ever thought of bytes (octets) regexp, that is using > regexp with not rune but octets strings (maybe UTF-8 as is) allowing to > use regexp with binary too, not only newline terminated chunks etc.? one of the points of plan 9 was to standardize on one character set, utf-8. imho, localization and character set aren't related unless one is dealing with 8859-x overlays or some other character set insufficient to represent the range of languages. however, sam and acme allow for structured regular expressions, and are generally not line oriented: http://doc.cat-v.org/bell_labs/structural_regexps/se.pdf and iirc, cinap has written a cifs bit that uses a bit of binary matching. - erik