Alaric Snell-Pym scripsit: > How do you propose we deal with the issues of switching between char > and u8 operations? I've seen some offhand talk on IRC that once a char > operation has been used, a port is no good for binary any more, as the > encoding may be nontrivial and involve some look-ahead; can this not > be addressed by pushing bytes back into a buffer so a subsequent call > to read-u8 will return them without calling the underlying custom-port > read-u8? What is this evil encoding that makes it impossible to switch > back to binary mode after reading characters? One that uses Huffman > codes or something?
The bad case is ISO 2022 where multiple different encodings are used, several at a time, and you use an escape sequence to switch from one to another. See the Wikipedia article for the incredibly grotty details. I actually implemented this in the front end to FIGlet. Anyhow, when you've read the last character in a run, have you or have you not read the bytes of the escape sequence that terminates the run? > I ask because it's not uncommon to interleave binary and text in a > stream. Many binary formats have embedded strings, for example. True, but the delimitation of the string is format-specific: fixed length, counted, and NUL-terminated are all in use. That's not something a general-purpose package can easily handle. -- John Cowan http://ccil.org/~cowan [email protected] 'My young friend, if you do not now, immediately and instantly, pull as hard as ever you can, it is my opinion that your acquaintance in the large-pattern leather ulster' (and by this he meant the Crocodile) 'will jerk you into yonder limpid stream before you can say Jack Robinson.' --the Bi-Coloured-Python-Rock-Snake _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
