Alaric Snell-Pym scripsit:

> How do you propose we deal with the issues of switching between char
> and u8 operations? I've seen some offhand talk on IRC that once a char
> operation has been used, a port is no good for binary any more, as the
> encoding may be nontrivial and involve some look-ahead; can this not
> be addressed by pushing bytes back into a buffer so a subsequent call
> to read-u8 will return them without calling the underlying custom-port
> read-u8? What is this evil encoding that makes it impossible to switch
> back to binary mode after reading characters? One that uses Huffman
> codes or something?

The bad case is ISO 2022 where multiple different encodings are used,
several at a time, and you use an escape sequence to switch from one to
another.  See the Wikipedia article for the incredibly grotty details.
I actually implemented this in the front end to FIGlet.

Anyhow, when you've read the last character in a run, have you
or have you not read the bytes of the escape sequence that
terminates the run?

> I ask because it's not uncommon to interleave binary and text in a
> stream. Many binary formats have embedded strings, for example. 

True, but the delimitation of the string is format-specific: fixed
length, counted, and NUL-terminated are all in use.  That's not
something a general-purpose package can easily handle.

-- 
John Cowan   http://ccil.org/~cowan   [email protected]
'My young friend, if you do not now, immediately and instantly, pull
as hard as ever you can, it is my opinion that your acquaintance in the
large-pattern leather ulster' (and by this he meant the Crocodile) 'will
jerk you into yonder limpid stream before you can say Jack Robinson.'
        --the Bi-Coloured-Python-Rock-Snake

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Reply via email to