Chris Hanson scripsit:

> John, I'm guessing you'd know: is there a way to detect character
> encoding in R6RS programs?

I'd be surprised if there were, since R6RS lexical syntax is defined
only at the level of characters.

> I've always been impressed with the way XML handled this, even though
> it's a kludge.

XML's scheme is indeed clever, although there are encodings it can't
handle, such as the fiendish (but fortunately fictitious) US-BSCII,
which is the same as US-ASCII except for the four encodings 0x41:U+0042,
0x42:U+0041, 0x61:U+0062, 0x62:U+0061.  Unfortunately, since "us-bscii"
in US-BSCII has the same bytes as "us-ascii" in US-ASCII, there's no
way for an XML processor to distinguish them.

My blog post at
http://recycledknowledge.blogspot.com/2005/07/hello-i-am-xml-encoding-sniffer.html
gives an operational semantics for the level of XML encoding detection
that actually works.

-- 
The Imperials are decadent, 300 pound   John Cowan <[email protected]>
free-range chickens (except they have   http://www.ccil.org/~cowan
teeth, arms instead of wings, and
dinosaurlike tails).                        --Elyse Grasso

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Reply via email to