Re: [r6rs-discuss] Proposed features for small Scheme, part 4B: case-sensitivity

John Cowan Wed, 09 Sep 2009 01:01:21 -0700

Arthur Gleck scripsit:

> If someone's case-sensitive code imports `FOO' from a case-folding
> library that exports it as `foo', should it find it?


I'd say no.  Since Unicode case-folding is basically downcasing (for
reasons explained in my Unicode post), the situation of case-preserving
code importing names from case-folding code is straightforward:
normal names without escape sequences arrive as lower case.  The only
circumstance in which case-folding code can export names with uppercase
letters is when the case-folding code contains escape sequences:
see below.

Shiro Kawai scripsit:

> For example, what will you suggest a natural interface to
> import Xlib?  It defines symbols like XK_A and XK_a,
> corresponding user's key input of uppercase and lowercase
> 'a', respectively.
> 
> We can use xk_capital_a and xk_small_a, or xk_\x41; and xk_a,
> respectively.  

[Note: I changed 65 decimal to 41 hex above]

I think the latter observation is the key to the solution.  Given
the presence of escape sequences in Thing One names, all Thing One
implementations, whether in case-preserving or case-folding mode, will
be able to support identifiers with both upper and lower case characters
in them.

That means that the symbols whose print names are "Foo" and "foo" are
always distinct as identifiers.  It's just that in case-folding mode,
the first is spelled \x46;oo or \x46;OO (or variants thereof), whereas
the latter is spelled foo or FOO or Foo (or variants thereof).

In case-preserving mode, though, foo, Foo, and FOO spell distinct
identifiers, known to case-folding mode as foo or Foo or FOO, \x46;oo
or \x46OO, and \x46;\x4F;\x4F, respectively.  Yes, the last is horrible;
but if case-folding code needs to import names from case-preserving code,
at least it's *possible*.

An alternative is to adopt some form of Common Lisp's |...| convention
for symbols.  All characters within vertical bars are part of the symbol
(only \ and | must be escaped) and no case-folding is done.  In this
way, the case-folded names of the case-preserving identifiers foo, Foo,
and FOO are simply foo or FOO, |Foo| and |FOO|.

The R6RS team rejected this idea, which is already implemented in some
Schemes (Chicken that I know of) because it can make S-expressions hard to
read: look at (|(| |foo bar| |)|), for example, a list of three symbols
whose print names are "(", "foo bar", and ")".  But perhaps, given the
persistence of R4RS/IEEE/R5RS case-folding code, and the extreme ugliness
of the Unicode escapes as the sole mechanism, it ought to be reconsidered
for Thing One and Thing Two.

> Case-folding symbol comparison seems to me to bring exceptions
> into rules.  The fact that it is very rare makes situations only
> worse---people are tempted to make programs that ignore rare cases,
> which work 99% of the time but break unexpectedly.

I would say instead that the rules, though fixed, have very messy
consequences in certain cases.

-- 
Said Agatha Christie / To E. Philips Oppenheim  John Cowan
"Who is this Hemingway? / Who is this Proust?   [email protected]
Who is this Vladimir / Whatchamacallum,         http://www.ccil.org/~cowan
This neopostrealist / Rabble?" she groused.
        --George Starbuck, Pith and Vinegar

_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Re: [r6rs-discuss] Proposed features for small Scheme, part 4B: case-sensitivity

Reply via email to