Re: [r6rs-discuss] Case sensitivity

Ray Dillinger Tue, 24 Feb 2009 12:42:38 -0800

On Tue, 2009-02-24 at 00:14 -0800, Guillermo J. Rozas wrote:

> I'm not going to try to convince you of that one (it is not a  
> technical issue),
> but likewise, please agree that there is no unsurmountable technical
> issue with having a case-insensitive language, and thus, for those
> of us who find that important (due to tradition, preference, or  
> whatever),
> the technical arguments that have been raised are spurious,
> and, sorry to say, 'naive'.


I explained, and argued for, a method of implementing a 
case-insensitive language during the R6RS process.  I see 
that now the Unicode Consortium has blessed what is 
essentially the same method.

It would treat as equivalent characters which are part of 
a reciprocal case pair of single codepoints; that is, if 
you have distinct characters 'x' and 'X', both represented 
in Unicode as single codepoints, and the preferred uppercase
mapping of 'x' is 'X', and the preferred lowercase mapping 
of 'X' is 'x'  (in a default "code" locale), you'd treat 
them as "the same" for purposes of identifier comparison.  

This rule implicitly excludes characters with compatibility 
decompositions, since they are never the preferred uppercase
or preferred lowercase of the characters which are their own
preferred lowercase or uppercase forms.  It also excludes 
accented characters whose accented versions don't exist 
in the opposite case, since there one of the "reciprocal
pair" is not expressible as a single codepoint. 

This (simple reciprocal pair requirement) sidesteps most 
issues of case-folding strangeness in unicode by treating
characters which have no single counterpart that forms a 
case pair, or whose case reciprocal is a different number of
codepoints from themselves, as not being amenable to case 
folding at all for purposes of symbol/identifier identity.  

Unicode already had case-mapping stability rules in place 
which assured that no identifiers unique by this method  
under any Unicode V3+ would be merged under any future 
version.  I see that those stability rules have been 
strengthened in Unicode V5. 

This would have preserved the R5RS case-insensitive semantics,
insofar as possible under Unicode, without introducing new 
problems for any existing code.  

But it would, undeniably, have been more complex to implement, 
and although most people rejected a locale-sensitive semantics
out of hand, no one wanted to bless a single locale as being 
"the code locale" for Scheme. There was widespread lack of 
agreement that a default or "code locale" could be a 
politically neutral choice, since no single choice could 
conform to the case expectations of users of many different 
human languages.

When it emerged that a large-ish code base had been converted 
to run under case sensitivity within a matter of hours without 
problems (and that people considered it worthwhile enough to 
actually commit to doing this work), that many major 
implementations already supported and preferred a case-sensitive 
mode of operation, that the technical requirements of widely-used 
methods of interfacing with libraries written in other languages 
were facilitated by case-sensitive operation, and that the results 
of the poll heavily favored case-sensitive behavior (more than 
2/3 supermajority, as I recall) I understood it to mean that 
there was community support for a _change_ in the language 
above and beyond the changes strictly necessary for the adoption 
of Unicode, and (somewhat reluctantly) stopped advocating the 
case-insensitive model.

I do not advocate revisiting this decision in R7RS.  Although 
I was on the "losing" side with my proposal to preserve
case-insensitivity, what I was doing was arguing for a way to 
avoid unnecessary change, especially since I saw that change 
as likely to cause a rift and hard feelings within the community
(which unfortunately it has, in alienating a minority of our 
user base).  

The change to case sensitive semantics was not technically 
necessary.  But at this point a change back would be equally
unnecessary, and probably even more harmful to the community, 
especially since the majority wanted it the way it is now 
and presumably still does. 

                                Bear



_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss

Re: [r6rs-discuss] Case sensitivity

Reply via email to