On Tue, 2009-02-24 at 00:14 -0800, Guillermo J. Rozas wrote:
> I'm not going to try to convince you of that one (it is not a
> technical issue),
> but likewise, please agree that there is no unsurmountable technical
> issue with having a case-insensitive language, and thus, for those
> of us who find that important (due to tradition, preference, or
> whatever),
> the technical arguments that have been raised are spurious,
> and, sorry to say, 'naive'.
I explained, and argued for, a method of implementing a
case-insensitive language during the R6RS process. I see
that now the Unicode Consortium has blessed what is
essentially the same method.
It would treat as equivalent characters which are part of
a reciprocal case pair of single codepoints; that is, if
you have distinct characters 'x' and 'X', both represented
in Unicode as single codepoints, and the preferred uppercase
mapping of 'x' is 'X', and the preferred lowercase mapping
of 'X' is 'x' (in a default "code" locale), you'd treat
them as "the same" for purposes of identifier comparison.
This rule implicitly excludes characters with compatibility
decompositions, since they are never the preferred uppercase
or preferred lowercase of the characters which are their own
preferred lowercase or uppercase forms. It also excludes
accented characters whose accented versions don't exist
in the opposite case, since there one of the "reciprocal
pair" is not expressible as a single codepoint.
This (simple reciprocal pair requirement) sidesteps most
issues of case-folding strangeness in unicode by treating
characters which have no single counterpart that forms a
case pair, or whose case reciprocal is a different number of
codepoints from themselves, as not being amenable to case
folding at all for purposes of symbol/identifier identity.
Unicode already had case-mapping stability rules in place
which assured that no identifiers unique by this method
under any Unicode V3+ would be merged under any future
version. I see that those stability rules have been
strengthened in Unicode V5.
This would have preserved the R5RS case-insensitive semantics,
insofar as possible under Unicode, without introducing new
problems for any existing code.
But it would, undeniably, have been more complex to implement,
and although most people rejected a locale-sensitive semantics
out of hand, no one wanted to bless a single locale as being
"the code locale" for Scheme. There was widespread lack of
agreement that a default or "code locale" could be a
politically neutral choice, since no single choice could
conform to the case expectations of users of many different
human languages.
When it emerged that a large-ish code base had been converted
to run under case sensitivity within a matter of hours without
problems (and that people considered it worthwhile enough to
actually commit to doing this work), that many major
implementations already supported and preferred a case-sensitive
mode of operation, that the technical requirements of widely-used
methods of interfacing with libraries written in other languages
were facilitated by case-sensitive operation, and that the results
of the poll heavily favored case-sensitive behavior (more than
2/3 supermajority, as I recall) I understood it to mean that
there was community support for a _change_ in the language
above and beyond the changes strictly necessary for the adoption
of Unicode, and (somewhat reluctantly) stopped advocating the
case-insensitive model.
I do not advocate revisiting this decision in R7RS. Although
I was on the "losing" side with my proposal to preserve
case-insensitivity, what I was doing was arguing for a way to
avoid unnecessary change, especially since I saw that change
as likely to cause a rift and hard feelings within the community
(which unfortunately it has, in alienating a minority of our
user base).
The change to case sensitive semantics was not technically
necessary. But at this point a change back would be equally
unnecessary, and probably even more harmful to the community,
especially since the majority wanted it the way it is now
and presumably still does.
Bear
_______________________________________________
r6rs-discuss mailing list
[email protected]
http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss