On Fri Mar 28 09:18:06 2014, moritz wrote:
> On Wed Mar 05 06:28:49 2014, coke wrote:
> > On Tue Mar 04 12:56:48 2014, moritz wrote:
> > > <moritz> p6: say 'ß'.uc, 'ß'.tc, 'ß'.tclc
> > > <camelia> rakudo-jvm f2471a: OUTPUT«SSSSß␤»
> > > <camelia> ..rakudo-parrot f2471a, rakudo-moar f2471a: OUTPUT«ßßß␤»
> > > <camelia> ..niecza v24-109-g48a8de3: OUTPUT«ßSsSs␤»
> > >
> > > All these answers are wrong. 'ß'.uc is supposed to be 'SS' or
> > > possibly
> > > 'ẞ', and 'ß'.tc and 'ß'.tclc should both be 'Ss'
> >
> > Is this a unicode specified behavior (if so, can we have a URL for
> > posterity?)
> 
> Yes. http://www.unicode.org/versions/Unicode6.2.0/ch04.pdf refers to
> SpecialCasing.txt, and SpecialCasing.txt contains this:
> 
> ==
> # Format
> #
> ==============================================================================
> ==
> # The entries in this file are in the following machine-readable
> format:
> #
> # <code>; <lower> ; <title> ; <upper> ; (<condition_list> ;)? #
> <comment>
> #
> # <code>, <lower>, <title>, and <upper> provide character values in
> hex. If ther
> e is more
> # than one character, they are separated by spaces. Other than as used
> to separa
>  te
> # elements, spaces are to be ignored.
> 
> [...]
> # The German es-zed is special--the normal mapping is to SS.
> # Note: the titlecase should never occur in practice. It is equal to
> titlecase(uppercase(<es-zed>))
> 
> 00DF; 00DF; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S
> 

This is now implemented, and tests unfudged.

Reply via email to