Re: [Scheme-reports] DISCUSSION/VOTE: The character tower

John Cowan Tue, 06 May 2014 15:53:34 -0700

Bear scripsit:

> I would be rather upset if 
> 
> (string=? (string #\A #\x301) (string #\xc1)) ==> #f
> 
> These strings have the same value, and if string=? does not detect 
> it, I would say that string=? has a bug in its implementation.


You may not like the specification, but it requires the value #f
in this case, at least if the procedure call completes without an error.

> At the very least, string=? in that case is not an implementation of
> any string comparison conforming with the Unicode standard.

It is and it isn't.  It's a mistake to think that Unicode requires that
applications be unable to differentiate between canonically equivalent
forms.  Rather, what Unicode requires is that Application A not expect
that Application B can distinguish between them.

At one time, R7RS-small had string-ni=? and friends, which were meant
for normalization-independent comparisons.  WG1 removed them from the
small language, but something like them may appear in the large language.

> Likewise I would be upset if 
> 
> (= (string-length (string #\A #\x301)) 
>    (string-length (string #\xc1))) ==> #f
> 
> for the same reason.  

Same story:  #f is required.

-- 
John Cowan          http://www.ccil.org/~cowan        [email protected]
If I have not seen as far as others, it is because giants were standing
on my shoulders.  --Hal Abelson

_______________________________________________
Scheme-reports mailing list
[email protected]
http://lists.scheme-reports.org/cgi-bin/mailman/listinfo/scheme-reports

Re: [Scheme-reports] DISCUSSION/VOTE: The character tower

Reply via email to