Sharp-S character treated incorrectly in UNICODE_CI_AI collation
----------------------------------------------------------------
Key: CORE-4136
URL: http://tracker.firebirdsql.org/browse/CORE-4136
Project: Firebird Core
Issue Type: Bug
Components: Charsets/Collation
Affects Versions: 2.5.2 Update 1
Environment: Tested on Windows7 Pro, Firebird 2.5.2-upd1
Reporter: Stefan Heymann
The UNICODE_CI_AI collation treats the Sharp-s character (U+00DF) incorrectly.
This character (used in German language text) is special in that there is only
a lower-case form, no upper-case (having derived from a ligature between a long
and a round lowercase "s". Forget about U+1E9E, which is an abstract invention
by the Unicode consortium that has no practical use in German language).
To reproduce the bug, try this on a UTF8 database:
select
case when 'Übergeek' collate unicode_ci_ai like 'ÜB%' collate unicode_ci_ai
then '=' else '<>' end as test_1,
case when 'Übergeek' collate unicode_ci_ai like 'üb%' collate unicode_ci_ai
then '=' else '<>' end as test_2,
case when 'Fußball' collate unicode_ci_ai like 'fu%' collate unicode_ci_ai
then '=' else '<>' end as test_3,
case when 'Fußball' collate unicode_ci_ai like 'fuß%' collate unicode_ci_ai
then '=' else '<>' end as test_4,
case when upper ('Fußball') like upper ('fuß%')
then '=' else '<>' end as test_5
from rdb$database
TEST_4 will show a mismatch where it should show a match.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://tracker.firebirdsql.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
------------------------------------------------------------------------------
This SF.net email is sponsored by Windows:
Build for Windows Store.
http://p.sf.net/sfu/windows-dev2dev
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel