[ 
https://issues.apache.org/jira/browse/LUCENE-9939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacob Lauritzen updated LUCENE-9939:
------------------------------------
    Status: Open  (was: Patch Available)

> Proper ASCII folding of Danish/Norwegian characters Ø, Å
> --------------------------------------------------------
>
>                 Key: LUCENE-9939
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9939
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>            Reporter: Jacob Lauritzen
>            Priority: Minor
>              Labels: easyfix, patch, patch-available
>         Attachments: LUCENE-9939.patch
>
>
> The current version of the ASCIIFoldingFilter sets Å, å to A, a and Ø, ø to 
> O, o which I believe is incorrect.
> Å was added by Norway as a replacement for the Aa (which is mapped to aa in 
> the AsciiFoldingFilter) in 1917 and by Denmark in 1948. Aa is still used in a 
> lot of names (as an example the second largest city in Denmark was originally 
> named Aarhus, renamed to Århus in 1948 and named back to AArhus in 2010 for 
> internationalization purposes).
> The story of Ø is similar. It's equivalent to Œ (which is mapped to oe), not 
> ö (which is mapped to o) and is generally mapped to oe in ascii text.
> The third Danish character Æ is already properly mapped to AE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to