Re: Removing accents and diacritics from a word

Asmus Freytag (c) via Unicode Wed, 17 Jul 2019 17:10:04 -0700

On 7/17/2019 11:25 AM, Sławomir Osipiuk wrote:

“Transliteration”?
Maybe more generic that what you’re looking for. Used for the processof producing the “machine readable zone” on passports:
https://www.icao.int/publications/Documents/9303_p3_cons_en.pdf (seesection 6, page 30)
“Accent folding” or “diacritic folding” is used in some places. Stringfolding is “A string transform F, with the property that repeatedapplications of the same function F produce the same output: F(F(S)) =F(S) for all input strings S”. Accent folding is a special case of that.
https://unicode.org/reports/tr23/#StringFunctionClassificationDefinitions

https://alistapart.com/article/accent-folding-for-auto-complete/

Diacritic folding. Thanks. Just didn't think of the operation as foldingthe way it came up, but that's what it is.

A./

*From:*Unicode [mailto:[email protected]] *On Behalf Of*Asmus Freytag via Unicode
*Sent:* Wednesday, July 17, 2019 13:38
*To:* Unicode Mailing List
*Subject:* Removing accents and diacritics from a word

A question has come up in another context:
Is there any linguistic term for describing the process of removingaccents and diacritics from a word to create its “base form”, e.g. SãoTomé to Sao Tome?
The linguistic term "string normalization" appears not that preferablein a computing context.
Any ideas?

A./

Re: Removing accents and diacritics from a word

Reply via email to