On 7/17/2019 11:25 AM, Sławomir Osipiuk wrote:

“Transliteration”?

Maybe more generic that what you’re looking for. Used for the process of producing the “machine readable zone” on passports:

https://www.icao.int/publications/Documents/9303_p3_cons_en.pdf (see section 6, page 30)

“Accent folding” or “diacritic folding” is used in some places. String folding is “A string transform F, with the property that repeated applications of the same function F produce the same output: F(F(S)) = F(S) for all input strings S”. Accent folding is a special case of that.

https://unicode.org/reports/tr23/#StringFunctionClassificationDefinitions

https://alistapart.com/article/accent-folding-for-auto-complete/

Diacritic folding. Thanks. Just didn't think of the operation as folding the way it came up, but that's what it is.

A./


*From:*Unicode [mailto:[email protected]] *On Behalf Of *Asmus Freytag via Unicode
*Sent:* Wednesday, July 17, 2019 13:38
*To:* Unicode Mailing List
*Subject:* Removing accents and diacritics from a word

A question has come up in another context:

Is there any linguistic term for describing the process of removing accents and diacritics from a word to create its “base form”, e.g. São Tomé to Sao Tome?

The linguistic term "string normalization" appears not that preferable in a computing context.

Any ideas?

A./




Reply via email to