On 7/17/2019 11:25 AM, Sławomir Osipiuk wrote:
“Transliteration”?
Maybe more generic that what you’re looking for. Used for the process
of producing the “machine readable zone” on passports:
https://www.icao.int/publications/Documents/9303_p3_cons_en.pdf (see
section 6, page 30)
“Accent folding” or “diacritic folding” is used in some places. String
folding is “A string transform F, with the property that repeated
applications of the same function F produce the same output: F(F(S)) =
F(S) for all input strings S”. Accent folding is a special case of that.
https://unicode.org/reports/tr23/#StringFunctionClassificationDefinitions
https://alistapart.com/article/accent-folding-for-auto-complete/
Diacritic folding. Thanks. Just didn't think of the operation as folding
the way it came up, but that's what it is.
A./
*From:*Unicode [mailto:[email protected]] *On Behalf Of
*Asmus Freytag via Unicode
*Sent:* Wednesday, July 17, 2019 13:38
*To:* Unicode Mailing List
*Subject:* Removing accents and diacritics from a word
A question has come up in another context:
Is there any linguistic term for describing the process of removing
accents and diacritics from a word to create its “base form”, e.g. São
Tomé to Sao Tome?
The linguistic term "string normalization" appears not that preferable
in a computing context.
Any ideas?
A./