OK, but if I, as a German, were to search for München in a context where I only had ASCII characters available, I would type Muenchen.
Il giorno gio 18 lug 2019 alle ore 22:23 Asmus Freytag (c) < [email protected]> ha scritto: > On 7/18/2019 1:08 PM, Walter Tross wrote: > > Please remember that diacritics carry information. > > That goes without saying, The context is for a situation like the one > where you might need to allow someone to enter a word without accents (e.g. > because they don't have the right keyboard). > > In Italian, e.g., where the grave or acute accent is almost always at the > end of words, this information is preserved, when transliterating, by > removing the accent and appending an apostrophe, like in però→pero' (pero > would be a different word). E.g., my father-in-law has Nicolo' instead of > Nicolò on his credit card. > In German, ä, ö and ü are transliterated as ae, oe and ue. E.g., the > portal of München (Munich) is https://www.muenchen.de/ > Etc. > > whether to fold the umlauts using the added "e" or just the base letter, > or doing both, would depend on the circumstance. > > This is not about preserving information, but enabling access/search from > an approximation of the full word. > > A./ > > > > Il giorno gio 18 lug 2019 alle ore 02:09 Asmus Freytag (c) via Unicode < > [email protected]> ha scritto: > >> On 7/17/2019 11:25 AM, Sławomir Osipiuk wrote: >> >> “Transliteration”? >> >> Maybe more generic that what you’re looking for. Used for the process of >> producing the “machine readable zone” on passports: >> >> https://www.icao.int/publications/Documents/9303_p3_cons_en.pdf (see >> section 6, page 30) >> >> >> >> “Accent folding” or “diacritic folding” is used in some places. String >> folding is “A string transform F, with the property that repeated >> applications of the same function F produce the same output: F(F(S)) = F(S) >> for all input strings S”. Accent folding is a special case of that. >> >> https://unicode.org/reports/tr23/#StringFunctionClassificationDefinitions >> >> https://alistapart.com/article/accent-folding-for-auto-complete/ >> >> Diacritic folding. Thanks. Just didn't think of the operation as folding >> the way it came up, but that's what it is. >> >> A./ >> >> >> >> >> >> >> *From:* Unicode [mailto:[email protected] >> <[email protected]>] *On Behalf Of *Asmus Freytag via Unicode >> *Sent:* Wednesday, July 17, 2019 13:38 >> *To:* Unicode Mailing List >> *Subject:* Removing accents and diacritics from a word >> >> >> >> A question has come up in another context: >> >> Is there any linguistic term for describing the process of removing >> accents and diacritics from a word to create its “base form”, e.g. São Tomé >> to Sao Tome? >> >> The linguistic term "string normalization" appears not that preferable in >> a computing context. >> >> Any ideas? >> >> A./ >> >> >> >> >> >

