Vincent Lefevre <vinc...@vinc17.net> writes: > "perldoc perlfaq4" gives in UTF-8 locales
> [...] > The trick to this problem is avoiding accidental autovivification. If > you want to check three keys deep, you might na<EF>vely try this: > where <EF> is actually the EF byte as shown by the "less" pager. This is an interesting bug. I'm going to have to dig in a bit to figure out what's going on here. The POD source has naE<0xEF>vely, and for some reason the output is ISO 8859-1 instead of UTF-8. The underlying formatting module is Pod::Text, and it defaults to using the same output character set as the input character set, which in this case is not specified. I think there may be an old default in play. Pod::Man breaks here in a different way because it interprets the diaeresis as a German umlaut and assumes you can just stick an e after it if you don't have umlauts available. My understanding is that this German umlaut conversion is only correct for ä, ö, and ü, not for ï (which I don't believe is a character in German, at least from some quick searching). I think this may be a very long-standing bug, although there's a deeper problem that one cannot assume German umlaut rules. It depends very much on the source language. > This should be encoded in UTF-8. However, this is a spelling mistake: > contrary to French, there is no ï in English (at least, my dictionaries > cannot find such a variant): naively. naïvely (and naïve) are correct alternate spellings in English. English historically uses a diaeresis to indicate that two adjacent vowels form separate syllables rather than a diphthong. This is one of the only "native" accept marks in the English language, which otherwise only uses accept marks in loan words and tends to drop them. It's common in modern English writing to drop the diaeresis, in part because US English keyboards tend to make typing them difficult, so both usages are now accepted, but there is a school of thought that the version with the diaeresis is more correct. The New Yorker famously insists on diaereses in its house style, even going so far as to use coöperate when every other publication has switched to cooperate: https://www.merriam-webster.com/words-at-play/mary-norris-diaeresis The other place you'll sometimes see diaereses in English is with proper names such as Chloë or Zoë. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>