Lapo Luchini wrote: > Zack Weinberg wrote: >> The //IGNORE and //TRANSLIT features are glibc / GNU libiconv >> specific, but I would have thought that they were available in recent >> Gentoo (they've been around since 2001 give or take). > >> Many systems have an iconv(1) command line utility that may be helpful >> here. > > Uh, right, but writing a "known good UTF-8 string" escaped on the > command line seems a bit trickier to me... no, not really. > > % echo "\xC2\xB7" | iconv -f UTF-8 -t CP1252//IGNORE//TRANSLIT > · (that is, the correct and converted U+00B7 MIDDLE DOT) > % echo "\xC2\xB7" | iconv -f UTF-8 -t ASCII//IGNORE//TRANSLIT > . > % echo "\xC3\x80" | iconv -f UTF-8 -t CP1252//IGNORE//TRANSLIT > À (that is, correct U+00C0 LATIN CAPITAL LETTER A WITH GRAVE) > % echo "\xC3\x80" | iconv -f UTF-8 -t ASCII//IGNORE//TRANSLIT > `A > > Derek (or anyonelse with Gentoo), what do you get with these?
OK, I managed to reproduce it here at work with a Fedora box, it's a really braindead iconv: % echo "\xC3\x80" | iconv -f UTF-8 -t ASCII//IGNORE//TRANSLIT iconv: illegal input sequence at position 3 % echo "\xC3\x80" | iconv -f UTF-8 -t ASCII//IGNORE iconv: illegal input sequence at position 3 % echo "\xC3\x80" | iconv -f UTF-8 -t ASCII//TRANSLIT ? So the "solution" on those hosts would be to use only //TRANSLIT: but that's a partial solution anyway, as not everything can be transliterated. E.g. the japanese "po" katakana (U+30DD): on FreeBSD, with libiconv 1.9.2: % echo "\xE3\x83\x9D" | iconv -f UTF-8 -t ASCII//IGNORE//TRANSLIT % echo "\xE3\x83\x9D" | iconv -f UTF-8 -t ASCII//IGNORE % echo "\xE3\x83\x9D" | iconv -f UTF-8 -t ASCII//TRANSLIT iconv: (stdin): cannot convert on Fedora, with libiconv bundled inside libc: % echo "\xE3\x83\x9D" | iconv -f UTF-8 -t ASCII//IGNORE//TRANSLIT iconv: illegal input sequence at position 4 % echo "\xE3\x83\x9D" | iconv -f UTF-8 -t ASCII//IGNORE iconv: illegal input sequence at position 4 % echo "\xE3\x83\x9D" | iconv -f UTF-8 -t ASCII//TRANSLIT ? There isn't any form that do something useful on both. =( I'll take a better look at the problem probably this evening. _______________________________________________ Monotone-devel mailing list Monotone-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/monotone-devel