Re: g_locale_from_utf8 and minus sign
On Mon, 29 Dec 2008, Behdad Esfahbod wrote: > Matthias Clasen wrote: > > On Mon, Dec 29, 2008 at 2:57 PM, Allin Cottrell wrote: > > > >> I judged it a devel issue because it raised a question about > >> whether glib was doing the right thing in not converting U+2212 to > >> "the nearest" character in ISO-8859-1, 0x2D. However, I accept an > >> offlist response from Dom Lachowicz, namely that these are > >> different characters and so glib is right not to convert. > > > > This is really a question about iconv behaviour, since glib doesn't do > > its own conversion. And I guess when you as the iconv developers about > > this, they > > will tell you that iconv is not about guessing the 'nearest' > > character, but rather > > about recoding characters from one coded character set to another. A > > hyphen is not the same character as a minus, thus iconv won't recode > > the latter to the former, even if they look similar on paper. I agree > > that it would be more useful > > if iconv _would_ do what you expected it to do... > > The glibc implementation of iconv actually does that if you nicely ask it to: > > $ echo − | iconv -f utf8 -t latin1 > iconv: illegal input sequence at position 0 > $ echo − | iconv -f utf8 -t latin1//translit > - > > Should glib try the //translit version first? I think so. That's what I made > vte do, and filed this bug about it: > > http://bugzilla.gnome.org/show_bug.cgi?id=502951 Sounds like a good idea. But I notice there has been no progress on that bug since over a year ago. Might it do better as a feature request, if there is such a thing? (Simply on the grounds that calling for the default behavior of iconv is unlikely to be considered a bug, even if one could do better with a different call.) Allin Cottrell ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_locale_from_utf8 and minus sign
Matthias Clasen wrote: > On Mon, Dec 29, 2008 at 2:57 PM, Allin Cottrell wrote: > >> I judged it a devel issue because it raised a question about >> whether glib was doing the right thing in not converting U+2212 to >> "the nearest" character in ISO-8859-1, 0x2D. However, I accept an >> offlist response from Dom Lachowicz, namely that these are >> different characters and so glib is right not to convert. > > This is really a question about iconv behaviour, since glib doesn't do > its own conversion. And I guess when you as the iconv developers about > this, they > will tell you that iconv is not about guessing the 'nearest' > character, but rather > about recoding characters from one coded character set to another. A > hyphen is not the same character as a minus, thus iconv won't recode > the latter to the former, even if they look similar on paper. I agree > that it would be more useful > if iconv _would_ do what you expected it to do... The glibc implementation of iconv actually does that if you nicely ask it to: $ echo − | iconv -f utf8 -t latin1 iconv: illegal input sequence at position 0 $ echo − | iconv -f utf8 -t latin1//translit - Should glib try the //translit version first? I think so. That's what I made vte do, and filed this bug about it: http://bugzilla.gnome.org/show_bug.cgi?id=502951 behdad > Matthias ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_locale_from_utf8 and minus sign
On Mon, Dec 29, 2008 at 2:57 PM, Allin Cottrell wrote: > > I judged it a devel issue because it raised a question about > whether glib was doing the right thing in not converting U+2212 to > "the nearest" character in ISO-8859-1, 0x2D. However, I accept an > offlist response from Dom Lachowicz, namely that these are > different characters and so glib is right not to convert. This is really a question about iconv behaviour, since glib doesn't do its own conversion. And I guess when you as the iconv developers about this, they will tell you that iconv is not about guessing the 'nearest' character, but rather about recoding characters from one coded character set to another. A hyphen is not the same character as a minus, thus iconv won't recode the latter to the former, even if they look similar on paper. I agree that it would be more useful if iconv _would_ do what you expected it to do... Matthias ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_locale_from_utf8 and minus sign
On Mon, 29 Dec 2008, Bastien Nocera wrote: > On Sun, 2008-12-28 at 16:17 -0500, Allin Cottrell wrote: > > When my app displays numerical output, I've been using a "real" > > minus sign (U+2212) if the current font supports this (as checked > > by pango), since it looks better than the usual hyphen-as-minus. > > > > The minus sign displays correctly within GTK, but I've noticed > > that if the app is running on, e.g. an ISO-8859-1 platform, so > > that output has to be recoded for saving to disk or copying to the > > clipboard, g_locale_from_utf8 chokes on the minus sign, giving > > > > "Invalid byte sequence in conversion input" > > > > I'm not sure if this is exactly a bug, but shouldn't U+2212 get > > successfully mapped onto character 45 (0x002D) in ISO-8859-1? > > You should use g_filename_from_utf8() to convert to filenames. I'm not converting to filenames, I'm converting text output. > This is an application development question though, and you should use > the gtk-app-devel-list. I judged it a devel issue because it raised a question about whether glib was doing the right thing in not converting U+2212 to "the nearest" character in ISO-8859-1, 0x2D. However, I accept an offlist response from Dom Lachowicz, namely that these are different characters and so glib is right not to convert. Allin Cottrell ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: g_locale_from_utf8 and minus sign
On Sun, 2008-12-28 at 16:17 -0500, Allin Cottrell wrote: > When my app displays numerical output, I've been using a "real" > minus sign (U+2212) if the current font supports this (as checked > by pango), since it looks better than the usual hyphen-as-minus. > > The minus sign displays correctly within GTK, but I've noticed > that if the app is running on, e.g. an ISO-8859-1 platform, so > that output has to be recoded for saving to disk or copying to the > clipboard, g_locale_from_utf8 chokes on the minus sign, giving > > "Invalid byte sequence in conversion input" > > I'm not sure if this is exactly a bug, but shouldn't U+2212 get > successfully mapped onto character 45 (0x002D) in ISO-8859-1? You should use g_filename_from_utf8() to convert to filenames. This is an application development question though, and you should use the gtk-app-devel-list. Cheers ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
g_locale_from_utf8 and minus sign
When my app displays numerical output, I've been using a "real" minus sign (U+2212) if the current font supports this (as checked by pango), since it looks better than the usual hyphen-as-minus. The minus sign displays correctly within GTK, but I've noticed that if the app is running on, e.g. an ISO-8859-1 platform, so that output has to be recoded for saving to disk or copying to the clipboard, g_locale_from_utf8 chokes on the minus sign, giving "Invalid byte sequence in conversion input" I'm not sure if this is exactly a bug, but shouldn't U+2212 get successfully mapped onto character 45 (0x002D) in ISO-8859-1? -- Allin Cottrell Department of Economics Wake Forest University ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list