Re: g_locale_from_utf8 and minus sign

2008-12-29 Thread Allin Cottrell
On Mon, 29 Dec 2008, Behdad Esfahbod wrote:

> Matthias Clasen wrote:
> > On Mon, Dec 29, 2008 at 2:57 PM, Allin Cottrell  wrote:
> >
> >> I judged it a devel issue because it raised a question about
> >> whether glib was doing the right thing in not converting U+2212 to
> >> "the nearest" character in ISO-8859-1, 0x2D.  However, I accept an
> >> offlist response from Dom Lachowicz, namely that these are
> >> different characters and so glib is right not to convert.
> >
> > This is really a question about iconv behaviour, since glib doesn't do
> > its own conversion. And I guess when you as the iconv developers about
> > this, they
> > will tell you that iconv is not about guessing the 'nearest'
> > character, but rather
> > about recoding characters from one coded character set to another. A
> > hyphen is not the same character as a minus, thus iconv won't recode
> > the latter to the former, even if they look similar on paper. I agree
> > that it would be more useful
> > if iconv _would_ do what you expected it to do...
>
> The glibc implementation of iconv actually does that if you nicely ask it to:
>
> $ echo − | iconv -f utf8 -t latin1
> iconv: illegal input sequence at position 0
> $ echo − | iconv -f utf8 -t latin1//translit
> -
>
> Should glib try the //translit version first?  I think so.  That's what I made
> vte do, and filed this bug about it:
>
>   http://bugzilla.gnome.org/show_bug.cgi?id=502951

Sounds like a good idea.  But I notice there has been no progress
on that bug since over a year ago.  Might it do better as a
feature request, if there is such a thing?  (Simply on the grounds
that calling for the default behavior of iconv is unlikely to be
considered a bug, even if one could do better with a different
call.)

Allin Cottrell

___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: g_locale_from_utf8 and minus sign

2008-12-29 Thread Behdad Esfahbod
Matthias Clasen wrote:
> On Mon, Dec 29, 2008 at 2:57 PM, Allin Cottrell  wrote:
> 
>> I judged it a devel issue because it raised a question about
>> whether glib was doing the right thing in not converting U+2212 to
>> "the nearest" character in ISO-8859-1, 0x2D.  However, I accept an
>> offlist response from Dom Lachowicz, namely that these are
>> different characters and so glib is right not to convert.
> 
> This is really a question about iconv behaviour, since glib doesn't do
> its own conversion. And I guess when you as the iconv developers about
> this, they
> will tell you that iconv is not about guessing the 'nearest'
> character, but rather
> about recoding characters from one coded character set to another. A
> hyphen is not the same character as a minus, thus iconv won't recode
> the latter to the former, even if they look similar on paper. I agree
> that it would be more useful
> if iconv _would_ do what you expected it to do...

The glibc implementation of iconv actually does that if you nicely ask it to:

$ echo − | iconv -f utf8 -t latin1
iconv: illegal input sequence at position 0
$ echo − | iconv -f utf8 -t latin1//translit
-

Should glib try the //translit version first?  I think so.  That's what I made
vte do, and filed this bug about it:

  http://bugzilla.gnome.org/show_bug.cgi?id=502951

behdad

> Matthias

___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: g_locale_from_utf8 and minus sign

2008-12-29 Thread Matthias Clasen
On Mon, Dec 29, 2008 at 2:57 PM, Allin Cottrell  wrote:

>
> I judged it a devel issue because it raised a question about
> whether glib was doing the right thing in not converting U+2212 to
> "the nearest" character in ISO-8859-1, 0x2D.  However, I accept an
> offlist response from Dom Lachowicz, namely that these are
> different characters and so glib is right not to convert.

This is really a question about iconv behaviour, since glib doesn't do
its own conversion. And I guess when you as the iconv developers about
this, they
will tell you that iconv is not about guessing the 'nearest'
character, but rather
about recoding characters from one coded character set to another. A
hyphen is not the same character as a minus, thus iconv won't recode
the latter to the former, even if they look similar on paper. I agree
that it would be more useful
if iconv _would_ do what you expected it to do...

Matthias
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: g_locale_from_utf8 and minus sign

2008-12-29 Thread Allin Cottrell
On Mon, 29 Dec 2008, Bastien Nocera wrote:

> On Sun, 2008-12-28 at 16:17 -0500, Allin Cottrell wrote:
> > When my app displays numerical output, I've been using a "real"
> > minus sign (U+2212) if the current font supports this (as checked
> > by pango), since it looks better than the usual hyphen-as-minus.
> >
> > The minus sign displays correctly within GTK, but I've noticed
> > that if the app is running on, e.g. an ISO-8859-1 platform, so
> > that output has to be recoded for saving to disk or copying to the
> > clipboard, g_locale_from_utf8 chokes on the minus sign, giving
> >
> > "Invalid byte sequence in conversion input"
> >
> > I'm not sure if this is exactly a bug, but shouldn't U+2212 get
> > successfully mapped onto character 45 (0x002D) in ISO-8859-1?
>
> You should use g_filename_from_utf8() to convert to filenames.

I'm not converting to filenames, I'm converting text output.

> This is an application development question though, and you should use
> the gtk-app-devel-list.

I judged it a devel issue because it raised a question about
whether glib was doing the right thing in not converting U+2212 to
"the nearest" character in ISO-8859-1, 0x2D.  However, I accept an
offlist response from Dom Lachowicz, namely that these are
different characters and so glib is right not to convert.

Allin Cottrell
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: g_locale_from_utf8 and minus sign

2008-12-29 Thread Bastien Nocera
On Sun, 2008-12-28 at 16:17 -0500, Allin Cottrell wrote:
> When my app displays numerical output, I've been using a "real"
> minus sign (U+2212) if the current font supports this (as checked
> by pango), since it looks better than the usual hyphen-as-minus.
> 
> The minus sign displays correctly within GTK, but I've noticed
> that if the app is running on, e.g. an ISO-8859-1 platform, so
> that output has to be recoded for saving to disk or copying to the
> clipboard, g_locale_from_utf8 chokes on the minus sign, giving
> 
> "Invalid byte sequence in conversion input"
> 
> I'm not sure if this is exactly a bug, but shouldn't U+2212 get
> successfully mapped onto character 45 (0x002D) in ISO-8859-1?

You should use g_filename_from_utf8() to convert to filenames.

This is an application development question though, and you should use
the gtk-app-devel-list.

Cheers

___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


g_locale_from_utf8 and minus sign

2008-12-28 Thread Allin Cottrell
When my app displays numerical output, I've been using a "real"
minus sign (U+2212) if the current font supports this (as checked
by pango), since it looks better than the usual hyphen-as-minus.

The minus sign displays correctly within GTK, but I've noticed
that if the app is running on, e.g. an ISO-8859-1 platform, so
that output has to be recoded for saving to disk or copying to the
clipboard, g_locale_from_utf8 chokes on the minus sign, giving

"Invalid byte sequence in conversion input"

I'm not sure if this is exactly a bug, but shouldn't U+2212 get
successfully mapped onto character 45 (0x002D) in ISO-8859-1?

-- 
Allin Cottrell
Department of Economics
Wake Forest University

___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list