Bug#847051: Question and possibly bugs about string encoding in gtk-perl

2016-12-20 Thread Dominique Dumont
On Tuesday, 20 December 2016 16:38:52 CET Boyuan Yang wrote:
> This version is in Debian unstable now.
> 
> I tested with some non-utf8 locale (zh_CN.GB18030, es_SV aka iso-8859-1, 
> zh_TW.BIG5) and UTF-8 locale. None of them crashed, so at least this NMU is 
> not damaging too much. We may need some more fixes, though.

As mentioned by Torsten, the name returned by get_name is  encoded in utf8 or 
in x11-compound-text. This does not depend on locale.

Although I don't know which context may lead get_name to return x11-compound-
text. I can only guess that depends on the window manager.

Anyway, I'd rather not leave the patch as-is in Debian, so I'll create a new 
nmu (*) package that implements Torsten's suggestion.

All the best

(*) note to perl-gtk people: "nmu" is Debian jargon, don't worry about it.
-- 
 https://github.com/dod38fr/   -o- http://search.cpan.org/~ddumont/
http://ddumont.wordpress.com/  -o-   irc: dod at irc.debian.org



Bug#847051: Question and possibly bugs about string encoding in gtk-perl

2016-12-20 Thread Boyuan Yang
在 2016年12月15日星期四 CST 下午8:54:38,Dominique Dumont 写道:
> And I'm worried that shutter may crash if used in a non-utf8 environment.

This version is in Debian unstable now.

I tested with some non-utf8 locale (zh_CN.GB18030, es_SV aka iso-8859-1, 
zh_TW.BIG5) and UTF-8 locale. None of them crashed, so at least this NMU is 
not damaging too much. We may need some more fixes, though.

--
Sincerely,
Boyuan Yang

signature.asc
Description: This is a digitally signed message part.


Bug#847051: Question and possibly bugs about string encoding in gtk-perl

2016-12-18 Thread Dominique Dumont
On Friday, 16 December 2016 21:34:55 CET Torsten Schönfeld wrote:
> So it seems like the safest bet would be to try to decode the window 
> name from UTF8, and if that fails, try Encode::X11 and its 
> 'x11-compound-text' (Hi Kevin!).

ok, that makes sense. 

For the record, here's what I'm going to use:

my $raw_name = $win->get_name;
my $name;
eval { $name = decode( 'UTF-8' , $raw_name, 1); };
$name = decode( 'x11-compound-text', $raw_name ) if $@;

Many thanks for the help :-)

All the best
-- 
 https://github.com/dod38fr/   -o- http://search.cpan.org/~ddumont/
http://ddumont.wordpress.com/  -o-   irc: dod at irc.debian.org



Bug#847051: Question and possibly bugs about string encoding in gtk-perl

2016-12-16 Thread Torsten Schönfeld

On 16.12.2016 20:02, Dominique Dumont wrote:

On Wednesday, 14 December 2016 10:54:16 CET Boyuan Yang wrote:

The original messy output, as indicated in screenshot in the Ubuntu bug,
looks  like treating a latin-1-encoded binary data as UTF-8-encoded data
and showing them anyway.


In more details, the problematic code boils down to:

  my $wnck_screen = Gnome2::Wnck::Screen->get_default;
  my $win= $wnck_screen->get_windows_stacked; # Gnome2::Wnck object
  my $name = $win->get_name;
  my $window_item = Gtk2::ImageMenuItem->new_with_label( $name );

$name contains window name apparently in octet format instead of an utf8
string. As a consequence, the list of windows shown by shutter contains
mojibake.


Yes, Gnome2::Wnck does not do any decoding at all in get_name.  This is 
probably by accident, not by intention, but if you look at what 
wnck_window_get_name boils down to, 
, you 
see that it returns UTF8-encoded strings in most cases, but sometimes 
also strings in X11's compound text encoding, according to 
.


So it seems like the safest bet would be to try to decode the window 
name from UTF8, and if that fails, try Encode::X11 and its 
'x11-compound-text' (Hi Kevin!).



The hacky patch proposed (by me) is using
Encode::_utf8_on() to turn on the internal flag for string and mark it as
UTF-8.


And I'm worried that shutter may crash if used in a non-utf8 environment.

After some experimentation, I've come up with a safer solution:

  use 5.12.0;
  use Encode::Locale;
  use Encode qw/decode/;
  # ...
  my $name = decode( 'locale' , $win->get_name);

This works in utf8 locale and is safer than turning utf8 flag on.


It seems like this would fail in a non-UTF8 locale, however, as 
gnome_wnck_window_get_name seems to always return UTF8 strings, 
regardless of locale.




Bug#847051: Question and possibly bugs about string encoding in gtk-perl

2016-12-16 Thread Dominique Dumont
[ 2nd try ]

On Wednesday, 14 December 2016 10:54:16 CET Boyuan Yang wrote:
> The original messy output, as indicated in screenshot in the Ubuntu bug,
> looks  like treating a latin-1-encoded binary data as UTF-8-encoded data
> and showing them anyway. 

In more details, the problematic code boils down to:
 
 my $wnck_screen = Gnome2::Wnck::Screen->get_default;
 my $win= $wnck_screen->get_windows_stacked; # Gnome2::Wnck object
 my $name = $win->get_name;
 my $window_item = Gtk2::ImageMenuItem->new_with_label( $name );

$name contains window name apparently in octet format instead of an utf8 
string. As a consequence, the list of windows shown by shutter contains 
mojibake. 

> The hacky patch proposed (by me) is using
> Encode::_utf8_on() to turn on the internal flag for string and mark it as
> UTF-8.

And I'm worried that shutter may crash if used in a non-utf8 environment.

After some experimentation, I've come up with a safer solution:

 use 5.12.0;
 use Encode::Locale;
 use Encode qw/decode/;
 # ...
 my $name = decode( 'locale' , $win->get_name);

This works in utf8 locale and is safer than turning utf8 flag on.

That said, shouldn't this decoding work be done in Gnome2::Wnck ?

All the best

-- 
 https://github.com/dod38fr/   -o- http://search.cpan.org/~ddumont/
http://ddumont.wordpress.com/  -o-   irc: dod at irc.debian.org



Bug#847051: Question and possibly bugs about string encoding in gtk-perl

2016-12-15 Thread Dominique Dumont
On Wednesday, 14 December 2016 10:54:16 CET Boyuan Yang wrote:
> The original messy output, as indicated in screenshot in the Ubuntu bug,
> looks  like treating a latin-1-encoded binary data as UTF-8-encoded data
> and showing them anyway. 

In more details, the problematic code boils down to:
 
 my $wnck_screen = Gnome2::Wnck::Screen->get_default;
 my $win= $wnck_screen->get_windows_stacked; # Gnome2::Wnck object
 my $name = $win->get_name;
 my $window_item = Gtk2::ImageMenuItem->new_with_label( $name );

$name contains window name apparently in octet format instead of an utf8 
string. As a consequence, the list of windows shown by shutter contains 
mojibake. 

> The hacky patch proposed (by me) is using
> Encode::_utf8_on() to turn on the internal flag for string and mark it as
> UTF-8.

And I'm worried that shutter may crash if used in a non-utf8 environment.

After some experimentation, I've come up with a safer solution:

 use 5.12.0;
 use Encode::Locale;
 use Encode qw/decode/;
 # ...
 my $name = decode( 'locale' , $win->get_name);

This works in utf8 locale and is safer than turning utf8 flag on.

That said, shouldn't this decoding work be done in Gnome2::Wnck ?

All the best

-- 
 https://github.com/dod38fr/   -o- http://search.cpan.org/~ddumont/
http://ddumont.wordpress.com/  -o-   irc: dod at irc.debian.org