On 2010-05-09 08:41, Emmanuel Rodriguez wrote: > Gtk2-Perl also returns all strings in UTF-8. If you want to see if a > string is in UTF-8 or not I suggest that you use Devel::Peek [1] > > > my $button = new Gtk2::Button( "print" ); > my $entry = new Gtk2::Entry; > $button->signal_connect( "clicked",sub {print $entry->get_text,"\n"} ); > > I'm certain that $entry->get_text returns an UTF-8 string. I'm guessing > that the problem lies on the encoding of STDOUT.
I am not concerned with the Perl-internal representation of the string (looking with Devel::Peek as you suggested, it seems that internally it is indeed stored as utf8 data)- when I access the data, I get 8-bit latin1 data, long before any file IO is involved. In the example above, when I print the data using print map( {sprintf("%X ", $_) } unpack("C*", $entry->get_text )), "\n" and enter "öä" (LATIN SMALL LETTER O WITH DIAERESIS, LATIN SMALL LETTER A WITH DIAERESIS) I get "F6 E4". I just managed to get an old system up and running again (Perl 5.8.5, Gtk2-Perl 1.144) and there the same little program will print "C3 B6 C3 A4". It's not like I prefer one over the other, but I am pretty sure that this would be the case on all older systems (my original program that brought up this issue was many years old; in this program there is a comparison between a string that is known to be ISO-8859-1 and a string from GTK; to get correct results, I always had to explicitly convert the latter). So obviously there was a change somewhere, and I would like to know when and where this change occurred so I can adjust my program that it will equally run on old and new systems (my current solution to check if the data I get looks like utf8 is quite insane, because the result does not depend on user input but on the system the program is running on ...) Regards, Peter _______________________________________________ gtk-perl-list mailing list gtk-perl-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-perl-list