On Tuesday, Mar 4, 2003, at 07:59 Asia/Tokyo, David Oftedal wrote:
Hello!

Sorry to make this a mass spam, but I need a program to convert UTF-8 to hex sequences. This is useful for embedding text in non-UTF web pages, but also for creating a Yudit keymap file, which I'm doing at the moment.

For example, a file with the content æøå would yield the output "0x00E6 0X00F8 0X00E5", and the Japanese expression あの人 would yield "0x3042 0x306E 0x4EBA".

Can anyone tell me how to do it without making a program for it myself? It would be VERY helpful, and I've already made 2 programs for assembling this file and I'm not starting on another just yet.

Perl 5.8 allows you to do so in one liner;


perl -MEncode -ple '$_=join(" ",map {sprintf "0x%04X", $_} unpack("U*", decode("utf8",$_)))'

A more descriptive script is as follows;

#
use strict;
use Encode;
while(<>){
        chomp $_;
        my $line = decode("utf8" => $_);
        my (@chars) = unpack("U*" => $line);
        my (@hexed) = map {sprintf "0x%04X", $_} @chars;
        my $hexed   = join(" " => @hexed);
        print $hexed, "\n";
}
__END__

Even funkier example.

#
package Encode::Hex;
use strict;
use base qw(Encode::Encoding);
__PACKAGE__->Define('hex');
sub encode($$;$){
    my ($obj, $str, $chk) = @_;
    my @hexed =
        map {$_ == ord("\n") ? chr($_) : sprintf "0x%04X", $_}
            unpack("U*" => $str);
    $_[1] = '' if $chk;
    return join(" " => @hexed);
}
package main;
binmode STDIN => ":utf8";
binmode STDOUT => ":encoding(hex)";
while(<>){
    chomp;
    print $_, "\n";
}
__END__

Dan the (Perl5 Porter|Encode Maintainer)



Reply via email to