Jan Eden wrote: > > Hi, Hello,
> sorry for the lengthy post. > > I recently wrote a Perl script to convert 8-bit characters to LaTeX > commands. The first version (which works just fine) looks like this > (the ... indicates more lines to follow): Your regular expressions look like they are longer then 8 bits. > >#!/usr/bin/perl -pw > > > >s/â??/{\\glqq}/g; > >s/â??/{\\grqq}/g; > >s/á/\\'{a}/g; > >s/à /\\`{a}/g; > >s/â/\\^{a}/g; > >s/ä/\\"{a}/g; > >.... > > Now I tried to use a hash instead of consecutive replacement commands. > The second version looked like this: > > >#!/usr/bin/perl -w > > > >%enctabelle = ("â??"=>"{\\glqq}", > >"â??"=>"{\\grqq}", > >"á"=>"\\'{a}", > >"à "=>"\\`{a}", > >"â"=>"\\^{a}", > >.... > > > >while (<>) { > > $zeile = $_; > > foreach $char (keys %enctabelle) { > > $zeile =~ s/$char/$enctabelle{$char}/g; > > } > > print $zeile; > >} > > This worked, too, but it was extremely slow, obviously since the variables > where compiled over and over again. > > I gave it a third try like this (code taken from someone else's script): > > >%enctabelle = ("â??"=>"{\\glqq}", > >"â??"=>"{\\grqq}", > >"á"=>"\\'{a}", > >"à "=>"\\`{a}", > >"â"=>"\\^{a}", > >.... > > > >while (<>) { > > s/(.)/exists $enctabelle{$1} ? $enctabelle{$1} : $1/geo; > > print; > >} > > This did not change the text at all. When I removed the ternary operator > > >s/(.)/exists $enctabelle{$1}/g; > > I got an error message like this: > > >Line 208: Use of uninitialized value in substitution iterator <> line 1. > > Obviously, Perl cannot interpolate variable names like $enctabelle{ä}. > Both the script and the file to convert are UTF-8 encoded. What's the problem here? The problem is probably that you are searching for a single byte (.) not a UTF character. perldoc perlunicode perldoc utf8 perldoc bytes > On another list, I got a rather complicated snippet I did not fully understand: > > >#!perl > > > >%enctabelle = (...); > > > >my $re = '(' . join('|', map quotemeta($_), keys %enctabelle) . ')'; > >$re = qr/$re/; > > > >while (<>) { > > s/$re/$enctabelle{$1}/g; > > print; > >} > > Maybe the quotemeta part is what helps identifying the corresponding value? > > Any hints are greatly appreciated, Do you want the fastest code? The shortest code? The most maintainable code? What are you trying to accomplish? John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>