On Sep 08, 2005, at 12:39 , Jerzy Giergiel wrote:
Neither of those fallbacks is OK, I want á converted to accent
stripped version of itself i.e. a. The second solution isn't very
helpful either, it's basically tr replacement table which is not
much fun to write when majority of upper 128 characters need to be
converted. There's gotta be a simpler and more elegant solution.
thanks anyway.
Well, it's not that hard to write a tr version if you let perl do the
job.
#!/usr/bin/perl
use strict;
use charnames qw(:full);
my ($from, $to);
for my $ord (0x80..0xff){
my $chr = chr $ord;
my $name = charnames::viacode($ord);
$name =~ /(SMALL|CAPITAL) LETTER ([A-Z]) WITH/i or next;
my $az = $1 eq 'CAPITAL' ? uc($2) : $2;
$from .= $chr;
$to .= $az;
}
binmode STDOUT => ":utf8";
print qq(tr[$from]\n [$to];), "\n";
__END__
And here is the output.
tr[ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝàáâãäåçèéêëìíîïñòóôõöøùúûüýÿ]
[AAAAAACEEEEIIIINOOOOOOUUUUYaaaaaaceeeeiiiinoooooouuuuyy];
In this kind of case, however, a simple tr/// won't cut it, however.
Consider Schrödinger. Usually you spell that 'Schroedinger", not
"Shrodinger". So you have to resort to s///g for most cases.
Dàñ thè Ëñçôdé Máìñtâíñêr