Re: intelligent lexically encoding

2005-09-08 Thread Dan Kogai

On Sep 08, 2005, at 12:39 , Jerzy Giergiel wrote:
Neither of those fallbacks is OK, I want á converted to accent  
stripped version of itself i.e. a. The second solution isn't very  
helpful either, it's basically tr replacement table which is not  
much fun to write when majority of upper 128 characters need to be  
converted. There's gotta be a simpler and more elegant solution.   
thanks anyway.


Well, it's not that hard to write a tr version if you let perl do the  
job.


#!/usr/bin/perl
use strict;
use charnames qw(:full);
my ($from, $to);
for my $ord (0x80..0xff){
my $chr = chr $ord;
my $name = charnames::viacode($ord);
$name =~ /(SMALL|CAPITAL) LETTER ([A-Z]) WITH/i or next;
my $az = $1 eq 'CAPITAL' ? uc($2) : $2;
$from .= $chr;
$to   .= $az;
}
binmode STDOUT => ":utf8";
print qq(tr[$from]\n  [$to];), "\n";
__END__

And here is the output.

tr[ÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÑÒÓÔÕÖØÙÚÛÜÝàáâãäåçèéêëìíîïñòóôõöøùúûüýÿ]
  [AACNOOYaacnooyy];

In this kind of case, however, a simple tr/// won't cut it, however.   
Consider Schrödinger.  Usually you spell that 'Schroedinger", not  
"Shrodinger".  So you have to resort to s///g for most cases.


Dàñ thè Ëñçôdé Máìñtâíñêr



[Encode] 2.12 Released!

2005-09-08 Thread Dan Kogai

Porters,

I am pleased to release Encode Version 2.12 as follows;

=head1 Availability

http://www.dan.co.jp/~dankogai/cpan/Encode-2.12.tar.gz
and CPAN near you.

=head1 Highlight

You can finally use coderef to CHECK.

   coderef for CHECK

   As of Encode 2.12 CHECK can also be a code reference which  
takes the
   ord value of unmapped caharacter as an argument and returns a  
string

   that represents the fallback character.  For instance,

 $ascii = encode("ascii", $utf8, sub{ sprintf "",  
shift });


   Acts like FB_PERLQQ but  is used instead of \x{}.

=head1 Changes

$Revision: 2.12 $ $Date: 2005/09/08 14:17:17 $
! Encode.xs Encode.pm t/fallback.t
  Now accepts coderef for CHECK!
! ucm/8859-7.ucm
  Updated to newer version at unicode.org
  http://rt.cpan.org/NoAuth/Bug.html?id=14222
! lib/Encode/Supported.pod
  More POD typo fixed.
  <[EMAIL PROTECTED]>
! encoding.pm
  More POD typo leftover fixed.
  Message-Id: <[EMAIL PROTECTED]>

=head1 Signature

Dan the Encode Maintainer



Re: [Encode] 2.12 Released!

2005-09-08 Thread Dan Kogai

On Sep 08, 2005, at 23:34 , Sastry wrote:

Hi Dan
Please check my previous mail on Encode problem on EBCDIC. Did you  
apply the patch in this new Version?(Seems like that is broken on  
EBCDIC platform as I happened to test on z/OS)


I wil be glad if you can reply me for the previous mail at the  
earliest!


Ouch.  Crisscrossed.   I checked AFTER I've subscribed the patch. I  
just want to make sure you are talking about the one below;



From:   [EMAIL PROTECTED]
Subject: Re: Encode on EBCDIC patch( Doesn't Work)
Date: September 08, 2005 22:14:35  JST
To:   [EMAIL PROTECTED]
Cc:   perl5-porters@perl.org
Reply-To:   [EMAIL PROTECTED]
Message-Id: <[EMAIL PROTECTED]>


As for the upgrade the 2.12 DOES NOT contain my ad-hoc workaround  
patch.  I'm not going to ship anything like that without tests.


Dan the Encode Maintainer