Re: UTF-8 case conversion

2003-09-03 Thread Jarkko Hietaniemi
> B: The following operations look the same but are not quite so; > > from_to($data, ïso-8859-1", ütf8"); #1 > $data = decode(ïso-8859-1", $data); #2 Ooops. My GNU Emacs iso-accents-mode tried to be too helpful, here... -- Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "Ther

Re: UTF-8 case conversion

2003-09-03 Thread Jarkko Hietaniemi
> > or if you have an editor that handles UTF-8, just say "use utf8;" at > > the top of your script and write your script, including the string > > literals, in UTF-8. > > Oops. Sorry for the wrong advice. I wonder where my head was spinning... I wouldn't call it _bad_, maybe _misguided_ :-

Re: UTF-8 case conversion

2003-09-03 Thread sigfrid . lundberg
On Wed, 3 Sep 2003, Jarkko Hietaniemi wrote: > > > use Encode 'from_to'; > > > > > > my $orjan = 'ÖRJAN'; > > > my $lundstrom = 'LUNDSTRÖM'; > > > > > > print $orjan . ' ' . $lundstrom . "\n"; > > > > > > from_to $orjan,'latin1','utf-8'; > > > from_to $lundstrom,'latin1','utf-8'; > > > > It is my

Re: UTF-8 case conversion

2003-09-03 Thread Andreas J Koenig
> On Wed, 3 Sep 2003 15:58:28 +0300, Jarkko Hietaniemi <[EMAIL PROTECTED]> said: >> > > from_to $orjan,'latin1','utf-8'; >> > > from_to $lundstrom,'latin1','utf-8'; >> > >> > Add this and you're there: >> > >> > binmode STDOUT, ":utf8"; >> >> Now, this did help. I'm starting to l

Re: UTF-8 case conversion

2003-09-03 Thread sigfrid . lundberg
On Wed, 3 Sep 2003, Bart Schuller wrote: > On Wed, Sep 03, 2003 at 01:05:21PM +0200, [EMAIL PROTECTED] wrote: > > use Encode 'from_to'; > > > > my $orjan = 'ÖRJAN'; > > my $lundstrom = 'LUNDSTRÖM'; > > > > print $orjan . ' ' . $lundstrom . "\n"; > > > > from_to $orjan,'latin1','utf-8'; > > from_to

Re: UTF-8 case conversion

2003-09-03 Thread Jarkko Hietaniemi
> > > from_to $orjan,'latin1','utf-8'; > > > from_to $lundstrom,'latin1','utf-8'; > > > > Add this and you're there: > > > > binmode STDOUT, ":utf8"; > > Now, this did help. I'm starting to learn :) Well, yes, that helps in the way that Perl knows you really intend to "speak utf8" to STDOUT,

Re: UTF-8 case conversion

2003-09-03 Thread Jarkko Hietaniemi
> > use Encode 'from_to'; > > > > my $orjan = 'ÖRJAN'; > > my $lundstrom = 'LUNDSTRÖM'; > > > > print $orjan . ' ' . $lundstrom . "\n"; > > > > from_to $orjan,'latin1','utf-8'; > > from_to $lundstrom,'latin1','utf-8'; > > It is my understanding that from_to is the wrong thing to use here. The

Re: UTF-8 case conversion

2003-09-03 Thread sigfrid . lundberg
On Wed, 3 Sep 2003, Andreas J Koenig wrote: > > On Wed, 3 Sep 2003 13:05:21 +0200 (CEST), [EMAIL PROTECTED] said: > > > I wrote a small script (see below), trying to transform ÖRJAN > > LUNDSTRÖM into Örjan Lundström, but it seems to fail, probably because > > of locale related problems.

Re: UTF-8 case conversion

2003-09-03 Thread Bart Schuller
On Wed, Sep 03, 2003 at 01:05:21PM +0200, [EMAIL PROTECTED] wrote: > use Encode 'from_to'; > > my $orjan = 'ÖRJAN'; > my $lundstrom = 'LUNDSTRÖM'; > > print $orjan . ' ' . $lundstrom . "\n"; > > from_to $orjan,'latin1','utf-8'; > from_to $lundstrom,'latin1','utf-8'; It is my understanding that

Re: UTF-8 case conversion

2003-09-03 Thread Andreas J Koenig
> On Wed, 3 Sep 2003 13:05:21 +0200 (CEST), [EMAIL PROTECTED] said: > I wrote a small script (see below), trying to transform ÖRJAN > LUNDSTRÖM into Örjan Lundström, but it seems to fail, probably because > of locale related problems. My question is then simply. How do I do > this then

UTF-8 case conversion

2003-09-03 Thread sigfrid . lundberg
The perlunicode POD tells me the following lc(), uc(), lcfirst(), and ucfirst() work for the following cases: the case mapping is from a single Unicode character to another single Unicode character, or the case mapping is from a single Uni

Re: bytes::substr() ?

2003-09-03 Thread Jarkko Hietaniemi
Perl 5.8.1, whenever that happens, will have bytes::substr(). -- Jarkko Hietaniemi <[EMAIL PROTECTED]> http://www.iki.fi/jhi/ "There is this special biologist word we use for 'stable'. It is 'dead'." -- Jack Cohen