Re: How to convert base64 string to utf-8

2004-02-06 Thread Guido Flohr
Hi,

ALexander N. Treyner wrote:
hello =?windows-1255?Q?=F9=EC=E5=ED_hello_world?=

After converting it by code you wrote into utf-8, the "_" is still 
present between second "hello" and "world".
Is it right behavior?
You can re-invent the wheel for that problem, or simply use MIME::Words, 
revisit http://www.mail-archive.com/[EMAIL PROTECTED]/msg02108.html

Ciao

Guido
--
Imperia AG, Development
Leyboldstr. 10 - D-50354 HÃrth - http://www.imperia.net/


Re: How to convert base64 string to utf-8

2004-02-06 Thread Nick Ing-Simmons
Guido Flohr <[EMAIL PROTECTED]> writes:
>ALexander N. Treyner wrote:
>> Hello All,
>> I'm using utf-8 Postgres database, where I save strings in many languages.
>> I have to match the database with strings encoded in mime base64 or 
>> quoted-printable format. Like next:
>> =?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?=
>> or
>> =?KOI8-R?Q?=F0=D2=C9=D7=C5=D4=2C_=ED=C9=D2!!!?=
>> 
>> I think that I need first convert these strings to utf-8, but I can not 
>> find out how to do it.
>
>You are looking for MIME::Words::decode_mimewords().  

Encode also has a MIME

   Encode::decode('MIME-Header',$tag);

The decode is okay, its version of encode is not compliant.

>The function will 
>also give you the charset of the decoded data, and if you are lucky 
>enough, that charset will be known to Encode and you can convert it to 
>UTF-8.  Unfortunately, you will be out of luck for the somewhat common 
>case of UTF-7 (unless it is available in Encode by now).

I personaly have never seen anything at all in UTF-7 if it really is 
common we can add it to Encode.

>
>Ciao
>
>Guido



Re: How to convert base64 string to utf-8

2004-02-06 Thread Nick Ing-Simmons
ALexander N. Treyner <[EMAIL PROTECTED]> writes:
>Hi John,
>Your code works perfect.
>But I found one strange thing.
>For example I have next string:
>
>   hello  hello world
>
>that converted by the mail client to
>   
>   hello =?windows-1255?Q?=F9=EC=E5=ED_hello_world?=
>
>After converting it by code you wrote into utf-8, the "_" is still 
>present between second "hello" and "world".
>Is it right behavior?

No - '_' is an encoding of '0x20' in the codeset, for ASCII based 
codesets that is space.

>
>Thx,
>Alex.