Re: How to convert base64 string to utf-8
Hi, ALexander N. Treyner wrote: hello =?windows-1255?Q?=F9=EC=E5=ED_hello_world?= After converting it by code you wrote into utf-8, the "_" is still present between second "hello" and "world". Is it right behavior? You can re-invent the wheel for that problem, or simply use MIME::Words, revisit http://www.mail-archive.com/[EMAIL PROTECTED]/msg02108.html Ciao Guido -- Imperia AG, Development Leyboldstr. 10 - D-50354 HÃrth - http://www.imperia.net/
Re: How to convert base64 string to utf-8
Guido Flohr <[EMAIL PROTECTED]> writes: >ALexander N. Treyner wrote: >> Hello All, >> I'm using utf-8 Postgres database, where I save strings in many languages. >> I have to match the database with strings encoded in mime base64 or >> quoted-printable format. Like next: >> =?utf-8?B?15TXoNeUINee16nXlNeZINeR16LXkdeo15nXqi4=?= >> or >> =?KOI8-R?Q?=F0=D2=C9=D7=C5=D4=2C_=ED=C9=D2!!!?= >> >> I think that I need first convert these strings to utf-8, but I can not >> find out how to do it. > >You are looking for MIME::Words::decode_mimewords(). Encode also has a MIME Encode::decode('MIME-Header',$tag); The decode is okay, its version of encode is not compliant. >The function will >also give you the charset of the decoded data, and if you are lucky >enough, that charset will be known to Encode and you can convert it to >UTF-8. Unfortunately, you will be out of luck for the somewhat common >case of UTF-7 (unless it is available in Encode by now). I personaly have never seen anything at all in UTF-7 if it really is common we can add it to Encode. > >Ciao > >Guido
Re: How to convert base64 string to utf-8
ALexander N. Treyner <[EMAIL PROTECTED]> writes: >Hi John, >Your code works perfect. >But I found one strange thing. >For example I have next string: > > hello hello world > >that converted by the mail client to > > hello =?windows-1255?Q?=F9=EC=E5=ED_hello_world?= > >After converting it by code you wrote into utf-8, the "_" is still >present between second "hello" and "world". >Is it right behavior? No - '_' is an encoding of '0x20' in the codeset, for ASCII based codesets that is space. > >Thx, >Alex.