Re: [GENERAL] Postgres Encoding conversion problem

2008-04-23 Thread Clemens Schwaighofer
On 04/23/2008 04:33 PM, Albe Laurenz wrote:
> Michael Fuhr wrote:
 I sometimes have a problem with conversion of encodings eg from UTF-8
 tio ShiftJIS:

 ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no
 equivalent in "SJIS"

 I have no idea what character this is, I cannot view it in my
 browser, etc.
>>> It translates to Unicode 10BB7, which is not defined.
>> Actually it's .
> 
> Oops, you're correct. Made an error in my calculations. Thanks.
> 
> So that explains the problem.
> Still, to handle it, the offending character needs to be changed before
> converting to SJIS.

probably wont get around a clean up before writing script. *sigh* Or
export the data in UTF-8 ...

-- 
[ Clemens Schwaighofer  -=:~ ]
[ IT Engineer/Manager, TEQUILA\ Japan IT Group   ]
[6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.co.jp   ]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-23 Thread Albe Laurenz
Michael Fuhr wrote:
>>> I sometimes have a problem with conversion of encodings eg from UTF-8
>>> tio ShiftJIS:
>>>
>>> ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no
>>> equivalent in "SJIS"
>>>
>>> I have no idea what character this is, I cannot view it in my
>>> browser, etc.
>> 
>> It translates to Unicode 10BB7, which is not defined.
> 
> Actually it's .

Oops, you're correct. Made an error in my calculations. Thanks.

So that explains the problem.
Still, to handle it, the offending character needs to be changed before
converting to SJIS.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-23 Thread Michael Fuhr
On Tue, Apr 22, 2008 at 10:37:59AM +0200, Albe Laurenz wrote:
> Clemens Schwaighofer wrote:
> > I sometimes have a problem with conversion of encodings eg from UTF-8
> > tio ShiftJIS:
> >
> > ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no
> > equivalent in "SJIS"
> >
> > I have no idea what character this is, I cannot view it in my
> > browser, etc.
> 
> It translates to Unicode 10BB7, which is not defined.

Actually it's .

http://www.unicode.org/cgi-bin/GetUnihanData.pl?codepoint=20BB7

-- 
Michael Fuhr

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Clemens Schwaighofer
On 04/22/2008 07:30 PM, Albe Laurenz wrote:
> Clemens Schwaighofer wrote:
 I sometimes have a problem with conversion of encodings eg from UTF-8
 tio ShiftJIS:

 ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no
 equivalent in "SJIS"

 I have no idea what character this is, I cannot view it in my
 browser, etc.
>>> It translates to Unicode 10BB7, which is not defined.
>>> I guess that is not intended; can you guess what the character(s) should be?
>> to be honest no idea. its some chinese character, I have no idea how the
>> user input this, because this is a japanese page.
>>
>> I actually found the carachter, but only my Mac OS X can show it. It
>> looks similar to a japanese character used for a name, but how the
>> chinese one got selected is a mystery to me ...
> 
> Are you sure that your Mac OS X computer interprets the character as
> UTF-8?

That I cannot be sure, I just searched through a page that has a
complete list. OS X can render it, Linux cannot, I have not tried windows.

-- 
[ Clemens Schwaighofer  -=:~ ]
[ IT Engineer/Manager, TEQUILA\ Japan IT Group   ]
[6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.co.jp   ]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Albe Laurenz
Clemens Schwaighofer wrote:
>>> I sometimes have a problem with conversion of encodings eg from UTF-8
>>> tio ShiftJIS:
>>>
>>> ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no
>>> equivalent in "SJIS"
>>>
>>> I have no idea what character this is, I cannot view it in my
>>> browser, etc.
>> 
>> It translates to Unicode 10BB7, which is not defined.
>> I guess that is not intended; can you guess what the character(s) should be?
> 
> to be honest no idea. its some chinese character, I have no idea how the
> user input this, because this is a japanese page.
> 
> I actually found the carachter, but only my Mac OS X can show it. It
> looks similar to a japanese character used for a name, but how the
> chinese one got selected is a mystery to me ...

Are you sure that your Mac OS X computer interprets the character as
UTF-8?

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Clemens Schwaighofer
On 04/22/2008 05:37 PM, Albe Laurenz wrote:
> Clemens Schwaighofer wrote:
>> I sometimes have a problem with conversion of encodings eg from UTF-8
>> tio ShiftJIS:
>>
>> ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no
>> equivalent in "SJIS"
>>
>> I have no idea what character this is, I cannot view it in my
>> browser, etc.
> 
> It translates to Unicode 10BB7, which is not defined.
> I guess that is not intended; can you guess what the character(s) should be?

to be honest no idea. its some chinese character, I have no idea how the
user input this, because this is a japanese page.

I actually found the carachter, but only my Mac OS X can show it. It
looks similar to a japanese character used for a name, but how the
chinese one got selected is a mystery to me ...

>> If I run the conversion through PHP with mb_convert_encoding it works,
>> perhaps he is ignoring the character.
>>
>> Is there a way to do a similar thing, like ignoring this character in
>> postgres too?
> 
> As far as I know, no.
> You'll have to fix the data before you import them.

well, the web page & data is in utf8 so I never see this issue, except I
would write a method that detects illegal shift_jis characters, and
thats difficult.

The reporting is only done in CSV ... so I am not sure if it is worth to
waste too much time here.

thanks for the tip.

-- 
[ Clemens Schwaighofer  -=:~ ]
[ IT Engineer/Manager, TEQUILA\ Japan IT Group   ]
[6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.co.jp   ]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Postgres Encoding conversion problem

2008-04-22 Thread Albe Laurenz
Clemens Schwaighofer wrote:
> I sometimes have a problem with conversion of encodings eg from UTF-8
> tio ShiftJIS:
>
> ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no
> equivalent in "SJIS"
>
> I have no idea what character this is, I cannot view it in my
> browser, etc.

It translates to Unicode 10BB7, which is not defined.
I guess that is not intended; can you guess what the character(s) should be?

> If I run the conversion through PHP with mb_convert_encoding it works,
> perhaps he is ignoring the character.
>
> Is there a way to do a similar thing, like ignoring this character in
> postgres too?

As far as I know, no.
You'll have to fix the data before you import them.

Yours,
Laurenz Albe

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Postgres Encoding conversion problem

2008-04-21 Thread Clemens Schwaighofer
Hi,

I sometimes have a problem with conversion of encodings eg from UTF-8
tio ShiftJIS:

ERROR:  character 0xf0a0aeb7 of encoding "UTF8" has no equivalent in "SJIS"

I have no idea what character this is, I cannot view it in my browser, etc.

If I run the conversion through PHP with mb_convert_encoding it works,
perhaps he is ignoring the character.

Is there a way to do a similar thing, like ignoring this character in
postgres too?

-- 
[ Clemens Schwaighofer  -=:~ ]
[ IT Engineer/Manager, TEQUILA\ Japan IT Group   ]
[6-17-2 Ginza Chuo-ku, Tokyo 104-8167, JAPAN ]
[ Tel: +81-(0)3-3545-7703Fax: +81-(0)3-3545-7343 ]
[ http://www.tequila.co.jp   ]

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general