[GENERAL] Re: A question multibye

Tatsuo Ishii Sun, 08 Jul 2001 01:24:39 -0700
From: "Siamack Jabbarzadeh" <[EMAIL PROTECTED]>
Subject: A question multibye
Date: Fri, 06 Jul 2001 18:56:53 
Message-ID: <[EMAIL PROTECTED]>

> Dear Sir/Madam:
> I have some questions on multibye languages and I hope you can help
> me? First I was wondering if there is a table (like ASCII table) for
> multibyte languages?

I am not sure what you want, but PostgreSQL allows default encoding
per database, not per table.

> Second, Assuming we have an input made up of some Japanese letters mixed
> with special character like & and % (which have ASCII values). Now I would 
> like to write a parser that takes & and % out and leaves only Japanese 
> letters. Knowing the fact that & and % are ASCII and the letters are 
> mulitbyte, I can not do the parsing by comparing byte by byte ( as we do in 
> normal ASCII). How can I do that? Do % and & have multibye values in 
> multibye systems? if yes, how can I get those values? Could you kindly ( if 
> you have some solutions to the problem), give me some hints on that?

Japanese has several encodings. I recomend you to use
EUC-JP. (Extended Unix Code for Japanese). With EUC-JP, it's very easy
to distinguish Japanese from ASCII even paring byte by byte. If a
byte is greater than 7f, then it should be a Japanese, otherwise
ASCII.

Anyway, I recommend you to study about Japanese encodings first.

See:
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
--
Tatsuo Ishii

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl
[GENERAL] Re: A question multibye

Reply via email to