Re: [HACKERS] Proposal: Adding JIS X 0213 support
Tatsuo Ishii wrote: Related to this, when are we going to get the Japanese po files in the core distribution? No idea. In my understanding, current message translating system has serious problem if wrong locale and encoding is provided(has this issue been solved in 8.3?). That's certainly true, and it's not solved. But how does keeping the Japanese po files out of the distribution improve the matter? Keeping out po files until the problem is solved is just my opinion. Regrettably I am also the same opinion. It is the cause of an unnecessary trouble to include japanese po file without a certain betterment. -- Hiroki Kataoka <[EMAIL PROTECTED]> ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal: Adding JIS X 0213 support
Hi Josh-san. From: "Josh Berkus" Hiroshi, We are doing the support including the trouble. It was thought that the place of JPUG was preferable for the reasons why they were problems too peculiar to Japan. Well, some of PostgreSQL's commercial distributors have been pretty surprised when they package PostgreSQL and find out that the main distribution has no Japanese support (I know because I get the confused emails). I've an open offer from the Sun i18N people to help with this, if they can coordinate with you. Ahh yes, Certainly an offer from SUN of Japan.:-) Then, The support was done as a volunteer with Honda-san. It seemed to be wonderful that the spread of PostgreSQL promoted it by Solaris.! The resource is being offered in the place where JPUG was open to the public. I think that SUN of Japan obtained knowhow that takes there and evades the problem. The support of the resource makes an effort to the utmost though we are volunteers. Satisfactory results proves.! Maybe, the problem is a release speed However, It might be the same even if it puts it on official's place... Regards, Hiroshi Saito ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: Adding JIS X 0213 support
Hiroshi, > We are doing the support including the trouble. It was thought that the > place of JPUG was preferable for the reasons why they were problems too > peculiar to Japan. Well, some of PostgreSQL's commercial distributors have been pretty surprised when they package PostgreSQL and find out that the main distribution has no Japanese support (I know because I get the confused emails). I've an open offer from the Sun i18N people to help with this, if they can coordinate with you. -- --Josh Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal: Adding JIS X 0213 support
Hi. - Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> Tatsuo Ishii <[EMAIL PROTECTED]> writes: Related to this, when are we going to get the Japanese po files in the core distribution? No idea. In my understanding, current message translating system has serious problem if wrong locale and encoding is provided(has this issue been solved in 8.3?). That's certainly true, and it's not solved. But how does keeping the Japanese po files out of the distribution improve the matter? We are doing the support including the trouble. It was thought that the place of JPUG was preferable for the reasons why they were problems too peculiar to Japan. Then, The system of the support of Honda-san who was the representative of the document team had functioned enough up to now. However, it is not the one to refuse to do the distribution with the main body. It should discuss it again in the document team for the reasons why the one that was the effort to match to the release schedule of the main body becomes stronger. Anyway, Please wait for the response from Honda-san for a while. Regards, Hiroshi Saito ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal: Adding JIS X 0213 support
> Tatsuo Ishii <[EMAIL PROTECTED]> writes: > >> Related to this, when are we going to get the Japanese po files in the > >> core distribution? > > > No idea. In my understanding, current message translating system has > > serious problem if wrong locale and encoding is provided(has this > > issue been solved in 8.3?). > > That's certainly true, and it's not solved. But how does keeping the > Japanese po files out of the distribution improve the matter? Keeping out po files until the problem is solved is just my opinion. If JPUG (or Japanese po files maintainers/volunteers) decide to include them into PostgreSQL distribution, I have no right to prevent it. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposal: Adding JIS X 0213 support
Tatsuo Ishii <[EMAIL PROTECTED]> writes: >> Related to this, when are we going to get the Japanese po files in the >> core distribution? > No idea. In my understanding, current message translating system has > serious problem if wrong locale and encoding is provided(has this > issue been solved in 8.3?). That's certainly true, and it's not solved. But how does keeping the Japanese po files out of the distribution improve the matter? regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Proposal: Adding JIS X 0213 support
> Tatsuo, > > Related to this, when are we going to get the Japanese po files in the > core distribution? No idea. In my understanding, current message translating system has serious problem if wrong locale and encoding is provided(has this issue been solved in 8.3?). AFAIK Hiroki Kataoka, chairman of JPUG has same impression. Japanese po files are managed by JPUG and it would be better to ask him or someone from JPUG who is responsible for Japanese po files. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal: Adding JIS X 0213 support
> Tatsuo, > > Related to this, when are we going to get the Japanese po files in the > core distribution? No idea. In my understanding, current message translating system has serious problem if wrong locale and encoding is provided(has this issue been solved in 8.3?). AFAIK Hiroki Kataoka, chairman of JPUG has same impression. Japanese po files are managed by JPUG and it would be better to ask him or someone from JPUG who is responsible for Japanese po files. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal: Adding JIS X 0213 support
Tatsuo, Related to this, when are we going to get the Japanese po files in the core distribution? --Josh ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Proposal: Adding JIS X 0213 support
> > Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > >> I'm confused. If this is exactly the same as EUC_JP, why do we need > > >> any new code at all? > > > > > I said *encoding schema" is same, not the contents (character set) is > > > same. In another word, characters included in EUC_JP are not same as > > > EUC_JIS_2004. > > > > I'm still confused. If the set of characters is different, then surely > > we need at least a different UTF8<->EUC_JIS_2004 conversion function? > > Yes, exactly. I will come up with new conversions later. I have committed changes to add JIS X 0213 along with conversions. New encodings: EUC_JIS_2004: JIS X 0213 encoded in EUC SHIFT_JIS_2004: JIS X 0213 encoded in Shift JIS (client only encoding) These encodings support following character sets: ASCII, JIS X 0201 (single byte "katakana"), JIS X 0213 plane 1, 2 New conversions: EUC_JIS_2004 --> UTF8: euc_jis_2004_to_utf8 UTF8 --> EUC_JIS_2004: utf8_to_euc_jis_2004 SHIFT_JIS_2004 --> UTF8: shift_jis_2004_to_utf8 UTF8 --> SHIFT_JIS_2004: utf8_to_shift_jis_2004 EUC_JIS_2004 --> SHIFT_JIS_2004: euc_jis_2004_to_shift_jis_2004 SHIFT_JIS_2004 --> EUC_JIS_2004: shift_jis_2004_to_euc_jis_2004 To generate conversion maps, I have created two perl scripts UCS_to_SHIFT_JIS_2004.pl and UCS_to_EUC_JIS_2004.pl, which use sjis-0213-2004-std.txt and euc-jis-2004-std.txt as the source of conversion specification. They are freely obtained from http://x0213.org. Conversions to UTF-8 from EUC_JIS_2004 and SHIFT_JIS_2004 require supporting UTF-8 "combined characters" i.e. a logical character consists of two UTF-8 characters. To implement this, I have modified LocalToUtf() and UtfToLocal() by adding new parameter: "combined character map". docs changes and regression test changes are committed too. Beware that I have updated catalog versions. Please do initdb. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposal: Adding JIS X 0213 support
> Tatsuo Ishii <[EMAIL PROTECTED]> writes: > >> I'm confused. If this is exactly the same as EUC_JP, why do we need > >> any new code at all? > > > I said *encoding schema" is same, not the contents (character set) is > > same. In another word, characters included in EUC_JP are not same as > > EUC_JIS_2004. > > I'm still confused. If the set of characters is different, then surely > we need at least a different UTF8<->EUC_JIS_2004 conversion function? Yes, exactly. I will come up with new conversions later. > After that I will develop conversion part(it will take several days). -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal: Adding JIS X 0213 support
Tatsuo Ishii <[EMAIL PROTECTED]> writes: >> I'm confused. If this is exactly the same as EUC_JP, why do we need >> any new code at all? > I said *encoding schema" is same, not the contents (character set) is > same. In another word, characters included in EUC_JP are not same as > EUC_JIS_2004. I'm still confused. If the set of characters is different, then surely we need at least a different UTF8<->EUC_JIS_2004 conversion function? regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Proposal: Adding JIS X 0213 support
> Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > I would like to propose adding new character set "JIS X > > 0213"(http://en.wikipedia.org/wiki/JIS_X_0213). > > ... > > Note that since encoding schema of EUC_JIS_2004 is exactly identical > > to EUC_JP, we can reuse existing encoding routines defined in > > utls/mb/*.c. > > I'm confused. If this is exactly the same as EUC_JP, why do we need > any new code at all? Why not just a documentation addition saying > they are the same thing? Or maybe rename EUC_JP to reflect the new > standard number (we've certainly renamed encodings before). I said *encoding schema" is same, not the contents (character set) is same. In another word, characters included in EUC_JP are not same as EUC_JIS_2004. Also, EUC_JIS_2004 is *not* the super set of EUC_JP. So we need to let EUC_JP and EUC_JIS_2004 coexist. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Proposal: Adding JIS X 0213 support
> Tatsuo Ishii <[EMAIL PROTECTED]> writes: > > I would like to propose adding new character set "JIS X > > 0213"(http://en.wikipedia.org/wiki/JIS_X_0213). > > ... > > Note that since encoding schema of EUC_JIS_2004 is exactly identical > > to EUC_JP, we can reuse existing encoding routines defined in > > utls/mb/*.c. > > I'm confused. If this is exactly the same as EUC_JP, why do we need > any new code at all? Why not just a documentation addition saying > they are the same thing? Or maybe rename EUC_JP to reflect the new > standard number (we've certainly renamed encodings before). I said *encoding schema" is same, not the contents (character set) is same. In another word, characters included in EUC_JP are not same as EUC_JIS_2004. Also, EUC_JIS_2004 is *not* the super set of EUC_JP. So we need to let EUC_JP and EUC_JIS_2004 coexist. -- Tatsuo Ishii SRA OSS, Inc. Japan ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Proposal: Adding JIS X 0213 support
Tatsuo Ishii <[EMAIL PROTECTED]> writes: > I would like to propose adding new character set "JIS X > 0213"(http://en.wikipedia.org/wiki/JIS_X_0213). > ... > Note that since encoding schema of EUC_JIS_2004 is exactly identical > to EUC_JP, we can reuse existing encoding routines defined in > utls/mb/*.c. I'm confused. If this is exactly the same as EUC_JP, why do we need any new code at all? Why not just a documentation addition saying they are the same thing? Or maybe rename EUC_JP to reflect the new standard number (we've certainly renamed encodings before). regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Proposal: Adding JIS X 0213 support
Hi, I would like to propose adding new character set "JIS X 0213"(http://en.wikipedia.org/wiki/JIS_X_0213). JIS X 0213 is a relatively new Japanese goverment standard (defined in 2000, revised in 2004), and becomes important for Japanese users. Moreover some commercial OSs including Windows VISTA support JIS X 0213(some open source OSs support too, of course). So I believe supporting JIS X 0213 in upcoming 8.3 will be usefull for Japanese users and will help spreading PostgreSQL more. Since JIS X 0213 is a character set, we need to add encodings supporting it. Here are lists of additional encodings (specifications are already published by the goverment). 1) EUC-JIS-2004 prposed encoding name: EUC_JIS_2004 including following character sets: - ASCII - JIS X 0213 plane 1 - JIS X 0201 "katakana" - JIS X 0213 plane 2 Note that since encoding schema of EUC_JIS_2004 is exactly identical to EUC_JP, we can reuse existing encoding routines defined in utls/mb/*.c. 2) Shift-JIS-2004 prposed encoding name: SHIFT_JIS_2004 including following character sets(same as EUC-JIS-2004): - ASCII - JIS X 0213 plane 1 - JIS X 0201 "katakana" - JIS X 0213 plane 2 Note that this is client encoding only due to the same reason as SJIS. Note that encoding schema of SHIFT_JIS_2004 is exactly identical to SJIS, we can reuse existing encoding routines defined in utils/mb/*.c. 3) UTF-8 Actually already supported by the recent version of PostgreSQL and no additional work required. o About encoding conversion I will add encoding conversios among EUC_JIS_2004, SHIFT_JIS_2004 and UTF-8. Including are patches against CVS head which should illustrate what I'm proposing in detail. If there's no objection, I will commit them along with documentation changes, regression updates and bump up catalog version. After that I will develop conversion part(it will take several days). comments, suggestions are welcome. -- Tatsuo Ishii SRA OSS, Inc. Japan jisx0213.patch Description: Binary data ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org