Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-20 Thread Kyotaro HORIGUCHI
Hello, At Tue, 13 Sep 2016 11:44:01 +0300, Heikki Linnakangas wrote in <7ff67a45-a53e-4d38-e25d-3a121afea...@iki.fi> > On 09/08/2016 09:35 AM, Kyotaro HORIGUCHI wrote: > > Returning in UTF-8 bloats the result string by about 1.5 times so > > it doesn't seem to make sense comparing with it. But i

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-13 Thread Heikki Linnakangas
On 09/08/2016 09:35 AM, Kyotaro HORIGUCHI wrote: Returning in UTF-8 bloats the result string by about 1.5 times so it doesn't seem to make sense comparing with it. But it takes real = 47.35s. Nice! I was hoping that this would also make the binaries smaller. A few dozen kB of storage is perha

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-12 Thread Kyotaro HORIGUCHI
At Thu, 8 Sep 2016 07:09:51 +, "Tsunakawa, Takayuki" wrote in <0A3221C70F24FB45833433255569204D1F5E7D4A@G01JPEXMBYT05> > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > > HORIGUCHI > > > > $ time psql postgres -c 'select t.a fr

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-08 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > HORIGUCHI > > $ time psql postgres -c 'select t.a from t, generate_series(0, )' > > /dev/null > > real 0m22.696s > user 0m16.991s > sys 0m0.182s> > > Using binsearch the result f

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-07 Thread Kyotaro HORIGUCHI
Hello, At Wed, 07 Sep 2016 16:13:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160907.161304.112519789.horiguchi.kyot...@lab.ntt.co.jp> > > Implementing radix tree code, then redefining the format of mapping table > > > to suppot radix tree, then modifying mapping generator scri

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-07 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > Thanks, by the way, there's another issue related to SJIS conversion. MS932 > has several characters that have multiple code points. By converting texts > in this encoding to and from Unic

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-07 Thread Kyotaro HORIGUCHI
Hello, At Tue, 6 Sep 2016 03:43:46 +, "Tsunakawa, Takayuki" wrote in <0A3221C70F24FB45833433255569204D1F5E66CE@G01JPEXMBYT05> > > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > > HORIGUCHI > Implementing radix tree code, then

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > HORIGUCHI Implementing radix tree code, then redefining the format of mapping table > to suppot radix tree, then modifying mapping generator script are needed. > > If no one oppse to thi

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Kyotaro HORIGUCHI
Hello, At Mon, 5 Sep 2016 19:38:33 +0300, Heikki Linnakangas wrote in <529db688-72fc-1ca2-f898-b0b99e300...@iki.fi> > On 09/05/2016 05:47 PM, Tom Lane wrote: > > "Tsunakawa, Takayuki" writes: > >> Before digging into the problem, could you share your impression on > >> whether PostgreSQL can su

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tom Lane
"Tsunakawa, Takayuki" writes: > Using multibyte-functions like mb... to process characters would solve > the problem? Well, sure. The problem is (1) finding all the places that need that (I'd estimate dozens to hundreds of places in the core code, and then there's the question of extensions); (2

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Heikki > But one thing that would help a little, would be to optimize the UTF-8 > -> SJIS conversion. It uses a very generic routine, with a binary search > over a large array of mappings. I bet you

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
From: Tom Lane [mailto:t...@sss.pgh.pa.us] > "Tsunakawa, Takayuki" writes: > > Before digging into the problem, could you share your impression on > > whether PostgreSQL can support SJIS? Would it be hopeless? > > I think it's pretty much hopeless. Even if we were willing to make every > bit of

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Heikki Linnakangas
On 09/05/2016 05:47 PM, Tom Lane wrote: "Tsunakawa, Takayuki" writes: Before digging into the problem, could you share your impression on whether PostgreSQL can support SJIS? Would it be hopeless? I think it's pretty much hopeless. Agreed. But one thing that would help a little, would be

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tom Lane
"Tsunakawa, Takayuki" writes: > Before digging into the problem, could you share your impression on > whether PostgreSQL can support SJIS? Would it be hopeless? I think it's pretty much hopeless. Even if we were willing to make every bit of code that looks for '\' and other specific at-risk cha

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tatsuo Ishii
> Before digging into the problem, could you share your impression on whether > PostgreSQL can support SJIS? Would it be hopeless? Can't we find any > direction to go? Can I find relevant source code by searching specific words > like "ASCII", "HIGH_BIT", "\\" etc? For starters, you could gr

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii > > But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there > any technical difficulty? Is there anything you are worried about if adding > SJIS? > > Yes, there's a

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tatsuo Ishii
> But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there any > technical difficulty? Is there anything you are worried about if adding SJIS? Yes, there's a technical difficulty with backend code. In many places it is assumed that any string is "ASCII compatible", which means n

[HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
Hello, I'd like to propose adding SJIS as a database encoding. You may wonder why SJIS is still necessary in the world of Unicode. The purpose is to achieve comparable performance when migrating legacy database systems from other DBMSs without little modification of applications. Recently, w