Thank you for committing this. At Mon, 13 Mar 2017 21:07:39 +0200, Heikki Linnakangas <hlinn...@iki.fi> wrote in <d5b70078-9f57-0f63-3462-1e564a577...@iki.fi> > On 03/13/2017 08:53 PM, Tom Lane wrote: > > Heikki Linnakangas <hlinn...@iki.fi> writes: > >> It would be nice to run the map_checker tool one more time, though, to > >> verify that the mappings match those from PostgreSQL 9.6. > > > > +1 > > > >> Just to be sure, and after that the map checker can go to the dustbin. > > > > Hm, maybe we should keep it around for the next time somebody has a > > bright > > idea in this area? > > The map checker compares old-style maps with the new radix maps. The > next time 'round, we'll need something that compares the radix maps > with the next great thing. Not sure how easy it would be to adapt. > > Hmm. A somewhat different approach might be more suitable for testing > across versions, though. We could modify the perl scripts slightly to > print out SQL statements that exercise every mapping. For every > supported conversion, the SQL script could: > > 1. create a database in the source encoding. > 2. set client_encoding='<target encoding>' > 3. SELECT a string that contains every character in the source > encoding.
There are many encodings that can be client-encoding but cannot be database-encoding. And some encodings such as UTF-8 has several one-way conversion. If we do something like this, it would be as the following. 1. Encoding test 1-1. create a database in UTF-8 1-2. set client_encoding='<source encoding>' 1-3. INSERT all characters defined in the source encoding. 1-4. set client_encoding='UTF-8' 1-5. SELECT a string that contains every character in UTF-8. 2. Decoding test .... sucks! I would like to use convert() function. It can be a large PL/PgSQL function or a series of "SELECT convert(...)"s. The latter is doable on-the-fly (by not generating/storing the whole script). | -- Test for SJIS->UTF-8 conversion | ... | SELECT convert('\0000', 'SJIS', 'UTF-8'); -- results in error | ... | SELECT convert('\897e', 'SJIS', 'UTF-8'); > You could then run those SQL statements against old and new server > version, and verify that you get the same results. Including the result files in the repository will make this easy but unacceptably bloats. Put mb/Unicode/README.sanity_check? regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers