Re: [HACKERS] Unicode mapping scripts cleanup

Peter Eisentraut Tue, 15 Sep 2015 17:36:07 -0700

On 9/1/15 7:27 PM, Tatsuo Ishii wrote:
>> On Tue, Sep 1, 2015 at 5:13 AM, Peter Eisentraut <[email protected]> wrote:
>>>   So apparently, the
>>> CJK to Unicode mappings are still evolving and should be updated
>>> occasionally.  Next steps would be to commit some or all of these
>>> differences after additional verification, and then update the scripts
>>> to use whatever the non-obsolete mapping sources are supposed to be.
>>
>> Would that pose a problem for databases which have data in them
>> already using the old mappings?
> 
> I think so. We must be very careful updating the maps. Adding new
> mapping data would cause less problem, but replacing existing mappings
> will be definitely a big problem for users.


Note that I'm not actually proposing to change the mappings, I just want
to get the scripts into working order, to put us into a position to
consider changes if necessary.

That said, I'm not sure what the problem with changes would be.  The
data in the databases doesn't change.  You just see different data
coming out.  It is in the nature of encoding conversion that you don't
get the original data, but an approximation.  Then again, I don't have
any knowledge about how to handle such changes.  But the fact that the
standards bodies are still making changes indicates that such changes
are to be expected and should be handled.  I think this is similar to
time zone changes, and also similar in different ways to collation changes.



-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Unicode mapping scripts cleanup

Reply via email to