[firebird-support] Character sets and collations of columns

Tomasz Tyrakowski t.tyrakow...@sol-system.pl [firebird-support] Fri, 26 Aug 2016 04:59:31 -0700

Hi,

One of the legacy databases I happen to maintain has character sets and 
collations messed up. As far as I know, the database was created with 
default character set "NONE" instead of "WIN1250", which should have 
been set back then (and it's over 10 years old).
Then, some columns have been added with the character set option set 
explicitly to WIN1250 (alter table X add Y varchar(n) character set 
WIN1250), while the old columns still remained as "NONE".
The clients were windows apps and used WIN1250 when connecting to the 
database.
Later on, another maintainer changed the RDB$COLLATION_ID flags for some 
of the columns in RDB$RELATION_FIELDS to indicate WIN1250 (value 3 as 
far as I know, although I'm not entirely sure if this flag actually 
concerns character sets, or only collation order - maybe the maintainer 
was convinced he was clever, while in fact nothing relevant happened), 
and set the default character set of the database to WIN1250 (in a way 
unknown to me - maybe even via binary editing the GDB file and changing 
some byte-flags inside).
So now the DB is kind of a mess (despite reporting dutifully WIN1250 as 
the default character set), with some columns created when the DB had 
"NONE" character set, some other create with character set WIN1250 set 
explicitly upon creation, yet another created without explicit character 
set but when the database has already had the "WIN1250" default 
character set, and possibly some columns, for which the RDB$COLLATION_ID 
was set by hand.
So my question is (in fact I'm not even sure what the right question is 
:( ): assuming the strings in the database are in fact WIN1250-encoded 
(all the clients used WIN1250), is setting the RDB$COLLATION_ID on all 
text columns a good idea and would actually make any difference?
All strange quirks that happen from time to time with this database are 
related to national characters. Sometimes a sort doesn't work as 
expected, sometimes a client function (like strtoupper in PHP) doesn't 
work on data from some of the columns (but works for other data). Is 
there a chance that setting the collation flag might help? Or is there 
another way to set it right?
I know the "right" right thing to do (pun intended) is to create a new 
DB and pump the data, but it has about 400 interconnected tables and 
over 60GB of size, so if there is another way, I would be grateful to 
hear about it first.


Thanks in advance and sorry for this lengthy story.

regards
Tomasz

-- 
__--==============================--__
__--==     Tomasz Tyrakowski    ==--__
__--==        SOL-SYSTEM        ==--__
__--== http://www.sol-system.pl ==--__
__--==============================--__

[firebird-support] Character sets and collations of columns

Reply via email to