Thomas Lockhart writes: > An aside: I was thinking about this some, from the PoV of using our > existing type system to handle this (as you might remember, this is an > inclination I've had for quite a while). I think that most things line > up fairly well to allow this (and having transaction-enabled features > may require it), but do notice that the SQL feature of allowing a > different character set for every column *name* does not map > particularly well to our underlying structures.
There more I think about it, the more I come to the conclusion that the SQL framework for "character sets" is both bogus and a red herring. (And it begins with figuring out exactly what a character set is, as opposed to a form-of-use, a.k.a.(?) encoding, but let's ignore that.) The ability to store each column value in a different encoding sounds interesting, because it allows you to create tables such as product_id | product_name_en | product_name_kr | product_name_jp but you might as well create a table such as product_id | lang | product_name with product_name in Unicode, and have a more extensible application that way, too. I think it's fine to have the encoding fixed for the entire database. It sure makes coding easier. If you want to be international, you use Unicode. If not you can "optimize" your database by using a more efficient encoding. In fact, I think we should consider making UTF-8 the default encoding sometime. The real issue is the collation. But the collation is a small subset of the whole locale/character set gobbledigook. Standardized collation rules in standardized forms exist. Finding/creating routines to interpret and apply them should be the focus. SQL's notion to funnel the decision which collation rule to apply through the character sets is bogus. It's impossible to pick a default collation rule for many character sets without applying bias. -- Peter Eisentraut [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster