Re: Collation feature discussion

Daniel John Debrunner Mon, 26 Mar 2007 12:49:54 -0800

Roy Lyseng wrote:



Daniel John Debrunner wrote:

Thus Derby could have two character sets:
- USER - UCS repertoire with default collation of UCS_BASIC orUNICODE depending on value of collation JDBC attribute at createdatabase time
 - SYSTEM - UCS repertoire with default collation of UCS_BASIC

I think that you should carefully consider the implications of using twocharacter sets. Among other things, it means that two strings withdifferent character sets are not immediately comparable. And as far as Iknow, this applies to literals as well. What this means (I think) isthat if columns in system tables are defined with character set SYSTEM,columns in user-defined tables are defined with character set USER, andliterals are of type USER, then you cannot immediately compare literalswith the character columns in the system tables.

Note I'm using "character set" as the SQL Standard defines it (section4.2.7) and different character sets are comparable if they have acollation in common (section 4.2.2).

I think the SQL Standard also mandates multiple character sets if onewants different default collations. The expression CURRENT USER has amandated character set of SQL_IDENTIFIER, thus Derby must support that,and it is required that SQL identifiers have UCS_BASIC collation. Then aCREATE TABLE picks up its collation from its default *character set*which comes from its schema 11.4 SR10b), so to have a different defaultcollation to SQL_IDENTIFIER a different character set is needed.

Another option is to use one character set, but use different collationsfor different types of tables. You may define that character columns insystem tables are created using collation UCS_BASIC, while all usertables are created with a user-defined collation. Because all columnsare defined using the same character set, all columns and literals willbe comparable.

Is that correct? So far the discussion has assumed columns withdifferent implicit collations are not comparable, see 9.3 SR3e).

I don't think it's a goal to have columns in system tables be comparablewith user columns since if they have different collations the standardsays they are not. [assuming no implementation of a <collate clause>]

A goal is to have the SQL queries used for JDBC metadata continue towork, which is currently the discussion around literals. The standardseems to poorly define the character set of a string literal.

Just remember that when comparing two strings with different definedcollations, you need to consider the collation rules defined by the SQLstandard.

Right, I think we are trying to understand those rules, how they applyto Derby and the proposed changes for DERBY-1478.


Thanks,
Dan.

Re: Collation feature discussion

Reply via email to