Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?

Mike Matrigali Thu, 15 Mar 2007 09:39:46 -0800

I think I am missing/not understanding your direction.


are there still 4 new types?

/mikem

Daniel John Debrunner wrote:

Mamta Satoor wrote:
Ok, so I spent some time trying to move COLLATION attribute code fromDataDictionaryImpl.boot to DataValueFactoryImpl.boot. I thought Icould simply put following piece of code in DataValueFactoryImpl.bootmethod and the Property.COLLATION will get saved in the propertiesconglomerate.
I think some of this goes back to the intended implementation.
The intended implementation seems to be that there will be variants ofthe four character datatypes with locale based collation. This is fournew (internal) datatypes in Derby that share most code with the existingCHAR, VARCHAR, LONG VARCHHAR and CLOB types.
I'm not sure this is the correct approach.
My first thought is that this doesn't scale and doesn't seem like an OOsolution. To think ahead this means any addition collation style willalso add four new datatypes, which means there could easily be sixteenor more datatypes to represent the characters. Each datatype will comewith some code cost, classes and/or methods per type.
My second concern is that many places get characters and the change mustensure they get the correct datatype, apart from potentially being a lotof work, the chance of missing some or picking the wrong character typesseems high.
What is really required is 'character type + collation'. I've beenthinking that looking at the problem in this way may make it moremanageable and easier to contain, with the main idea being only worryabout collation type when actually performing a collation. So someinitial ideas:
- collation is a attribute of DataTypeDescriptor, not valid for noncharacter types, 0 for UCS_BASIC, 1 for UNICODE etc.
       int getCollationType();

- A method on DataValueFactory, returns null if type is UCS_BASIC
       RuleBasedCollator getCharacterCollator(int type)

- A method on StringDataValue
       StringDataValue getValue(RuleBasedCollator collator)

       For SQLChar:
            getValue(null) would return itself
getValue(non-null) would return a new CollateSQLChar() withthe value of the SQLChar and the collator set.
       For CollatorSQLChar
getValue(null) would return a new SQLChar() with the value ofthe CollateSQLChargetValue(non-null) would return itself with the collator setcorrectly.
- The collation type (the integer) is written into the meta-data for anindex just as ascending/descending is today (including the btree controlrow, thus making the information available for recovery). Collation typeapplies to all character columns in the index.
- At SQL collation time, the code generation sets up the various typescorrectly using the new methods.
- At recovery time the btree uses the collation type and the data valuefactory to setup its template row array correctly. Something like
     for each dvd in row array
        if (dvd instanceof StringDataValue)
             dvd = dvd.getValue(dvf.getCharacterCollator(type));

- setting the collation property remains in the data dictionary
- basic database sets the locale for the DataValueFactory after it bootsit, using a new method on DVF
        void setLocale(Locale locale);
I think approaching the problem this way will lead to a cleaner solutionin the long term and be somewhat easier to implement.
Thanks,
Dan.

Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?

Reply via email to