Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?

Mike Matrigali Thu, 15 Mar 2007 15:43:40 -0800


Daniel John Debrunner wrote:

Mike Matrigali wrote:
Ok, so effectively language will store collation information on a per
column basis.  10.3 will interpret 0 representing USC_BASIC, and some
to be defined method will assign other values for other collations.Will need to make sure there aren't any jdbc calls that blindly return
scale currently for character types.
I had to rush the last e-mail about scale since I had to pick my son upfrom school, so sorry for that.
I'm not saying that DataTypeDescriptor.getScale() for a character columnchanges in any way, its api remains the same which would be to returnzero for any character column.
However for a character datatype we could use the space on-disk thatscale currently occupies to write collation information, since it'salways written as zero currently for characters. So the writeExternal()would have something like (not actual methods)
   if (i_am_character_type)
     out.writeInt(collation);
   else
     out.writeInt(scale);


and the readExternal

   int v = in.readInt();
   if (i_am_character_type)
   {
      collation = v;
      scale = 0;
   }
   else
   {
      scale = v;
   }

Hope that clears that up.
Dan.

thanks, that is what I thought. I didn't really think about how themetadata would be returned for scale - probably still worth making sure

we test the metadata scale call in a collated db.

I am just getting clear in my mind what we are doing with languagemetadata in the proposal. Since we are writing per-column metadata forcollation in language, it is harder for

me to argue against per column metadata in store.


physically I am not sure the best way to store it.

Are we sure the collation id can be represented as an INT?  I may have
missed it but do we expect a different number here for each different
language, or is there a single number that says sort based on language
and go look up language somewhere else?

options include:

1) most straight forward would be an array with an entry for each columnwhether it is character or not. If we use compressedInteger format wecan get away with only 1 byte per "null" entry. Note on the way out it

is easy to tell if it is a character, but on the way back we only have

format id's. I was hoping to have a single call to datafactory(formatid, collate id) and get back the correct object.


Will it ever make sense to assocate a collation with something other
than a character type?

2) some sort of encoded sparse index with entries only for the charactercolumns (anyone know if there is a java utility to do this)? Thedownside is that this usually means even more data stored than option 1

in some cases.

3) some sort of format that on read would depend on first getting an
uncollated datatype of type format-id and then regetting it based on
some code.  So maybe some extra object creation and extra cpu overhead
to create the template in readExternal.

Re: Collation implementation WAS Re: Should COLLATION attribute related code go in BasicDatabase?

Reply via email to