This is mixing a lot of things up.  I also may use the wrong
terminology here.

Character set encodings really only come into play with tools like
ij, and import getting the string from the environment into derby.  The more
standard interaction is using jdbc to load a java string into derby.
At that level we don't do anything with encodings.

We happen to use a modified utf8 to store stuff to disk, and this is
not configurable. But no user interface should depend on this encoding, and Derby could change this storage in the future.

Logically all strings at runtime are converted to standard java char.

Before 10.3 we always used standard java string compare which did a numerical comparison of the unicode value of chars to arrive at ordering. That is still the default. In 10.3 an option was added to set the territory based collation when the database is created such that comparison is dependent on the territory of the database. For this standard java
rule based Collator interfaces are used.  This is documented in the latest
derby release.

David Van Couvering wrote:
Hi, all.  I am getting some questions from Ken Frank NetBeans
internationalization quality team about Java DB and character set
encodings.  Rather than try and play go-between, I'm including him
here so he can directly ask any follow-on questions.

Ken would like to understand how Derby makes use of character
encodings, and how it is affected by  various settings.  How does
Derby handle things if the encoding is set to something different from
our default of UTF-8?  Are we impacted, or do we rely on Java routines
such as the Collator and Comparator class to handle this?

Sorry if I'm talking out my ear, i18n is not one of my fortes.

Thanks,

David

Reply via email to