Hi John,
I agree with Bryan that we want to compare Derby's behavior to that of
other databases so that we can understand
1) where Derby diverges from the SQL Standard
2) what features we should add to Derby to increase its usefulness
As you note, datatypes are an important compatibility topic. I think
that you are on the right track when you talk about ORM frameworks. I
think that we would understand a great deal about datatype compatibility
by studying how the major ORM frameworks map datatypes onto various
rdbmses. The engineers at Hibernate, Toplink, etc. have done a lot of
the analysis already. A systematic compilation of their findings would
be useful.
Thanks,
-Rick
Ян Программист wrote:
2010/3/17 Ян Программист <[email protected]
<mailto:[email protected]>>:
>> * No actual compatibility tests for Oracle, SQL Server
>>
>>
>>
>> I imagine that there are some existing compatibility test suites
out here. It would be interesting to understand Derby's compatibility
with other databases. This sounds like a task >on which a GSoC student
could make good progress.
>
> Thanks,
> -Rick
>
Well, some useful compatibility features to focus on:
1. Converting data from one database to another in a respective way
against data consistency. The reasons why special software for
converting between different databases is useless. Problem of
different actual length of data in table cells. Acceptableness
of creation and population of data entries in additional tables
with equal column structure for columns (read: table schema) for
entries with incompatible sizes (for column types of consuming
tables in a consuming database)
2. Converting data in a entity centric manner. Convert data of
similar column types from a point of stored data. If the
database client code contain strategy switching logic of how to
interpret the state of data, described in database, it is
possible to populate entries in other tables of consuming
database to avoid loss of data consistency
3. Converting types is not always good. Compatibility is rather is
if consuming database would read entries with a special
formatter - to interpret data as it's own type. So you can copy
data as binary heap and force consuming database to enable
special formatter to read data even at production enviroment
(formatter would rather point on certain points of a cell data
chunk in each number/word/... of actually represented data to
avoid dealing with useless binary delimiters). But that is for
data types which is for representing the same class
(numeric-to-numeric, date-to-date). Stored data in cells could
be forced to converted to native to avoid usage of special
formatter, to use normal formatter when consistency of state of
entities(ORM?), represented in entries is ensured.
4. Column type compatibility is not an index compatibility. Source
database contains lots of indexes to speed the JOINs. So it is
normally required for consuming database to force indexing
consumed data entries. Index simply represent positions of
cells, so in different databases it would be different positions
for the same data due to cell data size & binary delimiters (a
part of storage engine stuff). But delimiters are the same for
same storage engine, cell data sizes vary from to cell. Still do
not know how to fix a problem - but it would be awesome to
apply, for example, MySQL indexes on JavaDB without re-indexing
on JavaDB side
5. I see a VARBINARY as a candidate for dealing with aggregation in
ORM. When you will need a convertion due to any reasons - you
can get entity data, described in ORM classes with ORM, where
aggregated entities are virtualised with referencing columns on
either table, you can get entries from aggregated table (entry
of property set of entity) and store it into VARBINARY of
consuming database, and because it is known how the ORM class
entries would be iterated in tables of consumer database - you
can read at consumer side in a binary formatter, and use that
data how you want. Convertion of consumer's database VARBINARY
to a set of columns as native ones for aggregated entity entries
is fast enough(?)
6. All that numeric precision delaing stuff. Yeap, ODBC NUMERIC
type will not have any sence if column compatibility would be
implemented (precision auto-dealing when reading from cells with
different length (count of digits) and float-to-int and other
stuff). Can help avoiding some ODBC code, as also as JDBC code,
both relative to column type handling, when accessing data from
non-native database clients( access JavaDB from C# or SQLite
from Java, for example)
*XML persistence driven conversions* of both schema and data and
*"hot" binary copying* of non-native data from different
databases(special formatters in (1)) are two main goals here. John