Re: Google Summer of Code

Ян Программист Wed, 17 Mar 2010 06:12:51 -0700

2010/3/17 Ян Программист <[email protected]>:
>>  * No actual compatibility tests for Oracle, SQL Server
>>
>>
>>
>>   I imagine that there are some existing compatibility test suites out
here. It would be interesting to understand Derby's compatibility with other
databases. This sounds like a task >on which a GSoC student could make good
progress.
>
>    Thanks,
>    -Rick
>
Well, some useful compatibility features to focus on:



   1. Converting data from one database to another in a respective way
   against data consistency. The reasons why special software for converting
   between different databases is useless. Problem of different actual length
   of data in table cells. Acceptableness of creation and population of data
   entries in additional tables with equal column structure for columns (read:
   table schema) for entries with incompatible sizes (for column types of
   consuming tables in a consuming database)
   2. Converting data in a entity centric manner. Convert data of similar
   column types from a point of stored data. If the database client code
   contain strategy switching logic of how to interpret the state of data,
   described in database, it is possible to populate entries in other tables of
   consuming database to avoid loss of data consistency
   3. Converting types is not always good. Compatibility is rather is if
   consuming database would read entries with a special formatter - to
   interpret data as it's own type. So you can copy data as binary heap and
   force consuming database to enable special formatter to read data even at
   production enviroment (formatter would rather point on certain points of a
   cell data chunk in each number/word/... of actually represented data to
   avoid dealing with useless binary delimiters). But that is for data types
   which is for representing the same class (numeric-to-numeric, date-to-date).
   Stored data in cells could be forced to converted to native to avoid usage
   of special formatter, to use normal formatter when consistency of state of
   entities(ORM?), represented in entries is ensured.
   4. Column type compatibility is not an index compatibility. Source
   database contains lots of indexes to speed the JOINs. So it is normally
   required for consuming database to force indexing consumed data entries.
   Index simply represent positions of cells, so in different databases it
   would be different positions for the same data due to cell data size &
   binary delimiters (a part of storage engine stuff). But delimiters are the
   same for same storage engine, cell data sizes vary from to cell. Still do
   not know how to fix a problem - but it would be awesome to apply, for
   example, MySQL indexes on JavaDB without re-indexing on JavaDB side
   5. I see a VARBINARY as a candidate for dealing with aggregation in ORM.
   When you will need a convertion due to any reasons - you can get entity
   data, described in ORM classes with ORM, where aggregated entities are
   virtualised with referencing columns on either table, you can get entries
   from aggregated table (entry of property set of entity) and store it into
   VARBINARY of consuming database, and because it is known how the ORM class
   entries would be iterated in tables of consumer database - you can read at
   consumer side in a binary formatter, and use that data how you want.
   Convertion of consumer's database VARBINARY to a set of columns as native
   ones for aggregated entity entries is fast enough(?)
   6. All that numeric precision delaing stuff. Yeap, ODBC NUMERIC type will
   not have any sence if column compatibility would be implemented (precision
   auto-dealing when reading from cells with different length (count of digits)
   and float-to-int and other stuff). Can help avoiding some ODBC code, as also
   as JDBC code, both relative to column type handling, when accessing data
   from non-native database clients( access JavaDB from C# or SQLite from Java,
   for example)

*XML persistence driven conversions* of both schema and data and *"hot"
binary copying* of non-native data from different databases(special
formatters in (1)) are two main goals here. John

Re: Google Summer of Code

Reply via email to