just my opinion, but for what it's worth:

improved online backup - this seems like a good addition to derby. The
current state is that you can take a backup while the system is
running, but updating transactions will block until the backup is
finished. Recently implemented rollforward recovery makes implementing a full non-blocking online backup a next logical step.


table partitioning-
      The question here is why do want to partition the table.  If it
      just to spread I/O randomly across disks, I don't think it is a
      very useful feature.  The same thing can easily accomplished on
      most modern hardware/OS's at a lower level while presenting the
      disk farm as one disk to the JVM/derby.

      Now if you are talking about key partitioning then that may be
      useful, but only if accompanying work is done to partition
      query execution in parallel against those partitions.  Below
      I will describe one approach that I think is the easiest and most
      maintainable first step towards this.

replication -
      For this is can see a few directions:

1) master/offline slave(s), hot stand by - Again with recent completed work of rollfoward
recovery it would not be too hard to set up a secondary system
which was ready to take over when a primary failed. Basically
copy the db, and then stream the logs across and apply the
logs using existing recovery algorithms when you want to bring
the system online. Once the first slave initiated update is
applied no new updates from master work using this algorithm.


      2) master/read only slave(s), hot stand by, with read only access
         - Building on 1, this would
         again not be too hard.  Some work needed to guarantee read
         access while applying recovery logic online rather than during
         boot.  Save caveats as above

3) master/(read/write slave(s)), very hard - the usual problems,
what do you do with with conflicts. Such a system may better
be handled by doing a higher level update/conflict tracking than the log. maybe something like mysql does.


Load Balancing - I don't know what you are looking for here.  Would be
    be interested in more detail here.


An approach to a more scalable Derby Database (again just an opinion,
and note I don't have plans nor expertise to actually build the following, but would seem like a good project for someone interested in building distributed optimizer technology):


Taking a shared nothing approach to scalability, the following seems like a good first step to providing a more scalable Derby database.
Rather than partitioning tables within a single derby database, instead use the existing derby database software in a single node in a multi-node
distributed database. To do this build a new piece of software that glues
a network of derby databases together, each piece of the database could
be on the same machine or different machine.


The new software would handle the following:
1) Some new set of ddl which would could partition a single distributed table across multiple local derby databases.
2) Handle dml, sending it to appropriate local database.
3) optimizer/execution - this is the interesting part. Needs to partition queries, in parallel sending/receiving data from/to local dbs.
4) For extra credit one could build a fault tolerant system by applying
RAID algorithms to the local db's. Lose one local DB it could be
rebuilt from other replicated db's.
4) probably a lot else I haven't mentioned.
Some benefits of doing this in derby:
1) If multiple partioned db's are local to distributed server then all communication can easily using embedded derby server interfaces - making them go fast. In first implementation I would suggest just using standard jdbc between the distributed derby server and the local nodes as the easiest way to get it all working.
2) If using jdbc, same exact code will work to access local vs. networked local db's.
3) Seems like using the same kind of "driver" trick as does the network server, applications could
use this new distributed db with no code changes (apart from ddl to set
the system up).
4) Using derby modules, one can probably reuse derby code for some pieces (like the sql parser), while not slowing down the core non-networked derby version. If done right a local system can be
configured that includes no networked code overhead, while from the
same codeline a distributed version can also be built.


I like this approach to a distributed derby database rather than trying
to make one set of code handle both local and network paths. An optimizer is hard enough without making a single optimizer handle both local and distributed decisions. It also means that local user performance does not suffer from code path issues from unused code.


[EMAIL PROTECTED] wrote:
Hi all,

In theese Days there are some fine Databases out there but no one has the Features of
Java and its scale abilities.For example i cannot mix a MySQL 32 and 64 Bit Database on my
given Hardware. I can only use 64-Bit or 64-Bit Systems "only".


MySQL runs fast on Linux and poor on FreeBSD and other UNIX System without some
modifications (for example Threadlib issue) or dirty Tricks.


Sometimes a Feature will be avaiable on Windows only and on Linux not e.t.c.
With a Java SQL-Database like Derby there is a real Chance to have all the cool
Features of the Database at any System at any tme, as long als a J2SE JVM is
present.


Derby is the Right Way but is there any Plans to make it Enterprise ready (replication,
Loadbalancing of Connections, Online Backup, Table PArtitioning)?


Josh Carpenter







Reply via email to