Anders Morken wrote:
I've played around a bit with a different approach - using the
FileChannel class from Java 1.4's new IO API. I've written a class
RAFContainer4 which extends RAFContainer and overrides the readPage and
writePage methods of that class to use read/write(ByteBuffer buf, long
postition) in FileChannel to access the container's file, without
synchronizing on the FileContainer during the read and write calls.
With a bit of hackery in BaseDataFileFactory#newContainerObject() this
class is then used instead of the regular RAFContainer on creation of
new RAFContainer objects when Derby runs in a 1.4+ JVM.
This approach gives the JVM and OS the opportunity to issue multiple
file operations concurrently, although we have no guarantees that this
will actually happen. This is JVM/OS dependent, but stracing the Sun
1.4.2_09 VM on Linux 2.6 shows that the VM now uses pread64()/pwrite64()
system calls instead of seek(), read() and write(). pread and pwrite
have similar semantics to the FileChannel#read/write(ByteBuffer buf,
long position) methods, and do not alter the file's seek() position, and
are supposed to be thread safe.
Great, Anders. This looks like a promising idea.
Of course only people running Derby on 1.4+ JVMs will have the
opportunity to benefit from this approach. As support for 1.3 is to be
deprecated this might not be much of an issue?
If this means that 1.3 still works, but the old way, I think this is
acceptable.
But anyway, I would like to see if this hack of mine actually works. I
see mentions of a "TPC-B like benchmark" in the threads Øystein links to
above, and wonder if that is something Sun internal, or if it's a
publicly available benchmark implementation that I can get my grubby
little paws on and try out this patch with? =)
The actual code is something we have developed internally here, and I am
not sure we will have time to make it available any time soon. If you
make a patch of your changes, I should be able to test this next week.
If you want to try this out yourself, I think you should be able to make
a sufficient test client quickly. (However, it will take some time to
create the large database). What I used was:
1. A database much larger than physical memory on computer. I think I
had around 17 GB of data including indexes.
2. A large page cache. I used 500MB on a computer with 2GB RAM.
3. Log device on separate disk. (I.e., you need a computer with 2 disks.)
4. I used TPC-B like transactions, but I would accept that any load
where transactions access records in a large table by primary key should
work. Make sure to try to avoid frequent lock conflicts or deadlocks.
(E.g., two random accesses to the same table within a transaction is
dead-lock prone)
5. Multi-threaded application where all threads ran the same type of
transaction back-to-back. (I had 20 threads). Our application prints
throughput per thread and total throughput for every 10 second interval
and an average at the end.
6. Run for at least 30 mins to allow for several checkpoints to happen
during your run. (I ran for 1 hour).
A short description of our TPC-B like app:
4 tables:
branch(bid int, bbal int, junk char(92), primary key(bid))
teller(tid int, bid, int , tbal int, junk char(88), primary key(tid))
account(aid int, bid int, abal int, junk char(88), primary key(aid))
history(aid int, tid int, bid int, delta int, tstamp timestamp, primary
key(tstamp,aid,delta))
All primary keys are numbered from 0 to n-1, where n are the number of
rows in the table. *bal columns are initially 0.
teller: bid=tid/10, account:bid=aid/100000
I had 1000 branches, 10000 tellers and 100 million accounts. history
table is initially empty
A transaction has 5 statements:
update account set abal = abal+? where aid=? and bid=?
insert into history values (?, ?, ?, ?, CURRENT_TIMESTAMP)
update teller set tbal = tbal+? where tid=? and bid=?
update branch set bbal = bbal+? where bid=?
select abal from account where aid=y
In TPC-B there are some rules about selecting teller and account where a
certain percentage of transactions will use a teller from a different
branch than the branch of the account, but I do not think that will
matter here. I suggest by random determining a balance change (we use a
number between -1 million and 1 million), and a random aid and determine
tid and bid based on the aid. (I.e., tid=aid/10000, bid=tid/100000).
--
Øystein