Sleepycat Software writes:
> I think you're right -- it might be worth instrumenting the
> kernel to note for the test process how many hits it got in
> the buffer cache for the two runs vs. how many times it had
> to do I/O. Or, maybe that information is already available?
I'm happy that you join us in these conclusions. I don't know
how to profile the kernel. Since I've already spent a lot of time
benchmarking, I won't go any further. I'd be curious to know the
details but won't have the time to find out.
> Regardless -- would you please characterize for me what the
> tests you're running do? (I know you're using dbbench, were
> there others?) What I'm trying to understand is what the
> data and data access patterns are for these tests.
The tests reflects typical htdig usage pattern. Shortly the test does
the following:
. Keys are <4 bytes int><string>
. The custom compare function first compares the string and if equal
compares the int.
. count = 0
The test reads a 700 000 words file N times (-l argument)
Foreach word
build key <count++><word>
data is 1 char ' '
insert in db
And that's it. It's a BTREE, words are between 3 and 31 characters long,
since count is different for each word there's no duplicate keys.
This is not a read only base test pattern. The objective of htdig is to
have a dynamically updated database and most of the stress comes from the
crawler constantly updating the index. It would be intersting to know if
having a random search instead of random insert pattern changes the
threshold of the win-win situation. But I can't run endless benchmarks, I have
code to write :-)
--
Loic Dachary
ECILA
100 av. du Gal Leclerc
93500 Pantin - France
Tel: 33 1 56 96 09 80, Fax: 33 1 56 96 09 61
e-mail: [EMAIL PROTECTED] URL: http://www.senga.org/
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.