Hi,

       I'm happy to confirm that we have a win-win situation when
using transparent I/O compression on large Berkeley DB files (> 900Mb)
with random updates.  We save space (50%) and real time (29%). We use
3 times more cpu power with compression. All this is quite logic since
the I/O overhead grows as the file grows. Of course the win highly
depends on the usage pattern, I/O speed and CPU speed. However the
simplest rule is that the bigger your file is, the more you will gain.
Tests on smaller Berkeley DB file (<900 Mb) show that you lose real
time because I/O are not expensive enough to give an advantage. You
still have 50% compression, though, even on tiny files :-)

    These figures are not really benchmark results but I'm having
consistent results since two days. I would really like someone to
confirm this. The patch that applies to db-2.7.5 is at 
http://www.senga.org/htdig/db-2.7.5-compress.patch. I'll be running
more tests this week end (~10Gb files on a fast machine with lots of
memory).

    I'll commit the patch to htdig3 on Friday, together with the benchit
program, in the test directory.

With compression:

Reading from words.all ... pushed 21671040 words
        Command being timed: "./benchit -Z -S 8192 -C 33554432 -w words.all -l 10"
        User time (seconds): 3030.87
        System time (seconds): 465.54
        Percent of CPU this job got: 35%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 2:46:25
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 0
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 237889
        Minor (reclaiming a frame) page faults: 19360840
        Voluntary context switches: 0
        Involuntary context switches: 0
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
-rw-r--r--   1 loic     loic     461365248 Aug  4 21:01 test

Without compression:

Reading from words.all ... pushed 21671040 words
        Command being timed: "./benchit -S 8192 -C 33554432 -w words.all -l 10"
        User time (seconds): 1065.28
        System time (seconds): 124.90
        Percent of CPU this job got: 8%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 3:51:23
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 0
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 666789
        Minor (reclaiming a frame) page faults: 10746
        Voluntary context switches: 0
        Involuntary context switches: 0
        Swaps: 0
        File system inputs: 0
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
-rw-r--r--   1 loic     loic     922714112 Aug  5 00:52 test

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to