Hi,

       Here are the results of the last tests. I will not conduct any
more tests since I'm now 100% sure that the following rule applies:

       When using compression, if your database is big enough you
also win real time processing. 

       Reminder: the transparent Berkeley DB compression code used
in htdig3 compresses the file to 50% (+ 0.1% in the worst case).
 
       The value of 'big enough' depends on your configuration.
It's around 1Gb on a PII350, 128Mb RAM, 6Mb/s disk, Linux-2.2, 32Mb cache.
It's around 5Gb on a PIII450, 512Mb RAM, 12Mb/s disk, FreeBSD-3.2, 64Mb cache.

       Why do we win real-time ? There are two candidate factors:

           . File that is 50% smaller reduces seek latency to fetch
             blocks.
           . The kernel buffer cache contains twice as much usable data.

       Keith Bostic argues that seek latency has no visible effect. I
tend to agree with him, given the fact that the elevator algorithm of
the kernel sorts I/O operations. The gain is, IMHO, mainly due the cache
being twice as big because it contains compressed data. 

       The following test results (make bench in htdig3/test) are run
on PIII450, 512Mb RAM, 12Mb/s disk, FreeBSD-3.2, 64Mb cache. The real
time gain is 5% only for a 5.8Gb file, that's why I think the
threshold is around 5Gb.

       From now on I will therefore assume that compression is a very
good thing because it offers a win-win situation for large databases :-)

make bench
make[1]: Entering directory `/usr/home/loic/htdig3/test'
rm -f /spare/test /spare/test_weakcmpr
/usr/bin/time -l ../test/dbbench -C `expr 64 \* 1024 \* 1024` -S 8192 -z -w words.all 
-l 180 -B /spare/test 
Reading from words.all ... pushed 133495560 words
    37825.68 real     15860.08 user      6371.04 sys
     83492  maximum resident set size
       270  average shared memory size
      1333  average unshared data size
       154  average unshared stack size
 172566485  page reclaims
         1  page faults
         0  swaps
     16946  block input operations
   2584557  block output operations
         0  messages sent
         0  messages received
         0  signals received
     28274  voluntary context switches
    317480  involuntary context switches
ls -l /spare/test
-rw-r--r--  1 loic  wheel  2913034240 Aug 12 04:12 /spare/test
if [ -f /spare/test_weakcmpr ] ; then ../db/dist/db_dump -p /spare/test_weakcmpr ; fi
format=print
type=btree
recnum=1
bt_minkey=2
db_pagesize=8192
HEADER=END
./db/dist/db_stat -z -d /spare/test
0x53162 Btree magic number.
6       Btree version number.
Flags:
2       Minimum keys per-page.
8192    Underlying tree page size.
4       Number of levels in the tree.
133M    Number of keys in the tree.
3692    Number of tree internal pages.
707475  Number of tree leaf pages.
0       Number of tree duplicate pages.
0       Number of tree overflow pages.
0       Number of pages on the free list.
12M     Number of bytes free in tree internal pages (59% ff).
2858M   Number of bytes free in tree leaf pages (196% ff).
0       Number of bytes free in tree duplicate pages (0% ff).
0       Number of bytes free in tree overflow pages (0% ff).
make[1]: Leaving directory `/usr/home/loic/htdig3/test'
make CMPR='' bench
make[1]: Entering directory `/usr/home/loic/htdig3/test'
rm -f /spare/test /spare/test_weakcmpr
/usr/bin/time -l ../test/dbbench -C `expr 64 \* 1024 \* 1024` -S 8192  -w words.all -l 
180 -B /spare/test 
Reading from words.all ... pushed 133495560 words
    36121.84 real      5456.87 user       674.53 sys
     82988  maximum resident set size
       327  average shared memory size
      4776  average unshared data size
       186  average unshared stack size
     21252  page reclaims
         7  page faults
         0  swaps
     24638  block input operations
   2979151  block output operations
         0  messages sent
         0  messages received
         0  signals received
     47394  voluntary context switches
    108297  involuntary context switches
ls -l /spare/test
-rw-r--r--  1 loic  wheel  5825888256 Aug 12 14:56 /spare/test
if [ -f /spare/test_weakcmpr ] ; then ../db/dist/db_dump -p /spare/test_weakcmpr ; fi
./db/dist/db_stat  -d /spare/test
0x53162 Btree magic number.
6       Btree version number.
Flags:
2       Minimum keys per-page.
8192    Underlying tree page size.
4       Number of levels in the tree.
133M    Number of keys in the tree.
3692    Number of tree internal pages.
707475  Number of tree leaf pages.
0       Number of tree duplicate pages.
0       Number of tree overflow pages.
0       Number of pages on the free list.
12M     Number of bytes free in tree internal pages (59% ff).
2858M   Number of bytes free in tree leaf pages (196% ff).
0       Number of bytes free in tree duplicate pages (0% ff).
0       Number of bytes free in tree overflow pages (0% ff).
make[1]: Leaving directory `/usr/home/loic/htdig3/test'

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the SUBJECT of the message.

Reply via email to