On Sun, 4 Mar 2001, Geoff Hutchison wrote:
> Date: Sun, 4 Mar 2001 08:33:58 -0600
> From: Geoff Hutchison <[EMAIL PROTECTED]>
> To: "Joe R. Jah" <[EMAIL PROTECTED]>
> Cc: htdig3-dev <[EMAIL PROTECTED]>
> Subject: Re: [htdig-dev] [ANNOUNCE] ht://Dig 3.2.0b3
>
> At 9:45 PM -0800 3/3/01, Joe R. Jah wrote:
> >total 78089
> >-rw-r--r-- 1 jjah www 27633664 Mar 3 02:30 db.docdb
> >-rw-r--r-- 1 jjah www 560128 Mar 3 02:30 db.docs.index
> >-rw-r--r-- 1 jjah www 25340482 Mar 3 02:29 db.wordlist
> >-rw-r--r-- 1 jjah www 26329088 Mar 3 02:29 db.words.db
> >
> >3.2.0b3:
> >
> >total 67792
> >-rw-r--r-- 1 jjah www 5332992 Feb 19 23:15 db.docdb
> >-rw-r--r-- 1 jjah www 2916352 Feb 19 23:15 db.docs.index
> >-rw-r--r-- 1 jjah www 44097536 Feb 19 23:15 db.excerpts
> >-rw-r--r-- 1 jjah www 20443136 Feb 19 23:20 db.words.db
> >-rw-r--r-- 1 jjah www 16384 Feb 19 23:14 db.words.db_weakcmpr
>
> Actually this is very interesting. The db.docdb has been split into
> the db.docdb and the db.excerpts, so the combined size should be
> close. But for some reason the db.excerpts is huge. I'm guessing you
> probably ran htmerge on the 3.1.5 databases, but have the 3.2
> databases been through htpurge?
>
> Still, it's not likely an issue with disk performance/caching--the
> files are pretty close in size.
I restored the data base shown above from my backups. I then randig
without -i, (in case of 3.1.5 I always use -i because I can afford 11
minutes;) The second indexing with 3.2.0b3 took only five hours and ten
minutes, (versus nine hours and a half the first time.) Here is more
detail stats:
_________________________________________________
rundig: Start time: Sat Mar 3 20:35:48 PST 2001
htdig: Start Digging: Sat Mar 3 20:35:48 PST 2001
HTTP statistics
===============
Persistent connections : Yes
HEAD call before GET : No
Connections opened : 230
Connections closed : 229
Changes of server : 0
HTTP Requests : 1178
HTTP KBytes requested : 26571.9
HTTP Average request time : 0.0178268 secs
HTTP Average speed : 1265.33 KBytes/secs
htdig: Done Digging: Sun Mar 4 01:39:45 PST 2001
htpurge: Start purging: Sun Mar 4 01:39:45 PST 2001
htpurge: Done Purging: Sun Mar 4 01:45:56 PST 2001
htmerge: Start Merging: Sun Mar 4 01:45:56 PST 2001
htmerge: Done Merging: Sun Mar 4 01:45:56 PST 2001
rundig: End time: Sun Mar 4 01:45:57 PST 2001
_________________________________________________
The data base shrank further; specially db.docdb and db.docs.index shrank
to a quarter of their sizes at the first indexing. I think it is because
htpurge did not get a chance to run last time. I will rundig further
until the size of the database and the indexing speed reach equilibrium.
total 55520
-rw-r--r-- 1 jjah www 1409024 Mar 4 01:40 db.docdb
-rw-r--r-- 1 jjah www 655360 Mar 4 01:40 db.docs.index
-rw-r--r-- 1 jjah www 38780928 Mar 4 01:40 db.excerpts
-rw-r--r-- 1 jjah www 17954816 Mar 4 01:45 db.words.db
-rw-r--r-- 1 jjah www 16384 Mar 4 01:45 db.words.db_weakcmpr
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED]
_______________________________________________
htdig-dev mailing list
[EMAIL PROTECTED]
http://lists.sourceforge.net/lists/listinfo/htdig-dev