On Thu, 17 Sep 1998, Geoff Hutchison wrote:

> Date: Thu, 17 Sep 1998 23:47:47 -0400
> From: Geoff Hutchison <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: htdig: Problems with using htdig -a
> 
> Hi,
> 
> I consider the following a bug, since it's not documented. Fortunately
> there's an easy workaround.
> 
> I normally run the dig with the switch -a to use alternate files (allowing
> others to search as I'm digging). Usually I don't use the switch -i, so it
> should do an "update" dig and index only the changed or new files (which
> should be a small subset of the 50,000 pages). Then the script moves the
> files into place at the end of the run.
> 
> However, when using "-a" I wasn't seeing an update of the database.
> Essentially htdig looks at the db.docs.work file and found it empty. So it
> updates the empty db by doing a full initial dig. :-(
> 
> Here's an example solution: (yes, you might want to ignore the first cp
> commands and change the first two mv commands to cp)
> 
> BASEDIR=/opt/htdig
> cp $BASEDIR/db/db.wordlist $BASEDIR/db/db.wordlist.work
> cp $BASEDIR/db/db.docdb $BASEDIR/db/db.docdb.work
> $BASEDIR/bin/htdig -a -s
> $BASEDIR/bin/htmerge -a -s
> mv $BASEDIR/db/db.wordlist.work $BASEDIR/db/db.wordlist
> mv $BASEDIR/db/db.docdb.work $BASEDIR/db/db.docdb
> mv $BASEDIR/db/db.docs.index.work $BASEDIR/db/db.docs.index
> mv $BASEDIR/db/db.words.db.work $BASEDIR/db/db.words.db
> 
> This changed a 1 hr. 30 min. dig into a 15 min dig, even counting the
> shuffling of files. Faster is better. :-)

I have 2809 documents on a local server; I also use the -a switch; it
normllyt takes about 12 minutes to rundig.  I tried your easy workaround
and got the following results: 

According to the report I have 3128 documents; it took about 14 minutes to
rundig.  The size of my db files increased by about 30%:

-rw-r--r--  1 jjah  www  13281280 Sep 17 21:36 db.docdb
-rw-r--r--  1 jjah  www  10482688 Sep 17 02:33 db.docdb.old
-rw-r--r--  1 jjah  www    398336 Sep 17 21:35 db.docs.index
-rw-r--r--  1 jjah  www    343040 Sep 17 02:33 db.docs.index.old
-rw-r--r--  1 jjah  www  22928417 Sep 17 21:36 db.wordlist
-rw-r--r--  1 jjah  www  17329728 Sep 17 02:32 db.wordlist.old
-rw-r--r--  1 jjah  www  19543040 Sep 17 21:34 db.words.db
-rw-r--r--  1 jjah  www  15352832 Sep 17 02:32 db.words.db.old

I assume this increase in size of db files and theincrease in the reported
number of documents will be cumulative over time if one uses this
workaround; It will probably increase the actual search time as well;(

Joe

     _/   _/_/_/       _/              ____________    __o
     _/   _/   _/      _/         ______________     _-\<,_
 _/  _/   _/_/_/   _/  _/                     ......(_)/ (_)
  _/_/ oe _/   _/.  _/_/ ah        [EMAIL PROTECTED]

----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to