Re: [htdig] merging two databases - am I doing this right?

Geoff Hutchison Sun, 10 Nov 2002 21:18:15 -0800

On Saturday, November 9, 2002, at 04:27  PM, Dan Langille wrote:

years of data.  I'm seeking comments on my approach.

You're doing some unnecessary moving around, but I'd need some clarifications. I *think* what you're doing is indexing into the -merge databases, which are *only* used for indexing and the eventual merging into the old database. And the main database is only used for searching and the merging.

Right?

cp adsl-merge.docdb.work      adsl-merge.docdb
cp adsl-merge.docs.index.work adsl-merge.docs.index
cp adsl-merge.wordlist.work   adsl-merge.wordlist
cp adsl-merge.words.db.work   adsl-merge.words.db

OK, if you're only ever using this for merging, then this is completely useless. Only the .work files are ever touched. So you have a lot of duplicate data.

After the merge, this moves the new search data into production:
mv adsl.docdb.work      adsl.docdb
mv adsl.docs.index.work adsl.docs.index
mv adsl.wordlist.work   adsl.wordlist
mv adsl.words.db.work   adsl.words.db

OK, but the .wordlist file is never used by htsearch. So you might as well leave it as a .work file (where it's used by the merge) and never copy it either.

Also, if you have the disk space, you can change those "mv" commands to "cp" and leave the .work files in place--it'll save time, though admittedly at the expense of disk space.

-Geoff

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Re: [htdig] merging two databases - am I doing this right?

Reply via email to