I've just installed htdig; in our situation, we'll need to index multiple domains, in such a manner that htsearch is able to access a "combined" version. (so a keyword search will locate results from any of the domains). I was hoping that I could use only one database, "htdig" one (or relatively few) URL's at a given time, and thus "stagger" the process of re-indexing the database. At least as I've been running, however, htdig appears to be "re-checking" every url which is already in the database, presumably with intent to determine whether any have changed. I can see rationale to this, but it will result in a substantial (and very-possibly unacceptable) workload increase. Is there any way to prevent this re-checking behavior? Whether or not there is, I have been unable to locate any clear documentation concerning file handling. specifically: A. Which data-input files are mandatory, and which optional, for each of the three components? B. Which data files do htdig, and htmerge, create and/or update? What I think I want to develop is an approach under which htdig is executed against partial databases (each containing results from relatively few domains), and htmerge is used to merge the search results, from the domains in each of the partial databases, into a combined database. If there's an FAQ, or equivalent, which covers this, please so advise . . Steven P Haver ------------------------------------ To unsubscribe from the htdig3-dev mailing list, send a message to [EMAIL PROTECTED] You will receive a message to confirm this.
