On Fri, 17 May 2002, Torsten Neuer wrote: > > ht://dig's databases so that I can save the extra effort of mirroring? > > > > I was using rsync to keep my bandwidth low, but now I need to switch to > > something that works like wget so that I can get static html snapshots > > instead of the actual cgi/php/asp source pages.
> (2) Use the host name aliasing features of ht://Dig which allows > you to index the (mirrored) pages locally and still provide > search services for any host that mirrors the respective sites. ... > However, if you have to rsync the pages to be mirrors, you can > still index locally. Torsten pretty much summed up what you could do. At the moment, you can't use the ht://Dig indexing to mirror. For one, the document database for ht://Dig doesn't store the entire document--no images, no HTML tags, excerpt limited by max_head_length... You can, however, mirror and then have htdig index your mirrored copy. Attributes you may find interesting include local_urls (for picking files of the filesystem) and search_rewrite_rules (for rewriting a local URL used for indexing to a non-local one), among others. <http://www.htdig.org/attrs.html#local_urls> <http://www.htdig.org/attrs.html#search_rewrite_rules> -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ _______________________________________________________________ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

