On Fri, 17 May 2002, Torsten Neuer wrote:

> > ht://dig's databases so that I can save the extra effort of mirroring?
> > 
> > I was using rsync to keep my bandwidth low, but now I need to switch to
> > something that works like wget so that I can get static html snapshots
> > instead of the actual cgi/php/asp source pages.

> (2) Use the host name aliasing features of ht://Dig which allows
>     you to index the (mirrored) pages locally and still provide
>     search services for any host that mirrors the respective sites.
...
>     However, if you have to rsync the pages to be mirrors, you can
>     still index locally.


Torsten pretty much summed up what you could do. At the moment, you can't
use the ht://Dig indexing to mirror. For one, the document database for
ht://Dig doesn't store the entire document--no images, no HTML tags,
excerpt limited by max_head_length...

You can, however, mirror and then have htdig index your mirrored
copy. Attributes you may find interesting include local_urls (for picking
files of the filesystem) and search_rewrite_rules (for rewriting a local
URL used for indexing to a non-local one), among others.
 <http://www.htdig.org/attrs.html#local_urls>
 <http://www.htdig.org/attrs.html#search_rewrite_rules>

--
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/



_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [EMAIL PROTECTED]
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to