On Wed, Jun 13, 2007 at 06:44:02PM +0100, Andrew M. Bishop wrote: >> For now you should disable modify-html when you use wget >> (not only because of this issue). There is a tricky way >> to achieve it without changing configs. WWWOFFLE does >> not modify pages when ht/dig or some other indexing bot >> requests it. It looks info "User-Agent" header. So, if >> you run wget with --user-agent="ht/dig" (read the sources >> for exact value for user-agent, I don't remember), you >> will not receive the links to delete and other links from >> footer. > > Another header would be "Cache-Control: no-transform" or > you could use "Pragma: wwwoffle-client". Either of these > two would be prefered to faking a web indexer.
BTW this does not prevent the robot from requesting (and fetching in online?) unspooled pages So user-agent is still better way -- Max
