Ok, so I may have an answer to why this is not working:

When blocking is on, the slimserver answers to all URLs as empty files,
instead of HTTP 401 or 403.

Googlebot picks this up and thinks there are no restrictions for
crawl.. of course they don't fetch the robots.txt for every URL they
crawl, that would be bad.  Once you open it up, Googlebot thinks it can
crawl again and starts it's dance.

Options:
* Slimserver should always return the robots.txt.
* Slimserver should return 401 or 403 on connections from
non-authorized hosts.
* Slimserver should TCP RST non-autohrized hosts.
* Google should detect and avoid slimservers in their crawl.

Personally I think this is something that should be fixed on the client
router or slimserver.. But Google is good about these things.  Hrmm...


-- 
SuperQ
------------------------------------------------------------------------
SuperQ's Profile: http://forums.slimdevices.com/member.php?userid=2139
View this thread: http://forums.slimdevices.com/showthread.php?t=43741

_______________________________________________
discuss mailing list
discuss@lists.slimdevices.com
http://lists.slimdevices.com/lists/listinfo/discuss

Reply via email to