Re:all - didn't notice it was linux-il host 66.249.79.57 57.79.249.66.in-addr.arpa domain name pointer crawl-66-249-79-57.googlebot.com.
This suggests that it is indeed googlebot, why should they put GCE on googlebot hosts, way too high a risk of resulting in blocked bots. I would guess that either - They were testing a new crawler that is messing up (likely, but less I think since it should/would do so locally first) - Some other site contains a link to the content you don't have or someone (intentionally) made a query of the google search engine which triggered the crawler trying said URI. Regards, Eliyahu - אליהו 2014-05-20 12:14 GMT+03:00 Rabin Yasharzadehe <ra...@rabin.io>: > Good point, thank you. > > > *-- Rabin* > > > On Tue, May 20, 2014 at 10:23 AM, shimi <linux...@shimi.net> wrote: > >> On Tue, May 20, 2014 at 10:15 AM, Rabin Yasharzadehe <ra...@rabin.io>wrote: >> >>> I have installed fail2ban on one of my servers, and created a set of >>> rules to block some request the (from my point of view) looks like probing >>> attempts. >>> >>> One of the rules is to block on site, any request to *.jsp which i don't >>> have on this server. >>> >>> Today i got a mail about a blocked IP which belong to Google (based on >>> whois). >>> # whois 66.249.79.57 >>> >>> can any one tell me, why Googlebot will search for something i don't >>> have any reference to in my site? >>> >>> >> The ".." does look strange, I think Googlebot always use Canonical URLs >> in general... >> >> Just a note: The fact that there's no reference in your site (if that is >> indeed a fact...) - does NOT say that there isn't such a reference in any >> other site on the Internet... >> >> Note that Google also has GCE - I would assume the netblocks for GCE >> would also say "Google"... maybe it's a crawler which is not really >> Googlebot, rather than an impersonator running through GCE... >> >> -- Shimi >> >> > > _______________________________________________ > Linux-il mailing list > Linux-il@cs.huji.ac.il > http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il > >
_______________________________________________ Linux-il mailing list Linux-il@cs.huji.ac.il http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il