Re:all - didn't notice it was linux-il

host 66.249.79.57
57.79.249.66.in-addr.arpa domain name pointer
crawl-66-249-79-57.googlebot.com.

This suggests that it is indeed googlebot, why should they put GCE on
googlebot hosts, way too high a risk of resulting in blocked bots.
I would guess that either
- They were testing a new crawler that is messing up (likely, but less I
think since it should/would do so locally first)
- Some other site contains a link to the content you don't have or someone
(intentionally) made a query of the google search engine which triggered
the crawler trying said URI.

Regards,
Eliyahu - אליהו


2014-05-20 12:14 GMT+03:00 Rabin Yasharzadehe <ra...@rabin.io>:

> Good point, thank you.
>
>
> *-- Rabin*
>
>
> On Tue, May 20, 2014 at 10:23 AM, shimi <linux...@shimi.net> wrote:
>
>> On Tue, May 20, 2014 at 10:15 AM, Rabin Yasharzadehe <ra...@rabin.io>wrote:
>>
>>> I have installed fail2ban on one of my servers, and created a set of
>>> rules to block some request the (from my point of view) looks like probing
>>> attempts.
>>>
>>> One of the rules is to block on site, any request to *.jsp which i don't
>>> have on this server.
>>>
>>> Today i got a mail about a blocked IP which belong to Google (based on
>>> whois).
>>> # whois 66.249.79.57
>>>
>>> can any one tell me, why Googlebot will search for something i don't
>>> have any reference to in my site?
>>>
>>>
>> The ".." does look strange, I think Googlebot always use Canonical URLs
>> in general...
>>
>> Just a note: The fact that there's no reference in your site (if that is
>> indeed a fact...) - does NOT say that there isn't such a reference in any
>> other site on the Internet...
>>
>> Note that Google also has GCE - I would assume the netblocks for GCE
>> would also say "Google"... maybe it's a crawler which is not really
>> Googlebot, rather than an impersonator running through GCE...
>>
>> -- Shimi
>>
>>
>
> _______________________________________________
> Linux-il mailing list
> Linux-il@cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>
>
_______________________________________________
Linux-il mailing list
Linux-il@cs.huji.ac.il
http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il

Reply via email to