Hi Vinny, 

thanks for your comment I have done the changes in 
myhotelcar.com/bobots.txt file as you have mentioned but issue is still not 
resolved as per my analysis the bots hiiting specifically ahref has 
increased day by day an now issue seems critical. 

Please hep me to get out of this situation. I will happy to have your 
advice on this. 

On Tuesday, April 21, 2015 at 10:07:28 PM UTC+5:30, Vinny P wrote:
>
> On Mon, Apr 20, 2015 at 11:32 PM, Ashutosh Mishra <ashutosh.n...@gmail.com 
> <javascript:>> wrote:
>>
>> I have also searched so many thing and I found the Ahref bot doesn't obey 
>> robots principal.
>> Many people has suggested that I can prohibit them via htaccess file, I 
>> don't want to use that way as in google app engine hosting I didn't find 
>> htaccess file. So please provide me any way to filter out these spam bots.
>>
>
>
> The .htaccess file isn't supported in App Engine. 
>
> If this is the real Ahref bot, it should support robots.txt. I looked in 
> your robots.txt file: I see you disallowing Baidu, Yandex and a wildcard 
> disallow, but not specifically ahrefbot. Try adding the following to your 
> robots file:
>
> *user-agent: AhrefsBot*
> *disallow: /*
>
> According to the ahrefbot robot page, you can also email them directly to 
> ask them to stop; see https://ahrefs.com/robot
>
>
> On Mon, Apr 20, 2015 at 11:36 PM, Ashutosh Mishra <ashutosh.n...@gmail.com 
> <javascript:>> wrote:
>
>> I think you have picked the issue correctly they are hitting particular 
>> set of pages regularly hotel pages which were dynamically generated, you 
>> are correct about rss and sitemap feed.
>> So please tell me the way to overcome this issue as these spam bots 
>> specially ahref bot is consuming my server bandwidth a lot un-necessarily. 
>> I want a good solution so that I will not face any spam bot hurdle in 
>> future. 
>>
>
>
> This happens to a lot of websites with a large set of dynamically 
> generated pages. 
>
> Honestly the best solution would be to sign up for Cloudflare ( 
> https://www.cloudflare.com/google ) and use their tools to help filter 
> incoming traffic. You can also do what Barry suggested earlier, and start 
> blocking the IPs that ahrefsbot is using. 
>
> If you're willing to do some coding, you can write a filter into your 
> application to check for the useragent and kick back a 429 HTTP status code 
> (Too Many Requests) if traffic is too high: 
> http://tools.ietf.org/html/rfc6585#page-3
>
>  
>  
> -----------------
> -Vinny P
> Technology & Media Consultant
> Chicago, IL
>
> App Engine Code Samples: http://www.learntogoogleit.com
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to google-appengine+unsubscr...@googlegroups.com.
To post to this group, send email to google-appengine@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/181d93e6-b9e8-40e6-8a24-d883a2e315f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to