Good spot raster.

I've blocked the following User-Agent on the frontend load balancer, so 
no other VM/website can be reached by those bits suckers.

   acl u-robots-bad hdr(User-Agent) -i Ahrefsbot
   acl u-robots-bad hdr(User-Agent) -i Baiduspider
   acl u-robots-bad hdr(User-Agent) -i Cliqzbot
   acl u-robots-bad hdr(User-Agent) -i DotBot
   acl u-robots-bad hdr(User-Agent) -i MJ12bot
   acl u-robots-bad hdr(User-Agent) -i Semrushbot
   acl u-robots-bad hdr(User-Agent) -i YandexBot

https://git.enlightenment.org/ is back online.

Cheers,
Bertrand

On 25/04/2017 22:40, Bertrand Jacquin wrote:
> Hey,
> 
> Taking a look
> 
> Cheers
> 
> On 25/04/2017 19:51, Dave wrote:
>> Try the following apache config in your directory directive, or 
>> .htaccess
>> file:
>> 
>> BrowserMatchNoCase Baiduspider botblock
>> BrowserMatchNoCase Semrushbot botblock
>> BrowserMatchNoCase Ahrefsbot botblock
>> Order Deny,Allow
>> Deny from env=botblock
>> 
>>  Should block those specific bots, while allowing others to use http.  
>> It
>> could take a few weeks before the bots realise you've made a change to 
>> your
>> robots.txt .
>> 
>>  Cheers,
>>  davek
>> 
>> 
>> 
>>  In the year 2017, of the month of April, on the 26th day, Carsten
>> Haitzler wrote:
>>> I've had to disable the whole http support for now for 
>>> git.enlightenment.org
>>> because several bots are crawling it causing our VM to basically be 
>>> loaded with
>>> 10-20 cgit cgi's running git queries for history etc. continually. 
>>> I/O and
>>> system load is going through the roof as a result and causing other 
>>> stuff like
>>> phab to crawl and begin timing out.
>>> 
>>> So anyone using HTTP for doing cmdline git stuff is, at this moment, 
>>> going to
>>> find things not working. SSH and GIT protocol should still work. I'll 
>>> keep this
>>> shut down for a few hours hoping the bots give up.
>>> 
>>> I added a robots.txt and edited the cigtrc to deny all bots from 
>>> indexing
>>> git.enlightenment.org - but the bots seem to be ignoring that now 
>>> that they
>>> have decided to start indexing.
>>> 
>>> I am wondering if this has been the cause of our issues - being 
>>> overloaded by
>>> indexer bots. FYI I counted 3 different bots indexing cgit at the 
>>> same time:
>>> Baiduspider, Semrushbot, Ahrefsbot.
>>> 
>>> I hope later they will start listening to robots.txt, but for now I 
>>> need to
>>> keep things off until the bots give up.
>>> 
>>> --
>>> ------------- Codito, ergo sum - "I code, therefore I am" 
>>> --------------
>>> The Rasterman (Carsten Haitzler)    ras...@rasterman.com
>>> 
>>> 
>>> ------------------------------------------------------------------------------
>>> Check out the vibrant tech community on one of the world's most
>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>>> _______________________________________________
>>> enlightenment-devel mailing list
>>> enlightenment-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

-- 
Bertrand

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to