On Tue, 25 Apr 2017 22:58:14 -0700 Bertrand Jacquin <[email protected]> said:

sounds good. one day they'll find the robots.txt ... i hope. but not today.
this will indeed be good for now. turning git.e.org off entirely was an
emergency solution until i either could see these time out (they didnt after a
few hrs) or a better solution like below was put in place. :)

phabricator works oh so much better now! like i can actually use arc to apply
patches/diffs as opposed to getting 503's every time...

> Good spot raster.
> 
> I've blocked the following User-Agent on the frontend load balancer, so 
> no other VM/website can be reached by those bits suckers.
> 
>    acl u-robots-bad hdr(User-Agent) -i Ahrefsbot
>    acl u-robots-bad hdr(User-Agent) -i Baiduspider
>    acl u-robots-bad hdr(User-Agent) -i Cliqzbot
>    acl u-robots-bad hdr(User-Agent) -i DotBot
>    acl u-robots-bad hdr(User-Agent) -i MJ12bot
>    acl u-robots-bad hdr(User-Agent) -i Semrushbot
>    acl u-robots-bad hdr(User-Agent) -i YandexBot
> 
> https://git.enlightenment.org/ is back online.
> 
> Cheers,
> Bertrand
> 
> On 25/04/2017 22:40, Bertrand Jacquin wrote:
> > Hey,
> > 
> > Taking a look
> > 
> > Cheers
> > 
> > On 25/04/2017 19:51, Dave wrote:
> >> Try the following apache config in your directory directive, or 
> >> .htaccess
> >> file:
> >> 
> >> BrowserMatchNoCase Baiduspider botblock
> >> BrowserMatchNoCase Semrushbot botblock
> >> BrowserMatchNoCase Ahrefsbot botblock
> >> Order Deny,Allow
> >> Deny from env=botblock
> >> 
> >>  Should block those specific bots, while allowing others to use http.  
> >> It
> >> could take a few weeks before the bots realise you've made a change to 
> >> your
> >> robots.txt .
> >> 
> >>  Cheers,
> >>  davek
> >> 
> >> 
> >> 
> >>  In the year 2017, of the month of April, on the 26th day, Carsten
> >> Haitzler wrote:
> >>> I've had to disable the whole http support for now for 
> >>> git.enlightenment.org
> >>> because several bots are crawling it causing our VM to basically be 
> >>> loaded with
> >>> 10-20 cgit cgi's running git queries for history etc. continually. 
> >>> I/O and
> >>> system load is going through the roof as a result and causing other 
> >>> stuff like
> >>> phab to crawl and begin timing out.
> >>> 
> >>> So anyone using HTTP for doing cmdline git stuff is, at this moment, 
> >>> going to
> >>> find things not working. SSH and GIT protocol should still work. I'll 
> >>> keep this
> >>> shut down for a few hours hoping the bots give up.
> >>> 
> >>> I added a robots.txt and edited the cigtrc to deny all bots from 
> >>> indexing
> >>> git.enlightenment.org - but the bots seem to be ignoring that now 
> >>> that they
> >>> have decided to start indexing.
> >>> 
> >>> I am wondering if this has been the cause of our issues - being 
> >>> overloaded by
> >>> indexer bots. FYI I counted 3 different bots indexing cgit at the 
> >>> same time:
> >>> Baiduspider, Semrushbot, Ahrefsbot.
> >>> 
> >>> I hope later they will start listening to robots.txt, but for now I 
> >>> need to
> >>> keep things off until the bots give up.
> >>> 
> >>> --
> >>> ------------- Codito, ergo sum - "I code, therefore I am" 
> >>> --------------
> >>> The Rasterman (Carsten Haitzler)    [email protected]
> >>> 
> >>> 
> >>> ------------------------------------------------------------------------------
> >>> Check out the vibrant tech community on one of the world's most
> >>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> >>> _______________________________________________
> >>> enlightenment-devel mailing list
> >>> [email protected]
> >>> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 
> -- 
> Bertrand
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    [email protected]


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to