On Tuesday 12 November 2019 11:01:08 Lee wrote: > On 11/11/19, Gene Heskett <ghesk...@shentel.net> wrote: > > On Monday 11 November 2019 08:33:13 Greg Wooledge wrote: > > ... snip ... > > >> I *know* I told you to look at your log files, and to turn on > >> user-agent logging if necessary. > >> > >> I don't remember seeing you ever *post* your log files here, not > >> even a single line from a single instance of this bot. Maybe I > >> missed it. > > > > Only one log file seems to have useful data, the "other..." file, > > and I have posted several single lines here, but here's a few more: > > ... snip ... > > > [11/Nov/2019:12:11:39 -0500] "GET > > /gene/nitros9/level1/coco1_6309/bootfiles/bootfile_covga_cocosdc > > HTTP/1.1" 200 16133 "-" "Mozilla/5.0 (compatible; Daum/4.1; > > +http://cs.daum.net/faq/15/4118.html?faqId=28966)" > > > > I did ask earlier if daum was a bot but no one answered. They are > > becoming a mite pesky. > > Google translate can be your friend: > https://translate.google.com/translate?hl=&sl=ko&tl=en&u=https%3A%2F%2 >Fcs.daum.net%2Ffaq%2F15%2F4118.html > > Note they even tell you how to turn off collection: > I want to automatically exclude documents from my site from web > document search results. > [robots.txt Exclusion using file] > Please write the following in Notepad, and save it as robots.txt file > to the root directory. > > User-agent: DAUM > Disallow: / > > Using * instead of DAUM can prevent web collection robots from > collecting documents on all search services, not just Daum. > > So let's take a look at what you've got: > $ curl http://geneslinuxbox.net:6309/robots.txt > # $Id: robots.txt 410967 2009-08-06 19:44:54Z oden $ > # $HeadURL: > svn+ssh://svn.mandriva.com/svn/packages/cooker/apache-conf/current/SOU >RCES/robots.txt $ > # exclude help system from robots > > User-agent: googlebot-Image > Disallow: / > > User-agent: googlebot > Disallow: / > > User-agent: * > Disallow: /manual/ > > User-agent: * > Disallow: /manual-2.2/ > > User-agent: * > Disallow: /addon-modules/ > > User-0agent: * > Disallow: /doc/ > > User-agent: * > Disallow: /images/ > > # the next line is a spam bot trap, for grepping the logs. you should > _really_ change this to something else... > #Disallow: /all_our_e-mail_addresses > # same idea here... > > User-agent: * > Disallow: /admin/ > > # but allow htdig to index our doc-tree > # User-agent: htdig > # Disallow: > > User-agent: * > Disallow: stress test > > User-agent: stress-agent > Disallow: / > > User-agent * > Disallow: / > > $ > > You're missing a ':' - it should be > User-agent: * > Disallow: / > > and I don't think "User-0agent: *" is going to do what you want.. > > Regards, > Lee it didn't. So I had been adding iptables rules but had to reboot this morning to get a baseline cups start, only to find my iptables rules were all gone and the bots are DDOSing me again. Grrrrrrr
So I have to find all that in the history and re-invent a 33 line filter DROP. I'll be baqck when I've stuck a hot tater in semrushes exit port. Cheers, Gene Heskett -- "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) If we desire respect for the law, we must first make the law respectable. - Louis D. Brandeis Genes Web page <http://geneslinuxbox.net:6309/gene>