No DNS. I suspected it was SSH causing it. I would see an error in the
log after several brute force attempts, something like expected 50 got
5. The only thing I can find on that is something to do with keys. I do
have keys installed for SCP'ing backup files. Anyway, it would take a
couple hours after that error showed up for things to get really bad. If
I let it go for 10-12 hours, it would eventually say all locally
generated ICMP was 80% packet loss.
This is one of the few routers I have with 5.26. It was deployed shortly
after it was available last year and v6 wasn't all that stable yet. I
have since set the SSH service to allow only my NOC management subnet
and it has been running fine for days now. So whatever/whoever was
attacking the SSH server is now completely blocked. I have no doubt
whatever malformed request they were sending was causing it. I really
think the SSH "fix" in 5.26 has a memory leak. Like I said, I could ping
it remotely just fine with no loss or out of order packets so I don't
think it got as far as the kernel, like it was only local user-space
processes. Again, memory leak. Good job, MikroTik! I'm guessing they
have no more interest in v5 either.
On 11/15/2014 8:19 AM, Nicholas Eastman via Af wrote:
We use 5.25 and 5.26 on most of our routers. The main issues we've
seen are SSH brute force and DNS relay. We have a central DNS server
that we send everyone to located in our NOC, so we disabled "Allow
remote requests." This could easily be done with a firewall rule if
you do use the routers for DNS at the site, so they are not being hit
from outside. As far as the rest. We use an address list and firewall
to block access to the router's configuration interfaces except from
our office or local management IPs.
As far as the ICMP packets being mis-ordered, you might try something
like Greg Sowell's implementation of a ping brute force block. We
don't employ it on site routers right now, but I have seen it catch
some IPs on some customer set ups we have done. They are part of his
"Border Router Firewall Script" example that can be found here:
http://gregsowell.com/?p=4013
On Nov 10, 2014 7:05 PM, "George Skorup (Cyber Broadcasting) via Af"
<af@afmug.com <mailto:af@afmug.com>> wrote:
I've got a RB1100AH running 5.26. Something has been happening
every day for about the past week and it gets all screwy. I've
confirmed there are no site temperature or power issues. Here's
what happens in the screwy state. I can ping it and it responds
fine. I can log into Winbox or the CLI and try to ping anything,
even local same-subnet stuff and I get a bunch of packet loss.
SNMP responses are hit or miss as well. I did a packet capture and
it shows the ICMP packets all out of order. Reboot it and
everything works fine again, until next time. The only thing I
haven't tried yet is pinging 127.0.0.1 and see if the same packet
loss happens.
I see a bunch of SSH brute force attempts, but I'm using the brute
force protection firewall scripts to add sequential attempts to an
address list to stop them. And that works fine. But I'm wondering,
since 5.26 is the "ssh - fixed denial of service;" version, did
this "fix" break something else. I don't see this on any other
routers running 5.25, RB1100's and 493's. This is a remote router
so I do not want to try downgrading to 5.25 or upgrading to v6
without someone there. And if I'm going to send someone there,
probably better off replacing it, but then I'll never know WTF is
causing this.