Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-17 Thread Steven Kenney via AF
Been considering moving to iBGP myself.  We have several hundred routes as
well however there are a couple multi-link hops with multiple backhauls as
failovers and that does cause state changes sometimes, which then sometimes
causes issues with OSPF.  Most towers are fed from fiber though.  The
issues are the bad links as you said.

On Fri, Apr 14, 2023 at 2:18 PM castarritt  wrote:

> We are up to over 700 in a single area and it is starting to take more
> than just a few seconds for it to reconverge after a change, thinking it
> might be time to change things.
>
> On Fri, Apr 14, 2023 at 10:17 AM Dennis Burgess 
> wrote:
>
>> We have customers with a bit over 200 in one area.  Really not how many
>> devices you have, it depends on how many state changes you have normally.
>> That network is VERY stable, a lot of fiber, so it works well.
>>
>>
>>
>> *From:* AF  *On Behalf Of * castarritt
>> *Sent:* Thursday, April 13, 2023 9:38 AM
>> *To:* AnimalFarm Microwave Users Group 
>> *Subject:* Re: [AFMUG] Tik 1072 watchdog reboot bug
>>
>>
>>
>> How many routes are you running over OSPF?
>>
>>
>>
>> On Thu, Apr 13, 2023 at 9:34 AM Steven Kenney via AF 
>> wrote:
>>
>> Try pushing a total of 20+Gbps, and probably more if you combine all the
>> ports.  I'm not talking about running normally. I've run OSPF without an
>> update for up to a year without a crash.   I'm talking when you need to
>> make major changes to the structure of your area, add or remove p2p
>> connections etc.   It tends not to like that at least on RO6 and the 1072.
>>
>>
>>
>> On Thu, Apr 13, 2023 at 12:00 AM Dennis Burgess 
>> wrote:
>>
>> We have had ospf on routers running 5+ gig of traffic with uptimes of
>> over 200 days without issues. I can name a few customers that had or have
>> those.  Just a FYI.
>>
>>
>>
>>
>>
>> *[image: LTI-Full_175px]*
>>
>>
>> *Dennis Burgess, Mikrotik Certified Trainer MTCNA, MTCRE, MTCWE, MTCTCE,
>> MTCINE, MTCSE, HE IPv6 Sage, Cambium ePMP Certified *
>>
>> Author of "Learn RouterOS- Second Edition”
>>
>> *Link Technologies, Inc* -- Mikrotik & WISP Support Services
>>
>> *Office*: 314-735-0270  Website: http://www.linktechs.net
>>
>> Need to Automate MikroTik Backups:  https://cloud.linktechs.net
>>
>> Create Wireless Coverage’s with www.towercoverage.com
>>
>>
>>
>> *From:* AF  *On Behalf Of *Steven Kenney via AF
>> *Sent:* Wednesday, April 12, 2023 1:18 PM
>> *To:* AnimalFarm Microwave Users Group 
>> *Cc:* Steven Kenney 
>> *Subject:* Re: [AFMUG] Tik 1072 watchdog reboot bug
>>
>>
>>
>> OSPF will also kill the system and force a watchdog reboot.  If I remove
>> a a long time link between routers sure enough the router will reboot
>> itself a couple days later.  Anything OSPF when it comes to removing
>> existing rules (if you have enough going on)  it will die.
>>
>>
>>
>> On Wed, Apr 12, 2023 at 1:05 PM Josh Luthman 
>> wrote:
>>
>> Then why did mine have a kernel panic when there is no connection
>> tracking?  Why is it solved with significantly more traffic and only
>> changing the firewall?
>>
>>
>>
>> On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough  wrote:
>>
>> Its a known hardware issue with connection tracking enabled and hardware
>> offload. It has a hard limit to the number of connections it supports that
>> is pretty low. Its high enough you won't notice till you get significant
>> traffic, but low enough it is a common issue. The fix is to turn off
>> connection tracking I know this isn't the best solution, but its the only
>> one that works. This and the hardware availability of the processor are the
>> reason they are discontinued. The good news is that moving over to the
>> newer generation seems to resolve this, but comes with a handful of version
>> 7 quirks.
>>
>> On 4/11/2023 5:55 PM, Alex Kessler wrote:
>>
>> Been experiencing this bug for years while running NAT and connection
>> tracking.  Rebooting every few months while running v6 latest.  Does v7
>> have any known fixes to resolve these watchdog reboots?
>>
>>
>>
>>
>>
>>
>> ---
>>
>>
>>
>>
>> From: "Colin Stanners" < cstanners at gmail.com >
>> To: "af" < af at af.afmug.com >
>> Sent: Monday, December 21, 2020 12:59:09 AM
>

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-17 Thread Dennis Burgess
Ya 700 is a bit much.  Really depends on how many changes you have at one time. 
 I have multiple books on OSPF, some state 80 is too many and others state 150, 
and so on.  So there is no definitive rule on it.  Really depends on how 
quickly you want it to converge.  Keep in mind, there is not an issue with 
having MUTPLE backbone areas, just need to segment it accordingly and then put 
static routes between them.

From: AF  On Behalf Of castarritt
Sent: Friday, April 14, 2023 1:17 PM
To: AnimalFarm Microwave Users Group 
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

We are up to over 700 in a single area and it is starting to take more than 
just a few seconds for it to reconverge after a change, thinking it might be 
time to change things.

On Fri, Apr 14, 2023 at 10:17 AM Dennis Burgess 
mailto:dmburg...@linktechs.net>> wrote:
We have customers with a bit over 200 in one area.  Really not how many devices 
you have, it depends on how many state changes you have normally.  That network 
is VERY stable, a lot of fiber, so it works well.

From: AF mailto:af-boun...@af.afmug.com>> On Behalf Of 
castarritt
Sent: Thursday, April 13, 2023 9:38 AM
To: AnimalFarm Microwave Users Group mailto:af@af.afmug.com>>
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

How many routes are you running over OSPF?

On Thu, Apr 13, 2023 at 9:34 AM Steven Kenney via AF 
mailto:af@af.afmug.com>> wrote:
Try pushing a total of 20+Gbps, and probably more if you combine all the ports. 
 I'm not talking about running normally. I've run OSPF without an update for up 
to a year without a crash.   I'm talking when you need to make major changes to 
the structure of your area, add or remove p2p connections etc.   It tends not 
to like that at least on RO6 and the 1072.

On Thu, Apr 13, 2023 at 12:00 AM Dennis Burgess 
mailto:dmburg...@linktechs.net>> wrote:
We have had ospf on routers running 5+ gig of traffic with uptimes of over 200 
days without issues. I can name a few customers that had or have those.  Just a 
FYI.


[LTI-Full_175px]
Dennis Burgess, Mikrotik Certified Trainer
MTCNA, MTCRE, MTCWE, MTCTCE, MTCINE, MTCSE, HE IPv6 Sage, Cambium ePMP Certified
Author of "Learn RouterOS- Second Edition”
Link Technologies, Inc -- Mikrotik & WISP Support Services
Office: 314-735-0270  Website: 
http://www.linktechs.net<http://www.linktechs.net/>
Need to Automate MikroTik Backups:  https://cloud.linktechs.net
Create Wireless Coverage’s with 
www.towercoverage.com<http://www.towercoverage.com>

From: AF mailto:af-boun...@af.afmug.com>> On Behalf Of 
Steven Kenney via AF
Sent: Wednesday, April 12, 2023 1:18 PM
To: AnimalFarm Microwave Users Group mailto:af@af.afmug.com>>
Cc: Steven Kenney mailto:st...@wavedirect.org>>
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

OSPF will also kill the system and force a watchdog reboot.  If I remove a a 
long time link between routers sure enough the router will reboot itself a 
couple days later.  Anything OSPF when it comes to removing existing rules (if 
you have enough going on)  it will die.

On Wed, Apr 12, 2023 at 1:05 PM Josh Luthman 
mailto:j...@imaginenetworksllc.com>> wrote:
Then why did mine have a kernel panic when there is no connection tracking?  
Why is it solved with significantly more traffic and only changing the firewall?

On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough 
mailto:t...@3dsc.co>> wrote:

Its a known hardware issue with connection tracking enabled and hardware 
offload. It has a hard limit to the number of connections it supports that is 
pretty low. Its high enough you won't notice till you get significant traffic, 
but low enough it is a common issue. The fix is to turn off connection tracking 
I know this isn't the best solution, but its the only one that works. This and 
the hardware availability of the processor are the reason they are 
discontinued. The good news is that moving over to the newer generation seems 
to resolve this, but comes with a handful of version 7 quirks.
On 4/11/2023 5:55 PM, Alex Kessler wrote:

Been experiencing this bug for years while running NAT and connection tracking. 
 Rebooting every few months while running v6 latest.  Does v7 have any known 
fixes to resolve these watchdog reboots?





---




From: "Colin Stanners" < cstanners at gmail.com<http://gmail.com> >
To: "af" < af at af.afmug.com<http://af.afmug.com> >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long being rock-solid 
to now having occasional random reboots (from watchdog) or 100% CPU usage, 
which strangles the BGP process. In the latter case, tools->profile would show 
the firewall taking 100% of CPU, even after temporarily disabling all firewall 
filte

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-14 Thread castarritt
We are up to over 700 in a single area and it is starting to take more than
just a few seconds for it to reconverge after a change, thinking it might
be time to change things.

On Fri, Apr 14, 2023 at 10:17 AM Dennis Burgess 
wrote:

> We have customers with a bit over 200 in one area.  Really not how many
> devices you have, it depends on how many state changes you have normally.
> That network is VERY stable, a lot of fiber, so it works well.
>
>
>
> *From:* AF  *On Behalf Of * castarritt
> *Sent:* Thursday, April 13, 2023 9:38 AM
> *To:* AnimalFarm Microwave Users Group 
> *Subject:* Re: [AFMUG] Tik 1072 watchdog reboot bug
>
>
>
> How many routes are you running over OSPF?
>
>
>
> On Thu, Apr 13, 2023 at 9:34 AM Steven Kenney via AF 
> wrote:
>
> Try pushing a total of 20+Gbps, and probably more if you combine all the
> ports.  I'm not talking about running normally. I've run OSPF without an
> update for up to a year without a crash.   I'm talking when you need to
> make major changes to the structure of your area, add or remove p2p
> connections etc.   It tends not to like that at least on RO6 and the 1072.
>
>
>
> On Thu, Apr 13, 2023 at 12:00 AM Dennis Burgess 
> wrote:
>
> We have had ospf on routers running 5+ gig of traffic with uptimes of over
> 200 days without issues. I can name a few customers that had or have
> those.  Just a FYI.
>
>
>
>
>
> *[image: LTI-Full_175px]*
>
>
> *Dennis Burgess, Mikrotik Certified Trainer MTCNA, MTCRE, MTCWE, MTCTCE,
> MTCINE, MTCSE, HE IPv6 Sage, Cambium ePMP Certified *
>
> Author of "Learn RouterOS- Second Edition”
>
> *Link Technologies, Inc* -- Mikrotik & WISP Support Services
>
> *Office*: 314-735-0270  Website: http://www.linktechs.net
>
> Need to Automate MikroTik Backups:  https://cloud.linktechs.net
>
> Create Wireless Coverage’s with www.towercoverage.com
>
>
>
> *From:* AF  *On Behalf Of *Steven Kenney via AF
> *Sent:* Wednesday, April 12, 2023 1:18 PM
> *To:* AnimalFarm Microwave Users Group 
> *Cc:* Steven Kenney 
> *Subject:* Re: [AFMUG] Tik 1072 watchdog reboot bug
>
>
>
> OSPF will also kill the system and force a watchdog reboot.  If I remove a
> a long time link between routers sure enough the router will reboot itself
> a couple days later.  Anything OSPF when it comes to removing existing
> rules (if you have enough going on)  it will die.
>
>
>
> On Wed, Apr 12, 2023 at 1:05 PM Josh Luthman 
> wrote:
>
> Then why did mine have a kernel panic when there is no connection
> tracking?  Why is it solved with significantly more traffic and only
> changing the firewall?
>
>
>
> On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough  wrote:
>
> Its a known hardware issue with connection tracking enabled and hardware
> offload. It has a hard limit to the number of connections it supports that
> is pretty low. Its high enough you won't notice till you get significant
> traffic, but low enough it is a common issue. The fix is to turn off
> connection tracking I know this isn't the best solution, but its the only
> one that works. This and the hardware availability of the processor are the
> reason they are discontinued. The good news is that moving over to the
> newer generation seems to resolve this, but comes with a handful of version
> 7 quirks.
>
> On 4/11/2023 5:55 PM, Alex Kessler wrote:
>
> Been experiencing this bug for years while running NAT and connection
> tracking.  Rebooting every few months while running v6 latest.  Does v7
> have any known fixes to resolve these watchdog reboots?
>
>
>
>
>
>
> ---
>
>
>
>
> From: "Colin Stanners" < cstanners at gmail.com >
> To: "af" < af at af.afmug.com >
> Sent: Monday, December 21, 2020 12:59:09 AM
> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>
> This last year, I've seen a MikroTik CCR1072 switch from long being
> rock-solid to now having occasional random reboots (from watchdog) or 100%
> CPU usage, which strangles the BGP process. In the latter case,
> tools->profile would show the firewall taking 100% of CPU, even after
> temporarily disabling all firewall filter and NAT rules and connection
> tracking. Not fun.
>
> MT tech support did not seem super helpful or interested, mostly
> recommending to disable watchdog (unacceptable on a production router) or
> to upgrade firmware (without specifying the suspected cause of the problem
> or nature of the fix).
>
> Tried 1 update, that didn't seem to help, have now tried another...
>
> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wa

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-14 Thread Dennis Burgess
We have customers with a bit over 200 in one area.  Really not how many devices 
you have, it depends on how many state changes you have normally.  That network 
is VERY stable, a lot of fiber, so it works well.

From: AF  On Behalf Of castarritt
Sent: Thursday, April 13, 2023 9:38 AM
To: AnimalFarm Microwave Users Group 
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

How many routes are you running over OSPF?

On Thu, Apr 13, 2023 at 9:34 AM Steven Kenney via AF 
mailto:af@af.afmug.com>> wrote:
Try pushing a total of 20+Gbps, and probably more if you combine all the ports. 
 I'm not talking about running normally. I've run OSPF without an update for up 
to a year without a crash.   I'm talking when you need to make major changes to 
the structure of your area, add or remove p2p connections etc.   It tends not 
to like that at least on RO6 and the 1072.

On Thu, Apr 13, 2023 at 12:00 AM Dennis Burgess 
mailto:dmburg...@linktechs.net>> wrote:
We have had ospf on routers running 5+ gig of traffic with uptimes of over 200 
days without issues. I can name a few customers that had or have those.  Just a 
FYI.


[LTI-Full_175px]
Dennis Burgess, Mikrotik Certified Trainer
MTCNA, MTCRE, MTCWE, MTCTCE, MTCINE, MTCSE, HE IPv6 Sage, Cambium ePMP Certified
Author of "Learn RouterOS- Second Edition”
Link Technologies, Inc -- Mikrotik & WISP Support Services
Office: 314-735-0270  Website: 
http://www.linktechs.net<http://www.linktechs.net/>
Need to Automate MikroTik Backups:  https://cloud.linktechs.net
Create Wireless Coverage’s with 
www.towercoverage.com<http://www.towercoverage.com>

From: AF mailto:af-boun...@af.afmug.com>> On Behalf Of 
Steven Kenney via AF
Sent: Wednesday, April 12, 2023 1:18 PM
To: AnimalFarm Microwave Users Group mailto:af@af.afmug.com>>
Cc: Steven Kenney mailto:st...@wavedirect.org>>
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

OSPF will also kill the system and force a watchdog reboot.  If I remove a a 
long time link between routers sure enough the router will reboot itself a 
couple days later.  Anything OSPF when it comes to removing existing rules (if 
you have enough going on)  it will die.

On Wed, Apr 12, 2023 at 1:05 PM Josh Luthman 
mailto:j...@imaginenetworksllc.com>> wrote:
Then why did mine have a kernel panic when there is no connection tracking?  
Why is it solved with significantly more traffic and only changing the firewall?

On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough 
mailto:t...@3dsc.co>> wrote:

Its a known hardware issue with connection tracking enabled and hardware 
offload. It has a hard limit to the number of connections it supports that is 
pretty low. Its high enough you won't notice till you get significant traffic, 
but low enough it is a common issue. The fix is to turn off connection tracking 
I know this isn't the best solution, but its the only one that works. This and 
the hardware availability of the processor are the reason they are 
discontinued. The good news is that moving over to the newer generation seems 
to resolve this, but comes with a handful of version 7 quirks.
On 4/11/2023 5:55 PM, Alex Kessler wrote:

Been experiencing this bug for years while running NAT and connection tracking. 
 Rebooting every few months while running v6 latest.  Does v7 have any known 
fixes to resolve these watchdog reboots?





---




From: "Colin Stanners" < cstanners at gmail.com<http://gmail.com> >
To: "af" < af at af.afmug.com<http://af.afmug.com> >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long being rock-solid 
to now having occasional random reboots (from watchdog) or 100% CPU usage, 
which strangles the BGP process. In the latter case, tools->profile would show 
the firewall taking 100% of CPU, even after temporarily disabling all firewall 
filter and NAT rules and connection tracking. Not fun.

MT tech support did not seem super helpful or interested, mostly recommending 
to disable watchdog (unacceptable on a production router) or to upgrade 
firmware (without specifying the suspected cause of the problem or nature of 
the fix).

Tried 1 update, that didn't seem to help, have now tried another...

On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at 
wavedirect.org<http://wavedirect.org> > wrote:
MIkrotik has been rock solid for me for years. Until this year and the 1072's. 
Random reboots set off by watchdog timer on all of my 1072's. Some more than 
others. Threads in the forum all discuss the same problem exactly. Its a 
connection tracking issue.. however I need connection tracking on one 
particular router. I've adjusted everything I could. Firmware and board 
firmware all up to date etc. Happens randomly with low levels of

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-13 Thread castarritt
How many routes are you running over OSPF?

On Thu, Apr 13, 2023 at 9:34 AM Steven Kenney via AF 
wrote:

> Try pushing a total of 20+Gbps, and probably more if you combine all the
> ports.  I'm not talking about running normally. I've run OSPF without an
> update for up to a year without a crash.   I'm talking when you need to
> make major changes to the structure of your area, add or remove p2p
> connections etc.   It tends not to like that at least on RO6 and the 1072.
>
> On Thu, Apr 13, 2023 at 12:00 AM Dennis Burgess 
> wrote:
>
>> We have had ospf on routers running 5+ gig of traffic with uptimes of
>> over 200 days without issues. I can name a few customers that had or have
>> those.  Just a FYI.
>>
>>
>>
>>
>>
>> *[image: LTI-Full_175px]*
>>
>>
>> *Dennis Burgess, Mikrotik Certified Trainer MTCNA, MTCRE, MTCWE, MTCTCE,
>> MTCINE, MTCSE, HE IPv6 Sage, Cambium ePMP Certified *
>>
>> Author of "Learn RouterOS- Second Edition”
>>
>> *Link Technologies, Inc* -- Mikrotik & WISP Support Services
>>
>> *Office*: 314-735-0270  Website: http://www.linktechs.net
>>
>> Need to Automate MikroTik Backups:  https://cloud.linktechs.net
>>
>> Create Wireless Coverage’s with www.towercoverage.com
>>
>>
>>
>> *From:* AF  *On Behalf Of * Steven Kenney via AF
>> *Sent:* Wednesday, April 12, 2023 1:18 PM
>> *To:* AnimalFarm Microwave Users Group 
>> *Cc:* Steven Kenney 
>> *Subject:* Re: [AFMUG] Tik 1072 watchdog reboot bug
>>
>>
>>
>> OSPF will also kill the system and force a watchdog reboot.  If I remove
>> a a long time link between routers sure enough the router will reboot
>> itself a couple days later.  Anything OSPF when it comes to removing
>> existing rules (if you have enough going on)  it will die.
>>
>>
>>
>> On Wed, Apr 12, 2023 at 1:05 PM Josh Luthman 
>> wrote:
>>
>> Then why did mine have a kernel panic when there is no connection
>> tracking?  Why is it solved with significantly more traffic and only
>> changing the firewall?
>>
>>
>>
>> On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough  wrote:
>>
>> Its a known hardware issue with connection tracking enabled and hardware
>> offload. It has a hard limit to the number of connections it supports that
>> is pretty low. Its high enough you won't notice till you get significant
>> traffic, but low enough it is a common issue. The fix is to turn off
>> connection tracking I know this isn't the best solution, but its the only
>> one that works. This and the hardware availability of the processor are the
>> reason they are discontinued. The good news is that moving over to the
>> newer generation seems to resolve this, but comes with a handful of version
>> 7 quirks.
>>
>> On 4/11/2023 5:55 PM, Alex Kessler wrote:
>>
>> Been experiencing this bug for years while running NAT and connection
>> tracking.  Rebooting every few months while running v6 latest.  Does v7
>> have any known fixes to resolve these watchdog reboots?
>>
>>
>>
>>
>>
>>
>> ---
>>
>>
>>
>>
>> From: "Colin Stanners" < cstanners at gmail.com >
>> To: "af" < af at af.afmug.com >
>> Sent: Monday, December 21, 2020 12:59:09 AM
>> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>>
>> This last year, I've seen a MikroTik CCR1072 switch from long being
>> rock-solid to now having occasional random reboots (from watchdog) or 100%
>> CPU usage, which strangles the BGP process. In the latter case,
>> tools->profile would show the firewall taking 100% of CPU, even after
>> temporarily disabling all firewall filter and NAT rules and connection
>> tracking. Not fun.
>>
>> MT tech support did not seem super helpful or interested, mostly
>> recommending to disable watchdog (unacceptable on a production router) or
>> to upgrade firmware (without specifying the suspected cause of the problem
>> or nature of the fix).
>>
>> Tried 1 update, that didn't seem to help, have now tried another...
>>
>> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org >
>> wrote:
>> MIkrotik has been rock solid for me for years. Until this year and the
>> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
>> more than others. Threads in the forum all discuss the same problem
>> exactly. Its a connection tracking issue.. howe

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-13 Thread Steven Kenney via AF
Try pushing a total of 20+Gbps, and probably more if you combine all the
ports.  I'm not talking about running normally. I've run OSPF without an
update for up to a year without a crash.   I'm talking when you need to
make major changes to the structure of your area, add or remove p2p
connections etc.   It tends not to like that at least on RO6 and the 1072.

On Thu, Apr 13, 2023 at 12:00 AM Dennis Burgess 
wrote:

> We have had ospf on routers running 5+ gig of traffic with uptimes of over
> 200 days without issues. I can name a few customers that had or have
> those.  Just a FYI.
>
>
>
>
>
> *[image: LTI-Full_175px]*
>
>
> *Dennis Burgess, Mikrotik Certified Trainer MTCNA, MTCRE, MTCWE, MTCTCE,
> MTCINE, MTCSE, HE IPv6 Sage, Cambium ePMP Certified *
>
> Author of "Learn RouterOS- Second Edition”
>
> *Link Technologies, Inc* -- Mikrotik & WISP Support Services
>
> *Office*: 314-735-0270  Website: http://www.linktechs.net
>
> Need to Automate MikroTik Backups:  https://cloud.linktechs.net
>
> Create Wireless Coverage’s with www.towercoverage.com
>
>
>
> *From:* AF  *On Behalf Of * Steven Kenney via AF
> *Sent:* Wednesday, April 12, 2023 1:18 PM
> *To:* AnimalFarm Microwave Users Group 
> *Cc:* Steven Kenney 
> *Subject:* Re: [AFMUG] Tik 1072 watchdog reboot bug
>
>
>
> OSPF will also kill the system and force a watchdog reboot.  If I remove a
> a long time link between routers sure enough the router will reboot itself
> a couple days later.  Anything OSPF when it comes to removing existing
> rules (if you have enough going on)  it will die.
>
>
>
> On Wed, Apr 12, 2023 at 1:05 PM Josh Luthman 
> wrote:
>
> Then why did mine have a kernel panic when there is no connection
> tracking?  Why is it solved with significantly more traffic and only
> changing the firewall?
>
>
>
> On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough  wrote:
>
> Its a known hardware issue with connection tracking enabled and hardware
> offload. It has a hard limit to the number of connections it supports that
> is pretty low. Its high enough you won't notice till you get significant
> traffic, but low enough it is a common issue. The fix is to turn off
> connection tracking I know this isn't the best solution, but its the only
> one that works. This and the hardware availability of the processor are the
> reason they are discontinued. The good news is that moving over to the
> newer generation seems to resolve this, but comes with a handful of version
> 7 quirks.
>
> On 4/11/2023 5:55 PM, Alex Kessler wrote:
>
> Been experiencing this bug for years while running NAT and connection
> tracking.  Rebooting every few months while running v6 latest.  Does v7
> have any known fixes to resolve these watchdog reboots?
>
>
>
>
>
>
> ---
>
>
>
>
> From: "Colin Stanners" < cstanners at gmail.com >
> To: "af" < af at af.afmug.com >
> Sent: Monday, December 21, 2020 12:59:09 AM
> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>
> This last year, I've seen a MikroTik CCR1072 switch from long being
> rock-solid to now having occasional random reboots (from watchdog) or 100%
> CPU usage, which strangles the BGP process. In the latter case,
> tools->profile would show the firewall taking 100% of CPU, even after
> temporarily disabling all firewall filter and NAT rules and connection
> tracking. Not fun.
>
> MT tech support did not seem super helpful or interested, mostly
> recommending to disable watchdog (unacceptable on a production router) or
> to upgrade firmware (without specifying the suspected cause of the problem
> or nature of the fix).
>
> Tried 1 update, that didn't seem to help, have now tried another...
>
> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org >
> wrote:
> MIkrotik has been rock solid for me for years. Until this year and the
> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
> more than others. Threads in the forum all discuss the same problem
> exactly. Its a connection tracking issue.. however I need connection
> tracking on one particular router. I've adjusted everything I could.
> Firmware and board firmware all up to date etc. Happens randomly with low
> levels of traffic, high levels of traffic, sometimes a couple times a day,
> sometimes weeks. No DDOS evidence at all from upstream routers. Configs
> checked and rechecked by third party experts. I graph everything about the
> Mikrotik and there are no clues or anything abnormal happening before the
> crash. Plenty of memory, disk space, CPU etc. Replaces all the tra

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread TJ Trout
I think the fw open can cause reboots but it will still reboot with
connection tracking enabled. Try going to v7 7.8 you can always go back
later.

On Wed, Apr 12, 2023, 12:37 PM Josh Luthman 
wrote:

> Dropping ***ALL*** input except what you need to access the router from
> good hosts.
>
> It doesn't matter if the service is listening or not, firewall it.  It
> doesn't matter if you've restricted IPs on the service, firewall it.
>
> On Wed, Apr 12, 2023 at 2:30 PM Alex Kessler 
> wrote:
>
>> What needs changed with the firewall?
>>
>>
>> On 4/12/2023 9:27 AM, Josh Luthman wrote:
>>
>> Input firewall seems to be the right answer.  Not updating.
>>
>> On Tue, Apr 11, 2023 at 6:59 PM Alex Kessler 
>> wrote:
>>
>>> Been experiencing this bug for years while running NAT and connection
>>> tracking.  Rebooting every few months while running v6 latest.  Does v7
>>> have any known fixes to resolve these watchdog reboots?
>>>
>>>
>>>
>>>
>>>
>>>
>>> ---
>>>
>>>
>>>
>>>
>>> From: "Colin Stanners" < cstanners at gmail.com >
>>> To: "af" < af at af.afmug.com >
>>> Sent: Monday, December 21, 2020 12:59:09 AM
>>> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>>>
>>> This last year, I've seen a MikroTik CCR1072 switch from long being
>>> rock-solid to now having occasional random reboots (from watchdog) or 100%
>>> CPU usage, which strangles the BGP process. In the latter case,
>>> tools->profile would show the firewall taking 100% of CPU, even after
>>> temporarily disabling all firewall filter and NAT rules and connection
>>> tracking. Not fun.
>>>
>>> MT tech support did not seem super helpful or interested, mostly
>>> recommending to disable watchdog (unacceptable on a production router) or
>>> to upgrade firmware (without specifying the suspected cause of the problem
>>> or nature of the fix).
>>>
>>> Tried 1 update, that didn't seem to help, have now tried another...
>>>
>>> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org
>>> > wrote:
>>> MIkrotik has been rock solid for me for years. Until this year and the
>>> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
>>> more than others. Threads in the forum all discuss the same problem
>>> exactly. Its a connection tracking issue.. however I need connection
>>> tracking on one particular router. I've adjusted everything I could.
>>> Firmware and board firmware all up to date etc. Happens randomly with low
>>> levels of traffic, high levels of traffic, sometimes a couple times a day,
>>> sometimes weeks. No DDOS evidence at all from upstream routers. Configs
>>> checked and rechecked by third party experts. I graph everything about the
>>> Mikrotik and there are no clues or anything abnormal happening before the
>>> crash. Plenty of memory, disk space, CPU etc. Replaces all the trannies,
>>> power cables and such. Not running BGP only OSPF on the one that is giving
>>> me the most trouble.
>>>
>>> Even have a serial console cable plugged into them to my opengear and
>>> set it to log pretty much everything to console including the kernel and
>>> nothing. A hard freeze.
>>>
>>> Then there is Mikrotik support... I've never needed their support before
>>> until now. So I put a ticket in and the shitty attitude I'm getting from
>>> them seems like they KNOW there is something wrong with the hardware and
>>> they are intentionally not being helpful. It is pretty clear to see with
>>> all the people reporting this issue that there IS an issue.
>>>
>>> If this is any indication of how things are going to go with Mikrotik on
>>> the newer hardware going forware I think its time to jump to an enterprise
>>> level system. Juniper most likely. Shame because they are just about
>>> keeping up with the demands with their hardware. Getting closer to 100Gbps
>>> etc and ROS7 ... but at their current pace I think we've outgrew them.
>>>
>>> All the threads discussing this issue has been absolutely quiet when it
>>> comes to Mikrotik jumping in to mention or try to help troubleshoot. I
>>> think they know they had bad hardware out there and do not want to honor
>>> warranties. I've heard rumors of bad batches of 1072's.
>>>
>>> Anyone else encounter this?
>>>
>>>
>>> --
>>>
>>> *Alex*
>>> Alex Kessler / TECHNICAL OPERATIONS CENTER
>>> *O (Ohio)* 740.212.3773 / *O (All other markets)* 888.966.5690 / 145 
>>> Columbus
>>> Rd, Athens, OH 45701 / point-broadband.com
>>> --
>>> AF mailing list
>>> AF@af.afmug.com
>>> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>>>
>>
>>
>> 
>> --
>> AF mailing list
>> AF@af.afmug.com
>> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>>
> --
> AF mailing list
> AF@af.afmug.com
> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>
-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Josh Luthman
Dropping ***ALL*** input except what you need to access the router from
good hosts.

It doesn't matter if the service is listening or not, firewall it.  It
doesn't matter if you've restricted IPs on the service, firewall it.

On Wed, Apr 12, 2023 at 2:30 PM Alex Kessler 
wrote:

> What needs changed with the firewall?
>
>
> On 4/12/2023 9:27 AM, Josh Luthman wrote:
>
> Input firewall seems to be the right answer.  Not updating.
>
> On Tue, Apr 11, 2023 at 6:59 PM Alex Kessler 
> wrote:
>
>> Been experiencing this bug for years while running NAT and connection
>> tracking.  Rebooting every few months while running v6 latest.  Does v7
>> have any known fixes to resolve these watchdog reboots?
>>
>>
>>
>>
>>
>>
>> ---
>>
>>
>>
>>
>> From: "Colin Stanners" < cstanners at gmail.com >
>> To: "af" < af at af.afmug.com >
>> Sent: Monday, December 21, 2020 12:59:09 AM
>> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>>
>> This last year, I've seen a MikroTik CCR1072 switch from long being
>> rock-solid to now having occasional random reboots (from watchdog) or 100%
>> CPU usage, which strangles the BGP process. In the latter case,
>> tools->profile would show the firewall taking 100% of CPU, even after
>> temporarily disabling all firewall filter and NAT rules and connection
>> tracking. Not fun.
>>
>> MT tech support did not seem super helpful or interested, mostly
>> recommending to disable watchdog (unacceptable on a production router) or
>> to upgrade firmware (without specifying the suspected cause of the problem
>> or nature of the fix).
>>
>> Tried 1 update, that didn't seem to help, have now tried another...
>>
>> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org >
>> wrote:
>> MIkrotik has been rock solid for me for years. Until this year and the
>> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
>> more than others. Threads in the forum all discuss the same problem
>> exactly. Its a connection tracking issue.. however I need connection
>> tracking on one particular router. I've adjusted everything I could.
>> Firmware and board firmware all up to date etc. Happens randomly with low
>> levels of traffic, high levels of traffic, sometimes a couple times a day,
>> sometimes weeks. No DDOS evidence at all from upstream routers. Configs
>> checked and rechecked by third party experts. I graph everything about the
>> Mikrotik and there are no clues or anything abnormal happening before the
>> crash. Plenty of memory, disk space, CPU etc. Replaces all the trannies,
>> power cables and such. Not running BGP only OSPF on the one that is giving
>> me the most trouble.
>>
>> Even have a serial console cable plugged into them to my opengear and set
>> it to log pretty much everything to console including the kernel and
>> nothing. A hard freeze.
>>
>> Then there is Mikrotik support... I've never needed their support before
>> until now. So I put a ticket in and the shitty attitude I'm getting from
>> them seems like they KNOW there is something wrong with the hardware and
>> they are intentionally not being helpful. It is pretty clear to see with
>> all the people reporting this issue that there IS an issue.
>>
>> If this is any indication of how things are going to go with Mikrotik on
>> the newer hardware going forware I think its time to jump to an enterprise
>> level system. Juniper most likely. Shame because they are just about
>> keeping up with the demands with their hardware. Getting closer to 100Gbps
>> etc and ROS7 ... but at their current pace I think we've outgrew them.
>>
>> All the threads discussing this issue has been absolutely quiet when it
>> comes to Mikrotik jumping in to mention or try to help troubleshoot. I
>> think they know they had bad hardware out there and do not want to honor
>> warranties. I've heard rumors of bad batches of 1072's.
>>
>> Anyone else encounter this?
>>
>>
>> --
>>
>> *Alex*
>> Alex Kessler / TECHNICAL OPERATIONS CENTER
>> *O (Ohio)* 740.212.3773 / *O (All other markets)* 888.966.5690 / 145 Columbus
>> Rd, Athens, OH 45701 / point-broadband.com
>> --
>> AF mailing list
>> AF@af.afmug.com
>> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>>
>
>
> 
> --
> AF mailing list
> AF@af.afmug.com
> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>
-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Sterling Jacobson
That’s a nice setup with the MX204 and A10.

Since we upgraded to like 20+ CCR2004 units running V7 and 5 CCR2216 units we 
have had no issues with our edge, core and access network running CGNAT.
We separate roles on separate hardware so each major site has local CGNAT on 
one 2004 for access, and one 2004 core for OSPF/MPLS and all of those sites 
aggregate to two core CCR2216 units running OSPF and iBGP then three CCR2216 
for Edge BGP to providers taking full tables and iBGP between each other and 
the core.

No problems, doing over 10Gbps on 25Gbps interfaces (Edge is 100Gbps 
interfaces).

Been running for a few months so far no lockups or problems on any of the units.

I’m pretty happy with it, costs a lot less than Juniper and A10, but it is what 
it is.



From: AF  On Behalf Of Christopher Tyler
Sent: Wednesday, April 12, 2023 12:25 PM
To: AnimalFarm Microwave Users Group 
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

We had this same issue. Replacing it with a CCR2216 didn't fix the problem. We 
ended up going to Juniper MX204's and A10 CGNAT boxes for NAT.



Christopher Tyler
Senior Network Engineer
Total Highspeed Internet Solutions
+1 417-851-1107 ext 9002
ch...@totalhighspeed.net<mailto:ch...@totalhighspeed.net>
[cid:image001.png@01D96D40.37D571D0]<https://outlook.office.com/bookwithme/user/1f29ddb141b647f08b33d9cf03835...@totalhighspeed.net?anonymous=signature>
Book time to meet with 
me<https://outlook.office.com/bookwithme/user/1f29ddb141b647f08b33d9cf03835...@totalhighspeed.net?anonymous=signature>

This institution is an equal opportunity provider and employer. Esta 
institución es un proveedor de servicios con igualdad de oportunidades.

From: AF mailto:af-boun...@af.afmug.com>> on behalf of 
Josh Luthman mailto:j...@imaginenetworksllc.com>>
Sent: Wednesday, April 12, 2023 11:41 AM
To: AnimalFarm Microwave Users Group mailto:af@af.afmug.com>>
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

Then why did mine have a kernel panic when there is no connection tracking?  
Why is it solved with significantly more traffic and only changing the firewall?

On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough 
mailto:t...@3dsc.co>> wrote:

Its a known hardware issue with connection tracking enabled and hardware 
offload. It has a hard limit to the number of connections it supports that is 
pretty low. Its high enough you won't notice till you get significant traffic, 
but low enough it is a common issue. The fix is to turn off connection tracking 
I know this isn't the best solution, but its the only one that works. This and 
the hardware availability of the processor are the reason they are 
discontinued. The good news is that moving over to the newer generation seems 
to resolve this, but comes with a handful of version 7 quirks.
On 4/11/2023 5:55 PM, Alex Kessler wrote:

Been experiencing this bug for years while running NAT and connection tracking. 
 Rebooting every few months while running v6 latest.  Does v7 have any known 
fixes to resolve these watchdog reboots?





---




From: "Colin Stanners" < cstanners at gmail.com<http://gmail.com> >
To: "af" < af at af.afmug.com<http://af.afmug.com> >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long being rock-solid 
to now having occasional random reboots (from watchdog) or 100% CPU usage, 
which strangles the BGP process. In the latter case, tools->profile would show 
the firewall taking 100% of CPU, even after temporarily disabling all firewall 
filter and NAT rules and connection tracking. Not fun.

MT tech support did not seem super helpful or interested, mostly recommending 
to disable watchdog (unacceptable on a production router) or to upgrade 
firmware (without specifying the suspected cause of the problem or nature of 
the fix).

Tried 1 update, that didn't seem to help, have now tried another...

On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at 
wavedirect.org<http://wavedirect.org> > wrote:
MIkrotik has been rock solid for me for years. Until this year and the 1072's. 
Random reboots set off by watchdog timer on all of my 1072's. Some more than 
others. Threads in the forum all discuss the same problem exactly. Its a 
connection tracking issue.. however I need connection tracking on one 
particular router. I've adjusted everything I could. Firmware and board 
firmware all up to date etc. Happens randomly with low levels of traffic, high 
levels of traffic, sometimes a couple times a day, sometimes weeks. No DDOS 
evidence at all from upstream routers. Configs checked and rechecked by third 
party experts. I graph everything about the Mikrotik and there are no clues or 
anything abnormal happen

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Christopher Tyler
We had this same issue. Replacing it with a CCR2216 didn't fix the problem. We 
ended up going to Juniper MX204's and A10 CGNAT boxes for NAT.



Christopher Tyler
Senior Network Engineer
Total Highspeed Internet Solutions
+1 417-851-1107 ext 9002
ch...@totalhighspeed.net
[cid:29915cb0-80fd-4b95-9b62-b3dc6976535b]<https://outlook.office.com/bookwithme/user/1f29ddb141b647f08b33d9cf03835...@totalhighspeed.net?anonymous=signature>
   Book time to meet with 
me<https://outlook.office.com/bookwithme/user/1f29ddb141b647f08b33d9cf03835...@totalhighspeed.net?anonymous=signature>

This institution is an equal opportunity provider and employer. Esta 
institución es un proveedor de servicios con igualdad de oportunidades.

From: AF  on behalf of Josh Luthman 

Sent: Wednesday, April 12, 2023 11:41 AM
To: AnimalFarm Microwave Users Group 
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

Then why did mine have a kernel panic when there is no connection tracking?  
Why is it solved with significantly more traffic and only changing the firewall?

On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough 
mailto:t...@3dsc.co>> wrote:

Its a known hardware issue with connection tracking enabled and hardware 
offload. It has a hard limit to the number of connections it supports that is 
pretty low. Its high enough you won't notice till you get significant traffic, 
but low enough it is a common issue. The fix is to turn off connection tracking 
I know this isn't the best solution, but its the only one that works. This and 
the hardware availability of the processor are the reason they are 
discontinued. The good news is that moving over to the newer generation seems 
to resolve this, but comes with a handful of version 7 quirks.

On 4/11/2023 5:55 PM, Alex Kessler wrote:

Been experiencing this bug for years while running NAT and connection tracking. 
 Rebooting every few months while running v6 latest.  Does v7 have any known 
fixes to resolve these watchdog reboots?





---




From: "Colin Stanners" < cstanners at gmail.com<http://gmail.com> >
To: "af" < af at af.afmug.com<http://af.afmug.com> >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long being rock-solid 
to now having occasional random reboots (from watchdog) or 100% CPU usage, 
which strangles the BGP process. In the latter case, tools->profile would show 
the firewall taking 100% of CPU, even after temporarily disabling all firewall 
filter and NAT rules and connection tracking. Not fun.

MT tech support did not seem super helpful or interested, mostly recommending 
to disable watchdog (unacceptable on a production router) or to upgrade 
firmware (without specifying the suspected cause of the problem or nature of 
the fix).

Tried 1 update, that didn't seem to help, have now tried another...

On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at 
wavedirect.org<http://wavedirect.org> > wrote:
MIkrotik has been rock solid for me for years. Until this year and the 1072's. 
Random reboots set off by watchdog timer on all of my 1072's. Some more than 
others. Threads in the forum all discuss the same problem exactly. Its a 
connection tracking issue.. however I need connection tracking on one 
particular router. I've adjusted everything I could. Firmware and board 
firmware all up to date etc. Happens randomly with low levels of traffic, high 
levels of traffic, sometimes a couple times a day, sometimes weeks. No DDOS 
evidence at all from upstream routers. Configs checked and rechecked by third 
party experts. I graph everything about the Mikrotik and there are no clues or 
anything abnormal happening before the crash. Plenty of memory, disk space, CPU 
etc. Replaces all the trannies, power cables and such. Not running BGP only 
OSPF on the one that is giving me the most trouble.

Even have a serial console cable plugged into them to my opengear and set it to 
log pretty much everything to console including the kernel and nothing. A hard 
freeze.

Then there is Mikrotik support... I've never needed their support before until 
now. So I put a ticket in and the shitty attitude I'm getting from them seems 
like they KNOW there is something wrong with the hardware and they are 
intentionally not being helpful. It is pretty clear to see with all the people 
reporting this issue that there IS an issue.

If this is any indication of how things are going to go with Mikrotik on the 
newer hardware going forware I think its time to jump to an enterprise level 
system. Juniper most likely. Shame because they are just about keeping up with 
the demands with their hardware. Getting closer to 100Gbps etc and ROS7 ... but 
at their current pace I think we've outgrew them.


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Steven Kenney via AF
OSPF will also kill the system and force a watchdog reboot.  If I remove a
a long time link between routers sure enough the router will reboot itself
a couple days later.  Anything OSPF when it comes to removing existing
rules (if you have enough going on)  it will die.

On Wed, Apr 12, 2023 at 1:05 PM Josh Luthman 
wrote:

> Then why did mine have a kernel panic when there is no connection
> tracking?  Why is it solved with significantly more traffic and only
> changing the firewall?
>
> On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough  wrote:
>
>> Its a known hardware issue with connection tracking enabled and hardware
>> offload. It has a hard limit to the number of connections it supports that
>> is pretty low. Its high enough you won't notice till you get significant
>> traffic, but low enough it is a common issue. The fix is to turn off
>> connection tracking I know this isn't the best solution, but its the only
>> one that works. This and the hardware availability of the processor are the
>> reason they are discontinued. The good news is that moving over to the
>> newer generation seems to resolve this, but comes with a handful of version
>> 7 quirks.
>> On 4/11/2023 5:55 PM, Alex Kessler wrote:
>>
>> Been experiencing this bug for years while running NAT and connection
>> tracking.  Rebooting every few months while running v6 latest.  Does v7
>> have any known fixes to resolve these watchdog reboots?
>>
>>
>>
>>
>>
>>
>> ---
>>
>>
>>
>>
>> From: "Colin Stanners" < cstanners at gmail.com >
>> To: "af" < af at af.afmug.com >
>> Sent: Monday, December 21, 2020 12:59:09 AM
>> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>>
>> This last year, I've seen a MikroTik CCR1072 switch from long being
>> rock-solid to now having occasional random reboots (from watchdog) or 100%
>> CPU usage, which strangles the BGP process. In the latter case,
>> tools->profile would show the firewall taking 100% of CPU, even after
>> temporarily disabling all firewall filter and NAT rules and connection
>> tracking. Not fun.
>>
>> MT tech support did not seem super helpful or interested, mostly
>> recommending to disable watchdog (unacceptable on a production router) or
>> to upgrade firmware (without specifying the suspected cause of the problem
>> or nature of the fix).
>>
>> Tried 1 update, that didn't seem to help, have now tried another...
>>
>> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org >
>> wrote:
>> MIkrotik has been rock solid for me for years. Until this year and the
>> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
>> more than others. Threads in the forum all discuss the same problem
>> exactly. Its a connection tracking issue.. however I need connection
>> tracking on one particular router. I've adjusted everything I could.
>> Firmware and board firmware all up to date etc. Happens randomly with low
>> levels of traffic, high levels of traffic, sometimes a couple times a day,
>> sometimes weeks. No DDOS evidence at all from upstream routers. Configs
>> checked and rechecked by third party experts. I graph everything about the
>> Mikrotik and there are no clues or anything abnormal happening before the
>> crash. Plenty of memory, disk space, CPU etc. Replaces all the trannies,
>> power cables and such. Not running BGP only OSPF on the one that is giving
>> me the most trouble.
>>
>> Even have a serial console cable plugged into them to my opengear and set
>> it to log pretty much everything to console including the kernel and
>> nothing. A hard freeze.
>>
>> Then there is Mikrotik support... I've never needed their support before
>> until now. So I put a ticket in and the shitty attitude I'm getting from
>> them seems like they KNOW there is something wrong with the hardware and
>> they are intentionally not being helpful. It is pretty clear to see with
>> all the people reporting this issue that there IS an issue.
>>
>> If this is any indication of how things are going to go with Mikrotik on
>> the newer hardware going forware I think its time to jump to an enterprise
>> level system. Juniper most likely. Shame because they are just about
>> keeping up with the demands with their hardware. Getting closer to 100Gbps
>> etc and ROS7 ... but at their current pace I think we've outgrew them.
>>
>> All the threads discussing this issue has been absolutely quiet when it
>> comes to Mikrotik jumping in to mention or try to help troubleshoot. I
>> think they know they had bad hardware out there and do not want to honor
>> warranties. I've heard rumors of bad batches of 1072's.
>>
>> Anyone else encounter this?
>>
>>
>> --
>>
>> *Alex*
>> Alex Kessler / TECHNICAL OPERATIONS CENTER
>> *O (Ohio)* 740.212.3773 / *O (All other markets)* 888.966.5690 / 145 Columbus
>> Rd, Athens, OH 45701 / point-broadband.com
>>
>> --
>> AF mailing list
>> AF@af.afmug.com
>> 

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Steven Kenney via AF
About 10Gbps seems to be the choke point.  Infuriating actually.

On Wed, Apr 12, 2023 at 12:13 PM Trey Scarborough  wrote:

> Its a known hardware issue with connection tracking enabled and hardware
> offload. It has a hard limit to the number of connections it supports that
> is pretty low. Its high enough you won't notice till you get significant
> traffic, but low enough it is a common issue. The fix is to turn off
> connection tracking I know this isn't the best solution, but its the only
> one that works. This and the hardware availability of the processor are the
> reason they are discontinued. The good news is that moving over to the
> newer generation seems to resolve this, but comes with a handful of version
> 7 quirks.
> On 4/11/2023 5:55 PM, Alex Kessler wrote:
>
> Been experiencing this bug for years while running NAT and connection
> tracking.  Rebooting every few months while running v6 latest.  Does v7
> have any known fixes to resolve these watchdog reboots?
>
>
>
>
>
>
> ---
>
>
>
>
> From: "Colin Stanners" < cstanners at gmail.com >
> To: "af" < af at af.afmug.com >
> Sent: Monday, December 21, 2020 12:59:09 AM
> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>
> This last year, I've seen a MikroTik CCR1072 switch from long being
> rock-solid to now having occasional random reboots (from watchdog) or 100%
> CPU usage, which strangles the BGP process. In the latter case,
> tools->profile would show the firewall taking 100% of CPU, even after
> temporarily disabling all firewall filter and NAT rules and connection
> tracking. Not fun.
>
> MT tech support did not seem super helpful or interested, mostly
> recommending to disable watchdog (unacceptable on a production router) or
> to upgrade firmware (without specifying the suspected cause of the problem
> or nature of the fix).
>
> Tried 1 update, that didn't seem to help, have now tried another...
>
> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org >
> wrote:
> MIkrotik has been rock solid for me for years. Until this year and the
> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
> more than others. Threads in the forum all discuss the same problem
> exactly. Its a connection tracking issue.. however I need connection
> tracking on one particular router. I've adjusted everything I could.
> Firmware and board firmware all up to date etc. Happens randomly with low
> levels of traffic, high levels of traffic, sometimes a couple times a day,
> sometimes weeks. No DDOS evidence at all from upstream routers. Configs
> checked and rechecked by third party experts. I graph everything about the
> Mikrotik and there are no clues or anything abnormal happening before the
> crash. Plenty of memory, disk space, CPU etc. Replaces all the trannies,
> power cables and such. Not running BGP only OSPF on the one that is giving
> me the most trouble.
>
> Even have a serial console cable plugged into them to my opengear and set
> it to log pretty much everything to console including the kernel and
> nothing. A hard freeze.
>
> Then there is Mikrotik support... I've never needed their support before
> until now. So I put a ticket in and the shitty attitude I'm getting from
> them seems like they KNOW there is something wrong with the hardware and
> they are intentionally not being helpful. It is pretty clear to see with
> all the people reporting this issue that there IS an issue.
>
> If this is any indication of how things are going to go with Mikrotik on
> the newer hardware going forware I think its time to jump to an enterprise
> level system. Juniper most likely. Shame because they are just about
> keeping up with the demands with their hardware. Getting closer to 100Gbps
> etc and ROS7 ... but at their current pace I think we've outgrew them.
>
> All the threads discussing this issue has been absolutely quiet when it
> comes to Mikrotik jumping in to mention or try to help troubleshoot. I
> think they know they had bad hardware out there and do not want to honor
> warranties. I've heard rumors of bad batches of 1072's.
>
> Anyone else encounter this?
>
>
> --
>
> *Alex*
> Alex Kessler / TECHNICAL OPERATIONS CENTER
> *O (Ohio)* 740.212.3773 / *O (All other markets)* 888.966.5690 / 145 Columbus
> Rd, Athens, OH 45701 / point-broadband.com
>
> --
> AF mailing list
> AF@af.afmug.com
> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>

-- 


NOTICE OF CONFIDENTIALITY This communication, including any attachments, 
is intended only for the use of the addressee(s) to this email and is 
confidential. If you are not an intended recipient or acting on behalf of 
an intended recipient, any review, disclosure, conversion to hard copy, 
dissemination, reproduction or other use of any part of this communication 
is strictly prohibited. If you receive this communication in error or 
without authorization, please notify the originator 

Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Alex Kessler

What needs changed with the firewall?


On 4/12/2023 9:27 AM, Josh Luthman wrote:

Input firewall seems to be the right answer.  Not updating.

On Tue, Apr 11, 2023 at 6:59 PM Alex Kessler 
 wrote:


Been experiencing this bug for years while running NAT and
connection tracking.  Rebooting every few months while running v6
latest.  Does v7 have any known fixes to resolve these watchdog
reboots?






---




From: "Colin Stanners" < cstanners at gmail.com  >
To: "af" < af at af.afmug.com  >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long
being rock-solid to now having occasional random reboots (from
watchdog) or 100% CPU usage, which strangles the BGP process. In
the latter case, tools->profile would show the firewall taking
100% of CPU, even after temporarily disabling all firewall filter
and NAT rules and connection tracking. Not fun.

MT tech support did not seem super helpful or interested, mostly
recommending to disable watchdog (unacceptable on a production
router) or to upgrade firmware (without specifying the suspected
cause of the problem or nature of the fix).

Tried 1 update, that didn't seem to help, have now tried another...

On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at
wavedirect.org  > wrote:
MIkrotik has been rock solid for me for years. Until this year and
the 1072's. Random reboots set off by watchdog timer on all of my
1072's. Some more than others. Threads in the forum all discuss
the same problem exactly. Its a connection tracking issue..
however I need connection tracking on one particular router. I've
adjusted everything I could. Firmware and board firmware all up to
date etc. Happens randomly with low levels of traffic, high levels
of traffic, sometimes a couple times a day, sometimes weeks. No
DDOS evidence at all from upstream routers. Configs checked and
rechecked by third party experts. I graph everything about the
Mikrotik and there are no clues or anything abnormal happening
before the crash. Plenty of memory, disk space, CPU etc. Replaces
all the trannies, power cables and such. Not running BGP only OSPF
on the one that is giving me the most trouble.

Even have a serial console cable plugged into them to my opengear
and set it to log pretty much everything to console including the
kernel and nothing. A hard freeze.

Then there is Mikrotik support... I've never needed their support
before until now. So I put a ticket in and the shitty attitude I'm
getting from them seems like they KNOW there is something wrong
with the hardware and they are intentionally not being helpful. It
is pretty clear to see with all the people reporting this issue
that there IS an issue.

If this is any indication of how things are going to go with
Mikrotik on the newer hardware going forware I think its time to
jump to an enterprise level system. Juniper most likely. Shame
because they are just about keeping up with the demands with their
hardware. Getting closer to 100Gbps etc and ROS7 ... but at their
current pace I think we've outgrew them.

All the threads discussing this issue has been absolutely quiet
when it comes to Mikrotik jumping in to mention or try to help
troubleshoot. I think they know they had bad hardware out there
and do not want to honor warranties. I've heard rumors of bad
batches of 1072's.

Anyone else encounter this?


-- 


*Alex*
Alex Kessler/TECHNICAL OPERATIONS CENTER
*O (Ohio)*740.212.3773/*O (All other
markets)*888.966.5690/ 145 Columbus Rd, Athens, OH 45701
/point-broadband.com 

-- 
AF mailing list

AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com





-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Josh Luthman
Then why did mine have a kernel panic when there is no connection
tracking?  Why is it solved with significantly more traffic and only
changing the firewall?

On Wed, Apr 12, 2023 at 11:46 AM Trey Scarborough  wrote:

> Its a known hardware issue with connection tracking enabled and hardware
> offload. It has a hard limit to the number of connections it supports that
> is pretty low. Its high enough you won't notice till you get significant
> traffic, but low enough it is a common issue. The fix is to turn off
> connection tracking I know this isn't the best solution, but its the only
> one that works. This and the hardware availability of the processor are the
> reason they are discontinued. The good news is that moving over to the
> newer generation seems to resolve this, but comes with a handful of version
> 7 quirks.
> On 4/11/2023 5:55 PM, Alex Kessler wrote:
>
> Been experiencing this bug for years while running NAT and connection
> tracking.  Rebooting every few months while running v6 latest.  Does v7
> have any known fixes to resolve these watchdog reboots?
>
>
>
>
>
>
> ---
>
>
>
>
> From: "Colin Stanners" < cstanners at gmail.com >
> To: "af" < af at af.afmug.com >
> Sent: Monday, December 21, 2020 12:59:09 AM
> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>
> This last year, I've seen a MikroTik CCR1072 switch from long being
> rock-solid to now having occasional random reboots (from watchdog) or 100%
> CPU usage, which strangles the BGP process. In the latter case,
> tools->profile would show the firewall taking 100% of CPU, even after
> temporarily disabling all firewall filter and NAT rules and connection
> tracking. Not fun.
>
> MT tech support did not seem super helpful or interested, mostly
> recommending to disable watchdog (unacceptable on a production router) or
> to upgrade firmware (without specifying the suspected cause of the problem
> or nature of the fix).
>
> Tried 1 update, that didn't seem to help, have now tried another...
>
> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org >
> wrote:
> MIkrotik has been rock solid for me for years. Until this year and the
> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
> more than others. Threads in the forum all discuss the same problem
> exactly. Its a connection tracking issue.. however I need connection
> tracking on one particular router. I've adjusted everything I could.
> Firmware and board firmware all up to date etc. Happens randomly with low
> levels of traffic, high levels of traffic, sometimes a couple times a day,
> sometimes weeks. No DDOS evidence at all from upstream routers. Configs
> checked and rechecked by third party experts. I graph everything about the
> Mikrotik and there are no clues or anything abnormal happening before the
> crash. Plenty of memory, disk space, CPU etc. Replaces all the trannies,
> power cables and such. Not running BGP only OSPF on the one that is giving
> me the most trouble.
>
> Even have a serial console cable plugged into them to my opengear and set
> it to log pretty much everything to console including the kernel and
> nothing. A hard freeze.
>
> Then there is Mikrotik support... I've never needed their support before
> until now. So I put a ticket in and the shitty attitude I'm getting from
> them seems like they KNOW there is something wrong with the hardware and
> they are intentionally not being helpful. It is pretty clear to see with
> all the people reporting this issue that there IS an issue.
>
> If this is any indication of how things are going to go with Mikrotik on
> the newer hardware going forware I think its time to jump to an enterprise
> level system. Juniper most likely. Shame because they are just about
> keeping up with the demands with their hardware. Getting closer to 100Gbps
> etc and ROS7 ... but at their current pace I think we've outgrew them.
>
> All the threads discussing this issue has been absolutely quiet when it
> comes to Mikrotik jumping in to mention or try to help troubleshoot. I
> think they know they had bad hardware out there and do not want to honor
> warranties. I've heard rumors of bad batches of 1072's.
>
> Anyone else encounter this?
>
>
> --
>
> *Alex*
> Alex Kessler / TECHNICAL OPERATIONS CENTER
> *O (Ohio)* 740.212.3773 / *O (All other markets)* 888.966.5690 / 145 Columbus
> Rd, Athens, OH 45701 / point-broadband.com
>
> --
> AF mailing list
> AF@af.afmug.com
> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>
-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Sterling Jacobson
Exactly this below.

We used 1072 units as core Edge/BGP and OSPF/MPLS only, no connection tracking.

We upgraded from 1072 to 2116 units v7 on all edge units and 2004 v7 on all 
core units and also all access units using connection tracking. The 2004 units 
are great little processors of traffic, when they are in stock.

From: AF  On Behalf Of Trey Scarborough
Sent: Wednesday, April 12, 2023 9:06 AM
To: af@af.afmug.com
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug


Its a known hardware issue with connection tracking enabled and hardware 
offload. It has a hard limit to the number of connections it supports that is 
pretty low. Its high enough you won't notice till you get significant traffic, 
but low enough it is a common issue. The fix is to turn off connection tracking 
I know this isn't the best solution, but its the only one that works. This and 
the hardware availability of the processor are the reason they are 
discontinued. The good news is that moving over to the newer generation seems 
to resolve this, but comes with a handful of version 7 quirks.
On 4/11/2023 5:55 PM, Alex Kessler wrote:

Been experiencing this bug for years while running NAT and connection tracking. 
 Rebooting every few months while running v6 latest.  Does v7 have any known 
fixes to resolve these watchdog reboots?





---




From: "Colin Stanners" < cstanners at gmail.com >
To: "af" < af at af.afmug.com >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long being rock-solid 
to now having occasional random reboots (from watchdog) or 100% CPU usage, 
which strangles the BGP process. In the latter case, tools->profile would show 
the firewall taking 100% of CPU, even after temporarily disabling all firewall 
filter and NAT rules and connection tracking. Not fun.

MT tech support did not seem super helpful or interested, mostly recommending 
to disable watchdog (unacceptable on a production router) or to upgrade 
firmware (without specifying the suspected cause of the problem or nature of 
the fix).

Tried 1 update, that didn't seem to help, have now tried another...

On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org > wrote:
MIkrotik has been rock solid for me for years. Until this year and the 1072's. 
Random reboots set off by watchdog timer on all of my 1072's. Some more than 
others. Threads in the forum all discuss the same problem exactly. Its a 
connection tracking issue.. however I need connection tracking on one 
particular router. I've adjusted everything I could. Firmware and board 
firmware all up to date etc. Happens randomly with low levels of traffic, high 
levels of traffic, sometimes a couple times a day, sometimes weeks. No DDOS 
evidence at all from upstream routers. Configs checked and rechecked by third 
party experts. I graph everything about the Mikrotik and there are no clues or 
anything abnormal happening before the crash. Plenty of memory, disk space, CPU 
etc. Replaces all the trannies, power cables and such. Not running BGP only 
OSPF on the one that is giving me the most trouble.

Even have a serial console cable plugged into them to my opengear and set it to 
log pretty much everything to console including the kernel and nothing. A hard 
freeze.

Then there is Mikrotik support... I've never needed their support before until 
now. So I put a ticket in and the shitty attitude I'm getting from them seems 
like they KNOW there is something wrong with the hardware and they are 
intentionally not being helpful. It is pretty clear to see with all the people 
reporting this issue that there IS an issue.

If this is any indication of how things are going to go with Mikrotik on the 
newer hardware going forware I think its time to jump to an enterprise level 
system. Juniper most likely. Shame because they are just about keeping up with 
the demands with their hardware. Getting closer to 100Gbps etc and ROS7 ... but 
at their current pace I think we've outgrew them.

All the threads discussing this issue has been absolutely quiet when it comes 
to Mikrotik jumping in to mention or try to help troubleshoot. I think they 
know they had bad hardware out there and do not want to honor warranties. I've 
heard rumors of bad batches of 1072's.

Anyone else encounter this?

--

Alex
Alex Kessler / TECHNICAL OPERATIONS CENTER
O (Ohio) 740.212.3773 / O (All other markets) 888.966.5690 / 145 Columbus Rd, 
Athens, OH 45701 / point-broadband.com<https://point-broadband.com/>


-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Trey Scarborough
Its a known hardware issue with connection tracking enabled and hardware 
offload. It has a hard limit to the number of connections it supports 
that is pretty low. Its high enough you won't notice till you get 
significant traffic, but low enough it is a common issue. The fix is to 
turn off connection tracking I know this isn't the best solution, but 
its the only one that works. This and the hardware availability of the 
processor are the reason they are discontinued. The good news is that 
moving over to the newer generation seems to resolve this, but comes 
with a handful of version 7 quirks.


On 4/11/2023 5:55 PM, Alex Kessler wrote:


Been experiencing this bug for years while running NAT and connection 
tracking.  Rebooting every few months while running v6 latest.  Does 
v7 have any known fixes to resolve these watchdog reboots?






---




From: "Colin Stanners" < cstanners at gmail.com >
To: "af" < af at af.afmug.com >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long being 
rock-solid to now having occasional random reboots (from watchdog) or 
100% CPU usage, which strangles the BGP process. In the latter case, 
tools->profile would show the firewall taking 100% of CPU, even after 
temporarily disabling all firewall filter and NAT rules and connection 
tracking. Not fun.


MT tech support did not seem super helpful or interested, mostly 
recommending to disable watchdog (unacceptable on a production router) 
or to upgrade firmware (without specifying the suspected cause of the 
problem or nature of the fix).


Tried 1 update, that didn't seem to help, have now tried another...

On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org 
> wrote:
MIkrotik has been rock solid for me for years. Until this year and the 
1072's. Random reboots set off by watchdog timer on all of my 1072's. 
Some more than others. Threads in the forum all discuss the same 
problem exactly. Its a connection tracking issue.. however I need 
connection tracking on one particular router. I've adjusted everything 
I could. Firmware and board firmware all up to date etc. Happens 
randomly with low levels of traffic, high levels of traffic, sometimes 
a couple times a day, sometimes weeks. No DDOS evidence at all from 
upstream routers. Configs checked and rechecked by third party 
experts. I graph everything about the Mikrotik and there are no clues 
or anything abnormal happening before the crash. Plenty of memory, 
disk space, CPU etc. Replaces all the trannies, power cables and such. 
Not running BGP only OSPF on the one that is giving me the most trouble.


Even have a serial console cable plugged into them to my opengear and 
set it to log pretty much everything to console including the kernel 
and nothing. A hard freeze.


Then there is Mikrotik support... I've never needed their support 
before until now. So I put a ticket in and the shitty attitude I'm 
getting from them seems like they KNOW there is something wrong with 
the hardware and they are intentionally not being helpful. It is 
pretty clear to see with all the people reporting this issue that 
there IS an issue.


If this is any indication of how things are going to go with Mikrotik 
on the newer hardware going forware I think its time to jump to an 
enterprise level system. Juniper most likely. Shame because they are 
just about keeping up with the demands with their hardware. Getting 
closer to 100Gbps etc and ROS7 ... but at their current pace I think 
we've outgrew them.


All the threads discussing this issue has been absolutely quiet when 
it comes to Mikrotik jumping in to mention or try to help 
troubleshoot. I think they know they had bad hardware out there and do 
not want to honor warranties. I've heard rumors of bad batches of 1072's.


Anyone else encounter this?


--

*Alex*
Alex Kessler/TECHNICAL OPERATIONS CENTER
*O (Ohio)*740.212.3773/*O (All other 
markets)*888.966.5690/ 145 Columbus Rd, Athens, OH 45701 
/point-broadband.com 


-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Tyson Burris
I believe we have one in the data center that has this issue.
It’s just unacceptable to be honest. Seems like the solution is to replace it I 
guess.



Tyson Burris, President
Internet Communications Inc.
739 Commerce Dr.
Franklin, IN 46131

Office # 317-738-0320
Cell/Direct # 317-412-1540
Online: www.surfici.net

[ICI]
What can ICI do for you?

Broadband Wireless - PtP/PtMP Solutions - Mesh Wifi/Hotzones - IP Cameras - 
Fiber - Towers - Infrastructure.

CONFIDENTIALITY NOTICE: This e-mail is intended for the
addressee shown. It contains information that is
confidential and protected from disclosure. Any review,
dissemination or use of this transmission or its contents by
unauthorized organizations or individuals is strictly
prohibited.

From: AF  On Behalf Of Josh Luthman
Sent: Wednesday, April 12, 2023 9:27 AM
To: AnimalFarm Microwave Users Group 
Subject: Re: [AFMUG] Tik 1072 watchdog reboot bug

Input firewall seems to be the right answer.  Not updating.

On Tue, Apr 11, 2023 at 6:59 PM Alex Kessler 
mailto:akess...@intelliwave.com>> wrote:

Been experiencing this bug for years while running NAT and connection tracking. 
 Rebooting every few months while running v6 latest.  Does v7 have any known 
fixes to resolve these watchdog reboots?





---




From: "Colin Stanners" < cstanners at gmail.com<http://gmail.com> >
To: "af" < af at af.afmug.com<http://af.afmug.com> >
Sent: Monday, December 21, 2020 12:59:09 AM
Subject: Re: [AFMUG] Mikrotik 1072 Frustrations

This last year, I've seen a MikroTik CCR1072 switch from long being rock-solid 
to now having occasional random reboots (from watchdog) or 100% CPU usage, 
which strangles the BGP process. In the latter case, tools->profile would show 
the firewall taking 100% of CPU, even after temporarily disabling all firewall 
filter and NAT rules and connection tracking. Not fun.

MT tech support did not seem super helpful or interested, mostly recommending 
to disable watchdog (unacceptable on a production router) or to upgrade 
firmware (without specifying the suspected cause of the problem or nature of 
the fix).

Tried 1 update, that didn't seem to help, have now tried another...

On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at 
wavedirect.org<http://wavedirect.org> > wrote:
MIkrotik has been rock solid for me for years. Until this year and the 1072's. 
Random reboots set off by watchdog timer on all of my 1072's. Some more than 
others. Threads in the forum all discuss the same problem exactly. Its a 
connection tracking issue.. however I need connection tracking on one 
particular router. I've adjusted everything I could. Firmware and board 
firmware all up to date etc. Happens randomly with low levels of traffic, high 
levels of traffic, sometimes a couple times a day, sometimes weeks. No DDOS 
evidence at all from upstream routers. Configs checked and rechecked by third 
party experts. I graph everything about the Mikrotik and there are no clues or 
anything abnormal happening before the crash. Plenty of memory, disk space, CPU 
etc. Replaces all the trannies, power cables and such. Not running BGP only 
OSPF on the one that is giving me the most trouble.

Even have a serial console cable plugged into them to my opengear and set it to 
log pretty much everything to console including the kernel and nothing. A hard 
freeze.

Then there is Mikrotik support... I've never needed their support before until 
now. So I put a ticket in and the shitty attitude I'm getting from them seems 
like they KNOW there is something wrong with the hardware and they are 
intentionally not being helpful. It is pretty clear to see with all the people 
reporting this issue that there IS an issue.

If this is any indication of how things are going to go with Mikrotik on the 
newer hardware going forware I think its time to jump to an enterprise level 
system. Juniper most likely. Shame because they are just about keeping up with 
the demands with their hardware. Getting closer to 100Gbps etc and ROS7 ... but 
at their current pace I think we've outgrew them.

All the threads discussing this issue has been absolutely quiet when it comes 
to Mikrotik jumping in to mention or try to help troubleshoot. I think they 
know they had bad hardware out there and do not want to honor warranties. I've 
heard rumors of bad batches of 1072's.

Anyone else encounter this?

--
Alex
Alex Kessler / TECHNICAL OPERATIONS CENTER
O (Ohio) 740.212.3773 / O (All other markets) 888.966.5690 / 145 Columbus Rd, 
Athens, OH 45701 / point-broadband.com<https://point-broadband.com/>
--
AF mailing list
AF@af.afmug.com<mailto:AF@af.afmug.com>
http://af.afmug.com/mailman/listinfo/af_af.afmug.com
-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com


Re: [AFMUG] Tik 1072 watchdog reboot bug

2023-04-12 Thread Josh Luthman
Input firewall seems to be the right answer.  Not updating.

On Tue, Apr 11, 2023 at 6:59 PM Alex Kessler 
wrote:

> Been experiencing this bug for years while running NAT and connection
> tracking.  Rebooting every few months while running v6 latest.  Does v7
> have any known fixes to resolve these watchdog reboots?
>
>
>
>
>
>
> ---
>
>
>
>
> From: "Colin Stanners" < cstanners at gmail.com >
> To: "af" < af at af.afmug.com >
> Sent: Monday, December 21, 2020 12:59:09 AM
> Subject: Re: [AFMUG] Mikrotik 1072 Frustrations
>
> This last year, I've seen a MikroTik CCR1072 switch from long being
> rock-solid to now having occasional random reboots (from watchdog) or 100%
> CPU usage, which strangles the BGP process. In the latter case,
> tools->profile would show the firewall taking 100% of CPU, even after
> temporarily disabling all firewall filter and NAT rules and connection
> tracking. Not fun.
>
> MT tech support did not seem super helpful or interested, mostly
> recommending to disable watchdog (unacceptable on a production router) or
> to upgrade firmware (without specifying the suspected cause of the problem
> or nature of the fix).
>
> Tried 1 update, that didn't seem to help, have now tried another...
>
> On Sun, Dec 20, 2020, 11:38 PM Steven Kenney < steve at wavedirect.org >
> wrote:
> MIkrotik has been rock solid for me for years. Until this year and the
> 1072's. Random reboots set off by watchdog timer on all of my 1072's. Some
> more than others. Threads in the forum all discuss the same problem
> exactly. Its a connection tracking issue.. however I need connection
> tracking on one particular router. I've adjusted everything I could.
> Firmware and board firmware all up to date etc. Happens randomly with low
> levels of traffic, high levels of traffic, sometimes a couple times a day,
> sometimes weeks. No DDOS evidence at all from upstream routers. Configs
> checked and rechecked by third party experts. I graph everything about the
> Mikrotik and there are no clues or anything abnormal happening before the
> crash. Plenty of memory, disk space, CPU etc. Replaces all the trannies,
> power cables and such. Not running BGP only OSPF on the one that is giving
> me the most trouble.
>
> Even have a serial console cable plugged into them to my opengear and set
> it to log pretty much everything to console including the kernel and
> nothing. A hard freeze.
>
> Then there is Mikrotik support... I've never needed their support before
> until now. So I put a ticket in and the shitty attitude I'm getting from
> them seems like they KNOW there is something wrong with the hardware and
> they are intentionally not being helpful. It is pretty clear to see with
> all the people reporting this issue that there IS an issue.
>
> If this is any indication of how things are going to go with Mikrotik on
> the newer hardware going forware I think its time to jump to an enterprise
> level system. Juniper most likely. Shame because they are just about
> keeping up with the demands with their hardware. Getting closer to 100Gbps
> etc and ROS7 ... but at their current pace I think we've outgrew them.
>
> All the threads discussing this issue has been absolutely quiet when it
> comes to Mikrotik jumping in to mention or try to help troubleshoot. I
> think they know they had bad hardware out there and do not want to honor
> warranties. I've heard rumors of bad batches of 1072's.
>
> Anyone else encounter this?
>
>
> --
>
> *Alex*
> Alex Kessler / TECHNICAL OPERATIONS CENTER
> *O (Ohio)* 740.212.3773 / *O (All other markets)* 888.966.5690 / 145 Columbus
> Rd, Athens, OH 45701 / point-broadband.com
> --
> AF mailing list
> AF@af.afmug.com
> http://af.afmug.com/mailman/listinfo/af_af.afmug.com
>
-- 
AF mailing list
AF@af.afmug.com
http://af.afmug.com/mailman/listinfo/af_af.afmug.com