Re: ipfw: switching sets does stall the machine

David Wolfskill Fri, 14 Jun 2019 10:21:45 -0700

On Fri, Jun 14, 2019 at 05:33:02PM +0200, Peter wrote:
> 
> Hi,
> I am trying to use two different configurations (production and test)
> loaded into different sets, and switch between them with
> 
>    # ipfw set disable ... enable ...
> 
> When testing my script, this did work, except once the machine went
> into "swap_pager indefinite wait" and was lost.


IIRC, this message means that a command was sent to a disk controller
and at least 20 seconds have elapsed with no response from that
controller.  That doesn't seem like an "ipfw" issue, per se.

> Then, after reboot (and automatically loading the production rules) I
> tried to load and switch to the test rules, and immediately got ATA
> COMMAND TIMEOUT and the machine was lost.

Again, that's a disk subsystem (apparently) doing Bad Things.

> I repeated this a few times, it is nicely reproducible: withing 3-5
> seconds after the new rules are loaded, the machine locks up and is
> lost.

It's at least plausible that the catalyzing activity causes a certain
disk I/O pattern that does the actual triggering (I expect).

> I analyzed more closely by running "top -HPS" in rtprio, and found
> this:
>  * loading the rules is no problem.
>  * when switching sets, the command returns, but then within few
>    seconds the machine gets unresponsive and stays so until watchdog
>    hits.
>  * The last thing seen in "top" (before it freezes) is this thread
>    eating 85% CPU (and running with high priority):
>    [irq12: uhci0 uhci1]
> 
> 
> It there a known workaround?
> ....

My inclination is for you to check the disk drive(s), cabling, and
controller(s) before much else.

Peace,
david
-- 
David H. Wolfskill                              [email protected]
Donald Trump advocated for the executions of five factually innocent young men.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

signature.asc
Description: PGP signature

Re: ipfw: switching sets does stall the machine

Reply via email to