carp(4) debugging

2006-10-10 Thread Brian A. Seklecki
I'm building -current right now.  I'm looking forward to improvements 
between vlan(4) and carp(4) post 3.7.


I'm curious: Are there any new debugging mechanisms for carp(4) in 
-current/4.x ?  I was looking at ip_carp.{c,h} changelog.  It doesn't seem 
obvious if there are.


I.e., does ifconfig(8)'ing the DEBUG flag onto the interface generate any 
helpful output to log(9)?  Something along the lines of what you would get 
from "debug standby error", "debug standby event", "debug standby terse" 
in an IOS environment?


Anything to help debug the decision making algorithm used in 
master/standy/backup election process.


Certainly a way to log events (interfaces, etc.) and the resulting actions 
taken by the code would be useful in mission critical environments.


Anything beats "tcpdump 'proto carp'" and making guesses from there.

TIA,

-lava (Brian A. Seklecki - Pittsburgh, PA, USA)
   http://www.spiritual-machines.org/



Re: carp(4) debugging

2006-10-10 Thread Ryan McBride
On Tue, Oct 10, 2006 at 05:50:50PM -0400, Brian A. Seklecki wrote:
> Certainly a way to log events (interfaces, etc.) and the resulting actions 
> taken by the code would be useful in mission critical environments.
> 
> Anything beats "tcpdump 'proto carp'" and making guesses from there.

Nothing new to 4.0, but a few of the things you can do besides using
tcpdump are:

route monitor 
- see interface link state change
sysctl net.inet.carp.log=1
- generates primarily protocol error messages
netstat -sp carp
- display a number of relevant counters

If you want to do more complicated things, like run commands when carp
interfaces change state, you can have a look at ifstated.

-Ryan



Re: carp(4) debugging

2006-10-11 Thread Brian A. Seklecki

Exciting stuff; totally missed the log sysctl.

The netstat(8) reveals some interesting info about a persistent failover 
condition:


$ netstat -sp carp
carp:
7731906 packets received (IPv4)
0 packets received (IPv6)
0 packets discarded for bad interface
0 packets discarded for wrong TTL
0 packets shorter than header
0 discarded for bad checksums
0 discarded packets with a bad version
0 discarded because packet too short
0 discarded for bad authentication
0 discarded for bad vhid
0 discarded because of a bad address list
118961 packets sent (IPv4)
0 packets sent (IPv6)

** 152 send failed due to mbuf memory error


But yet:

$ netstat -m

[...snip...]

290/558/6144 mbuf clusters in use (current/peak/max)
1224 Kbytes allocated to network (53% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines

Which is interesting because an identical backup unit does not exhibit 
these errors at all, even when running as MASTER for weeks at end.


MBuf isn't getting exhausted; MRTG does show interfaces getting staturated 
either.  The machine has an absurd ammount of RAM for a Router, too.


Also interesting how it is printed out, as well, as if it is under the 
IPv6 statistics; however these systems have a userland and kernel compiled 
without IPv6 support.


But since this is 3.7-era code, it's hard to imagine troubleshooting this 
further.  Certainly a 4x upgrade is in order before I go chasing down an 
mbuf exhaustion problem.


This is most likely related somehow to the absurdley high number of max 
states (set limit states 20, etc.)


~BAS

On Wed, 11 Oct 2006, Ryan McBride wrote:


On Tue, Oct 10, 2006 at 05:50:50PM -0400, Brian A. Seklecki wrote:

Certainly a way to log events (interfaces, etc.) and the resulting actions
taken by the code would be useful in mission critical environments.

Anything beats "tcpdump 'proto carp'" and making guesses from there.


Nothing new to 4.0, but a few of the things you can do besides using
tcpdump are:

route monitor
- see interface link state change
sysctl net.inet.carp.log=1
- generates primarily protocol error messages
netstat -sp carp
- display a number of relevant counters

If you want to do more complicated things, like run commands when carp
interfaces change state, you can have a look at ifstated.

-Ryan



l8*
-lava (Brian A. Seklecki - Pittsburgh, PA, USA)
   http://www.spiritual-machines.org/

"...from back in the heady days when "helpdesk" meant nothing, "diskquota"
meant everything, and lives could be bought and sold for a couple of pages
of laser printout - and frequently were."