Checkpoint clusters are based on the same poor HA forwarding design, but try to tell that firewall guys...
Back to the issue: Broadcasts and Multicasts (non-programmed) are always handled by interrupts thus consulting CPU ressources. CoPP can't handle that (at least not on PFC3), so that you rely entirely on the availability of h/w-based rate-limiters of your PFC platform. For example if there wasn't a HSRP rate-limiter shipped with the SX code you could quite easily kill the box with few Megs of HSRP. The same is true for all other B/Mcasts. -- Christian On 16.12.2012, at 23:22, Tony Varriale <tvarri...@comcast.net> wrote: > On 12/16/2012 5:59 AM, Robert Williams wrote: >> Hi, I'll try to go into some additional detail on the traffic and other >> router config elements now. >> >> The traffic is basically made up of a randomly generated packet which is >> almost identical to the below. >> >> The 'random' element is that the source port is different every time. >> >> This packet was 10.0.5.200 (00:50:56:a6:00:23) -> 10.0.5.88 >> (01:00:5e:7f:05:77) >> >> The test interface on the 6500 is currently on 10.0.5.123. >> >> The below packet was captured on the control-plane going towards the >> Route-Processor CPU. >> >> ---------------------------------------------------------------------------------- >> ---------------------------------------------------------------------------------- >> No. Time Source Destination Protocol >> Length Info >> 23985 2023.684297 10.0.5.200 10.0.5.88 TCP 60 >> config-port > 0 [<None>] Seq=1 Win=512 Len=0 >> >> Frame 23985: 60 bytes on wire (480 bits), 60 bytes captured (480 bits) >> Arrival Time: Dec 16, 2012 11:36:32.951556000 UTC >> Epoch Time: 1355657792.951556000 seconds >> [Time delta from previous captured frame: 0.000300000 seconds] >> [Time delta from previous displayed frame: 0.000300000 seconds] >> [Time since reference or first frame: 2023.684297000 seconds] >> Frame Number: 23985 >> Frame Length: 60 bytes (480 bits) >> Capture Length: 60 bytes (480 bits) >> [Frame is marked: True] >> [Frame is ignored: False] >> [Protocols in frame: eth:ip:tcp] >> [Coloring Rule Name: TCP] >> [Coloring Rule String: tcp] >> Ethernet II, Src: Vmware_a6:00:23 (00:50:56:a6:00:23), Dst: >> IPv4mcast_7f:05:77 (01:00:5e:7f:05:77) >> Destination: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77) >> Address: IPv4mcast_7f:05:77 (01:00:5e:7f:05:77) >> .... ...1 .... .... .... .... = IG bit: Group address >> (multicast/broadcast) >> .... ..0. .... .... .... .... = LG bit: Globally unique address >> (factory default) >> Source: Vmware_a6:00:23 (00:50:56:a6:00:23) >> Address: Vmware_a6:00:23 (00:50:56:a6:00:23) >> .... ...0 .... .... .... .... = IG bit: Individual address (unicast) >> .... ..0. .... .... .... .... = LG bit: Globally unique address >> (factory default) >> Type: IP (0x0800) >> Trailer: 000000000000 >> Internet Protocol Version 4, Src: 10.0.5.200 (10.0.5.200), Dst: 10.0.5.88 >> (10.0.5.88) >> Version: 4 >> Header length: 20 bytes >> Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: >> Not-ECT (Not ECN-Capable Transport)) >> 0000 00.. = Differentiated Services Codepoint: Default (0x00) >> .... ..00 = Explicit Congestion Notification: Not-ECT (Not >> ECN-Capable Transport) (0x00) >> Total Length: 40 >> Identification: 0x7b6e (31598) >> Flags: 0x00 >> 0... .... = Reserved bit: Not set >> .0.. .... = Don't fragment: Not set >> ..0. .... = More fragments: Not set >> Fragment offset: 0 >> Time to live: 64 >> Protocol: TCP (6) >> Header checksum: 0xe042 [correct] >> [Good: True] >> [Bad: False] >> Source: 10.0.5.200 (10.0.5.200) >> Destination: 10.0.5.88 (10.0.5.88) >> Transmission Control Protocol, Src Port: config-port (3577), Dst Port: 0 >> (0), Seq: 1, Len: 0 >> Source port: config-port (3577) >> Destination port: 0 (0) >> [Stream index: 3651] >> Sequence number: 1 (relative sequence number) >> Acknowledgement number: Broken TCP. The acknowledge field is nonzero >> while the ACK flag is not set >> Header length: 20 bytes >> Flags: 0x000 (<None>) >> 000. .... .... = Reserved: Not set >> ...0 .... .... = Nonce: Not set >> .... 0... .... = Congestion Window Reduced (CWR): Not set >> .... .0.. .... = ECN-Echo: Not set >> .... ..0. .... = Urgent: Not set >> .... ...0 .... = Acknowledgement: Not set >> .... .... 0... = Push: Not set >> .... .... .0.. = Reset: Not set >> .... .... ..0. = Syn: Not set >> .... .... ...0 = Fin: Not set >> Window size value: 512 >> [Calculated window size: 512] >> [Window size scaling factor: -1 (unknown)] >> Checksum: 0xd021 [validation disabled] >> [Good Checksum: False] >> [Bad Checksum: False] >> ---------------------------------------------------------------------------------- >> ---------------------------------------------------------------------------------- >> >> As for the multicast configuration on the box - it doesn't run any end-user >> multicast services, other than VRRP/HSRP between itself and a partner 6500 >> (for gateway resilience). >> >> As such there is no multicast configuration. In fact, if anything it would >> be ideal if the box dropped all multicast traffic apart from the HSRP/VRRP >> to be honest. >> >> The reason I think this may be causing issues is because it is destined to a >> non-multicast IP, but with a multicast MAC....? >> >> I also tried the suggestion of disabling CoPP and the traffic was still >> hitting the CPU at the same rate. >> >> To answer the other questions, the TTL on these test packets is 64 and the >> router has "IP options drop" set globally. There are also rate-limits for >> TTL expired and all interfaces in question have "no ip unreachables" set. In >> fact. the test interface config is currently: >> >> interface Vlan10 >> ip address 10.0.5.123 255.255.255.0 >> ip access-group test in >> no ip redirects >> no ip unreachables >> no ip proxy-arp >> >> I have also tried enabling/disabling these on the vlan interface: >> ip pim snooping >> ip igmp version 3 >> >> But no impact was seen. >> >> There is also a test ACL I have been experimenting with to try and match the >> test traffic, which (after receiving 100,000 test packets) shows the >> following: >> >> Extended IP access list test >> 10 deny ip host 10.0.5.200 any (9 matches) >> 20 deny ip any host 10.0.5.88 >> 30 deny ip any 224.0.0.0 0.15.255.255 (4 matches) >> 1000 permit ip any any (504 matches) >> >> So even though I've specifically matched the traffic source and destination >> IPs, I'm not getting matches or drops. >> >> (The "permit ip any any" is matching other random traffic we have on that >> test network at the moment and increments normally without the test packets) >> >> Some additional background info: >> >> The situation arose in the real world when a Windows NLB cluster went >> offline and there was a load of traffic heading to its shared IPv4 address >> ('not' a multicast IP, but 'is' a multicast MAC) - so the switch flooded to >> all ports, including the 6500 upstream, triggering high CPU. >> >> Thanks again! >> >> >> >> >> Robert Williams >> Custodian Data Centre >> Email: rob...@custodiandc.com >> http://www.CustodianDC.com >> >> >> Robert Williams >> Backline / Operations Team >> Custodian DataCentre >> tel: +44 (0)1622 230382 >> email: rob...@custodiandc.com >> http://www.custodiandc.com/disclaimer.txt >> >> -----Original Message----- >> From: cisco-nsp-boun...@puck.nether.net >> [mailto:cisco-nsp-boun...@puck.nether.net] On Behalf Of Phil Mayers >> Sent: 16 December 2012 11:26 >> To: cisco-nsp@puck.nether.net >> Subject: Re: [c-nsp] All multicast punting to CPU on 6500 >> >> I think the implication is that it's possible for a CoPP policy to prevent >> the forwarding hardware "seeing" the multicast and installing the hardware >> shortcuts to drop uninteresting traffic. >> >> You might try disabling CoPP to see if that changes things. >> >> You weren't very specific about the type of multicast traffic and the >> multicast config on the box. I'm going to guess it's IPv4/IPv6 multicast >> from the MAC addresses, but is the 6500 configured for multicast routing, >> and is it enabled on that interface? If so, what does "sh ip mr <the group>" >> say? >> >> I assume you've eliminated the really obviously things like TTL=1 and IP >> options / special packet stuff? > This covers the issue well. > > http://www.cisco.com/en/US/products/hw/switches/ps708/products_configuration_example09186a0080a07203.shtml > > Highly recommended to stay away from MS NLB. It's been designed poorly for > over 7 years (that I know of personally). > > I think your test is invalid. You should come up with a real use case(s). > In the world of networking, most of us can come up with tests that would > crush and boggle any box. > > tv > _______________________________________________ > cisco-nsp mailing list cisco-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/cisco-nsp > archive at http://puck.nether.net/pipermail/cisco-nsp/ _______________________________________________ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/