[j-nsp] Broadcast storm on M7i fxp0 kills the CFEB?
All, Yesterday, an error caused a loop in our OOB network. This resulted in one of our route reflectors failing, badly. Apparently, the broadcast storm caused the CFEB to die. Both 1GE ports went link-down, which is understandable since the CFEB actually seems to have rebooted: admin@ext-m7i-2 show chassis cfeb CFEB status: ... Start time: 2012-06-21 14:46:39 BST Uptime: 22 hours, 24 minutes, 7 seconds The box logged all sorts of horrible messages, which suggest the internal control connections (via fxp1) somehow hung or died - possibly the RE CPU was pegged? To say that this is disturbing is an understatement; surely there should be no conceivable way for traffic on fxp0 to cause the CFEB to crash? Has anyone else seen this? Does anyone have any ideas why it happened, and how I can ensure it cannot happen in future? This is an RE 5.0 (400MHz) upgraded to 768Mb of RAM, running 10.4R8.5. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Broadcast storm on M7i fxp0 kills the CFEB?
Hello Phil, I have seen this happen a few times and with different platforms. A good way to avoid this is to configure policing on the OOB switches ports facing the REs. Regards Amos Sent from my iPhone On 22 Jun 2012, at 15:16, Phil Mayers p.may...@imperial.ac.ukmailto:p.may...@imperial.ac.uk wrote: All, Yesterday, an error caused a loop in our OOB network. This resulted in one of our route reflectors failing, badly. Apparently, the broadcast storm caused the CFEB to die. Both 1GE ports went link-down, which is understandable since the CFEB actually seems to have rebooted: admin@ext-m7i-2 show chassis cfeb CFEB status: ... Start time: 2012-06-21 14:46:39 BST Uptime: 22 hours, 24 minutes, 7 seconds The box logged all sorts of horrible messages, which suggest the internal control connections (via fxp1) somehow hung or died - possibly the RE CPU was pegged? To say that this is disturbing is an understatement; surely there should be no conceivable way for traffic on fxp0 to cause the CFEB to crash? Has anyone else seen this? Does anyone have any ideas why it happened, and how I can ensure it cannot happen in future? This is an RE 5.0 (400MHz) upgraded to 768Mb of RAM, running 10.4R8.5. ___ juniper-nsp mailing list juniper-nsp@puck.nether.netmailto:juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Broadcast storm on M7i fxp0 kills the CFEB?
On 22/06/12 13:29, Amos Rosenboim wrote: Hello Phil, I have seen this happen a few times and with different platforms. A good way to avoid this is to configure policing on the OOB switches ports facing the REs. Unfortunately, our OOB network is constructed from older, repurposed equipment. I doubt we have the ability to do the required egress policing. What kind of policing parameters have you successfully used? ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Broadcast storm on M7i fxp0 kills the CFEB?
Phil, Actually, I am not surprised that this happened to you. The fxp0 interface is a funny animal. It isn't really as isolated from the rest of the box as you would think. Since all IP broadcast/multicast on layer3 interfaces get sent to the RE by default, if you get a loop that starts to pump out tons of broadcasts, then all of that traffic will start to crush the RE and/or the forwading path to the RE. It does not matter if the storm happens on regular interfaces or fxp0. The only way you can mitigate against this is with RE protection filters. For example, you can implement a policer on fxp0 that handles packet bursts on ingress. But I found it just as easy to enumerate which protocols and/or source ips need access to fxp0 and discard the rest using a firewall filter. I learned the hard way :-) You can follow this thread to find out what I went through: http://www.gossamer-threads.com/lists/nsp/juniper/31311 My experience has been with the MX, but I am pretty sure the same applies to the M7i. Clarke Morledge College of William and Mary Information Technology - Network Engineering Jones Hall (Room 18) Williamsburg VA 23187 ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Broadcast storm on M7i fxp0 kills the CFEB?
On 6/22/12 6:28 AM, Phil Mayers wrote: On 22/06/12 13:29, Amos Rosenboim wrote: Hello Phil, I have seen this happen a few times and with different platforms. A good way to avoid this is to configure policing on the OOB switches ports facing the REs. Unfortunately, our OOB network is constructed from older, repurposed equipment. I doubt we have the ability to do the required egress policing. What kind of policing parameters have you successfully used? The arp policer is the one that normally kicks in. there are golobal defaults which iirc vary by platform. the trick is to have per interface limits which are lower than the global limits so that the policer renders that interface unusable without rendering all arp learning on the box dead at once. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp