Re: [j-nsp] MX punting packets to RE - why?
On 3 February 2016 at 19:09, Ross Halliday wrote: > Oh dear, that sounds like quite the chore. I don't understand your reasoning > behind lowering the parameters so far from the defaults, though. 3000 > pps/5000 packet burst is how the box ships. Or am I to read between the lines > re: "random recommendation"? lol Lot of the DDoS-protection limits are 20kpps by default, which is more than the NPU will even punt to the PFE CPU, so there will be additional policer anyhow limiting more strictly. The defaults are unfortunately not sane. Only reason you'd need to punt multicast, is to fix your ingress interface in the HW, so really 1 packet per group will do, anything extra is just additional useless work for CPU. > Maybe this is something I should talk with JTAC about at this point. I don't > want to slam the RE but I don't want to have such a massive cutout, either. Absolutely, always good idea to engage vendor support. > Oh, the redundancy definitely works, don't get me wrong. For some reason the > MX is deciding it has to resolve packets instead of just sending whatever > comes in with that VLAN tag into an l2circuit. Reason is, the ingress interface of mcast stream changed, so the multicast tree was incorrect. > Internet multicast, as we have things now, would be an absolute nightmare. > But as far as unknown DoS vectors and other quirkiness, I compare it to IPv6 > a few years ago. Everybody basically does it half-assed because nobody uses > it. The only applications we have for multicast are TV service delivery and > some timing protocols here and there. I did quite few multicast setup for companies running CCTV, where CCTV by default sends to multicast (But can be changed to send unicast). In each of these configurations the CCTV only had single host join (recorder). So multicast was just useless complexity with no advantages, so I guess my failure as I was consistently unable to to convince them to reconfigure the CCTV's for unicast. -- ++ytti ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] MX punting packets to RE - why?
Hey again, > No, something like this: > edit system ddos-protection protocols resolve mcast-v4 > set bandwidth 100 > set burst 100 > set flow-level-bandwidth logical-interface 20 > set flow-level-detection subscriber off > set flow-level-detection logical-interface on > > So we allow on aggregate 100pps of mcast-v4 resolve, but only 20pps > per IFL. So even if one IFL is misbehaving, another IFL's mcast-v4 > resolve works fine. > > There are only 4k policers available in HW for ddos-protection, which > makes me turn off subscriber detection, as it's easy for attacker to > generate 'new subscriber' (like in BGP change SADDR => new subscriber) > and congest those 4k policer slots. > I make logical-interface on (instead of 'automatic', which means they > are added dynamically if aggregate policer if out-of-contract, but > with 'on' they are always there, this guarantees well-behaving IFL > does not suffer or the duration software detects this and adds the IFL > policers) > > This is just random recommendation. And funny thing is, you'd need to > make similar thing to _all_ protools available for 'ddos-protection', > there are quite many of them, like maybe 100, so you'll get several > thousand new lines of config just for this. > I wish there was way to set default explicit values, but there isn't. Oh dear, that sounds like quite the chore. I don't understand your reasoning behind lowering the parameters so far from the defaults, though. 3000 pps/5000 packet burst is how the box ships. Or am I to read between the lines re: "random recommendation"? lol Maybe this is something I should talk with JTAC about at this point. I don't want to slam the RE but I don't want to have such a massive cutout, either. > Hmm, then the redundancy really should have worked, unless you're > doing some IGMP Snooping or something in the switches (on by default) > which might require convergence on multicast states on the L2 too. > If you're rocking RSTP or MST, and it's correctly configured (all > non-l2-core ports _MUST_ be portfast, because MST will not unblock > downstream ports, until there is explicit permission from all upstream > ports, and if you don't have portfast on in 1 port which is not > speaking MST, then this port will block whole MST convergence, as > you're waiting for explicit permission from that port, which will > never come). Oh, the redundancy definitely works, don't get me wrong. For some reason the MX is deciding it has to resolve packets instead of just sending whatever comes in with that VLAN tag into an l2circuit. > Yeah multicast is tricky subject That's the understatement of the year! > I dislike > multicast, I believe there are probably lot of yet unknown DoS vectors > in it, and I would never run internet multicast. But for well > controlled internal applications, it may sometimes be least bad > solution. Internet multicast, as we have things now, would be an absolute nightmare. But as far as unknown DoS vectors and other quirkiness, I compare it to IPv6 a few years ago. Everybody basically does it half-assed because nobody uses it. The only applications we have for multicast are TV service delivery and some timing protocols here and there. > But getting help on the setup probably is huge chore, just getting > external person to understand the network takes time. Yes, that's for sure. Thank you very much for all of your feedback and effort into trying to understand what's going on here. I'm going to see if our CSE is subscribed to this list and ask him to take a peek. Thanks again. Ross ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] MX punting packets to RE - why?
On 3 February 2016 at 02:18, Ross Halliday wrote: Hey, > Yes, on the entire MPC I will see unrelated control plane protocols bounce, > eg. spanning-tree. If I recall correctly some protocols are handled by the > TRIO chips, right? I don't see any of my BFD-managed ISIS adjacencies drop. > >> If yes, just limit the mcast resolve to something reasonable 100pps should >> be plenty, provided we're not competing with actual attack traffic. >> >> I would start with ddos-protection fixes and see if it behaves better with >> more restricted punting. > > I assume you're referring to "set forwarding-options multicast resolve-rate", > right? No, something like this: edit system ddos-protection protocols resolve mcast-v4 set bandwidth 100 set burst 100 set flow-level-bandwidth logical-interface 20 set flow-level-detection subscriber off set flow-level-detection logical-interface on So we allow on aggregate 100pps of mcast-v4 resolve, but only 20pps per IFL. So even if one IFL is misbehaving, another IFL's mcast-v4 resolve works fine. There are only 4k policers available in HW for ddos-protection, which makes me turn off subscriber detection, as it's easy for attacker to generate 'new subscriber' (like in BGP change SADDR => new subscriber) and congest those 4k policer slots. I make logical-interface on (instead of 'automatic', which means they are added dynamically if aggregate policer if out-of-contract, but with 'on' they are always there, this guarantees well-behaving IFL does not suffer or the duration software detects this and adds the IFL policers) This is just random recommendation. And funny thing is, you'd need to make similar thing to _all_ protools available for 'ddos-protection', there are quite many of them, like maybe 100, so you'll get several thousand new lines of config just for this. I wish there was way to set default explicit values, but there isn't. > I'm in the habit of running VSTP for everything (the Cisco half of my brain > keeps trying to type rapid-pvst+) that isn't a two-port affair. BPDUs are > definitely making it through, everything checks out. The paths over the > l2circuits are normally blocked via increased interface cost. Hmm, then the redundancy really should have worked, unless you're doing some IGMP Snooping or something in the switches (on by default) which might require convergence on multicast states on the L2 too. If you're rocking RSTP or MST, and it's correctly configured (all non-l2-core ports _MUST_ be portfast, because MST will not unblock downstream ports, until there is explicit permission from all upstream ports, and if you don't have portfast on in 1 port which is not speaking MST, then this port will block whole MST convergence, as you're waiting for explicit permission from that port, which will never come). > One of the VLANs carried as an l2circuit by the MXes between the EXes is > actually not spanning-tree controlled, but a "backup" PIM interface. > Essentially a clone of the EX-EX direct link, but with higher metric. Unlike > the other VLANs this one always has the PIM and BGP adjacency sending traffic > over it. The ddos-protection resolve-mcast4 action trips when multicast is > slammed over that or one of the VSTP-managed VLANs transitions to a > forwarding state. > > > I can do up a diagram if that would help. I'm really not sure how I'd explain > this to JTAC and wanted to get some real-world experience from guys who are > working with this stuff. Yeah multicast is tricky subject, I've had to learn it maybe 3 times now, since it's needed so rarely it's easy to forget. I dislike multicast, I believe there are probably lot of yet unknown DoS vectors in it, and I would never run internet multicast. But for well controlled internal applications, it may sometimes be least bad solution. But getting help on the setup probably is huge chore, just getting external person to understand the network takes time. -- ++ytti ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] MX punting packets to RE - why?
Thanks Michael. Looks like I'm at 66 pps like Dragan mentioned. Some night I'll set up a maintenance window and play with this knob... Cheers Ross -Original Message- From: Michael Hare [mailto:michael.h...@wisc.edu] Sent: Monday, February 01, 2016 10:19 PM To: Ross Halliday Cc: juniper-nsp@puck.nether.net Subject: RE: [j-nsp] MX punting packets to RE - why? Ross- Change 'fpc0' to 'afeb0' in your failed command. I got goose eggs, but this lab chassis isn't doing multicast which may play a part. $user@mx104-lab-re0> request pfe execute target afeb0 command "show nhdb mcast resolve" SENT: Ukern command: show nhdb mcast resolve GOT: GOT: Nexthop Info: GOT:ID TypeProtocolResolve-Rate GOT: - -- --- LOCAL: End of file -Michael -Original Message- From: juniper-nsp [mailto:juniper-nsp-boun...@puck.nether.net] On Behalf Of Ross Halliday Sent: Monday, February 1, 2016 2:38 PM To: Dragan Jovicic ; Saku Ytti Cc: juniper-nsp@puck.nether.net Subject: Re: [j-nsp] MX punting packets to RE - why? Hi Saku and Dragan, Thank you for the responses, and apologies for the ambiguity. The EXes are our video source switches. PIM RP is shared with MSDP to an anycast address. The MXes connect to the EXes at L3 via BGP - MX1/EX1 link is de-prioritized with a metric. Most of our receivers ride off of MX2, with a few further downstream. Due to some interop issues and our use of VBR we've settled on a single MDT for this VRF. Being the default MDT it is of course joined on all PEs with this VRF. During normal operation, MX1, which doesn't have any active traffic for this VRF, has a full list of mcast routes with the source interface of the MDT. So, in the first failure scenario - let's say EX2 or MX2 totally dies - MX1 will lose a preferred BGP route to the RP and sources and see everything over the MX1/EX1 link, so all of the S,G entries will need to be updated from mt-0/0/0.1081344 to xe-0/3/0.3812. If I am understanding what you guys are saying correctly, this would cause everything to get punted to the CPU until a new hardware shortcut is created, and in the meantime - since our entire channel lineup is in there - this would hammer the DoS protection mechanism? Can the rate at which the joins are sent out be slowed? I can live with a bit of a delay on the channels coming back to life, but not with the entire slot getting blackholed... I am also open to tweaking the DoS protection settings but it seems to me that a 10x increase would be opening myself up to really slamming the RE and causing even bigger problems. I come from SUP720 world, and I rather like having a box that can process BFD and BGP updates at the same time LOL The other failure scenario is when the EX1/EX2 link goes down. When this happens, all devices are still up, so as far as BGP or really anything on the MX "knows", nothing has changed. Metric and next-hops are identical to the PEs. Instead of pulling video from the direct link, EX1 & EX2 can only see each other through VLANs that the MXes carry as EoMPLS l2circuits. This is what truly baffles me, as none of what you guys mentioned with regards to should apply to an l2circuit. Also, > request pfe execute target fpc0 command "show nhdb mcast resolve" error: command is not valid on the mx104 :( Thanks for your help guys! Ross From: Dragan Jovicic [mailto:dragan...@gmail.com] Sent: Sunday, January 31, 2016 7:44 AM To: Saku Ytti Cc: Ross Halliday; juniper-nsp@puck.nether.net Subject: Re: [j-nsp] MX punting packets to RE - why? Correct me if I'm wrong, this looks like MX doesn't have multicast cache for all those S,G routes (in inet.1). So first packet of each S,G entry must first be resolved by kernel and downloaded to PFE. DDOS feature is activated because large influx of unresolved packets are passing trough the router. You could change default DDOS setting for this type of traffic on your FPC. Another thing that comes to mind is that kernel itself has limited number of resolves per second, which is 66. That is, 66 different NH S,G entries will be resolved per second. dj@mx-re0> request pfe execute target fpc0 command "show nhdb mcast resolve" SENT: Ukern command: show nhdb mcast resolve GOT: GOT: Nexthop Info: GOT: ID Type Protocol Resolve-Rate GOT: - -- --- GOT: 1927 Resolve IPv6 66 GOT: 1962 Resolve IPv4 66 LOCAL: End of file This is modified by (hidden) knob: dj@mx-re0# set forwarding-options multicast resolve-rate ? Possible completions: Multicast resolve rate (100..1000 per second) {master}[edit] Mind you, I haven't tested this. HTH, Regards On Sat, Jan 30, 2016 at 12:04 PM, Saku Ytti wrote: Hey Ross, It's not clear to me i
Re: [j-nsp] MX punting packets to RE - why?
Hello, > > If I am understanding what you guys are saying correctly, this would cause > > everything to get punted to the CPU until a new hardware shortcut is > > created, and in the meantime - since our entire channel lineup is in there > > - this would hammer the DoS protection mechanism? > > Yes, if ingress interface does not match, they will be punted. Okay, thanks > > Can the rate at which the joins are sent out be slowed? I can live with a > > bit of a delay on the channels coming back to life, but not with the entire > > slot getting blackholed... > > What do you mean 'entire slot being blackholed', do you mean losing unrelated > control-plane stuff, like BGP/ARP etc? Yes, on the entire MPC I will see unrelated control plane protocols bounce, eg. spanning-tree. If I recall correctly some protocols are handled by the TRIO chips, right? I don't see any of my BFD-managed ISIS adjacencies drop. > If yes, just limit the mcast resolve to something reasonable 100pps should be > plenty, provided we're not competing with actual attack traffic. > > I would start with ddos-protection fixes and see if it behaves better with > more restricted punting. I assume you're referring to "set forwarding-options multicast resolve-rate", right? > Further research might involve figuring out if both MX boxes have multicast > state with source towards the local EX port, and clients subscribed. So that > no convergence is needed. Interesting concept! Doesn't bother me at all, we like the idea of having our multicast available everywhere anyway. > It wasn't obvious to me what kind of negative impact you observe when the > EX-EX link goes down. How are you now stopping loop in the EX network? You > have direct physical link between them, then you have l2circuit as well? But > it looks like you're not carrying BPDU over the l2circuit? So if you rely on > STP, I'm not entirely sure how the L2 redundancy works, which port is > normally being blocked? The actual physical link between switches or the link > via l2circuit, since my first guess would be that there would be L2 loop in > the topology and nothing to stop it, so I'm not sure I understand why it > works at all. I'm in the habit of running VSTP for everything (the Cisco half of my brain keeps trying to type rapid-pvst+) that isn't a two-port affair. BPDUs are definitely making it through, everything checks out. The paths over the l2circuits are normally blocked via increased interface cost. One of the VLANs carried as an l2circuit by the MXes between the EXes is actually not spanning-tree controlled, but a "backup" PIM interface. Essentially a clone of the EX-EX direct link, but with higher metric. Unlike the other VLANs this one always has the PIM and BGP adjacency sending traffic over it. The ddos-protection resolve-mcast4 action trips when multicast is slammed over that or one of the VSTP-managed VLANs transitions to a forwarding state. I can do up a diagram if that would help. I'm really not sure how I'd explain this to JTAC and wanted to get some real-world experience from guys who are working with this stuff. Thanks for all your help! Ross ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] MX punting packets to RE - why?
Ross- Change 'fpc0' to 'afeb0' in your failed command. I got goose eggs, but this lab chassis isn't doing multicast which may play a part. $user@mx104-lab-re0> request pfe execute target afeb0 command "show nhdb mcast resolve" SENT: Ukern command: show nhdb mcast resolve GOT: GOT: Nexthop Info: GOT:ID TypeProtocolResolve-Rate GOT: - -- --- LOCAL: End of file -Michael -Original Message- From: juniper-nsp [mailto:juniper-nsp-boun...@puck.nether.net] On Behalf Of Ross Halliday Sent: Monday, February 1, 2016 2:38 PM To: Dragan Jovicic ; Saku Ytti Cc: juniper-nsp@puck.nether.net Subject: Re: [j-nsp] MX punting packets to RE - why? Hi Saku and Dragan, Thank you for the responses, and apologies for the ambiguity. The EXes are our video source switches. PIM RP is shared with MSDP to an anycast address. The MXes connect to the EXes at L3 via BGP - MX1/EX1 link is de-prioritized with a metric. Most of our receivers ride off of MX2, with a few further downstream. Due to some interop issues and our use of VBR we've settled on a single MDT for this VRF. Being the default MDT it is of course joined on all PEs with this VRF. During normal operation, MX1, which doesn't have any active traffic for this VRF, has a full list of mcast routes with the source interface of the MDT. So, in the first failure scenario - let's say EX2 or MX2 totally dies - MX1 will lose a preferred BGP route to the RP and sources and see everything over the MX1/EX1 link, so all of the S,G entries will need to be updated from mt-0/0/0.1081344 to xe-0/3/0.3812. If I am understanding what you guys are saying correctly, this would cause everything to get punted to the CPU until a new hardware shortcut is created, and in the meantime - since our entire channel lineup is in there - this would hammer the DoS protection mechanism? Can the rate at which the joins are sent out be slowed? I can live with a bit of a delay on the channels coming back to life, but not with the entire slot getting blackholed... I am also open to tweaking the DoS protection settings but it seems to me that a 10x increase would be opening myself up to really slamming the RE and causing even bigger problems. I come from SUP720 world, and I rather like having a box that can process BFD and BGP updates at the same time LOL The other failure scenario is when the EX1/EX2 link goes down. When this happens, all devices are still up, so as far as BGP or really anything on the MX "knows", nothing has changed. Metric and next-hops are identical to the PEs. Instead of pulling video from the direct link, EX1 & EX2 can only see each other through VLANs that the MXes carry as EoMPLS l2circuits. This is what truly baffles me, as none of what you guys mentioned with regards to should apply to an l2circuit. Also, > request pfe execute target fpc0 command "show nhdb mcast resolve" error: command is not valid on the mx104 :( Thanks for your help guys! Ross From: Dragan Jovicic [mailto:dragan...@gmail.com] Sent: Sunday, January 31, 2016 7:44 AM To: Saku Ytti Cc: Ross Halliday; juniper-nsp@puck.nether.net Subject: Re: [j-nsp] MX punting packets to RE - why? Correct me if I'm wrong, this looks like MX doesn't have multicast cache for all those S,G routes (in inet.1). So first packet of each S,G entry must first be resolved by kernel and downloaded to PFE. DDOS feature is activated because large influx of unresolved packets are passing trough the router. You could change default DDOS setting for this type of traffic on your FPC. Another thing that comes to mind is that kernel itself has limited number of resolves per second, which is 66. That is, 66 different NH S,G entries will be resolved per second. dj@mx-re0> request pfe execute target fpc0 command "show nhdb mcast resolve" SENT: Ukern command: show nhdb mcast resolve GOT: GOT: Nexthop Info: GOT: ID Type Protocol Resolve-Rate GOT: - -- --- GOT: 1927 Resolve IPv6 66 GOT: 1962 Resolve IPv4 66 LOCAL: End of file This is modified by (hidden) knob: dj@mx-re0# set forwarding-options multicast resolve-rate ? Possible completions: Multicast resolve rate (100..1000 per second) {master}[edit] Mind you, I haven't tested this. HTH, Regards On Sat, Jan 30, 2016 at 12:04 PM, Saku Ytti wrote: Hey Ross, It's not clear to me if the mcast is only inside the EX or if it's also on the MX's. And it's not clear to me how the faults impact the multicast distribution tree. On stable state, do both MX80's have mcast states for groups? Or only one of them? Trio maps each multicast group into an input interface, if mismatch occurs, that is group ingresses from other input interface than the specif
Re: [j-nsp] MX punting packets to RE - why?
On 1 February 2016 at 22:37, Ross Halliday wrote: Hey, > If I am understanding what you guys are saying correctly, this would cause > everything to get punted to the CPU until a new hardware shortcut is created, > and in the meantime - since our entire channel lineup is in there - this > would hammer the DoS protection mechanism? Yes, if ingress interface does not match, they will be punted. > Can the rate at which the joins are sent out be slowed? I can live with a bit > of a delay on the channels coming back to life, but not with the entire slot > getting blackholed... What do you mean 'entire slot being blackholed', do you mean losing unrelated control-plane stuff, like BGP/ARP etc? If yes, just limit the mcast resolve to something reasonable 100pps should be plenty, provided we're not competing with actual attack traffic. I would start with ddos-protection fixes and see if it behaves better with more restricted punting. Further research might involve figuring out if both MX boxes have multicast state with source towards the local EX port, and clients subscribed. So that no convergence is needed. It wasn't obvious to me what kind of negative impact you observe when the EX-EX link goes down. How are you now stopping loop in the EX network? You have direct physical link between them, then you have l2circuit as well? But it looks like you're not carrying BPDU over the l2circuit? So if you rely on STP, I'm not entirely sure how the L2 redundancy works, which port is normally being blocked? The actual physical link between switches or the link via l2circuit, since my first guess would be that there would be L2 loop in the topology and nothing to stop it, so I'm not sure I understand why it works at all. -- ++ytti ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] MX punting packets to RE - why?
Hi Saku and Dragan, Thank you for the responses, and apologies for the ambiguity. The EXes are our video source switches. PIM RP is shared with MSDP to an anycast address. The MXes connect to the EXes at L3 via BGP - MX1/EX1 link is de-prioritized with a metric. Most of our receivers ride off of MX2, with a few further downstream. Due to some interop issues and our use of VBR we've settled on a single MDT for this VRF. Being the default MDT it is of course joined on all PEs with this VRF. During normal operation, MX1, which doesn't have any active traffic for this VRF, has a full list of mcast routes with the source interface of the MDT. So, in the first failure scenario - let's say EX2 or MX2 totally dies - MX1 will lose a preferred BGP route to the RP and sources and see everything over the MX1/EX1 link, so all of the S,G entries will need to be updated from mt-0/0/0.1081344 to xe-0/3/0.3812. If I am understanding what you guys are saying correctly, this would cause everything to get punted to the CPU until a new hardware shortcut is created, and in the meantime - since our entire channel lineup is in there - this would hammer the DoS protection mechanism? Can the rate at which the joins are sent out be slowed? I can live with a bit of a delay on the channels coming back to life, but not with the entire slot getting blackholed... I am also open to tweaking the DoS protection settings but it seems to me that a 10x increase would be opening myself up to really slamming the RE and causing even bigger problems. I come from SUP720 world, and I rather like having a box that can process BFD and BGP updates at the same time LOL The other failure scenario is when the EX1/EX2 link goes down. When this happens, all devices are still up, so as far as BGP or really anything on the MX "knows", nothing has changed. Metric and next-hops are identical to the PEs. Instead of pulling video from the direct link, EX1 & EX2 can only see each other through VLANs that the MXes carry as EoMPLS l2circuits. This is what truly baffles me, as none of what you guys mentioned with regards to should apply to an l2circuit. Also, > request pfe execute target fpc0 command "show nhdb mcast resolve" error: command is not valid on the mx104 :( Thanks for your help guys! Ross From: Dragan Jovicic [mailto:dragan...@gmail.com] Sent: Sunday, January 31, 2016 7:44 AM To: Saku Ytti Cc: Ross Halliday; juniper-nsp@puck.nether.net Subject: Re: [j-nsp] MX punting packets to RE - why? Correct me if I'm wrong, this looks like MX doesn't have multicast cache for all those S,G routes (in inet.1). So first packet of each S,G entry must first be resolved by kernel and downloaded to PFE. DDOS feature is activated because large influx of unresolved packets are passing trough the router. You could change default DDOS setting for this type of traffic on your FPC. Another thing that comes to mind is that kernel itself has limited number of resolves per second, which is 66. That is, 66 different NH S,G entries will be resolved per second. dj@mx-re0> request pfe execute target fpc0 command "show nhdb mcast resolve" SENT: Ukern command: show nhdb mcast resolve GOT: GOT: Nexthop Info: GOT: ID Type Protocol Resolve-Rate GOT: - -- --- GOT: 1927 Resolve IPv6 66 GOT: 1962 Resolve IPv4 66 LOCAL: End of file This is modified by (hidden) knob: dj@mx-re0# set forwarding-options multicast resolve-rate ? Possible completions: Multicast resolve rate (100..1000 per second) {master}[edit] Mind you, I haven't tested this. HTH, Regards On Sat, Jan 30, 2016 at 12:04 PM, Saku Ytti wrote: Hey Ross, It's not clear to me if the mcast is only inside the EX or if it's also on the MX's. And it's not clear to me how the faults impact the multicast distribution tree. On stable state, do both MX80's have mcast states for groups? Or only one of them? Trio maps each multicast group into an input interface, if mismatch occurs, that is group ingresses from other input interface than the specified, I believe this causes host punt. Alas DDoS-protection limits are quite insane, like 20kpps for many protocols, that's more than NPU=>LC_PCU punting allows for, so it'll kill pretty much everything. I'd set protocols I don't need to 10-100pps, non-critical protocols I need to 4kpps and critical protocols I need to 8kpps. And yes, configure each and every ddos-protocol, it'll inflate the config quite a bit, but there is always 'set apply-flags omit', which can be useful way to reduce config cruft about standard-configs you don't really want to review in normally. On 29 January 2016 at 23:36, Ross Halliday wrote: > Hi list, > > I've run into an oddity that's been caus
Re: [j-nsp] MX punting packets to RE - why?
Correct me if I'm wrong, this looks like MX doesn't have multicast cache for all those S,G routes (in inet.1). So first packet of each S,G entry must first be resolved by kernel and downloaded to PFE. DDOS feature is activated because large influx of unresolved packets are passing trough the router. You could change default DDOS setting for this type of traffic on your FPC. Another thing that comes to mind is that kernel itself has limited number of resolves per second, which is 66. That is, 66 different NH S,G entries will be resolved per second. dj@mx-re0> request pfe execute target fpc0 command "show nhdb mcast resolve" SENT: Ukern command: show nhdb mcast resolve GOT: GOT: Nexthop Info: GOT:ID TypeProtocolResolve-Rate GOT: - -- --- GOT: 1927 ResolveIPv6 66 GOT: 1962 ResolveIPv4 66 LOCAL: End of file This is modified by (hidden) knob: dj@mx-re0# set forwarding-options multicast resolve-rate ? Possible completions: Multicast resolve rate (100..1000 per second) {master}[edit] Mind you, I haven't tested this. HTH, Regards On Sat, Jan 30, 2016 at 12:04 PM, Saku Ytti wrote: > Hey Ross, > > It's not clear to me if the mcast is only inside the EX or if it's > also on the MX's. And it's not clear to me how the faults impact the > multicast distribution tree. On stable state, do both MX80's have > mcast states for groups? Or only one of them? > > Trio maps each multicast group into an input interface, if mismatch > occurs, that is group ingresses from other input interface than the > specified, I believe this causes host punt. > > Alas DDoS-protection limits are quite insane, like 20kpps for many > protocols, that's more than NPU=>LC_PCU punting allows for, so it'll > kill pretty much everything. I'd set protocols I don't need to > 10-100pps, non-critical protocols I need to 4kpps and critical > protocols I need to 8kpps. > And yes, configure each and every ddos-protocol, it'll inflate the > config quite a bit, but there is always 'set apply-flags omit', which > can be useful way to reduce config cruft about standard-configs you > don't really want to review in normally. > > > On 29 January 2016 at 23:36, Ross Halliday > wrote: > > Hi list, > > > > I've run into an oddity that's been causing us some issues. First, a > diagram! > > > > EX1EX2 > > | | > > | | > > MX1MX2 > > > > EX1 and EX2 are independent switches (not VC) that run a ton of video > traffic. EX4200 on 12.3R8.7 > > MX1 and MX2 are MPLS PEs that ingest video and send it out to our > network. MX104 on 13.3R4.6 > > Several VLANs span EX1 and EX2 as each switch has a server that requires > Layer 2 to the other unit. (active/active middleware) > > EX1-EX2 link is direct fiber carrying VLANs > > MX1-MX2 link is MPLS > > > > The MX ports facing the EXes terminate L3 as well as hauling L2: > > > > MX1: > > > > xe-0/3/0 { > > description "EX1 xe-3/1/0"; > > flexible-vlan-tagging; > > hold-time up 5000 down 0; > > encapsulation flexible-ethernet-services; > > unit 3810 { > > description "Backup link between TV switches"; > > encapsulation vlan-ccc; > > vlan-id-list [ 304 810-811 3810 3813 3821-3822 ]; > > } > > unit 3812 { > > description "Video feed 2/2 from head end switch"; > > vlan-id 3812; > > family inet { > > address MX1/31; > > } > > } > > } > > l2circuit { > > neighbor MX2 { > > interface xe-0/3/0.3810 { > > virtual-circuit-id 3810; > > description "IPTV switch redundant link"; > > no-control-word; > > } > > } > > } > > > > MX2: > > > > xe-0/3/0 { > > description "EX1 xe-0/1/0"; > > flexible-vlan-tagging; > > hold-time up 5000 down 0; > > encapsulation flexible-ethernet-services; > > unit 3810 { > > description "Backup link between TV switches"; > > encapsulation vlan-ccc; > > vlan-id-list [ 304 810-811 3813 3821-3822 ]; > > } > > unit 3811 { > > description "Video feed 1/2 from head end switch"; > > vlan-id 3811; > > family inet { > > address MX2/31; > > } > > } > > } > > l2circuit { > > neighbor MX1 { > > interface xe-0/3/0.3810 { > > virtual-circuit-id 3810; > > description "IPTV switch redundant link"; > > no-control-word; > > } > > } > > } > > > > We have dual L3 feeds from "the switches" to "the routers", and VLANs > are carried over an l2circuit should the direct link between EX1 & EX2 bite > the dust. It should be noted that MX1 is basically a "backup" - traffic > normally f
Re: [j-nsp] MX punting packets to RE - why?
Hey Ross, It's not clear to me if the mcast is only inside the EX or if it's also on the MX's. And it's not clear to me how the faults impact the multicast distribution tree. On stable state, do both MX80's have mcast states for groups? Or only one of them? Trio maps each multicast group into an input interface, if mismatch occurs, that is group ingresses from other input interface than the specified, I believe this causes host punt. Alas DDoS-protection limits are quite insane, like 20kpps for many protocols, that's more than NPU=>LC_PCU punting allows for, so it'll kill pretty much everything. I'd set protocols I don't need to 10-100pps, non-critical protocols I need to 4kpps and critical protocols I need to 8kpps. And yes, configure each and every ddos-protocol, it'll inflate the config quite a bit, but there is always 'set apply-flags omit', which can be useful way to reduce config cruft about standard-configs you don't really want to review in normally. On 29 January 2016 at 23:36, Ross Halliday wrote: > Hi list, > > I've run into an oddity that's been causing us some issues. First, a diagram! > > EX1EX2 > | | > | | > MX1MX2 > > EX1 and EX2 are independent switches (not VC) that run a ton of video > traffic. EX4200 on 12.3R8.7 > MX1 and MX2 are MPLS PEs that ingest video and send it out to our network. > MX104 on 13.3R4.6 > Several VLANs span EX1 and EX2 as each switch has a server that requires > Layer 2 to the other unit. (active/active middleware) > EX1-EX2 link is direct fiber carrying VLANs > MX1-MX2 link is MPLS > > The MX ports facing the EXes terminate L3 as well as hauling L2: > > MX1: > > xe-0/3/0 { > description "EX1 xe-3/1/0"; > flexible-vlan-tagging; > hold-time up 5000 down 0; > encapsulation flexible-ethernet-services; > unit 3810 { > description "Backup link between TV switches"; > encapsulation vlan-ccc; > vlan-id-list [ 304 810-811 3810 3813 3821-3822 ]; > } > unit 3812 { > description "Video feed 2/2 from head end switch"; > vlan-id 3812; > family inet { > address MX1/31; > } > } > } > l2circuit { > neighbor MX2 { > interface xe-0/3/0.3810 { > virtual-circuit-id 3810; > description "IPTV switch redundant link"; > no-control-word; > } > } > } > > MX2: > > xe-0/3/0 { > description "EX1 xe-0/1/0"; > flexible-vlan-tagging; > hold-time up 5000 down 0; > encapsulation flexible-ethernet-services; > unit 3810 { > description "Backup link between TV switches"; > encapsulation vlan-ccc; > vlan-id-list [ 304 810-811 3813 3821-3822 ]; > } > unit 3811 { > description "Video feed 1/2 from head end switch"; > vlan-id 3811; > family inet { > address MX2/31; > } > } > } > l2circuit { > neighbor MX1 { > interface xe-0/3/0.3810 { > virtual-circuit-id 3810; > description "IPTV switch redundant link"; > no-control-word; > } > } > } > > We have dual L3 feeds from "the switches" to "the routers", and VLANs are > carried over an l2circuit should the direct link between EX1 & EX2 bite the > dust. It should be noted that MX1 is basically a "backup" - traffic normally > flows EX1-EX2-MX2. The goal of this setup is so that we can take out any link > and still have our video working. > > It works... eventually. > > The problem I am running into is that when a fail occurs, or I simply pull a > VLAN from the EX1-EX2 link, multicast is suddenly slammed either across or > into the MXes. When that happens, I get this lovely message: > > jddosd[1527]: DDOS_PROTOCOL_VIOLATION_SET: Protocol resolve:mcast-v4 is > violated at fpc 0 for 38 times, started at 2016-01-27 04:59:55 EST > jddosd[1527]: DDOS_PROTOCOL_VIOLATION_CLEAR: Protocol resolve:mcast-v4 has > returned to normal. Violated at fpc 0 for 38 times, from 2016-01-27 04:59:55 > EST to 2016-01-27 04:59:55 EST > > ...and traffic (maybe of just offending class) on that slot is dumped for a > little while. > >> show ddos-protection protocols resolve statistics > > Packet type: mcast-v4 > System-wide information: > Bandwidth is no longer being violated > No. of FPCs that have received excess traffic: 1 > Last violation started at: 2016-01-27 04:59:55 EST > Last violation ended at: 2016-01-27 04:59:55 EST > Duration of last violation: 00:00:00 Number of violations: 38 > Received: 4496939 Arrival rate: 0 pps > Dropped: 2161644 Max arrival rate: 45877 pps > Routing Engine information: > Policer is never violate
[j-nsp] MX punting packets to RE - why?
Hi list, I've run into an oddity that's been causing us some issues. First, a diagram! EX1EX2 | | | | MX1MX2 EX1 and EX2 are independent switches (not VC) that run a ton of video traffic. EX4200 on 12.3R8.7 MX1 and MX2 are MPLS PEs that ingest video and send it out to our network. MX104 on 13.3R4.6 Several VLANs span EX1 and EX2 as each switch has a server that requires Layer 2 to the other unit. (active/active middleware) EX1-EX2 link is direct fiber carrying VLANs MX1-MX2 link is MPLS The MX ports facing the EXes terminate L3 as well as hauling L2: MX1: xe-0/3/0 { description "EX1 xe-3/1/0"; flexible-vlan-tagging; hold-time up 5000 down 0; encapsulation flexible-ethernet-services; unit 3810 { description "Backup link between TV switches"; encapsulation vlan-ccc; vlan-id-list [ 304 810-811 3810 3813 3821-3822 ]; } unit 3812 { description "Video feed 2/2 from head end switch"; vlan-id 3812; family inet { address MX1/31; } } } l2circuit { neighbor MX2 { interface xe-0/3/0.3810 { virtual-circuit-id 3810; description "IPTV switch redundant link"; no-control-word; } } } MX2: xe-0/3/0 { description "EX1 xe-0/1/0"; flexible-vlan-tagging; hold-time up 5000 down 0; encapsulation flexible-ethernet-services; unit 3810 { description "Backup link between TV switches"; encapsulation vlan-ccc; vlan-id-list [ 304 810-811 3813 3821-3822 ]; } unit 3811 { description "Video feed 1/2 from head end switch"; vlan-id 3811; family inet { address MX2/31; } } } l2circuit { neighbor MX1 { interface xe-0/3/0.3810 { virtual-circuit-id 3810; description "IPTV switch redundant link"; no-control-word; } } } We have dual L3 feeds from "the switches" to "the routers", and VLANs are carried over an l2circuit should the direct link between EX1 & EX2 bite the dust. It should be noted that MX1 is basically a "backup" - traffic normally flows EX1-EX2-MX2. The goal of this setup is so that we can take out any link and still have our video working. It works... eventually. The problem I am running into is that when a fail occurs, or I simply pull a VLAN from the EX1-EX2 link, multicast is suddenly slammed either across or into the MXes. When that happens, I get this lovely message: jddosd[1527]: DDOS_PROTOCOL_VIOLATION_SET: Protocol resolve:mcast-v4 is violated at fpc 0 for 38 times, started at 2016-01-27 04:59:55 EST jddosd[1527]: DDOS_PROTOCOL_VIOLATION_CLEAR: Protocol resolve:mcast-v4 has returned to normal. Violated at fpc 0 for 38 times, from 2016-01-27 04:59:55 EST to 2016-01-27 04:59:55 EST ...and traffic (maybe of just offending class) on that slot is dumped for a little while. > show ddos-protection protocols resolve statistics Packet type: mcast-v4 System-wide information: Bandwidth is no longer being violated No. of FPCs that have received excess traffic: 1 Last violation started at: 2016-01-27 04:59:55 EST Last violation ended at: 2016-01-27 04:59:55 EST Duration of last violation: 00:00:00 Number of violations: 38 Received: 4496939 Arrival rate: 0 pps Dropped: 2161644 Max arrival rate: 45877 pps Routing Engine information: Policer is never violated Received: 130584 Arrival rate: 0 pps Dropped: 0 Max arrival rate: 1 pps Dropped by aggregate policer: 0 FPC slot 0 information: Policer is no longer being violated Last violation started at: 2016-01-27 04:59:57 EST Last violation ended at: 2016-01-27 04:59:57 EST Duration of last violation: 00:00:00 Number of violations: 38 Received: 4496939 Arrival rate: 0 pps Dropped: 2161644 Max arrival rate: 45877 pps Dropped by this policer: 2161644 Dropped by aggregate policer: 0 Dropped by flow suppression: 0 Flow counts: Aggregation level Current Total detected State Subscriber0 0Active Once the thing recovers, everything works again. But I cannot change a VLAN, a spanning tree topology, or work on anything without risking serious impact to my network! I understand that the 'resolve' protocol means these packets are being sent to the RE. ...why the hell are they being sent to the RE? Even when there's a change on traffic that gets sent into that l2circuit - shouldn't this just be punte