Re: [j-nsp] Strange VRRP problem -- question about restarting process
So wired, should be bug inside. On Nov 3, 2012 9:12 PM, Terry Baranski terry.baranski.l...@gmail.com wrote: On Fri, Nov 2, 2012 at 6:55 PM, John Neiberger jneiber...@gmail.com wrote: Sorry for the barrage of emails, but I wanted to mention that we can ping in both directions through the switch from one router interface to the other, so we know broadcasts (ARP) and unicast work. How could the problem be limited to only multicast if IGMP snooping is off? This is such a simple scenario that it's really bugging me that I can't figure it out. lol Had this happen on an m40e recently and ended up having to reboot it. -Terry ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Strange VRRP problem -- question about restarting process
On Fri, Nov 2, 2012 at 6:55 PM, John Neiberger jneiber...@gmail.com wrote: Sorry for the barrage of emails, but I wanted to mention that we can ping in both directions through the switch from one router interface to the other, so we know broadcasts (ARP) and unicast work. How could the problem be limited to only multicast if IGMP snooping is off? This is such a simple scenario that it's really bugging me that I can't figure it out. lol Had this happen on an m40e recently and ended up having to reboot it. -Terry ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
[j-nsp] Strange VRRP problem -- question about restarting process
We have a very odd problem that we've been dealing with for a couple of weeks. JTAC is involved but we have not come to a resolution yet. The gist of the problem is that we have two MX960s and we're running VRRP on multiple interfaces with different Cisco switches in between each pair of Juniper interfaces. [J] - [C][C]-- [J] The switches are just layer two and we're running VRRP on the routers. The problem is that one day, three of the interfaces on the backup router suddenly stopped receiving VRRP messages from its peer. JTAC seems to think that the Cisco switches just suddenly stopped forwarding VRRP messages to the backup router, but that makes zero sense unless some bizarre issue just happened to occur on multiple unrelated switches at exactly the same moment. I'm still leaning toward a problem on the router. Which leads me to my question. What is the risk of restarting the VRRP process? I see we have soft and graceful as options. Both sound fairly low-risk. I'm tempted to just restart the process on the backup router to see if that fixes the problem. What do you think? Thanks, John ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Strange VRRP problem -- question about restarting process
We have a very odd problem that we've been dealing with for a couple of weeks. JTAC is involved but we have not come to a resolution yet. The gist of the problem is that we have two MX960s and we're running VRRP on multiple interfaces with different Cisco switches in between each pair of Juniper interfaces. [J] - [C][C]-- [J] The switches are just layer two and we're running VRRP on the routers. The problem is that one day, three of the interfaces on the backup router suddenly stopped receiving VRRP messages from its peer. JTAC seems to think that the Cisco switches just suddenly stopped forwarding VRRP messages to the backup router, but that makes zero sense unless some bizarre issue just happened to occur on multiple unrelated switches at exactly the same moment. I'm still leaning toward a problem on the router. Did you try disabling IGMP snooping for the VLAN on the switches? ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Strange VRRP problem -- question about restarting process
Well, that's fairly straightforward - either (1) VRRP on master [J] stopped sending or (2) CSCO switches stopped forwarding VRRP hellos, or (3) backup [J] drops incoming VRRP hellos. You can verify (1) by using monitor traffic interface blah no-resolve size . (2) could be verified with SPAN/RSPAN (3) cannot be verified with monitor traffic interface _if_ there is an input FW filter. monitor traffic interface a.k.a. tcpdump does not capture packets dropped by FW filter. Which begs a question - do you have an input FW filter on VRRP interfaces or lo0 and if yes, do you allow protocol vrrp as well as AH/proto 51 and have you added/changed VRRP auth type recently? Proto 51 is used when VRRP MD5 auth is configured. In any case, I'd suggest to configure a FW filter to log/syslog incoming VRRP packets (dst.ip 224.0.0.18/32) on backup [J]. HTH Rgds Alex - Original Message - From: John Neiberger jneiber...@gmail.com To: juniper-nsp@puck.nether.net Sent: Friday, November 02, 2012 3:37 PM Subject: [j-nsp] Strange VRRP problem -- question about restarting process We have a very odd problem that we've been dealing with for a couple of weeks. JTAC is involved but we have not come to a resolution yet. The gist of the problem is that we have two MX960s and we're running VRRP on multiple interfaces with different Cisco switches in between each pair of Juniper interfaces. [J] - [C][C]-- [J] The switches are just layer two and we're running VRRP on the routers. The problem is that one day, three of the interfaces on the backup router suddenly stopped receiving VRRP messages from its peer. JTAC seems to think that the Cisco switches just suddenly stopped forwarding VRRP messages to the backup router, but that makes zero sense unless some bizarre issue just happened to occur on multiple unrelated switches at exactly the same moment. I'm still leaning toward a problem on the router. Which leads me to my question. What is the risk of restarting the VRRP process? I see we have soft and graceful as options. Both sound fairly low-risk. I'm tempted to just restart the process on the backup router to see if that fixes the problem. What do you think? Thanks, John ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Strange VRRP problem -- question about restarting process
Sorry for the lack of replies. I got swamped today and haven't had a chance to look at this much. Another one of our engineers has been working it. I did notice that the three interfaces I originally looked at back when this started all seem to be fine now. However, this weird behavior seems to have moved to some other interfaces. I'm going to need to investigate a bit more to find out what changed when I wasn't looking. :) We do not have IGMP snooping enabled on the Cisco switches and we have no inbound filters that would block traffic. In fact, we have this identical config on several different routers and dozens of interfaces and switches with no problem. Whatever is wrong seems to be isolated to this router. I'll try to regroup and get the latest info. Thanks! John On Fri, Nov 2, 2012 at 11:18 AM, Alex Arseniev alex.arsen...@gmail.comwrote: Well, that's fairly straightforward - either (1) VRRP on master [J] stopped sending or (2) CSCO switches stopped forwarding VRRP hellos, or (3) backup [J] drops incoming VRRP hellos. You can verify (1) by using monitor traffic interface blah no-resolve size . (2) could be verified with SPAN/RSPAN (3) cannot be verified with monitor traffic interface _if_ there is an input FW filter. monitor traffic interface a.k.a. tcpdump does not capture packets dropped by FW filter. Which begs a question - do you have an input FW filter on VRRP interfaces or lo0 and if yes, do you allow protocol vrrp as well as AH/proto 51 and have you added/changed VRRP auth type recently? Proto 51 is used when VRRP MD5 auth is configured. In any case, I'd suggest to configure a FW filter to log/syslog incoming VRRP packets (dst.ip 224.0.0.18/32) on backup [J]. HTH Rgds Alex - Original Message - From: John Neiberger jneiber...@gmail.com To: juniper-nsp@puck.nether.net Sent: Friday, November 02, 2012 3:37 PM Subject: [j-nsp] Strange VRRP problem -- question about restarting process We have a very odd problem that we've been dealing with for a couple of weeks. JTAC is involved but we have not come to a resolution yet. The gist of the problem is that we have two MX960s and we're running VRRP on multiple interfaces with different Cisco switches in between each pair of Juniper interfaces. [J] - [C][C]-- [J] The switches are just layer two and we're running VRRP on the routers. The problem is that one day, three of the interfaces on the backup router suddenly stopped receiving VRRP messages from its peer. JTAC seems to think that the Cisco switches just suddenly stopped forwarding VRRP messages to the backup router, but that makes zero sense unless some bizarre issue just happened to occur on multiple unrelated switches at exactly the same moment. I'm still leaning toward a problem on the router. Which leads me to my question. What is the risk of restarting the VRRP process? I see we have soft and graceful as options. Both sound fairly low-risk. I'm tempted to just restart the process on the backup router to see if that fixes the problem. What do you think? Thanks, John __**_ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/**mailman/listinfo/juniper-nsphttps://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Strange VRRP problem -- question about restarting process
Okay, I've been looking at this for a little bit and it's just really bizarre. I was wrong about the connectivity earlier. It's really just a single Cisco 4948 in the middle between these two MX960s. IGMP snooping is not enabled, nor are there any inbound filters on the routers. I have verified that our RE filter is allowing VRRP. We have verified with the monitor traffic command that router 1 is sending and receiving vrrp multicasts, but router 2 is not receiving them, only sending them. The switch is a pretty vanilla config. The two links are in the same VLAN and there are no special features enabled, like MAC filtering or whatever. It's very straightforward, which is why we're all stumped. Something is stopping those multicasts from reaching router 2, but for the life of me I don't see what it could be. On Fri, Nov 2, 2012 at 3:53 PM, John Neiberger jneiber...@gmail.com wrote: Sorry for the lack of replies. I got swamped today and haven't had a chance to look at this much. Another one of our engineers has been working it. I did notice that the three interfaces I originally looked at back when this started all seem to be fine now. However, this weird behavior seems to have moved to some other interfaces. I'm going to need to investigate a bit more to find out what changed when I wasn't looking. :) We do not have IGMP snooping enabled on the Cisco switches and we have no inbound filters that would block traffic. In fact, we have this identical config on several different routers and dozens of interfaces and switches with no problem. Whatever is wrong seems to be isolated to this router. I'll try to regroup and get the latest info. Thanks! John On Fri, Nov 2, 2012 at 11:18 AM, Alex Arseniev alex.arsen...@gmail.comwrote: Well, that's fairly straightforward - either (1) VRRP on master [J] stopped sending or (2) CSCO switches stopped forwarding VRRP hellos, or (3) backup [J] drops incoming VRRP hellos. You can verify (1) by using monitor traffic interface blah no-resolve size . (2) could be verified with SPAN/RSPAN (3) cannot be verified with monitor traffic interface _if_ there is an input FW filter. monitor traffic interface a.k.a. tcpdump does not capture packets dropped by FW filter. Which begs a question - do you have an input FW filter on VRRP interfaces or lo0 and if yes, do you allow protocol vrrp as well as AH/proto 51 and have you added/changed VRRP auth type recently? Proto 51 is used when VRRP MD5 auth is configured. In any case, I'd suggest to configure a FW filter to log/syslog incoming VRRP packets (dst.ip 224.0.0.18/32) on backup [J]. HTH Rgds Alex - Original Message - From: John Neiberger jneiber...@gmail.com To: juniper-nsp@puck.nether.net Sent: Friday, November 02, 2012 3:37 PM Subject: [j-nsp] Strange VRRP problem -- question about restarting process We have a very odd problem that we've been dealing with for a couple of weeks. JTAC is involved but we have not come to a resolution yet. The gist of the problem is that we have two MX960s and we're running VRRP on multiple interfaces with different Cisco switches in between each pair of Juniper interfaces. [J] - [C][C]-- [J] The switches are just layer two and we're running VRRP on the routers. The problem is that one day, three of the interfaces on the backup router suddenly stopped receiving VRRP messages from its peer. JTAC seems to think that the Cisco switches just suddenly stopped forwarding VRRP messages to the backup router, but that makes zero sense unless some bizarre issue just happened to occur on multiple unrelated switches at exactly the same moment. I'm still leaning toward a problem on the router. Which leads me to my question. What is the risk of restarting the VRRP process? I see we have soft and graceful as options. Both sound fairly low-risk. I'm tempted to just restart the process on the backup router to see if that fixes the problem. What do you think? Thanks, John __**_ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/**mailman/listinfo/juniper-nsphttps://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Strange VRRP problem -- question about restarting process
Sorry for the barrage of emails, but I wanted to mention that we can ping in both directions through the switch from one router interface to the other, so we know broadcasts (ARP) and unicast work. How could the problem be limited to only multicast if IGMP snooping is off? This is such a simple scenario that it's really bugging me that I can't figure it out. lol On Fri, Nov 2, 2012 at 4:43 PM, John Neiberger jneiber...@gmail.com wrote: Okay, I've been looking at this for a little bit and it's just really bizarre. I was wrong about the connectivity earlier. It's really just a single Cisco 4948 in the middle between these two MX960s. IGMP snooping is not enabled, nor are there any inbound filters on the routers. I have verified that our RE filter is allowing VRRP. We have verified with the monitor traffic command that router 1 is sending and receiving vrrp multicasts, but router 2 is not receiving them, only sending them. The switch is a pretty vanilla config. The two links are in the same VLAN and there are no special features enabled, like MAC filtering or whatever. It's very straightforward, which is why we're all stumped. Something is stopping those multicasts from reaching router 2, but for the life of me I don't see what it could be. On Fri, Nov 2, 2012 at 3:53 PM, John Neiberger jneiber...@gmail.comwrote: Sorry for the lack of replies. I got swamped today and haven't had a chance to look at this much. Another one of our engineers has been working it. I did notice that the three interfaces I originally looked at back when this started all seem to be fine now. However, this weird behavior seems to have moved to some other interfaces. I'm going to need to investigate a bit more to find out what changed when I wasn't looking. :) We do not have IGMP snooping enabled on the Cisco switches and we have no inbound filters that would block traffic. In fact, we have this identical config on several different routers and dozens of interfaces and switches with no problem. Whatever is wrong seems to be isolated to this router. I'll try to regroup and get the latest info. Thanks! John On Fri, Nov 2, 2012 at 11:18 AM, Alex Arseniev alex.arsen...@gmail.comwrote: Well, that's fairly straightforward - either (1) VRRP on master [J] stopped sending or (2) CSCO switches stopped forwarding VRRP hellos, or (3) backup [J] drops incoming VRRP hellos. You can verify (1) by using monitor traffic interface blah no-resolve size . (2) could be verified with SPAN/RSPAN (3) cannot be verified with monitor traffic interface _if_ there is an input FW filter. monitor traffic interface a.k.a. tcpdump does not capture packets dropped by FW filter. Which begs a question - do you have an input FW filter on VRRP interfaces or lo0 and if yes, do you allow protocol vrrp as well as AH/proto 51 and have you added/changed VRRP auth type recently? Proto 51 is used when VRRP MD5 auth is configured. In any case, I'd suggest to configure a FW filter to log/syslog incoming VRRP packets (dst.ip 224.0.0.18/32) on backup [J]. HTH Rgds Alex - Original Message - From: John Neiberger jneiber...@gmail.com To: juniper-nsp@puck.nether.net Sent: Friday, November 02, 2012 3:37 PM Subject: [j-nsp] Strange VRRP problem -- question about restarting process We have a very odd problem that we've been dealing with for a couple of weeks. JTAC is involved but we have not come to a resolution yet. The gist of the problem is that we have two MX960s and we're running VRRP on multiple interfaces with different Cisco switches in between each pair of Juniper interfaces. [J] - [C][C]-- [J] The switches are just layer two and we're running VRRP on the routers. The problem is that one day, three of the interfaces on the backup router suddenly stopped receiving VRRP messages from its peer. JTAC seems to think that the Cisco switches just suddenly stopped forwarding VRRP messages to the backup router, but that makes zero sense unless some bizarre issue just happened to occur on multiple unrelated switches at exactly the same moment. I'm still leaning toward a problem on the router. Which leads me to my question. What is the risk of restarting the VRRP process? I see we have soft and graceful as options. Both sound fairly low-risk. I'm tempted to just restart the process on the backup router to see if that fixes the problem. What do you think? Thanks, John __**_ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/**mailman/listinfo/juniper-nsphttps://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp