Re: [j-nsp] Strange ARP issue on M7i

2012-08-15 Thread Saku Ytti
On (2012-08-14 13:09 -0700), Jonathan Lassoff wrote:
 
 Moral of the story, as I see it: avoid static routing.

This is bit circular. Vendor had software defect in ARP and you arrived to
conclusion consequently we should not use static routing, but dynamic. 
However our choice of configuration does not affect quality of the code as
implemented by vendor, so just as well we might have BGP defect doing
something nasty, and someone might draw conclusion 'avoid bgp routing'.

Moral of the story is, avoid broken software, which is easier said than
done.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-15 Thread Jonathan Lassoff
On Wed, Aug 15, 2012 at 12:13 AM, Saku Ytti s...@ytti.fi wrote:
 On (2012-08-14 13:09 -0700), Jonathan Lassoff wrote:

 Moral of the story, as I see it: avoid static routing.

 This is bit circular. Vendor had software defect in ARP and you arrived to
 conclusion consequently we should not use static routing, but dynamic.
 However our choice of configuration does not affect quality of the code as
 implemented by vendor, so just as well we might have BGP defect doing
 something nasty, and someone might draw conclusion 'avoid bgp routing'.

 Moral of the story is, avoid broken software, which is easier said than
 done.

You make a very good point here.

My thing was more along the lines that routing (RIB) / next-hop path
information ought to be learned and/or monitored over protocols that
ride that same path, so that any path failures are detected and routed
around.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-15 Thread Saku Ytti
On (2012-08-15 00:21 -0700), Jonathan Lassoff wrote:

 My thing was more along the lines that routing (RIB) / next-hop path
 information ought to be learned and/or monitored over protocols that
 ride that same path, so that any path failures are detected and routed
 around.

In static route they are also, ARP timeout in JunOS is 20min by default, so
it'll just take quite long time to invalidate the static route (short of
bugs like the OP sees)

Cisco has 4h, which is absolutely ridiculous.

Linux uses 1min, which is better than default BGP holdtime in Cisco or
Juniper. So statically routed Linux would converge faster than BGP routed
Juniper in sudden disappearance of peer.

Of course both ARP timeout and BGP holdtime are tunable as well as either
BGP or static could run BFD.

-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-15 Thread Markus

Hi JP and all,

thanks for all the replies. show policer shows:

ad...@ffm01.rt show policer
Policers:
Name  Packets
__default_arp_policer__   1140304
__policer_tmpl__-term   0
__policer_tmpl__-fc00
__policer_tmpl__-fc00
__policer_tmpl__-fc10
__policer_tmpl__-fc00
__policer_tmpl__-fc10
__policer_tmpl__-fc20
__policer_tmpl__-fc00
__policer_tmpl__-fc10
__policer_tmpl__-fc20
__policer_tmpl__-fc30

What does that mean?

I don't seem to have anything configured related to that:

ad...@ffm01.rt show configuration | grep arp
 empty 

Thank you!
Markus


Am 14.08.2012 21:37, schrieb JP Senior:

Hi, Markus.
I have experienced issues in previous deployments that have involved built-in 
ARP policers.

Hit up 'show policer', and look for __default_arp_policer__.

JP Senior


-Original Message-
From: juniper-nsp-boun...@puck.nether.net 
[mailto:juniper-nsp-boun...@puck.nether.net] On Behalf Of Markus
Sent: 14 August 2012 7:13 AM
To: juniper-nsp@puck.nether.net
Subject: [j-nsp] Strange ARP issue on M7i

Hi all,

last night I encountered something weird (in my opinion). Not sure if Juniper 
related but maybe someone here has seen something like this?

I was experiencing a strange effect that several websites hosted on a Linux KVM VM didn't 
load properly. They would load but 90% of the time hang in some strange way, the browser 
displaying Waiting for www.sitename.com... after all the page has loaded, or 
even before anything of the page was displayed. A minute later it would work sometimes, 
but only for a short period of time. After eliminating all MySQL, Apache, KVM etc. as the 
source of the problem I logged into the M7i in front of that host and saw:

ad...@ffm01.rt show arp no-resolve |grep 195.100.100.7
00:25:90:38:66:c6 195.100.100.7ge-0/0/0.0none
00:25:90:38:66:c6 195.100.101.34   ge-0/0/0.0none

With 195.100.100.7 being the KVM host. So I thought: why is 101.34 up?
It's an IP that wasn't in use for years. And in the Juniper config a whole /24 
was still getting routed to it. I thought, OK, the KVM host got hax0red or 
something and the intruder assigned 101.34, but couldnt find anything. 101.34 
wasn't reachable from any machine in the same LAN and the MAC could not be seen 
either. No traffic to/from it on the Switch monitoring port either. All I saw 
was traffic (port scans I
think) to the /24 which ended up on the KVM host (195.100.100.7). That was an indicator 
that the KVM host was really also saying I have 195.100.101.34. Or the 
Juniper insisted that the IP is at that MAC. I suspect the latter. I shutdown the KVM 
host physically and cleared the ARP cache on the Juniper, 195.100.100.7 was gone, but 
195.100.101.34 was still there with the identical MAC, as before.
I then removed the static route entry for the /24 which was pointing to
195.100.101.34 and only then the arp entry for 195.100.101.34 disappeared!

Isn't that weird? Where did that arp entry come from and why was it saved on 
the Juniper for so long, and only got removed after I removed the static 
routing of that /24?

I'm running JUNOS 8.0R2.8. :)

This didn't eliminate the problem with the websites reachability, I think it is 
something local with my dialup connection as I see a lot of TCP retransmission 
errors when accessing all sites on any of the VMs hosted on that KVM host. 
Through an alternative dialup provider everything is fine. Other sites on other 
boxes in the same LAN work just fine though via the first provider. The problem 
comes and goes now.
Really puzzled!

Anyway, can't stop thinking about the ARP thing so I thought I would ask here! 
Thank you very much!

Regards
Markus



___
juniper-nsp mailing list juniper-nsp@puck.nether.net 
https://puck.nether.net/mailman/listinfo/juniper-nsp
The contents of this message may contain confidential and/or privileged
subject matter. If this message has been received in error, please contact
the sender and delete all copies. Like other forms of communication,
e-mail communications may be vulnerable to interception by unauthorized
parties. If you do not wish us to communicate with you by e-mail, please
notify us at your earliest convenience. In the absence of such
notification, your consent is assumed. Should you choose to allow us to
communicate by e-mail, we will not take any additional security measures
(such as encryption) unless specifically requested.




___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https

Re: [j-nsp] Strange ARP issue on M7i

2012-08-15 Thread apurva modh
it represents that the default arp pollicer is dropping the arp packets.
You dont need to apply this filter on any interface. It is applied on all
interfaces by default ... Default values of the arp policer is fine-tuned
such that it does not interrupt normal arp mechanism .. the counter in the
show policer should not increment in ideal scenarios ... check if there
is any machine is spoofing/flooding arp or not ...

btw, Your junos is very old .. try changing to new junos ,, there are many
improvements since then ...

On Wed, Aug 15, 2012 at 11:44 PM, Markus unive...@truemetal.org wrote:

 Hi JP and all,

 thanks for all the replies. show policer shows:

 ad...@ffm01.rt show policer
 Policers:
 Name  Packets
 __default_arp_policer__   1140304
 __policer_tmpl__-term   0
 __policer_tmpl__-fc00
 __policer_tmpl__-fc00
 __policer_tmpl__-fc10
 __policer_tmpl__-fc00
 __policer_tmpl__-fc10
 __policer_tmpl__-fc20
 __policer_tmpl__-fc00
 __policer_tmpl__-fc10
 __policer_tmpl__-fc20
 __policer_tmpl__-fc30

 What does that mean?

 I don't seem to have anything configured related to that:

 ad...@ffm01.rt show configuration | grep arp
  empty 

 Thank you!
 Markus


 Am 14.08.2012 21:37, schrieb JP Senior:

 Hi, Markus.
 I have experienced issues in previous deployments that have involved
 built-in ARP policers.

 Hit up 'show policer', and look for __default_arp_policer__.

 JP Senior


 -Original Message-
 From: 
 juniper-nsp-bounces@puck.**nether.netjuniper-nsp-boun...@puck.nether.net[mailto:
 juniper-nsp-bounces@**puck.nether.netjuniper-nsp-boun...@puck.nether.net]
 On Behalf Of Markus
 Sent: 14 August 2012 7:13 AM
 To: juniper-nsp@puck.nether.net
 Subject: [j-nsp] Strange ARP issue on M7i

 Hi all,

 last night I encountered something weird (in my opinion). Not sure if
 Juniper related but maybe someone here has seen something like this?

 I was experiencing a strange effect that several websites hosted on a
 Linux KVM VM didn't load properly. They would load but 90% of the time hang
 in some strange way, the browser displaying Waiting for
 www.sitename.com... after all the page has loaded, or even before anything
 of the page was displayed. A minute later it would work sometimes, but only
 for a short period of time. After eliminating all MySQL, Apache, KVM etc.
 as the source of the problem I logged into the M7i in front of that host
 and saw:

 ad...@ffm01.rt show arp no-resolve |grep 195.100.100.7
 00:25:90:38:66:c6 195.100.100.7ge-0/0/0.0none
 00:25:90:38:66:c6 195.100.101.34   ge-0/0/0.0none

 With 195.100.100.7 being the KVM host. So I thought: why is 101.34 up?
 It's an IP that wasn't in use for years. And in the Juniper config a
 whole /24 was still getting routed to it. I thought, OK, the KVM host got
 hax0red or something and the intruder assigned 101.34, but couldnt find
 anything. 101.34 wasn't reachable from any machine in the same LAN and the
 MAC could not be seen either. No traffic to/from it on the Switch
 monitoring port either. All I saw was traffic (port scans I
 think) to the /24 which ended up on the KVM host (195.100.100.7). That
 was an indicator that the KVM host was really also saying I have
 195.100.101.34. Or the Juniper insisted that the IP is at that MAC. I
 suspect the latter. I shutdown the KVM host physically and cleared the ARP
 cache on the Juniper, 195.100.100.7 was gone, but 195.100.101.34 was still
 there with the identical MAC, as before.
 I then removed the static route entry for the /24 which was pointing to
 195.100.101.34 and only then the arp entry for 195.100.101.34 disappeared!

 Isn't that weird? Where did that arp entry come from and why was it saved
 on the Juniper for so long, and only got removed after I removed the static
 routing of that /24?

 I'm running JUNOS 8.0R2.8. :)

 This didn't eliminate the problem with the websites reachability, I think
 it is something local with my dialup connection as I see a lot of TCP
 retransmission errors when accessing all sites on any of the VMs hosted on
 that KVM host. Through an alternative dialup provider everything is fine.
 Other sites on other boxes in the same LAN work just fine though via the
 first provider. The problem comes and goes now.
 Really puzzled!

 Anyway, can't stop thinking about the ARP thing so I thought I would ask
 here! Thank you very much!

 Regards
 Markus



 __**_
 juniper-nsp mailing list juniper-nsp@puck.nether.net
 https://puck.nether.net/**mailman/listinfo/juniper-nsphttps

[j-nsp] Strange ARP issue on M7i

2012-08-14 Thread Markus

Hi all,

last night I encountered something weird (in my opinion). Not sure if 
Juniper related but maybe someone here has seen something like this?


I was experiencing a strange effect that several websites hosted on a 
Linux KVM VM didn't load properly. They would load but 90% of the time 
hang in some strange way, the browser displaying Waiting for 
www.sitename.com... after all the page has loaded, or even before 
anything of the page was displayed. A minute later it would work 
sometimes, but only for a short period of time. After eliminating all 
MySQL, Apache, KVM etc. as the source of the problem I logged into the 
M7i in front of that host and saw:


ad...@ffm01.rt show arp no-resolve |grep 195.100.100.7
00:25:90:38:66:c6 195.100.100.7ge-0/0/0.0none
00:25:90:38:66:c6 195.100.101.34   ge-0/0/0.0none

With 195.100.100.7 being the KVM host. So I thought: why is 101.34 up? 
It's an IP that wasn't in use for years. And in the Juniper config a 
whole /24 was still getting routed to it. I thought, OK, the KVM host 
got hax0red or something and the intruder assigned 101.34, but couldnt 
find anything. 101.34 wasn't reachable from any machine in the same LAN 
and the MAC could not be seen either. No traffic to/from it on the 
Switch monitoring port either. All I saw was traffic (port scans I 
think) to the /24 which ended up on the KVM host (195.100.100.7). That 
was an indicator that the KVM host was really also saying I have 
195.100.101.34. Or the Juniper insisted that the IP is at that MAC. I 
suspect the latter. I shutdown the KVM host physically and cleared the 
ARP cache on the Juniper, 195.100.100.7 was gone, but 195.100.101.34 was 
still there with the identical MAC, as before.
I then removed the static route entry for the /24 which was pointing to 
195.100.101.34 and only then the arp entry for 195.100.101.34 disappeared!


Isn't that weird? Where did that arp entry come from and why was it 
saved on the Juniper for so long, and only got removed after I removed 
the static routing of that /24?


I'm running JUNOS 8.0R2.8. :)

This didn't eliminate the problem with the websites reachability, I 
think it is something local with my dialup connection as I see a lot of 
TCP retransmission errors when accessing all sites on any of the VMs 
hosted on that KVM host. Through an alternative dialup provider 
everything is fine. Other sites on other boxes in the same LAN work just 
fine though via the first provider. The problem comes and goes now. 
Really puzzled!


Anyway, can't stop thinking about the ARP thing so I thought I would ask 
here! Thank you very much!


Regards
Markus



___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-14 Thread JP Senior
Hi, Markus.
I have experienced issues in previous deployments that have involved built-in 
ARP policers.

Hit up 'show policer', and look for __default_arp_policer__.

JP Senior


-Original Message-
From: juniper-nsp-boun...@puck.nether.net 
[mailto:juniper-nsp-boun...@puck.nether.net] On Behalf Of Markus
Sent: 14 August 2012 7:13 AM
To: juniper-nsp@puck.nether.net
Subject: [j-nsp] Strange ARP issue on M7i

Hi all,

last night I encountered something weird (in my opinion). Not sure if Juniper 
related but maybe someone here has seen something like this?

I was experiencing a strange effect that several websites hosted on a Linux KVM 
VM didn't load properly. They would load but 90% of the time hang in some 
strange way, the browser displaying Waiting for www.sitename.com... after all 
the page has loaded, or even before anything of the page was displayed. A 
minute later it would work sometimes, but only for a short period of time. 
After eliminating all MySQL, Apache, KVM etc. as the source of the problem I 
logged into the M7i in front of that host and saw:

ad...@ffm01.rt show arp no-resolve |grep 195.100.100.7
00:25:90:38:66:c6 195.100.100.7ge-0/0/0.0none
00:25:90:38:66:c6 195.100.101.34   ge-0/0/0.0none

With 195.100.100.7 being the KVM host. So I thought: why is 101.34 up? 
It's an IP that wasn't in use for years. And in the Juniper config a whole /24 
was still getting routed to it. I thought, OK, the KVM host got hax0red or 
something and the intruder assigned 101.34, but couldnt find anything. 101.34 
wasn't reachable from any machine in the same LAN and the MAC could not be seen 
either. No traffic to/from it on the Switch monitoring port either. All I saw 
was traffic (port scans I
think) to the /24 which ended up on the KVM host (195.100.100.7). That was an 
indicator that the KVM host was really also saying I have 195.100.101.34. Or 
the Juniper insisted that the IP is at that MAC. I suspect the latter. I 
shutdown the KVM host physically and cleared the ARP cache on the Juniper, 
195.100.100.7 was gone, but 195.100.101.34 was still there with the identical 
MAC, as before.
I then removed the static route entry for the /24 which was pointing to
195.100.101.34 and only then the arp entry for 195.100.101.34 disappeared!

Isn't that weird? Where did that arp entry come from and why was it saved on 
the Juniper for so long, and only got removed after I removed the static 
routing of that /24?

I'm running JUNOS 8.0R2.8. :)

This didn't eliminate the problem with the websites reachability, I think it is 
something local with my dialup connection as I see a lot of TCP retransmission 
errors when accessing all sites on any of the VMs hosted on that KVM host. 
Through an alternative dialup provider everything is fine. Other sites on other 
boxes in the same LAN work just fine though via the first provider. The problem 
comes and goes now. 
Really puzzled!

Anyway, can't stop thinking about the ARP thing so I thought I would ask here! 
Thank you very much!

Regards
Markus



___
juniper-nsp mailing list juniper-nsp@puck.nether.net 
https://puck.nether.net/mailman/listinfo/juniper-nsp
The contents of this message may contain confidential and/or privileged
subject matter. If this message has been received in error, please contact
the sender and delete all copies. Like other forms of communication,
e-mail communications may be vulnerable to interception by unauthorized
parties. If you do not wish us to communicate with you by e-mail, please
notify us at your earliest convenience. In the absence of such
notification, your consent is assumed. Should you choose to allow us to
communicate by e-mail, we will not take any additional security measures
(such as encryption) unless specifically requested.


___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-14 Thread Tobias Heister
Hi

Am 14.08.2012 15:12, schrieb Markus:
 Isn't that weird? Where did that arp entry come from and why was it saved on 
 the Juniper for so long, and only got removed after I removed the static 
 routing of that /24?

We saw a similar thing a short time ago on an MX480 running 10.4R9
In our case it was a bgp route pointing to a no longer existing ip address as 
the next-hop. The arp entry for this ip address stayed active as long as there 
was an active route for it.
Even clearing the arp cache witch clear arp hostname x.x.x.x did not do the 
trick. The next-hop ip was gone for several weeks and the arp entry had low 
timeout values left but never expired.
After clearing the route the arp entry vanished as expected.

I guess something keeps the arp entry from being deleted as long as there are 
or were forwarding entries in the fib for it at any time.

-- 
Kind Regards
Tobias Heister
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-14 Thread Jonathan Lassoff
On Tue, Aug 14, 2012 at 1:00 PM, Tobias Heister li...@tobias-heister.de wrote:
 Hi

 Am 14.08.2012 15:12, schrieb Markus:
 Isn't that weird? Where did that arp entry come from and why was it saved on 
 the Juniper for so long, and only got removed after I removed the static 
 routing of that /24?

 We saw a similar thing a short time ago on an MX480 running 10.4R9
 In our case it was a bgp route pointing to a no longer existing ip address as 
 the next-hop. The arp entry for this ip address stayed active as long as 
 there was an active route for it.
 Even clearing the arp cache witch clear arp hostname x.x.x.x did not do the 
 trick. The next-hop ip was gone for several weeks and the arp entry had low 
 timeout values left but never expired.
 After clearing the route the arp entry vanished as expected.

 I guess something keeps the arp entry from being deleted as long as there are 
 or were forwarding entries in the fib for it at any time.

Probably because the underlying information ARP is learning is used to
build the next-hop in the forwarding table (which needs to know what
Ethernet address to put in the destination MAC).

However, I would think that the route should become unreachable or
pruned if ARP is failing.
What if the remote router died for some reason? If the ARP entry and
next-hop were kept into place, the path would not work, but the route
would stay active.
A dynamic routing protocol and BFD would be see this right away and
move traffic, but this would break any static routes that rely on any
dynamism with ARP and next-hops.

Moral of the story, as I see it: avoid static routing.

--j
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-14 Thread Tobias Heister
Hi,

Am 14.08.2012 22:09, schrieb Jonathan Lassoff:
 A dynamic routing protocol and BFD would be see this right away and
 move traffic, but this would break any static routes that rely on any
 dynamism with ARP and next-hops.
 
 Moral of the story, as I see it: avoid static routing.

At least in our case it was a bgp route with a third-party next-hop (server) 
living on a connected LAN segment.
So we could not be saved by BFD in this case, but i admit its a special setup.

But it is funny that this behavior is present across platforms (M7i to MX with 
DPC) and junos versions (from 8.0 to 10.4) but this of course may have been 
coincidence.

-- 
Kind Regards
Tobias Heister
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] Strange ARP issue on M7i

2012-08-14 Thread Jonathan Lassoff
On Tue, Aug 14, 2012 at 1:20 PM, Tobias Heister li...@tobias-heister.de wrote:
 Hi,

 Am 14.08.2012 22:09, schrieb Jonathan Lassoff:
 A dynamic routing protocol and BFD would be see this right away and
 move traffic, but this would break any static routes that rely on any
 dynamism with ARP and next-hops.

 Moral of the story, as I see it: avoid static routing.

 At least in our case it was a bgp route with a third-party next-hop (server) 
 living on a connected LAN segment.
 So we could not be saved by BFD in this case, but i admit its a special setup.

I'm confused, because you said that The next-hop ip was gone for
several weeks.

In this case, wouldn't BGP detect the neighbor as down and remove the
route from the RIB?

--j
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp