Re: ARP issues when using ldpd and MPLS pseudowires

Henry Bonath Mon, 01 Apr 2019 19:40:34 -0700

Lee, I just read your post about 5 minutes before you sent this. I agree
that I think this is all related.
I'm not running Pseudowires in my environment, only L3VPN but we're all
talking MPLS here.


I came across this thread from the dev mailing list posted by (who I can
assume is) the Adrian Close that started this thread:
http://openbsd-archive.7691.n7.nabble.com/ARP-issues-when-using-ldpd-8-and-mpw-4-td360853.html

It looks like a patch may have been produced, but I do not know how to test
it. I'm not sure if I can pull down just a small part of the
OpenBSD source, or if the entire OS should be built. (Although I'd love to
learn how to do this)

If I'm reading this right, the issue is in if_ethersubr.c and the issue is
with running LDP, when ARP'ing for a neighbor,
ARP "who has" requests go out for the Address of the LDP ID of the neighbor
router, not the directly broadcast-adjacent Address.
In my case, I run Loopbacks advertised into OSPF and use those for LDP and
BGP. ARP requests go out for those Loopback
IP addresses which are not broadcast-adjacent, causing my ARP entries to go
expired/incomplete. (ARP should know that these
addresses are not in my subnet mask and therefore should not be sending
ARPs for those addresses!)

If someone could help me get this patch built, I'd gladly reach back to the
Developer to see if we could get this rolled into a syspatch
or maybe into 6.5 which is right around the corner I'm assuming.  I have
not tested MPLS on 6.5 as of yet.

On Mon, Apr 1, 2019 at 10:18 PM Lee Nelson <lnel...@nelnet.org> wrote:

> This sounds very similar to the problem I mentioned over the last couple
> of days in an email with the subject "Trouble forwarding between mpw's in
> bridge (6.4)".
>
> Our environments are very different, but I think the underlying problem
> may be the same. In short, arp inside of a bridge works as it should except
> between mpw's (pseudowires). An arp broadcast entering the bridge on one
> mpw exits the bridge properly on physical interfaces, but does not get
> properly encapsulated onto the other mpw. The problem probably affects all
> broadcast traffic, but so far arp is the only broadcast traffic I have
> dealt with. Like you, I have to statically configure entries in the arp
> tables. This hack does not scale.
>
> On Mon, Apr 1, 2019, 18:36 Henry Bonath <he...@thebonaths.com> wrote:
>
>> Tom, Adrian, et al -
>>
>> I have posted before about this issue a few weeks ago - apparently this
>> affects more than just
>> Virtualbox or VMWare, I am experiencing this *EXACT* thing on Hyper-V as
>> well.
>> I have not tried this on metal.
>>
>> My network looks like this:
>>
>> (Customer VMs)<--->(Hyper-V OpenBSD 6.4 PE)<--->(CISCO ASR P)<--->(CISCO
>> ME3750 PE)<--->(CE)
>> They Layer-2 between the Hyper-V and Cisco ASR is a Cisco Nexus 5672.
>> I am using L3VPN instead of Pseudowire.
>>
>> Some of the ARP entries will time out and when they do, LDP will crash.
>> (ldp engine terminated; signal 10)
>> 100.92.64.37                         (incomplete)         hvn0 expired
>> <---- ARP timed out
>> 100.92.64.68                         00:b7:71:93:32:95    hvn0 6m8s  <----
>> ARP about to time out
>>
>> ARP Timing out makes no sense as these devices are all running OSPF with
>> each other,
>> granted OSPF is running Multicast to 224.0.0.5 I would think that would be
>> enough to keep ARP up.
>>
>> In my environment I use Salt to manage my systems, and my PE formula has
>> static ARP entries
>> that get added, but that's not really a fix but a workaround.
>>
>> 100.92.64.37                         (incomplete)         hvn0 expired
>> <--- ARP still missing for this guy
>> 100.92.64.68                         00:b7:71:93:32:95    hvn0 expired
>> <--- ARP timed out while writing this
>>
>> # ping 100.92.64.68
>> PING 100.92.64.68 (100.92.64.68): 56 data bytes
>> ping: sendmsg: Host is down
>> ping: wrote 100.92.64.68 64 chars, ret=-1
>> ping: sendmsg: Host is down
>>
>> Other ARP Entries stay up, the ones that do not run LDP, and oddly enough-
>> other OpenBSD systems.
>> It seems like this only happens to OpenBSD LDP against Cisco IOS/IOS-XE
>> (in
>> my environment, anyway)
>>
>> -Henry
>>
>>
>> On Wed, Mar 13, 2019 at 7:28 PM Tom Smyth <tom.sm...@wirelessconnect.eu>
>> wrote:
>>
>> > Adrian,
>> > sorry I only saw this now ...   when trying to go through old unread
>> mails
>> >
>> > I would be very wary of vmware virtual networking  and Layer 2
>> Forwarding
>> > I loved vmware before I discovered the ridiculous short comings in
>> > their virtual networks
>> >
>> >  Vmware Virtual Switches  vmxnet
>> > they are not switches or bridges...   :(  they (vmware) over optimised
>> > and the virtual switches forward  too and from vms by default based on
>> > macs learned
>> > via each attached machines vmx config file.
>> > the workaround is promiscuous mode for the virtual switch... (turns
>> > your virtual switch into
>> > a crappy hub)
>> >  but this copies packets (frames) that are destined
>> > for 1 machine virtual machine attached to the virtual switch, so if
>> > you have high traffic volumes
>> > and alot of machines attached to the virtual lan ...  your perf is
>> > going to suck ...
>> > also you need to allow forged transmits on the virtual switch (macs
>> > that dont match the vmx machine
>> > mac configuration  (which all bridged packets from behind your openbsd
>> > guest will appear as ...
>> >
>> > if you are desperate to use vmware .. .check out the labs...  they had
>> > an "improved"
>> > virtual switch with mac learning capabilities ... (only down side is
>> > that  particular virtual switch
>> > has no mac ageing on the  switch your virtual switch FIB wont flush
>> > without rebooting the host
>> >
>> > apparently vmware have a switch that has proper mac learning  from
>> > virtual machines that
>> > are bridging , but this requires the  super duper awesome license (the
>> > enterprise + or something like that,
>> >
>> > If you still need to use vmware on a lesser license perhaps a
>> > multiport card + sriov and avoid their poor virtual switches
>> >
>> > basically you  will have a lot of hassle with that,
>> >
>> > I hope this helps ... 352 days later :/
>> >
>> > Tom Smyth
>> > PS
>> > Einstein once said " you should make things as simple as possible but
>> > no simpler" it would appear vmware
>> > did not heed this advice...  and you dont have to be a genius to work
>> > that out ... :)   (because I did :) )
>> >
>> > On Fri, 16 Mar 2018 at 04:55, Adrian Close <adr...@close.wattle.id.au>
>> > wrote:
>> > >
>> > > Hi,
>> > >
>> > > I'm looking at doing some MPLS/VPLS stuff with OpenBSD, in particular
>> > > using 'mpw' pseudowires.  I've created a test network comprising two
>> > > "PE" and two "P" hosts, to transport Ethernet traffic between service
>> > > ports on the PE hosts across the MPLS network, based on an example I
>> > > found online.  I'm using a 6.3 snapshot from March 11th.
>> > >
>> > >    [firewall] = [em0][mpw0][PE1][em1] - [em0][P1][em1] -
>> [em1][P2][em0]
>> > > - [em1][PE2][mpw0][em0] = [host]
>> > >
>> > > PE1 em0 and mpw0 are in a bridge, PE1 em1 is MPLS, P1 em0/1 are MPLS
>> etc.
>> > >
>> > > This is all working great, except for short outages which turn out to
>> > > coincide with the ARP cache expiry time for the P router's IP address
>> on
>> > > the PE host.
>> > >
>> > > When the ARP entry times out (or is manually deleted), the PE host
>> > > doesn't ARP for the P router IP, but instead sends ARP who-has queries
>> > > for other, definitely non-local things, such as the IP address for the
>> > > other PE host's router-id.  After a minute or so it finally ARPs for
>> the
>> > > P router IP and things work again.
>> > >
>> > > This only happens when "ldpd" is running (and I think only when the
>> > > pseudowires are actually up).  If I stop "ldpd" on the PE host, ARP
>> > > works fine as expected every time.
>> > >
>> > > I guess I could fix this with static ARP entries, but that doesn't
>> seem
>> > > like quite the right thing.  My test setup is running in Virtualbox
>> > > VMs.  I also replicated the issue under VMWare ESX using 'vic'
>> > interfaces.
>> > >
>> > > Does anyone have any clues on this?
>> > >
>> > > Thanks in advance,
>> > >
>> > > Adrian Close
>> > >
>> >
>> >
>> > --
>> > Kindest regards,
>> > Tom Smyth
>> >
>> > Mobile: +353 87 6193172
>> > The information contained in this E-mail is intended only for the
>> > confidential use of the named recipient. If the reader of this message
>> > is not the intended recipient or the person responsible for
>> > delivering it to the recipient, you are hereby notified that you have
>> > received this communication in error and that any review,
>> > dissemination or copying of this communication is strictly prohibited.
>> > If you have received this in error, please notify the sender
>> > immediately by telephone at the number above and erase the message
>> > You are requested to carry out your own virus check before
>> > opening any attachment.
>> >
>> >
>>
>

Re: ARP issues when using ldpd and MPLS pseudowires

Reply via email to