Re: [bess] EVPN Draft Comments

Ravi Shekhar Tue, 03 Feb 2015 23:38:53 -0800

The approach of A/B/C independently generating route to X only based on local 
MAC learning works even if physical link failure detection is not an option. 
And in most practical cases, X will have enough active flows that it will cause 
local learning on A/B/C to hopefully happen well before a failure of connection 
to A happens. In the event that this is not the case, current scheme takes care 
of it via ESI route. Can you elaborate when it does not work a bit more?


There are other problems with the approach that you have cited - for instance, 
if the advertisement of route to X from A triggered B and C to generate route 
to X as well, what happens if X goes away? Phy-layer will not help A/B/C in the 
detection of X going away. A will keep its advertisement intact (even after X 
has aged out locally) because it is seeing route to X from B/C => much like B/C 
created the route to X based on A's advertisement initially. Similarly B/C will 
keep their routes intact because A's route is lingering around. There will be a 
circular dependency created and route to X will never get withdrawn. Not to 
mention, that even if this can be fixed with extra state and extra bits in the 
advertisement from B/C (which  may eventually make the scheme even complex),  
advertisement of a route based on another route and keeping track of this 
dependency in the same route table will require a extra BGP machinery that does 
not exist today.

I would think that the closer a PE is to the source of information that causes 
it to generate a route - like local MAC learning for MAC route, or LAG link 
present for ESI route - the more accurate its information will be. The farther 
the PE is from the source of information - like a PE depending on another PE's 
route to generate its own MAC route - convergence will be slower and complexity 
will be higher to keep track of dependent state.

My 2 cents.
- Ravi.






-----Original Message-----
From: Russ White [mailto:ru...@riw.us] 
Sent: Tuesday, February 03, 2015 7:47 PM
To: Ravi Shekhar; John E Drake
Cc: 'Rabadan, Jorge (Jorge)'; bess@ietf.org
Subject: RE: [bess] EVPN Draft Comments


> <Ravi> Using a route from another PE (A here) to inject a route by 
> other PEs
> (B/C) has its pitfalls. For instance withdrawals are going to be 
> tough. Say A has died for good, and X goes away – what mechanism will 
> invalidate this route from B? If it is local-aging at B, then B might 
> as well use local-learning to advertise the route in the first place.

This is pitfall in every conceivable scheme, in fact, when you have transmit 
capability to a device you can't see at the other end of the link to know it's 
actual status. If the CE and PE both fail at the same time when you're assuming 
connectivity you can't prove, you're always going to run into this problem -- 
including your aliasing scheme.

> <Ravi> In most practical situations, X would rehash its flow to B/C if 
> A has died. And B/C will learn the MAC of X (if they already hadn’t 
> due to other flows), and will publish the route again (if they already 
> hadn’t).

So let's work through the process --

- A fails
- The advertisements A was sending are, after peer down, removed from the table
- X continues sending traffic until it either the session resets or (hopefully) 
the interface down on A propagates towards X in some way -- but there's no way 
of actually knowing what this looks like, as we don't have any idea what's 
actually between A and X
- Eventually, X begins refactoring it's hash, and starts sending towards B
- B learns the new attached host, and readvertises it

There is a lot of "ifs, ands, and buts," in here to cover, and a lot of time. 
Either both A and B can reach the same set of hosts, or they cannot. If they 
can, then the link should be treated as a broadcast, which means it's reachable 
from every upstream on the LAG connected to it. I would still prefer a solution 
that doesn't play this sort of "I can reach all the same things he can," game 
-- the DR type of system is much cleaner, and much more robust to modifications 
and future enhancements than aliasing will be.

Of course, the real solution is -- don't use LAGs when you're doing layer 3 
control plane mechanisms in the first place, but you must get out of the layer 
2 only mindset to see that LAGs are causing you nothing but trouble all around 
in a proactive control plane with high density link counts. 

:-)

Russ


_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

Re: [bess] EVPN Draft Comments

Reply via email to