The approach of A/B/C independently generating route to X only based on local MAC learning works even if physical link failure detection is not an option. And in most practical cases, X will have enough active flows that it will cause local learning on A/B/C to hopefully happen well before a failure of connection to A happens. In the event that this is not the case, current scheme takes care of it via ESI route. Can you elaborate when it does not work a bit more?
There are other problems with the approach that you have cited - for instance, if the advertisement of route to X from A triggered B and C to generate route to X as well, what happens if X goes away? Phy-layer will not help A/B/C in the detection of X going away. A will keep its advertisement intact (even after X has aged out locally) because it is seeing route to X from B/C => much like B/C created the route to X based on A's advertisement initially. Similarly B/C will keep their routes intact because A's route is lingering around. There will be a circular dependency created and route to X will never get withdrawn. Not to mention, that even if this can be fixed with extra state and extra bits in the advertisement from B/C (which may eventually make the scheme even complex), advertisement of a route based on another route and keeping track of this dependency in the same route table will require a extra BGP machinery that does not exist today. I would think that the closer a PE is to the source of information that causes it to generate a route - like local MAC learning for MAC route, or LAG link present for ESI route - the more accurate its information will be. The farther the PE is from the source of information - like a PE depending on another PE's route to generate its own MAC route - convergence will be slower and complexity will be higher to keep track of dependent state. My 2 cents. - Ravi. -----Original Message----- From: Russ White [mailto:ru...@riw.us] Sent: Tuesday, February 03, 2015 7:47 PM To: Ravi Shekhar; John E Drake Cc: 'Rabadan, Jorge (Jorge)'; bess@ietf.org Subject: RE: [bess] EVPN Draft Comments > <Ravi> Using a route from another PE (A here) to inject a route by > other PEs > (B/C) has its pitfalls. For instance withdrawals are going to be > tough. Say A has died for good, and X goes away – what mechanism will > invalidate this route from B? If it is local-aging at B, then B might > as well use local-learning to advertise the route in the first place. This is pitfall in every conceivable scheme, in fact, when you have transmit capability to a device you can't see at the other end of the link to know it's actual status. If the CE and PE both fail at the same time when you're assuming connectivity you can't prove, you're always going to run into this problem -- including your aliasing scheme. > <Ravi> In most practical situations, X would rehash its flow to B/C if > A has died. And B/C will learn the MAC of X (if they already hadn’t > due to other flows), and will publish the route again (if they already > hadn’t). So let's work through the process -- - A fails - The advertisements A was sending are, after peer down, removed from the table - X continues sending traffic until it either the session resets or (hopefully) the interface down on A propagates towards X in some way -- but there's no way of actually knowing what this looks like, as we don't have any idea what's actually between A and X - Eventually, X begins refactoring it's hash, and starts sending towards B - B learns the new attached host, and readvertises it There is a lot of "ifs, ands, and buts," in here to cover, and a lot of time. Either both A and B can reach the same set of hosts, or they cannot. If they can, then the link should be treated as a broadcast, which means it's reachable from every upstream on the LAG connected to it. I would still prefer a solution that doesn't play this sort of "I can reach all the same things he can," game -- the DR type of system is much cleaner, and much more robust to modifications and future enhancements than aliasing will be. Of course, the real solution is -- don't use LAGs when you're doing layer 3 control plane mechanisms in the first place, but you must get out of the layer 2 only mindset to see that LAGs are causing you nothing but trouble all around in a proactive control plane with high density link counts. :-) Russ _______________________________________________ BESS mailing list BESS@ietf.org https://www.ietf.org/mailman/listinfo/bess