Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

John Scudder Mon, 17 May 2021 12:32:32 -0700

Hi Adrian,

Comments in line.

> On May 16, 2021, at 7:25 AM, Adrian Farrel <adr...@olddog.co.uk> wrote:
> 
> 
> Hi John,
>  
> Trying to dismantle this…
>  
> We are saying that a site is integral.

I don’t think I saw any place in the draft that states that assumption.

> You are is asking : what happens if a site becomes partitioned so that some 
> prefixes are accessible through one GW and some through another.

Actually that wasn’t my scenario, I see I must have expressed it poorly. I’ll 
come back to that, but first, your own scenario.

> Consider a site with a set of prefixes S
> Consider two GWs: GW1 and GW2
> Initially GW1 and GW2 discover each other.
> So GW1 advertises reachability to S, and by the way GW2 exists
> GW2 advertises reachability to S, and by the way GW1 exists
> Now the site becomes partitioned so that GW1 can reach S1 and GW2 can reach 
> S2. (S = S1 U S2,  S1 n S2 = E)
>  
> You ask:
>       • What happens to packets for S2 arriving at GW1?
>       • What is the remedy in the protocol?
>  
> My answer to 1. is that the packets will be black-holed either at GW1 or 
> inside the site.
> My observation is that:
>       • GW1 cannot reach GW2 inside the site. If it could, then S2 would be 
> reachable via GW1
>       • It is contrary to BCP38 for GW1 to forward a packet back into the 
> external AS to be routed to GW2
>  
> My answer to 2. is that when the site becomes partitioned:
>       • GW1 will stop advertising the whole of S and will fall back to 
> advertising just S1
>       • GW2 will stop advertising the whole of S and will fall back to 
> advertising just S2

This, too, is unstated in the draft, and it’s not like it will happen for free 
by natural operation of every router. But in the scenario you’ve picked that 
seems OK since this practice is needed for a multi homed site regardless of 
whether your draft is in use or not — if the site partitions, either you 
deaggregate, or you black hole. (“Deaggregate” might be actual deaggregation, 
or simply advertising only the subset of individual prefixes that remain 
reachable, depending on the address allocation scheme in use, of course.)

>       • Initially, GW1 and GW2 will still advertise each other’s existence, 
> but will “soon” un-auto-discover each other
> At this point the site is effectively two sites that use the same site 
> identifier.
>  
> How quickly this takes place depends on precisely what the failure case is, 
> how fast the failure detection is done, and how fast BGP converges.  
>  
> *Perhaps* there is a wrinkle *if* the autodetection advertisements are sent 
> external to the site. In this case, GW1 would continue to discover GW2 and so 
> would readvertise it (and vice versa). This would continue to lead to the 
> broken condition you noted. I think we assumed that the peering between GW1 
> and GW2 would be internal to the site (because otherwise it would constitute 
> traffic leaving the site and re-entering it (breaking BCP38 again). If it 
> would help, we could make this point clear by saying that the peering between 
> GW1 and GW2 must be within the site.

Seems reasonable. However, it wasn’t the problem I was posing. Since that seems 
to have gotten lost in the quote-and-snip chain, here it is again:

"The autodiscovery mechanism is clear as far as it goes, but I think not all 
failure modes are addressed. In particular, if there’s partial connectivity 
within a domain, I think long-term black holing can ensue. Consider this case: 
GW1 and GW2 are gateways in domain A. GW3 is a gateway in domain B. GW1 and GW2 
discover one another and advertise one another’s encapsulation information 
accordingly, when advertising a route to prefix X. However, there’s a problem 
within GW1 and GW2’s domain, such that GW1 can reach X, but GW2 can’t. Even 
though GW2 may know it can’t reach X, and indeed GW2 isn’t advertising X, GW1 
is still advertising GW2 as a viable gateway to reach X, and GW3 may well route 
traffic for X via GW2. 

The key difference between the question you answered, and the one I asked, is 
that you answer what happens in the case of a complete partition, and I ask 
what happens in the case of inconsistent routing within the site. When you 
worked through your scenario, you included this:

>       • GW1 cannot reach GW2 inside the site. If it could, then S2 would be 
> reachable via GW1

That isn’t the case in my scenario. GW1 can reach GW2, GW1 cannot reach S2. I 
think the assumption you stated up top (“the site is integral”) answers this 
question too: this case isn’t covered since it violates the assumption. I think 
that’s OK, it’s a case I’d call silly if I hadn’t seen it happen, but it’s 
still low-probability. The question was only, is it worth stating the 
assumption and what is/isn’t covered?

"Admittedly, having partial connectivity within a domain as I’ve described is a 
broken situation to begin with, but stuff happens, and your spec would make 
matters worse. It might be worth acknowledging this issue somewhere in the 
document?”

I hope this is clearer now.

Thanks,

—John

> Cheers,
> Adrian
>  
> From: John Scudder <j...@juniper.net> 
> Sent: 14 May 2021 22:25
> To: Adrian Farrel <adr...@olddog.co.uk>
> Cc: The IESG <i...@ietf.org>; draft-ietf-bess-datacenter-gate...@ietf.org; 
> bess-cha...@ietf.org; bess@ietf.org; Matthew Bocci <matthew.bo...@nokia.com>
> Subject: Re: John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: 
> (with DISCUSS and COMMENT)
>  
> Having re-read Section 3 carefully (and skimmed the rest) I still think what 
> the document says (as opposed to what’s in the authors’ heads?) is the first 
> description I give below. Let me know if you want me to walk through my 
> reasoning in detail with reference to the document. 
>  
> —John
> 
> 
>> On May 14, 2021, at 4:12 PM, John Scudder <j...@juniper.net> wrote:
>> 
>>  Hi Adrian, 
>>  
>> Thanks for your reply. Pressed for time at the moment but one partial 
>> response:
>> 
>> 
>>> On May 14, 2021, at 1:04 PM, Adrian Farrel <adr...@olddog.co.uk> wrote:
>>>  
>>> Agree with you that "stuff happens." I think that what you have described 
>>> is a window not a permanent situation.
>>> When GW2 knows it can't reach X any more, it will stop advertising X, and 
>>> GW1 will receive that and will update what it advertises on behalf of GW2.
>>  
>> Ah, perhaps I have badly misunderstood the way this works. I had thought it 
>> went something like this:
>>  
>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>> - GW1 knows the set S of internal prefixes it can reach
>> - GW1 advertises each prefix from S with both GW1 and GW2 in the tunnel 
>> attribute
>>  
>> In the description above, there’s no notion of GW2 telling GW1 what internal 
>> prefixes GW2 can reach, or GW1 caring.  Now I suppose you are telling me 
>> that it goes:
>>  
>> - GW1 knows it can reach GW2 because of GW2’s auto discovery route
>> - GW1 knows the full set of prefixes GW2 can reach. _How does it know this?_
>> - GW1 constructs each advertisement listing only the correct set of gateways 
>> in the tunnel attribute
>>  
>> The key question is the one I’ve highlighted: how does GW1 come to know 
>> GW2’s internally-reachable prefixes? I didn’t notice any of this in the 
>> spec. Maybe it was just my sloppy reading, I’ll look again.
>> 
>> 
>>> Further, if GW1 can no longer receive advertisements from GW2 then it will 
>>> stop advertising on behalf of GW2.
>>  
>> Yes, that’s understood, but I was positing a case where just because GW1 can 
>> reach GW2 stably, and just because GW1 can reach X stably, it does not imply 
>> GW2 can reach X.
>>  
>> —John

_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

Re: [bess] John Scudder's Discuss on draft-ietf-bess-datacenter-gateway-10: (with DISCUSS and COMMENT)

Reply via email to