[Speaking as an IDR chair and the shepherd for the DMZ document in this 
response]

Tiger,


> On May 6, 2026, at 03:17, Tiger Xu <[email protected]> wrote:
> The following are my late comments:
> First, it is anomalous for a document purely focused on use cases to be 
> advanced as a Standards Track RFC.

There has been contention over what the status of the DMZ document should be on 
the standards track. Thanks for registering your opinion. The shepherd report 
notes the AD has a preference for informative as well.

> Second, it is inappropriate to progress this VPN-unrelated draft within the 
> BESS Working Group, rather than the IDR Working Group.

The overlaps of the various link-bandwidth technologies did lead to mixed 
claims as to what the most appropriate working group this was for.  Review of 
the contents of the material originally in the DMZ draft resulted in the RFC 
4360-bis work to clarify non-transitive extended community behavior.  Since 
there was overlap between the working groups, the BESS and IDR chairs ensured 
the work was visible on both mailing lists, and also that the working group 
last calls covering RFC 4360-bis and the DMZ draft were done on both mailing 
lists. 


> 
> Third, the original DMZ use case is inherently straightforward and 
> well-defined, which renders a standalone dedicated draft entirely 
> unnecessary. The relevant scenarios can be sufficiently addressed by 
> incorporating a modest amount of descriptive text into the existing 
> link-bandwidth draft.

This point was also discussed.  "Doing math" using link-bandwidth was deemed an 
inappropriate expansion of the signaling portion of the link-bandwidth 
document, especially when the form of that math may vary depending on use case. 
 The link-bandwidth draft was progressed without such use cases, which were 
left to the previously adopted DMZ draft.

> Fourth, the new use cases and associated technical approaches introduced in 
> draft-ietf-bess-ebgp-dmz -07 (published 20 July 2025) were already fully and 
> explicitly documented in draft-xu-idr-fare -00 (released 1 July 2024).

There are certainly overlaps in the problem space between the DMZ and FARE 
documents. draft-xu-idr-fare-00 was issued on January 2025.  The DMZ draft had 
been getting work done on it over several years with refining use cases.  Is 
there a specific use case you are referring to here?  If so, please clarify 
what section in the document you're discussing.

Also, is there a concern that there is work covered in that overlapping portion 
of the documents that have undisclosed IPR considerations?  Or, is this more a 
matter that the DMZ document appropriated a use case without attribution to its 
contributors?


> Further detailed analysis is presented below. IMHO, the IETF, as a leading 
> global standards-setting body, ought not to endorse or tolerate such 
> non-compliant community practices. Allowing such precedent would undermine 
> the IETF’s reputation for impartiality and erode its ecosystem of original 
> technical innovation.

I'll address the technical points below, but what "non-compliant community 
practices" are you referring to?  Please be explicit.


> 
> 1. Issues and solutions introduced in version -07

I'm presuming this is referring to draft-ietf-bess-ebgp-dmz-07

> The -07 draft states:
> “With the existing rules for the DMZ link bandwidth, this is not possible. 
> First the LB extended community is not sent over EBGP. Secondly the DMZ does 
> not have a notion of conveying the cumulative link bandwidth (of the directed 
> tree rooted at a node) to an upstream router. To enable the use case 
> described above, the cumulative link bandwidth of R1 and R2 has to be 
> advertised by R3 to R4, and, similarly, the cumulative bandwidth of R6 and R7 
> has to be advertised by R5 to R4. This will enable R4 to load-balance based 
> on the proportion of the cumulative link bandwidth that it receives from its 
> downstream routers R3 and R5.”
> In essence, this changes the extended community from non‑transitive to 
> transitive and introduces the concept of bandwidth aggregation – both of 
> which were already present in draft-xu-idr-fare version -00.

The substance of the work for the DMZ draft has always been about two 
components: What does the math look like for aggregation or disaggregation 
cases, and how do you signal that.  Dealing with the ambiguity for 
non-transitive behaviors for extended communities at an eBGP boundary was 
attempted to be workshopped in a BESS document - the complaint that you made 
near the top of this email.  Since this was a general problem of an ambiguity 
with RFC 4360, the decision was made to cover that transitivity issue in RFC 
4360-bis. to provide clarification for all non-transitive extended communities.

Because, honestly, the idea of selectively overriding the extended community 
for the one use case was inappropriate.  Thankfully we seem to have broad 
consensus on that point, illustrating that the review across working groups was 
happening - albeit rather late.

The use case in draft-xu-idr-fare-00 essentially says "set the link-bw at the 
leaf in a new transitive extended community, perform math on a hop by hop 
basis".  This is different than the use case in DMZ where the non-transitive 
use cases leverage regenerating the community with the local node's contents.


> 2. Issues and solutions introduced in version -08
> Version -08 adds:
> “In addition, as illustrated in the previous sections, BGP may have to 
> consider a combination of the local link and remote bandwidth when computing 
> the weights for weighted load‑balancing. Any function of the two may be used 
> like for instance a ‘minimum’ function … The weight of each path may then be 
> based on either: only the remote bandwidth, only the local link bandwidth, or 
> a function of both.”
> This introduces the minimum function (choosing the smaller value between the 
> link bandwidth and the path bandwidth of the received route). The draft also 
> acknowledges for the first time the necessity of path bandwidth:
> “In our example, the value 30Gbps advertised by R3 represents an aggregated 
> path bandwidth.”
> Again, both the minimum function and the concept of path bandwidth were 
> already described in draft-xu-idr-fare version -00.

For this point, I'll ask that the DMZ authors respond to the provenance of the 
use case.  I'd also suggest to the BESS chairs and AD to halt progression of 
the document until that point is settled.  Minimally, if there's an issue of 
attribution, that should be addressed.

Further, I ask again on this point where there is undisclosed IPR that you 
consider an issue for this use case?


> 
> Notably, even with the aforementioned revisions, draft-ietf-bess-ebgp-dmz 
> remains inapplicable to 5‑stage CLOS networks—the prevailing mainstream 
> architecture deployed in 100K-XPU and even million-XPU scale AI clusters.   I 
> hope the community will revisit these aspects before progressing the draft 
> further.

For this point, I'll let authors and WG members contribute their experience.  I 
believe there are implementations of the DMZ feature in multi-stage fabrics.

This concludes my response to your original mail covering WG matters.  I'll 
have a followup email offering some observations on draft-xu-idr-fare.  

-- Jeff

> Best regards,
> Tiger

_______________________________________________
BESS mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to