[Lsr] 转发: New Version Notification for draft-xu-lsr-fare-01.txt

2024-01-29 Thread xuxiaohu_i...@hotmail.com
Hi all,

Packet-granular adaptive routing, also known as congestion-aware packet spray, 
has been widely recognized as an ideal load-balancing mechanism for AI Ethernet 
networks. Some cloud providers have implemented their in-house packet spray 
approaches, which are mainly built on proactive and real-time congestion 
detection along all possible ECMP paths.

In order to achieve a non-blocking network fabric, it seems more suitable for 
network switches for perform packet spray since they could obtain the 
information about network congestion between switches more quickly and easily. 
Some major network chip vendors are developed their proprietary congestion 
notification mechanisms built on their proprietary data-plane signaling. 
However, to meet the aim of the UEC to deliver an Ethernet-based open, 
interoperable, high-performance full-communications stack for the growing 
network demands of AI and HPC at scale, it is meaningful for us to pursue an 
open standard-based approach for packet spray. This draft is a step towards 
that goal indeed, any comments and suggestions are welcome.

Best regards,rt
Xiaohu

发件人: internet-dra...@ietf.org 
日期: 星期一, 2024年1月29日 16:36
收件人: Hang Wu , Hongyi Huang , 
Junjie Wang , Qingliang Zhang , 
Xiaohu Xu , Yadong Liu , Yinben 
Xia , Zongying He 
主题: New Version Notification for draft-xu-lsr-fare-01.txt
A new version of Internet-Draft draft-xu-lsr-fare-01.txt has been successfully
submitted by Xiaohu Xu and posted to the
IETF repository.

Name: draft-xu-lsr-fare
Revision: 01
Title:Fully Adaptive Routing Ethernet
Date: 2024-01-29
Group:Individual Submission
Pages:9
URL:  https://www.ietf.org/archive/id/draft-xu-lsr-fare-01.txt
Status:   https://datatracker.ietf.org/doc/draft-xu-lsr-fare/
HTMLized: https://datatracker.ietf.org/doc/html/draft-xu-lsr-fare
Diff: https://author-tools.ietf.org/iddiff?url2=draft-xu-lsr-fare-01

Abstract:

   Large language models (LLMs) like ChatGPT have become increasingly
   popular in recent years due to their impressive performance in
   various natural language processing tasks.  These models are built by
   training deep neural networks on massive amounts of text data, often
   consisting of billions or even trillions of parameters.  However, the
   training process for these models can be extremely resource-
   intensive, requiring the deployment of thousands or even tens of
   thousands of GPUs in a single AI training cluster.  Therefore, three-
   stage or even five-stage CLOS networks are commonly adopted for AI
   networks.  The non-blocking nature of the network become increasingly
   critical for large-scale AI models.  Therefore, adaptive routing is
   necessary to dynamically load balance traffic to the same destination
   over multiple ECMP paths, based on network capacity and even
   congestion information along those paths.



The IETF Secretariat

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] 答复: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt

2023-11-28 Thread xuxiaohu_i...@hotmail.com
Hi Jeff,

Open/R 
(https://engineering.fb.com/2017/11/15/connectivity/open-r-open-routing-for-modern-networks/
 ) is actually an in-house link-state routing protocol.

The following text is quoted from the above article.

"For many years, Facebook has been operating these large-scale fabrics solely 
with Border Gateway Protocol (BGP). While BGP brings its strengths, especially 
with respect to policy enforcement and scale, we saw opportunities to improve 
and simplify the design by having Open/R and BGP work together."

"Open/R provides APIs allowing remote agents to learn the link state or 
subscribe to database updates, such as notifications of a link capacity change. 
For example, this could be used to compute label-switched paths and program 
them on the network from a central location. “

In short, don’t lose confidence in using link-state routing protocol in data 
centers:)

Best regards,
Xiaohu


发件人: Jeff Tantsura 
日期: 星期一, 2023年11月27日 05:49
收件人: Acee Lindem 
抄送: Les Ginsberg (ginsberg) , Tony Li 
, xuxiaohu_i...@hotmail.com , 
lsr@ietf.org 
主题: Re: [Lsr] New Version Notification for 
draft-xu-lsr-flooding-reduction-in-clos-01.txt

I agree with all aforementioned comments.

Wrt AI/ML networking - if a controller is used, what is required is link state 
exposure northbound and not link state protocol  in the fabric. (I could argue 
for RIFT though ;-))
I’d urge you to take a look at Meta’s deployment  in their ML clusters 
(publicly available) - they use BGP as the routing protocol to exchange 
reachability (and build ECMP sets) and provide a backup if controller computed 
next hop goes away/before new one has been computed.
Open R is used northbound to expose the topology (in exactly same way - BGP-LS 
could be used).

To summarize: an LS protocol brings no additional value in scaled-out 
leaf-spine fabrics, without significant modifications -  it doesn’t work in 
irregular topologies such as DF, etc.
Existing proposals - there are shipping implementations and experience in 
operating it, have proven their relative value in suitable deployments.

Cheers,
Jeff

> On Nov 26, 2023, at 12:20, Acee Lindem  wrote:
>
> Speaking as WG member:
>
> I agree. The whole Data Center IGP flooding discussion went on years ago and 
> the simplistic enhancement proposed in the subject draft is neither relevant 
> or useful now.
>
> Thanks,
> Acee
>
>> On Nov 24, 2023, at 11:55 PM, Les Ginsberg (ginsberg) 
>>  wrote:
>>
>> Xiaohu –
>> I also point out that there are at least two existing drafts which 
>> specifically address IS-IS flooding reduction in CLOS networks and do so in 
>> greater detail and with more robustness than what is in your draft:
>> https://datatracker.ietf.org/doc/draft-ietf-lsr-distoptflood/
>> https://datatracker.ietf.org/doc/draft-ietf-lsr-isis-spine-leaf-ext/
>> I do not see a need for yet another draft specifically aimed at CLOS 
>> networks.
>> Note that work on draft-ietf-lsr-isis-spine-leaf-ext was suspended due to 
>> lack of interest in deploying an IGP solution in CLOS networks.
>> You are suggesting in draft-xu-lsr-fare that AI is going to change this. 
>> Well, maybe, but if so I think we should return to the solutions already 
>> available and prioritize work on them.
>>Les
>>  From: Lsr  On Behalf Of Tony Li
>> Sent: Thursday, November 23, 2023 8:39 AM
>> To: xuxiaohu_i...@hotmail.com
>> Cc: lsr@ietf.org
>> Subject: Re: [Lsr] New Version Notification for 
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> Hi,
>> What you’re proposing is already described in IS-IS Mesh Groups 
>> (https://www.rfc-editor.org/rfc/rfc2973.html) and improved upon in Dynamic 
>> Flooding 
>> (https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding).
>> Regards,
>> Tony
>>
>>
>> On Nov 23, 2023, at 8:29 AM, xuxiaohu_i...@hotmail.com wrote:
>> Hi all,
>> Any comments or suggestions are welcome.
>> Best regards,
>> Xiaohu
>> 发件人: internet-dra...@ietf.org 
>> 日期: 星期三, 2023年11月22日 11:37
>> 收件人: Xiaohu Xu 
>> 主题: New Version Notification for 
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> A new version of Internet-Draft 
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> has been successfully submitted by Xiaohu Xu and posted to the
>> IETF repository.
>>
>> Name: draft-xu-lsr-flooding-reduction-in-clos
>> Revision: 01
>> Title:Flooding Reduction in CLOS Networks
>> Date: 2023-11-22
>> Group:Individual Submission
>> Pages:6
>> URL:  
>> https://www.ietf.org/archive/id/draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> Status:   
>> https://datatracker.ietf.org/doc/draft-xu

[Lsr] 答复: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt

2023-11-27 Thread xuxiaohu_i...@hotmail.com


Hi Jeff,

Meta's deployment in their ML clusters as you mentioned has indicated clearly 
that the centralized TE cannot respond fast enough to network failure, not to 
mention congestion changes.

In fact, some vendors have used the BGP Link Bandwidth Extended Community to 
achieve capacity-aware global path selection in AI networks, although it is not 
as efficient as IGP, especially for congestion-aware path selection.

Best regards,
Xiaohu

发件人: Jeff Tantsura 
日期: 星期一, 2023年11月27日 05:49
收件人: Acee Lindem 
抄送: Les Ginsberg (ginsberg) , Tony Li 
, xuxiaohu_i...@hotmail.com , 
lsr@ietf.org 
主题: Re: [Lsr] New Version Notification for 
draft-xu-lsr-flooding-reduction-in-clos-01.txt

I agree with all aforementioned comments.

Wrt AI/ML networking - if a controller is used, what is required is link state 
exposure northbound and not link state protocol  in the fabric. (I could argue 
for RIFT though ;-))
I’d urge you to take a look at Meta’s deployment  in their ML clusters 
(publicly available) - they use BGP as the routing protocol to exchange 
reachability (and build ECMP sets) and provide a backup if controller computed 
next hop goes away/before new one has been computed.
Open R is used northbound to expose the topology (in exactly same way - BGP-LS 
could be used).

To summarize: an LS protocol brings no additional value in scaled-out 
leaf-spine fabrics, without significant modifications -  it doesn’t work in 
irregular topologies such as DF, etc.
Existing proposals - there are shipping implementations and experience in 
operating it, have proven their relative value in suitable deployments.

Cheers,
Jeff

> On Nov 26, 2023, at 12:20, Acee Lindem  wrote:
>
> Speaking as WG member:
>
> I agree. The whole Data Center IGP flooding discussion went on years ago and 
> the simplistic enhancement proposed in the subject draft is neither relevant 
> or useful now.
>
> Thanks,
> Acee
>
>> On Nov 24, 2023, at 11:55 PM, Les Ginsberg (ginsberg) 
>>  wrote:
>>
>> Xiaohu –
>> I also point out that there are at least two existing drafts which 
>> specifically address IS-IS flooding reduction in CLOS networks and do so in 
>> greater detail and with more robustness than what is in your draft:
>> https://datatracker.ietf.org/doc/draft-ietf-lsr-distoptflood/
>> https://datatracker.ietf.org/doc/draft-ietf-lsr-isis-spine-leaf-ext/
>> I do not see a need for yet another draft specifically aimed at CLOS 
>> networks.
>> Note that work on draft-ietf-lsr-isis-spine-leaf-ext was suspended due to 
>> lack of interest in deploying an IGP solution in CLOS networks.
>> You are suggesting in draft-xu-lsr-fare that AI is going to change this. 
>> Well, maybe, but if so I think we should return to the solutions already 
>> available and prioritize work on them.
>>Les
>>  From: Lsr  On Behalf Of Tony Li
>> Sent: Thursday, November 23, 2023 8:39 AM
>> To: xuxiaohu_i...@hotmail.com
>> Cc: lsr@ietf.org
>> Subject: Re: [Lsr] New Version Notification for 
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> Hi,
>> What you’re proposing is already described in IS-IS Mesh Groups 
>> (https://www.rfc-editor.org/rfc/rfc2973.html) and improved upon in Dynamic 
>> Flooding 
>> (https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding).
>> Regards,
>> Tony
>>
>>
>> On Nov 23, 2023, at 8:29 AM, xuxiaohu_i...@hotmail.com wrote:
>> Hi all,
>> Any comments or suggestions are welcome.
>> Best regards,
>> Xiaohu
>> 发件人: internet-dra...@ietf.org 
>> 日期: 星期三, 2023年11月22日 11:37
>> 收件人: Xiaohu Xu 
>> 主题: New Version Notification for 
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> A new version of Internet-Draft 
>> draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> has been successfully submitted by Xiaohu Xu and posted to the
>> IETF repository.
>>
>> Name: draft-xu-lsr-flooding-reduction-in-clos
>> Revision: 01
>> Title:Flooding Reduction in CLOS Networks
>> Date: 2023-11-22
>> Group:Individual Submission
>> Pages:6
>> URL:  
>> https://www.ietf.org/archive/id/draft-xu-lsr-flooding-reduction-in-clos-01.txt
>> Status:   
>> https://datatracker.ietf.org/doc/draft-xu-lsr-flooding-reduction-in-clos/
>> HTMLized: 
>> https://datatracker.ietf.org/doc/html/draft-xu-lsr-flooding-reduction-in-clos
>> Diff: 
>> https://author-tools.ietf.org/iddiff?url2=draft-xu-lsr-flooding-reduction-in-clos-01
>>
>> Abstract:
>>
>>   In a CLOS topology, an OSPF (or ISIS) router may receive identical
>>   copies of an LSA (or LSP) from multiple OSPF (or ISIS) neighbors.
>>   Moreover, two OSPF (or I

[Lsr] 答复: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt

2023-11-27 Thread xuxiaohu_i...@hotmail.com
Hi Les,

I have to say that the second draft you mentioned is irrelevant since leaf 
nodes need to know the topology for performing congestion-aware path selection. 
In contrast, the goal of that draft is to make the topology invisible for leaf 
nodes.

As for the first draft, as the draft itself has admitted, “The calculations 
described here seem complex, which might lead the reader to conclude that the 
cost of calculation is so much higher than the cost of flooding that this 
optimization is counter-productive.”  Honestly speaking, it has at least four 
drawbacks:

1) it’s too complex. For example, the usage of a hash function to select the 
designated flooding nodes for a given LSP would become a nightmare when 
debugging LSP flooding issues, IMHO;

2) it’s not useful in the most popular 5-stage CLOS topology for AI networks as 
described in my draft (please refer to Facebook’s F16 DCN architecture if you 
are confused about the difference between the two 5-stage CLOS topologies 
described in these two drafts respectively) since each spine node (located at 
PoD X and connected to PoD-interconnect plane Y) must flood received LSPs 
between all leaf nodes in PoD X and all super-nodes in PoD-interconnect plane Y;

3) it’s inefficient to reduce the flooding once the topology becomes asymmetry 
due to link failure which is common in large networks;

4) it would slow down the IGP convergence which is critical for 
congestion-aware path selection since it needs to build out the Two-Hop List 
(THL) and Remote Neighbor's List (RNL) before flooding a received LSP.

Best regards,
Xiaohu



发件人: Les Ginsberg (ginsberg) 
日期: 星期六, 2023年11月25日 12:55
收件人: Tony Li , xuxiaohu_i...@hotmail.com 

抄送: lsr@ietf.org 
主题: RE: [Lsr] New Version Notification for 
draft-xu-lsr-flooding-reduction-in-clos-01.txt

Xiaohu –

I also point out that there are at least two existing drafts which specifically 
address IS-IS flooding reduction in CLOS networks and do so in greater detail 
and with more robustness than what is in your draft:

https://datatracker.ietf.org/doc/draft-ietf-lsr-distoptflood/

https://datatracker.ietf.org/doc/draft-ietf-lsr-isis-spine-leaf-ext/

I do not see a need for yet another draft specifically aimed at CLOS networks.

Note that work on draft-ietf-lsr-isis-spine-leaf-ext was suspended due to lack 
of interest in deploying an IGP solution in CLOS networks.
You are suggesting in draft-xu-lsr-fare that AI is going to change this. Well, 
maybe, but if so I think we should return to the solutions already available 
and prioritize work on them.

   Les


From: Lsr  On Behalf Of Tony Li
Sent: Thursday, November 23, 2023 8:39 AM
To: xuxiaohu_i...@hotmail.com
Cc: lsr@ietf.org
Subject: Re: [Lsr] New Version Notification for 
draft-xu-lsr-flooding-reduction-in-clos-01.txt

Hi,

What you’re proposing is already described in IS-IS Mesh Groups 
(https://www.rfc-editor.org/rfc/rfc2973.html) and improved upon in Dynamic 
Flooding 
(https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding).

Regards,
Tony



On Nov 23, 2023, at 8:29 AM, 
xuxiaohu_i...@hotmail.com<mailto:xuxiaohu_i...@hotmail.com> wrote:

Hi all,

Any comments or suggestions are welcome.

Best regards,
Xiaohu

发件人: internet-dra...@ietf.org<mailto:internet-dra...@ietf.org> 
mailto:internet-dra...@ietf.org>>
日期: 星期三, 2023年11月22日 11:37
收件人: Xiaohu Xu mailto:xuxiaohu_i...@hotmail.com>>
主题: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt
A new version of Internet-Draft draft-xu-lsr-flooding-reduction-in-clos-01.txt
has been successfully submitted by Xiaohu Xu and posted to the
IETF repository.

Name: draft-xu-lsr-flooding-reduction-in-clos
Revision: 01
Title:Flooding Reduction in CLOS Networks
Date: 2023-11-22
Group:Individual Submission
Pages:6
URL:  
https://www.ietf.org/archive/id/draft-xu-lsr-flooding-reduction-in-clos-01.txt
Status:   
https://datatracker.ietf.org/doc/draft-xu-lsr-flooding-reduction-in-clos/
HTMLized: 
https://datatracker.ietf.org/doc/html/draft-xu-lsr-flooding-reduction-in-clos
Diff: 
https://author-tools.ietf.org/iddiff?url2=draft-xu-lsr-flooding-reduction-in-clos-01

Abstract:

   In a CLOS topology, an OSPF (or ISIS) router may receive identical
   copies of an LSA (or LSP) from multiple OSPF (or ISIS) neighbors.
   Moreover, two OSPF (or ISIS) neighbors may exchange the same LSA (or
   LSP) simultaneously.  This results in unnecessary flooding of link-
   state information, which wastes the precious resources of OSPF (or
   ISIS) routers.  Therefore, this document proposes extensions to OSPF
   (or ISIS) to reduce this flooding within CLOS networks.  The
   reduction of OSPF (or ISIS) flooding is highly beneficial for
   improving the scalability of CLOS networks.



The IETF Secretariat

___
Lsr mailing list
Lsr@ietf.org<mailto:Lsr@ietf.org>
https://www

[Lsr] 答复: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt

2023-11-26 Thread xuxiaohu_i...@hotmail.com
Hi Tony,

I agree that the mesh group feature described in RFC2973 is a relatively simple 
mechanism to reduce the flooding of LSPs. However, it’s still a little bit 
complex since 1) it relies on a correct static configuration on all links; 2) 
it also modifies the behavior in the transmission of generated LSPs; 3) it 
modifies the transmission behavior of CSNP; 4) mesh groups are required to be 
connected by "transit" circuits which are ‘meshInactive’.

In contrast, the flooding reduction mechanism as proposed in my draft is much 
simpler.

Best regards,
Xiaohu








发件人: Tony Li  代表 Tony Li 
日期: 星期五, 2023年11月24日 00:39
收件人: xuxiaohu_i...@hotmail.com 
抄送: lsr@ietf.org 
主题: Re: [Lsr] New Version Notification for 
draft-xu-lsr-flooding-reduction-in-clos-01.txt

Hi Xiaohu,

What you’re proposing is already described in IS-IS Mesh Groups 
(https://www.rfc-editor.org/rfc/rfc2973.html) and improved upon in Dynamic 
Flooding 
(https://datatracker.ietf.org/doc/html/draft-ietf-lsr-dynamic-flooding).

Regards,
Tony


On Nov 23, 2023, at 8:29 AM, xuxiaohu_i...@hotmail.com wrote:

Hi all,

Any comments or suggestions are welcome.

Best regards,
Xiaohu


发件人: internet-dra...@ietf.org 
日期: 星期三, 2023年11月22日 11:37
收件人: Xiaohu Xu 
主题: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt

A new version of Internet-Draft draft-xu-lsr-flooding-reduction-in-clos-01.txt
has been successfully submitted by Xiaohu Xu and posted to the
IETF repository.

Name: draft-xu-lsr-flooding-reduction-in-clos
Revision: 01
Title:Flooding Reduction in CLOS Networks
Date: 2023-11-22
Group:Individual Submission
Pages:6
URL:  
https://www.ietf.org/archive/id/draft-xu-lsr-flooding-reduction-in-clos-01.txt
Status:   
https://datatracker.ietf.org/doc/draft-xu-lsr-flooding-reduction-in-clos/
HTMLized: 
https://datatracker.ietf.org/doc/html/draft-xu-lsr-flooding-reduction-in-clos
Diff: 
https://author-tools.ietf.org/iddiff?url2=draft-xu-lsr-flooding-reduction-in-clos-01

Abstract:

   In a CLOS topology, an OSPF (or ISIS) router may receive identical
   copies of an LSA (or LSP) from multiple OSPF (or ISIS) neighbors.
   Moreover, two OSPF (or ISIS) neighbors may exchange the same LSA (or
   LSP) simultaneously.  This results in unnecessary flooding of link-
   state information, which wastes the precious resources of OSPF (or
   ISIS) routers.  Therefore, this document proposes extensions to OSPF
   (or ISIS) to reduce this flooding within CLOS networks.  The
   reduction of OSPF (or ISIS) flooding is highly beneficial for
   improving the scalability of CLOS networks.



The IETF Secretariat


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] 答复: New Version Notification for draft-xu-lsr-fare-00.txt

2023-11-23 Thread xuxiaohu_i...@hotmail.com
Hi Tony,

Thank you for your comment. I fully agree with your observation that it is 
crucial to reduce the risk of oscillation when propagating real-time topology 
and link congestion information across the network. Since link congestion 
information is significantly dynamic, it's essential to use the threshold of 
available link capacity variation to trigger and suppress an update. At least, 
link capacity information that is relatively stable can be used to achieve 
global load-balancing, particularly when multiple physical links are deployed 
between peers.

Some network chip vendors have adopted their proprietary signals to propagate 
congestion information for the purpose of global load-balancing. The draft just 
proposes an alternative approach based on open standards which could be used 
across different network chips.

Best regards,
Xiaohu



发件人: Tony Li  代表 Tony Li 
日期: 星期五, 2023年11月24日 00:36
收件人: xuxiaohu_i...@hotmail.com 
抄送: lsr@ietf.org 
主题: Re: [Lsr] New Version Notification for draft-xu-lsr-fare-00.txt


Hi Xiaohu,

One way of achieving this would be to use the Unreserved Bandwidth TLV 
(https://datatracker.ietf.org/doc/html/rfc5305#autoid-10) to report the unused 
bandwidth on a link.

Then, you would have to explain how this does not become an oscillator. I’m not 
optimistic.

Regards,
Tony


On Nov 23, 2023, at 8:27 AM, xuxiaohu_i...@hotmail.com wrote:

Hi all,

Any comments or suggestions are welcome.

Best regards,
Xiaohu



发件人: internet-dra...@ietf.org 
日期: 星期五, 2023年11月24日 00:13
收件人: Xiaohu Xu 
主题: New Version Notification for draft-xu-lsr-fare-00.txt

A new version of Internet-Draft draft-xu-lsr-fare-00.txt has been successfully
submitted by Xiaohu Xu and posted to the
IETF repository.

Name: draft-xu-lsr-fare
Revision: 00
Title:Fully Adaptive Routing Ethernet
Date: 2023-11-22
Group:Individual Submission
Pages:7
URL:  https://www.ietf.org/archive/id/draft-xu-lsr-fare-00.txt
Status:   https://datatracker.ietf.org/doc/draft-xu-lsr-fare/
HTMLized: https://datatracker.ietf.org/doc/html/draft-xu-lsr-fare


Abstract:

   Large language models (LLMs) like ChatGPT have become increasingly
   popular in recent years due to their impressive performance in
   various natural language processing tasks.  These models are built by
   training deep neural networks on massive amounts of text data, often
   consisting of billions or even trillions of parameters.  However, the
   training process for these models can be extremely resource-
   intensive, requiring the deployment of thousands or even tens of
   thousands of GPUs in a single AI training cluster.  Therefore, three-
   stage or even five-stage CLOS networks are commonly adopted for AI
   networks.  The non-blocking nature of the network become increasingly
   critical for large-scale AI models.  Therefore, adaptive routing is
   necessary to dynamically load balance traffic to the same destination
   over multiple ECMP paths, based on network capacity and even
   congestion information along those paths.



The IETF Secretariat


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr

___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] 转发: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt

2023-11-23 Thread xuxiaohu_i...@hotmail.com
Hi all,

Any comments or suggestions are welcome.

Best regards,
Xiaohu


发件人: internet-dra...@ietf.org 
日期: 星期三, 2023年11月22日 11:37
收件人: Xiaohu Xu 
主题: New Version Notification for draft-xu-lsr-flooding-reduction-in-clos-01.txt

A new version of Internet-Draft draft-xu-lsr-flooding-reduction-in-clos-01.txt
has been successfully submitted by Xiaohu Xu and posted to the
IETF repository.

Name: draft-xu-lsr-flooding-reduction-in-clos
Revision: 01
Title:Flooding Reduction in CLOS Networks
Date: 2023-11-22
Group:Individual Submission
Pages:6
URL:  
https://www.ietf.org/archive/id/draft-xu-lsr-flooding-reduction-in-clos-01.txt
Status:   
https://datatracker.ietf.org/doc/draft-xu-lsr-flooding-reduction-in-clos/
HTMLized: 
https://datatracker.ietf.org/doc/html/draft-xu-lsr-flooding-reduction-in-clos
Diff: 
https://author-tools.ietf.org/iddiff?url2=draft-xu-lsr-flooding-reduction-in-clos-01

Abstract:

   In a CLOS topology, an OSPF (or ISIS) router may receive identical
   copies of an LSA (or LSP) from multiple OSPF (or ISIS) neighbors.
   Moreover, two OSPF (or ISIS) neighbors may exchange the same LSA (or
   LSP) simultaneously.  This results in unnecessary flooding of link-
   state information, which wastes the precious resources of OSPF (or
   ISIS) routers.  Therefore, this document proposes extensions to OSPF
   (or ISIS) to reduce this flooding within CLOS networks.  The
   reduction of OSPF (or ISIS) flooding is highly beneficial for
   improving the scalability of CLOS networks.



The IETF Secretariat


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr


[Lsr] 转发: New Version Notification for draft-xu-lsr-fare-00.txt

2023-11-23 Thread xuxiaohu_i...@hotmail.com
Hi all,

Any comments or suggestions are welcome.

Best regards,
Xiaohu



发件人: internet-dra...@ietf.org 
日期: 星期五, 2023年11月24日 00:13
收件人: Xiaohu Xu 
主题: New Version Notification for draft-xu-lsr-fare-00.txt

A new version of Internet-Draft draft-xu-lsr-fare-00.txt has been successfully
submitted by Xiaohu Xu and posted to the
IETF repository.

Name: draft-xu-lsr-fare
Revision: 00
Title:Fully Adaptive Routing Ethernet
Date: 2023-11-22
Group:Individual Submission
Pages:7
URL:  https://www.ietf.org/archive/id/draft-xu-lsr-fare-00.txt
Status:   https://datatracker.ietf.org/doc/draft-xu-lsr-fare/
HTMLized: https://datatracker.ietf.org/doc/html/draft-xu-lsr-fare


Abstract:

   Large language models (LLMs) like ChatGPT have become increasingly
   popular in recent years due to their impressive performance in
   various natural language processing tasks.  These models are built by
   training deep neural networks on massive amounts of text data, often
   consisting of billions or even trillions of parameters.  However, the
   training process for these models can be extremely resource-
   intensive, requiring the deployment of thousands or even tens of
   thousands of GPUs in a single AI training cluster.  Therefore, three-
   stage or even five-stage CLOS networks are commonly adopted for AI
   networks.  The non-blocking nature of the network become increasingly
   critical for large-scale AI models.  Therefore, adaptive routing is
   necessary to dynamically load balance traffic to the same destination
   over multiple ECMP paths, based on network capacity and even
   congestion information along those paths.



The IETF Secretariat


___
Lsr mailing list
Lsr@ietf.org
https://www.ietf.org/mailman/listinfo/lsr