Dear Nitsan:
It relies on the controller to configure cache resources, combined with
routing devices' cache threshold settings and backpressure message
interactions. In accordance with preset trigger conditions and cross-device
buffer coordination processes, it ensures that router buffers absorb traffic
bursts within the time.
For details, please refer to clause 3.2 of the
draft-hs-rtgwg-wan-lossless-framework. The content is as follows:
3.2. Use and Management of Multi-Level Network Buffers
Since temporary bandwidth is shared and not dedicated, it exhibits
weaker SLA guarantees. If traffic experiences jitter during
transmission, network device buffers can absorb packets to reduce
packet loss.
3.2.1. Specific Requirements:
* *Single Device Buffer Sharing and Management*: Single devices
should implement fine-grained buffer divisions based on traffic
priority and slice. These buffers should be isolated to avoid
mutual interference. Initial buffer resource allocation is
determined by the controller and configured across all devices in
the domain via control plane protocols.
* *Cross-Device Buffer Coordination*: Given the nature of large data
transmissions, a single device's buffer might be insufficient for
absorbing bursty traffic. Therefore, multiple devices' buffers of
the same fine-grained type (e.g., same priority and slice) should
be used collectively. For example, if device C in the path
A->B->C is congested and its buffer is insufficient, it should
notify upstream devices B or A to utilize their similar buffers to
absorb some traffic. This involves:
- Control Signaling: Using control signaling packets to notify
upstream devices to buffer packets, reducing the burden on the
congested device. If upstream device buffers also reach a
threshold, further notifications should be triggered upstream.
Control signaling should include buffer index (e.g., slice ID),
control instructions, and parameters. Controller configuration
or segment routing can help determine upstream device
addresses. Upon congestion relief, upstream devices should be
notified to release buffered traffic. This notification
mechanism can be inspired by IEEE PFC mechanisms but requires
more granular backpressure.
- Trigger Conditions for Buffer Coordination: The local device-
triggering cross-device buffer coordination requires pre-set
conditions. Controllers can configure device-specific
thresholds to customize trigger conditions for each device,
slice, and priority.
BR,
Zhengxin
Zhengxin Han
发件人: Nitsan Dolev
发送时间: 2025-07-24 20:12
收件人: 韩政鑫(联通集团本部); Tony Li
抄送: rtgwg
主题: [rtgwg] Re: [EXTERNAL] Re: Continue discussion on “Use Cases, Requirements,
and Framework for Implementing Lossless Techniques in Wide Area Networks”
presentation in the RTGWG
Dear Zhengxin,
Could you please explain who one can ensure that your below proposal can be
done within the time neighboring router buffers absorb bursts?
Looking forward,
Nitsan Dolev
From: 韩政鑫(联通集团本部) <[email protected]>
Sent: Thursday, July 24, 2025 1:56 PM
To: Tony Li <[email protected]>
Cc: rtgwg <[email protected]>
Subject: [EXTERNAL] [rtgwg] Re: Continue discussion on “Use Cases,
Requirements, and Framework for Implementing Lossless Techniques in Wide Area
Networks” presentation in the RTGWG
Hi Tony,
We are not focused on QoS. Instead, we aim to utilize the large buffers of
routers to notify upstream devices to slow down or pause packet transmission
before the congestion queue is full, thereby achieving losslessness.
The cache-based retransmission method has a significant negative impact on
performance, so this is not a good choice.
Of course, directly applying traditional PFC to wide area networks still poses
challenges. Therefore, we are proposing an enhanced PFC mechanism.
BR,
Zhengxin
在 2025年7月24日,18:01,Tony Li <[email protected]> 写道:
【本邮件为外部邮件,请注意核实发件人身份,并谨慎处理邮件内容中的链接及附件】
Apparently, I still don't understand your requirements. You say lossless, yet
you aren't willing to deal with retransmissions. This would seem to be
problematic when there are link errors.
If what you seek is simply QoS, well, we've solved that problem before. Flow
control is not necessary.
T
On Thu, Jul 24, 2025 at 11:49 AM 韩政鑫(联通集团本部) <[email protected]> wrote:
Hi Tony,
Thanks for your valuable historical perspective. We’ve reviewed materials on
LAPB and X.25 networks, and it’s true that early approaches like LAPB had
limitations—leading Internet designers to adopt a different architectural path.
LAPB in X.25 relies on hop - by - hop retransmission for error correction,
introducing significant latency and throughput bottlenecks. However, modern
flow control mechanisms, such as PFC, detect queue thresholds and rapidly
throttle traffic upstream of congestion points. This actively prevents
congestion without retransmission, using backpressure with extremely low
latency.
We fully agree that preventing congestion for all traffic across the entire
network is impractical and would incur severe costs. Instead of targeting all
traffic, we prioritize high-priority services to ensure their performance. Is
there value in precisely preventing congestion for high - priority flows to
reduce packet loss and guarantee high throughput for RDMA transmission over
long distance? This is why we propose tenant / flow-level refined flow control
is necessary.
Additionally, we believe upgrading all network devices is not feasible. There
should be a lightweight, cross - hop technical solution. For example, only the
routers at both ends are upgraded. In special cases, such as when the distance
is quite long, a few intermediate nodes may be further upgraded to quickly
alleviate congestion.
BR,
ZhengXin
Zhengxin Han
发件人: Tony Li
发送时间: 2025-07-23 21:29
收件人:
抄送: rtgwg; shavitt; 庞冉(联通集团本部); 阮征(联通集团本部)
主题: [rtgwg] Re: Continue discussion on “Use Cases, Requirements, and Framework
for Implementing Lossless Techniques in Wide Area Networks” presentation in the
RTGWG
【本邮件为外部邮件,请注意核实发件人身份,并谨慎处理邮件内容中的链接及附件】
Hi,
If your goal is to prevent congestion loss in the network, then you will find
that you effectively need to prevent congestion in the network.
That is possible and has been done before. The approach for doing this is to
ensure that each router has flow-control and retransmission at the link layer.
You also need to extend this back to the originating hosts.
This has been done before. See the LAPB link layer protocol that underlies
X.25 networks. The performance implications are rather severe.
You might consider that these approaches are an entirely different architecture
that the Internet designers decided to avoid back around 1969.
Regards,
Tony
On Wed, Jul 23, 2025 at 3:14 PM <"韩政鑫(联通集团本部)"@mf1-de.cloudmails.net> wrote:
Hi all,
We gave a presentation in the RTGWG session, focusing on the topic “Use
Cases, Requirements, and Framework for Implementing Lossless Techniques in Wide
Area Networks”. During the meeting, we got two comments. Since time limited
there,we can continue the discussion over this email list.
1、Shouldn’t this be handled at layer four (the transport layer) or the
application layer using forward error correction(FEC)? That way, it can be
solved end - to - end, instead of requiring further communication between
routing devices. (Comment from Yuval SHAVITT).
Response:
FEC is to detect and correct bit errors in data transmission, which ensures
data integrity and reduces packet loss caused by bit errors. However, our
primary focus is on packet loss resulting from network congestion due to
traffic aggregation and bursts,and such packet loss significantly affects RDMA
throughput and transmission efficiency.
To address this, we propose using fine-grained flow control mechanisms (e.g.,
enhanced PFC) in WAN between the routing devices to promptly mitigate
congestion, achieving extremely low packet loss rate, and guarantee efficient
RDMA transmissions over long distance. Meanwhile, to avoid large-scale upgrades
of network device, we have also submitted a draft to the spring working group
that supports cross-hop flow control notification and processing
(https://datatracker.ietf.org/doc/draft-ruan-spring-priority-flow-control-sid/).
Admittedly, end-to-end solutions at layer four or the application layer, such
as fast source rate control notifications (e.g., ECN, Fast CNP) are also
integrated into our framework to tackle issues from the source end.
Nevertheless, WAN has long RTTs, these mechanisms may suffer from delayed
responses, limiting their effectiveness in rapidly alleviating congestion.
We think network device optimizations and end-side improvements are
complementary rather than conflicting. Similar to data center networks,
combining network-layer technologies with transport/application layer
mechanisms can achieve lossless transmission. Besides, as communication
operators, we focus more on the network side and hope to further reduce the
packet loss rate in WANs to provide robust network services for upper-layer
applications.
2、Regarding the relationship with DetNet, here are some of our thoughts, and we
welcome further discussions and insights from the DetNet.
Deterministic networking typically emphasizes bounded low latency and jitter,
catering to latency critical scenarios like industrial control. Our current
focus, however, is on efficient transmission of massive TB/PB level data over
long-distance, for example, distributed AI training and inference across
geographically dispersed data centers.
From our view, deterministic networking can achieve lossless transmission (with
zero packet loss) through pre-resource reservation and time-slot-based
scheduling. Does the deterministic network eliminate network congestion
entirely? Additionally, lossless transmission (with extremely low packet loss
nearly 0) could also be achieved by congestion control, path optimization, QoS
etc. So does each approach is suited to different scenarios, with varying
trade-offs between effectiveness and implementation costs?
Draft links:
https://datatracker.ietf.org/doc/draft-hs-rtgwg-wan-lossless-uc
https://datatracker.ietf.org/doc/draft-hs-rtgwg-wan-lossless-framework/
Any feedback and comments are welcome!
Best Regards,
Zhengxin Han
Zhengxin Han
Next Generation Internet Research Department
Research Institute
CHINA UNITED NETWORK COMMUNICATIONS CORPORATION LIMITED
Mobile: +86-18601275531
E-mail: [email protected]
_______________________________________________
rtgwg mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Disclaimer
This e-mail together with any attachments may contain information of Ribbon
Communications Inc. and its Affiliates that is confidential and/or proprietary
for the sole use of the intended recipient. Any review, disclosure, reliance or
distribution by others or forwarding without express permission is strictly
prohibited. If you are not the intended recipient, please notify the sender
immediately and then delete all copies, including any attachments.
_______________________________________________
rtgwg mailing list -- [email protected]
To unsubscribe send an email to [email protected]