When an application locks SO_RCVBUF, it expects strict memory bounds and
disables TCP window auto-tuning. However, recent TCP memory fragmentation
optimizations still apply dynamic truesize penalties to the `scaling_ratio`
of these locked sockets.
For workloads processing small, fragmented packets (like Java's Tomcat),
this penalty drops the scaling_ratio to 1. This shrinks the dynamically
calculated advertised window, leading to Silly Window Syndrome (SWS)
deadlocks and 504 Gateway Timeouts.
This patch fixes the issue by bypassing the truesize penalty for sockets
with `SOCK_RCVBUF_LOCK` set. To ensure the kernel still defends against
memory exhaustion from large aggregate payloads (e.g., GRO), the penalty
is still applied if `skb->len` exceeds the advertised MSS.
Fixes: a2cbb1603943 ("tcp: Update window clamping condition")
Reported-by: Karen Badiryan <[email protected]>
Signed-off-by: Ankit Jain <[email protected]>
---
net/ipv4/tcp_input.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d5c9e65d9760..569299dafa88 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -240,8 +240,14 @@ static void tcp_measure_rcv_mss(struct sock *sk, const
struct sk_buff *skb)
/* Note: divides are still a bit expensive.
* For the moment, only adjust scaling_ratio
* when we update icsk_ack.rcv_mss.
+ *
+ * Protect locked SO_RCVBUF from Silly Window Syndrome
+ * due to truesize penalties on small packets. Allow
+ * penalty if aggregate payload (e.g., GRO) exceeds MSS.
*/
- if (unlikely(len != icsk->icsk_ack.rcv_mss)) {
+ if (unlikely(len != icsk->icsk_ack.rcv_mss &&
+ (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK) ||
+ skb->len > tcp_sk(sk)->advmss))) {
u64 val = (u64)skb->len << TCP_RMEM_TO_WIN_SCALE;
u8 old_ratio = tcp_sk(sk)->scaling_ratio;
--
2.53.0