On Thu, 4 Mar 2021 22:26:58 +0100 Eric Dumazet wrote: > It would be nice if tun driver would have the ability to delay TX > completions by N usecs, > so that packetdrill tests could be used. > > It is probably not too hard to add such a feature.
Add an ioctl to turn off the skb_orphan, queue the skbs in tun_do_read() to free them from a timer? > I was testing this (note how I also removed the tcp_rearm_rto(sk) call) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index > 6f450e577975c7be9537338c8a4c0673d7d36c4c..9ef92ca55e530f76ad793d7342442c4ec06165f7 > 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -6471,11 +6471,10 @@ static bool tcp_rcv_fastopen_synack(struct > sock *sk, struct sk_buff *synack, > tcp_fastopen_cache_set(sk, mss, cookie, syn_drop, try_exp); > > if (data) { /* Retransmit unacked data in SYN */ > - skb_rbtree_walk_from(data) { > - if (__tcp_retransmit_skb(sk, data, 1)) > - break; > - } > - tcp_rearm_rto(sk); > + skb_rbtree_walk_from(data) > + tcp_mark_skb_lost(sk, data); > + > + tcp_xmit_retransmit_queue(sk); > NET_INC_STATS(sock_net(sk), > LINUX_MIB_TCPFASTOPENACTIVEFAIL); > return true; AFAICT this works great now: ==> TFO case ret:-16 (0) ca_state:0 skb:ffff8881d3513800! FREED swapper/5 -- skb 0xffff8881d3513800 freed after: 1us ----- First: __tcp_retransmit_skb+1 tcp_retransmit_skb+18 tcp_xmit_retransmit_queue.part.70+339 tcp_rcv_state_process+2491 tcp_v6_do_rcv+405 tcp_v6_rcv+2984 Second: __tcp_retransmit_skb+1 tcp_retransmit_skb+18 tcp_xmit_retransmit_queue.part.70+339 tcp_tsq_write.part.71+146 tcp_tsq_handler+53 tcp_tasklet_func+181 sk:0xffff8885adc16f00 skb:ffff8881d3513800 --- 61us acked:1 The other case where we hit RTO after __tcp_retransmit_skb() fails is: ==> non-TFO case ret:-11 (0) ca_state:3 skb:ffff8883d71dd400! ----- First: __tcp_retransmit_skb+1 tcp_retransmit_skb+18 tcp_xmit_retransmit_queue.part.70+339 tcp_ack+2270 tcp_rcv_established+303 tcp_v6_do_rcv+190 Second: __tcp_retransmit_skb+1 tcp_retransmit_skb+18 tcp_retransmit_timer+716 tcp_write_timer_handler+136 tcp_write_timer+141 call_timer_fn+43 sk:0xffff88801772d340 skb:ffff8883d71dd400 --- 51738us acked:47324 Which I believe is this: if (refcount_read(&sk->sk_wmem_alloc) > min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf)) return -EAGAIN; Because __tcp_retransmit_skb() seems to bail before inet6_sk_rebuild_header gets called. Should we arm TSQ here as well?