Hi Neal,

I tried out your new suggested patches and indeed it looks like it is
working. The duration of the freezes looks like it has reduced from an
RTO to 10ms (tcp probe reports SRTT measurements of about 200us just
before the freeze). So the PTO value seems to be correctly set to
max(2*SRTT, 10ms).

Best,
-Steve

On Tue, Nov 21, 2017 at 7:46 PM, Neal Cardwell <ncardw...@google.com> wrote:
> On Tue, Nov 21, 2017 at 10:02 PM, Steve Ibanez <siba...@stanford.edu> wrote:
>> Hi Neal,
>>
>> I just tried out your fix for enabling TLPs in the CWR state (while
>> leaving tcp_tso_should_defer() unchanged), but I'm still seeing the
>> host enter long timeouts. Feel free to let me know if there is
>> something else you'd like me to try.
>
> Oh, interesting. That was surprising to me, until I re-read the TLP
> code. I think the TLP code is accidentally preventing the TLP timer
> from being set in cases where TSO deferral is using cwnd unused,
> because of this part of the logic:
>
>   if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&
>       !tcp_write_queue_empty(sk))
>         return false;
>
> AFAICT it would be great for the TLP timer to be set if TSO deferral
> decides not to send. That way the TLP timer firing can unwedge the
> connection (in only a few milliseconds in LAN cases) if TSO deferral
> decides to defer sending and ACK stop arriving. Removing those 3 lines
> might allow TLP to give us much of the benefit of having a timer to
> unwedge things after TSO deferral, without adding any new timers or
> code.
>
> If you have time, would you be able to try the following two patches
> together in your test set-up?
>
> (1)
> commit 1ade85cd788cfed0433a83da03e299f396769c73
> Author: Neal Cardwell <ncardw...@google.com>
> Date:   Tue Nov 21 22:33:30 2017 -0500
>
>     tcp: allow TLP in CWR
>
>     (Also allows TLP in disorder, though this is somewhat academic, since
>     in disorder RACK will almost always override the TLP timer with a
>     reorder timeout.)
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 4ea79b2ad82e..deccf8070f84 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2536,11 +2536,11 @@ bool tcp_schedule_loss_probe(struct sock *sk,
> bool advancing_rto)
>
>         early_retrans = sock_net(sk)->ipv4.sysctl_tcp_early_retrans;
>         /* Schedule a loss probe in 2*RTT for SACK capable connections
> -        * in Open state, that are either limited by cwnd or application.
> +        * not in loss recovery, that are either limited by cwnd or 
> application.
>          */
>         if ((early_retrans != 3 && early_retrans != 4) ||
>             !tp->packets_out || !tcp_is_sack(tp) ||
> -           icsk->icsk_ca_state != TCP_CA_Open)
> +           icsk->icsk_ca_state >= TCP_CA_Recovery)
>                 return false;
>
>         if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&
>
> (2)
> commit ccd377a601c14dc82826720d93afb573a388022e (HEAD ->
> gnetnext8xx-tlp-recalc-rto-on-ack-fix-v4)
> Author: Neal Cardwell <ncardw...@google.com>
> Date:   Tue Nov 21 22:34:42 2017 -0500
>
>     tcp: allow scheduling TLP timer if TSO deferral leaves some cwnd unused
>
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index deccf8070f84..1724cc2bbf1a 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2543,10 +2543,6 @@ bool tcp_schedule_loss_probe(struct sock *sk,
> bool advancing_rto)
>             icsk->icsk_ca_state >= TCP_CA_Recovery)
>                 return false;
>
> -       if ((tp->snd_cwnd > tcp_packets_in_flight(tp)) &&
> -            !tcp_write_queue_empty(sk))
> -               return false;
> -
>         /* Probe timeout is 2*rtt. Add minimum RTO to account
>          * for delayed ack when there's one outstanding packet. If no RTT
>          * sample is available then probe after TCP_TIMEOUT_INIT.
>
> Thanks!
>
> neal

Reply via email to