[PATCH v2] sctp: avoid refreshing heartbeat timer too often

2016-04-06 Thread Marcelo Ricardo Leitner
Currently on high rate SCTP streams the heartbeat timer refresh can
consume quite a lot of resources as timer updates are costly and it
contains a random factor, which a) is also costly and b) invalidates
mod_timer() optimization for not editing a timer to the same value.
It may even cause the timer to be slightly advanced, for no good reason.

As suggested by David Laight this patch now removes this timer update
from hot path by leaving the timer on and re-evaluating upon its
expiration if the heartbeat is still needed or not, similarly to what is
done for TCP. If it's not needed anymore the timer is re-scheduled to
the new timeout, considering the time already elapsed.

For this, we now record the last tx timestamp per transport, updated in
the same spots as hb timer was restarted on tx. Also split up
sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
heartbeat one.

On loopback with MTU of 65535 and data chunks with 1636, so that we
have a considerable amount of chunks without stressing system calls,
netperf -t SCTP_STREAM -l 30, perf looked like this before:

Samples: 103K of event 'cpu-clock', Event count (approx.): 2583300
  Overhead  Command  Shared Object  Symbol
+6,15%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
-5,43%  netperf  [kernel.vmlinux]   [k] _raw_write_unlock_irqrestore
   - _raw_write_unlock_irqrestore
  - 96,54% _raw_spin_unlock_irqrestore
 - 36,14% mod_timer
+ 97,24% sctp_transport_reset_timers
+ 2,76% sctp_do_sm
 + 33,65% __wake_up_sync_key
 + 28,77% sctp_ulpq_tail_event
 + 1,40% del_timer
  - 1,84% mod_timer
 + 99,03% sctp_transport_reset_timers
 + 0,97% sctp_do_sm
  + 1,50% sctp_ulpq_tail_event

And after this patch, now with netperf -l 60:

Samples: 230K of event 'cpu-clock', Event count (approx.): 5770725
  Overhead  Command  Shared Object  Symbol
+5,65%  netperf  [kernel.vmlinux]   [k] memcpy_erms
+5,59%  netperf  [kernel.vmlinux]   [k] copy_user_enhanced_fast_string
-5,05%  netperf  [kernel.vmlinux]   [k] _raw_spin_unlock_irqrestore
   - _raw_spin_unlock_irqrestore
  + 49,89% __wake_up_sync_key
  + 45,68% sctp_ulpq_tail_event
  - 2,85% mod_timer
 + 76,51% sctp_transport_reset_t3_rtx
 + 23,49% sctp_do_sm
  + 1,55% del_timer
+2,50%  netperf  [sctp] [k] sctp_datamsg_from_user
+2,26%  netperf  [sctp] [k] sctp_sendmsg

Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
~3.7%.

Signed-off-by: Marcelo Ricardo Leitner 
---
v2:
  patch re-checked, fixed indentation issue.

 include/net/sctp/structs.h |  8 +++-
 net/sctp/outqueue.c| 15 ++-
 net/sctp/sm_make_chunk.c   |  3 +--
 net/sctp/sm_sideeffect.c   | 36 
 net/sctp/transport.c   | 19 +--
 5 files changed, 47 insertions(+), 34 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 
6df1ce7a411c548bda4163840a90578b6e1b4cfe..5a404c354f4c787e57a6ab6ca68b3ffc537eb1cc
 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -847,6 +847,11 @@ struct sctp_transport {
 */
ktime_t last_time_heard;
 
+   /* When was the last time that we sent a chunk using this
+* transport? We use this to check for idle transports
+*/
+   unsigned long last_time_sent;
+
/* Last time(in jiffies) when cwnd is reduced due to the congestion
 * indication based on ECNE chunk.
 */
@@ -952,7 +957,8 @@ void sctp_transport_route(struct sctp_transport *, union 
sctp_addr *,
  struct sctp_sock *);
 void sctp_transport_pmtu(struct sctp_transport *, struct sock *sk);
 void sctp_transport_free(struct sctp_transport *);
-void sctp_transport_reset_timers(struct sctp_transport *);
+void sctp_transport_reset_t3_rtx(struct sctp_transport *);
+void sctp_transport_reset_hb_timer(struct sctp_transport *);
 int sctp_transport_hold(struct sctp_transport *);
 void sctp_transport_put(struct sctp_transport *);
 void sctp_transport_update_rto(struct sctp_transport *, __u32);
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 
8d3d3625130ee0fd294998554a9290d57eae56e7..084718f9b3dad09e21e41e34b989e25627058c98
 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -866,8 +866,10 @@ static int sctp_outq_flush(struct sctp_outq *q, int 
rtx_timeout, gfp_t gfp)
 * sender MUST assure that at least one T3-rtx
 * timer is running.
 */
-   if (chunk->chunk_hdr->type == SCTP_CID_FWD_TSN)
-   sctp_transport_reset_timers(transport);
+   if (chunk->chunk_hdr->type == SCTP_CID_FW

Re: [PATCH v2] sctp: avoid refreshing heartbeat timer too often

2016-04-10 Thread David Miller
From: Marcelo Ricardo Leitner 
Date: Wed,  6 Apr 2016 15:15:19 -0300

> Currently on high rate SCTP streams the heartbeat timer refresh can
> consume quite a lot of resources as timer updates are costly and it
> contains a random factor, which a) is also costly and b) invalidates
> mod_timer() optimization for not editing a timer to the same value.
> It may even cause the timer to be slightly advanced, for no good reason.
> 
> As suggested by David Laight this patch now removes this timer update
> from hot path by leaving the timer on and re-evaluating upon its
> expiration if the heartbeat is still needed or not, similarly to what is
> done for TCP. If it's not needed anymore the timer is re-scheduled to
> the new timeout, considering the time already elapsed.
> 
> For this, we now record the last tx timestamp per transport, updated in
> the same spots as hb timer was restarted on tx. Also split up
> sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
> sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
> heartbeat one.
> 
> On loopback with MTU of 65535 and data chunks with 1636, so that we
> have a considerable amount of chunks without stressing system calls,
> netperf -t SCTP_STREAM -l 30, perf looked like this before:
 . ..
> And after this patch, now with netperf -l 60:
 ...
> Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
> ~3.7%.
> 
> Signed-off-by: Marcelo Ricardo Leitner 

Applied, thanks Marcelo.