Re: [PATCH v2 net-next] softirq: reduce latencies

2013-01-05 Thread Sedat Dilek
[ QUOTE ]

From: Eric Dumazet 
Date: Thu, 03 Jan 2013 23:49:40 -0800

> From: Eric Dumazet 
>
> In various network workloads, __do_softirq() latencies can be up
> to 20 ms if HZ=1000, and 200 ms if HZ=100.
>
> This is because we iterate 10 times in the softirq dispatcher,
> and some actions can consume a lot of cycles.
>
> This patch changes the fallback to ksoftirqd condition to :
>
> - A time limit of 2 ms.
> - need_resched() being set on current task
>
> When one of this condition is met, we wakeup ksoftirqd for further
> softirq processing if we still have pending softirqs.
 ...
> Signed-off-by: Eric Dumazet 

Acked-by: David S. Miller 

[ /QUOTE ]

Tested-by: Sedat Dilek  (against Linux v3.8-rc2)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 net-next] softirq: reduce latencies

2013-01-04 Thread David Miller
From: Eric Dumazet 
Date: Thu, 03 Jan 2013 23:49:40 -0800

> From: Eric Dumazet 
> 
> In various network workloads, __do_softirq() latencies can be up
> to 20 ms if HZ=1000, and 200 ms if HZ=100.
> 
> This is because we iterate 10 times in the softirq dispatcher,
> and some actions can consume a lot of cycles.
> 
> This patch changes the fallback to ksoftirqd condition to :
> 
> - A time limit of 2 ms.
> - need_resched() being set on current task
> 
> When one of this condition is met, we wakeup ksoftirqd for further
> softirq processing if we still have pending softirqs.
 ...
> Signed-off-by: Eric Dumazet 

Acked-by: David S. Miller 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 net-next] softirq: reduce latencies

2013-01-04 Thread Eric Dumazet
On Fri, 2013-01-04 at 01:12 -0800, Joe Perches wrote:
> On Fri, 2013-01-04 at 00:23 -0800, Eric Dumazet wrote:
> > On Fri, 2013-01-04 at 00:15 -0800, Joe Perches wrote:
> > > Perhaps MAX_SOFTIRQ_TIME should be
> > > #define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
> > > though it would be nicer if it were a compile time constant.
> > 
> > If you send a patch to convert msecs_to_jiffies() to an inline function
> > when HZ = 1000, I will gladly use it instead of (2*HZ/1000)
> > 
> > Right now, max(1, msecs_to_jiffies(2)) uses way too many instructions,
> > while it should be the constant 2, known at compile time.
> 
> Something like this might work.
> 
> This is incomplete, it just does msecs_to_jiffies,
> and it should convert usecs_to_jiffies and the
> jiffies_to_ types too.
> 
> Maybe it's worthwhile.
> 
> It does reduce object size by 16 bytes per call site
> (x86-32) when the argument is a constant. There are
> about 800 of these jiffies conversions in kernel sources.
> 
> What do you think?
> 

I think this is something to discuss in another thread, and definitely
worth to do, at least for msecs_to_jiffies()

We have many HZ references everywhere that could be cleaned up using
this.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 net-next] softirq: reduce latencies

2013-01-04 Thread Joe Perches
On Fri, 2013-01-04 at 00:23 -0800, Eric Dumazet wrote:
> On Fri, 2013-01-04 at 00:15 -0800, Joe Perches wrote:
> > Perhaps MAX_SOFTIRQ_TIME should be
> > #define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
> > though it would be nicer if it were a compile time constant.
> 
> If you send a patch to convert msecs_to_jiffies() to an inline function
> when HZ = 1000, I will gladly use it instead of (2*HZ/1000)
> 
> Right now, max(1, msecs_to_jiffies(2)) uses way too many instructions,
> while it should be the constant 2, known at compile time.

Something like this might work.

This is incomplete, it just does msecs_to_jiffies,
and it should convert usecs_to_jiffies and the
jiffies_to_ types too.

Maybe it's worthwhile.

It does reduce object size by 16 bytes per call site
(x86-32) when the argument is a constant. There are
about 800 of these jiffies conversions in kernel sources.

What do you think?

 include/linux/jiffies.h | 49 -
 kernel/time.c   | 42 +++---
 2 files changed, 51 insertions(+), 40 deletions(-)

diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h
index 82ed068..c67ddcf 100644
--- a/include/linux/jiffies.h
+++ b/include/linux/jiffies.h
@@ -291,7 +291,54 @@ extern unsigned long preset_lpj;
  */
 extern unsigned int jiffies_to_msecs(const unsigned long j);
 extern unsigned int jiffies_to_usecs(const unsigned long j);
-extern unsigned long msecs_to_jiffies(const unsigned int m);
+extern unsigned long __msecs_to_jiffies(const unsigned int m);
+
+static inline unsigned long __inline_msecs_to_jiffies(const unsigned int m)
+{
+   /*
+* Negative value, means infinite timeout:
+*/
+   if ((int)m < 0)
+   return MAX_JIFFY_OFFSET;
+
+#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ)
+   /*
+* HZ is equal to or smaller than 1000, and 1000 is a nice
+* round multiple of HZ, divide with the factor between them,
+* but round upwards:
+*/
+   return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ);
+#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC)
+   /*
+* HZ is larger than 1000, and HZ is a nice round multiple of
+* 1000 - simply multiply with the factor between them.
+*
+* But first make sure the multiplication result cannot
+* overflow:
+*/
+   if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET))
+   return MAX_JIFFY_OFFSET;
+
+   return m * (HZ / MSEC_PER_SEC);
+#else
+   /*
+* Generic case - multiply, round and divide. But first
+* check that if we are doing a net multiplication, that
+* we wouldn't overflow:
+*/
+   if (HZ > MSEC_PER_SEC && m > jiffies_to_msecs(MAX_JIFFY_OFFSET))
+   return MAX_JIFFY_OFFSET;
+
+   return (MSEC_TO_HZ_MUL32 * m + MSEC_TO_HZ_ADJ32)
+   >> MSEC_TO_HZ_SHR32;
+#endif
+}
+
+#define msecs_to_jiffies(x)\
+   (__builtin_constant_p(x) ?  \
+__inline_msecs_to_jiffies(x) : \
+__msecs_to_jiffies(x))
+
 extern unsigned long usecs_to_jiffies(const unsigned int u);
 extern unsigned long timespec_to_jiffies(const struct timespec *value);
 extern void jiffies_to_timespec(const unsigned long jiffies,
diff --git a/kernel/time.c b/kernel/time.c
index d226c6a..231f2ac 100644
--- a/kernel/time.c
+++ b/kernel/time.c
@@ -425,47 +425,11 @@ EXPORT_SYMBOL(ns_to_timeval);
  *
  * We must also be careful about 32-bit overflows.
  */
-unsigned long msecs_to_jiffies(const unsigned int m)
+unsigned long __msecs_to_jiffies(const unsigned int m)
 {
-   /*
-* Negative value, means infinite timeout:
-*/
-   if ((int)m < 0)
-   return MAX_JIFFY_OFFSET;
-
-#if HZ <= MSEC_PER_SEC && !(MSEC_PER_SEC % HZ)
-   /*
-* HZ is equal to or smaller than 1000, and 1000 is a nice
-* round multiple of HZ, divide with the factor between them,
-* but round upwards:
-*/
-   return (m + (MSEC_PER_SEC / HZ) - 1) / (MSEC_PER_SEC / HZ);
-#elif HZ > MSEC_PER_SEC && !(HZ % MSEC_PER_SEC)
-   /*
-* HZ is larger than 1000, and HZ is a nice round multiple of
-* 1000 - simply multiply with the factor between them.
-*
-* But first make sure the multiplication result cannot
-* overflow:
-*/
-   if (m > jiffies_to_msecs(MAX_JIFFY_OFFSET))
-   return MAX_JIFFY_OFFSET;
-
-   return m * (HZ / MSEC_PER_SEC);
-#else
-   /*
-* Generic case - multiply, round and divide. But first
-* check that if we are doing a net multiplication, that
-* we wouldn't overflow:
-*/
-   if (HZ > MSEC_PER_SEC && m > jiffies_to_msecs(MAX_JIFFY_OFFSET))
-   return MAX_JIFFY_OFFSET;
-
-   return (MSEC_TO_HZ_MUL32 * m + MSEC_TO_HZ_ADJ32)
-   >> MSEC_TO_HZ_SHR32;
-#endif
+   return __inline_msecs_to_

Re: [PATCH v2 net-next] softirq: reduce latencies

2013-01-04 Thread Eric Dumazet
On Fri, 2013-01-04 at 00:15 -0800, Joe Perches wrote:
> On Thu, 2013-01-03 at 23:49 -0800, Eric Dumazet wrote:
> > In various network workloads, __do_softirq() latencies can be up
> > to 20 ms if HZ=1000, and 200 ms if HZ=100.
> > This patch changes the fallback to ksoftirqd condition to :
> > - A time limit of 2 ms.
> 
> []
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> []
> > +#define MAX_SOFTIRQ_TIME  max(1, (2*HZ/1000))
> 
> And if HZ is 1?
>  

Then its OK.  2*1/1000 -> 20 ticks -> 2 ms


> >  asmlinkage void __do_softirq(void)
> >  {
> []
> > +   unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
> 
> Perhaps MAX_SOFTIRQ_TIME should be
> 
> #define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
> 
> though it would be nicer if it were a compile time constant.

If you send a patch to convert msecs_to_jiffies() to an inline function
when HZ = 1000, I will gladly use it instead of (2*HZ/1000)

Right now, max(1, msecs_to_jiffies(2)) uses way too many instructions,
while it should be the constant 2, known at compile time.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 net-next] softirq: reduce latencies

2013-01-04 Thread Joe Perches
On Thu, 2013-01-03 at 23:49 -0800, Eric Dumazet wrote:
> In various network workloads, __do_softirq() latencies can be up
> to 20 ms if HZ=1000, and 200 ms if HZ=100.
> This patch changes the fallback to ksoftirqd condition to :
> - A time limit of 2 ms.

[]
> diff --git a/kernel/softirq.c b/kernel/softirq.c
[]
> +#define MAX_SOFTIRQ_TIME  max(1, (2*HZ/1000))

And if HZ is 1?
 
>  asmlinkage void __do_softirq(void)
>  {
[]
> + unsigned long end = jiffies + MAX_SOFTIRQ_TIME;

Perhaps MAX_SOFTIRQ_TIME should be

#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)

though it would be nicer if it were a compile time constant.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 net-next] softirq: reduce latencies

2013-01-03 Thread Eric Dumazet
From: Eric Dumazet 

In various network workloads, __do_softirq() latencies can be up
to 20 ms if HZ=1000, and 200 ms if HZ=100.

This is because we iterate 10 times in the softirq dispatcher,
and some actions can consume a lot of cycles.

This patch changes the fallback to ksoftirqd condition to :

- A time limit of 2 ms.
- need_resched() being set on current task

When one of this condition is met, we wakeup ksoftirqd for further
softirq processing if we still have pending softirqs.

Using need_resched() as the only condition can trigger RCU stalls,
as we can keep BH disabled for too long.

I ran several benchmarks and got no significant difference in
throughput, but a very significant reduction of latencies (one order
of magnitude) :

In following bench, 200 antagonist "netperf -t TCP_RR" are started in
background, using all available cpus.

Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC
IRQ (hard+soft)

Before patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=550110.424
MIN_LATENCY=146858
MAX_LATENCY=997109
P50_LATENCY=305000
P90_LATENCY=55
P99_LATENCY=71
MEAN_LATENCY=376989.12
STDDEV_LATENCY=184046.92

After patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=40545.492
MIN_LATENCY=9834
MAX_LATENCY=78366
P50_LATENCY=33583
P90_LATENCY=59000
P99_LATENCY=69000
MEAN_LATENCY=38364.67
STDDEV_LATENCY=12865.26

Signed-off-by: Eric Dumazet 
Cc: David Miller 
Cc: Tom Herbert 
Cc: Ben Hutchings 
---
v2: min(1, (2*HZ/1000)) -> max(1, (2*HZ/1000)), as spotted by Ben

 kernel/softirq.c |   17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index ed567ba..8d5e4be 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -195,21 +195,21 @@ void local_bh_enable_ip(unsigned long ip)
 EXPORT_SYMBOL(local_bh_enable_ip);
 
 /*
- * We restart softirq processing MAX_SOFTIRQ_RESTART times,
- * and we fall back to softirqd after that.
+ * We restart softirq processing for at most 2 ms,
+ * and if need_resched() is not set.
  *
- * This number has been established via experimentation.
+ * These limits have been established via experimentation.
  * The two things to balance is latency against fairness -
  * we want to handle softirqs as soon as possible, but they
  * should not be able to lock up the box.
  */
-#define MAX_SOFTIRQ_RESTART 10
+#define MAX_SOFTIRQ_TIME  max(1, (2*HZ/1000))
 
 asmlinkage void __do_softirq(void)
 {
struct softirq_action *h;
__u32 pending;
-   int max_restart = MAX_SOFTIRQ_RESTART;
+   unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
int cpu;
unsigned long old_flags = current->flags;
 
@@ -264,11 +264,12 @@ restart:
local_irq_disable();
 
pending = local_softirq_pending();
-   if (pending && --max_restart)
-   goto restart;
+   if (pending) {
+   if (time_before(jiffies, end) && !need_resched())
+   goto restart;
 
-   if (pending)
wakeup_softirqd();
+   }
 
lockdep_softirq_exit();
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/