Re: What's the status of parallel netisr?

2008-09-20 Thread Jian Qiu
Hi, Kevin,

 Did you try locking down the CPUs used with cpuset (FreeBSD) or taskset
 (Linux)? This can make a very substantial difference. Something like a
 UDP canon will run far more efficiently if locked to a single CPU and
 will run best if that CPU is not processing the interrupts.


As far as I know, on the sending path, a UDP packet will be directly
put on the sending queue of the relevant NIC. The UDP stack codes are
executed on the CPU where the sending application is running. On the
receiving path, iIf the packet is received from a loopback interface,
the UDP stack codes are executed in the context of netisr softirq.

Did you mean I should bind the sending application to one CPU and
netsir softirq to another CPU?

Thanks.

Jian
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: What's the status of parallel netisr?

2008-09-20 Thread Jian Qiu
Hi, Kris,

 In our application-level tests FreeBSD significantly out-performs Linux, so
 either you have found a different workload, or something is not configured
 equally.  One important thing I can think of off the top of my head is that
 Linux has a larger socket buffer size by default, so try tuning that on
 FreeBSD or confirm they are equal.

 If that still fails, can you provide test code?

 Kris


I tried but larger socket buffer seem not helpful.

I also tried netperf and iperf. Both applications achieve better
throughput on Linux.

So I feel the result is not specific to my test code.

My code is very simple. Basically, a client process called sendto in a
loop while a server called recvfrom in a loop.
Besides these, some additional lines get the throughput statistics. If
necessary, I will post the code here.

BTW, I did the tests on Linux 2.26.5. Which linux kernel did you use?

Could you please provide some more information on your test.

Many thanks.

Jian
___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: What's the status of parallel netisr?

2008-09-18 Thread Jian Qiu
Thanks again for the info.

As you suggested, I did test on the most recent 7.0-stable-200807 kernel.

The SMP throughout on the new kernel was improved to around 90MB/s.

However, SMP kernel still had no advantage over UP, at least for this
kind of single threaded applications.

I further did the same test on Linux with both SMP and UP.

I did observe the same trend.

The throughput on UP (~210MB/ecs) was also much better than SMP (~170MB/sec).

However, I was surprised again that the local UDP throughput on Linux
was more than double of FreeBSD.

Since all these tests were performed on the same machine, it must be
because of the kernel that made such big differences.

I'm curious what is the major performance bottleneck in FreeBSD network stack??

Is there any plan in community to address these issues?

Many thanks.

Jian

On Wed, Sep 17, 2008 at 3:27 AM, Kris Kennaway [EMAIL PROTECTED] wrote:
 Jian Qiu wrote:

 Interesting.

 I did a test on local UDP throughput.

 I was surprised to find out the performance with a SMP kernel was
 worse than UP. (~74MB/s v.s. 96 MB/s).

 I had though parallel netisr might be a solution.

 Make sure you are testing with either 8.0 or 7.1 (or late 7.0-STABLE), i.e.
 after the fixes to improve UDP performance on SMP systems.

 Kris


___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: What's the status of parallel netisr?

2008-09-18 Thread Kevin Oberman
 Date: Thu, 18 Sep 2008 22:50:10 +0800
 From: Jian Qiu [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]
 
 Thanks again for the info.
 
 As you suggested, I did test on the most recent 7.0-stable-200807 kernel.
 
 The SMP throughout on the new kernel was improved to around 90MB/s.
 
 However, SMP kernel still had no advantage over UP, at least for this
 kind of single threaded applications.
 
 I further did the same test on Linux with both SMP and UP.
 
 I did observe the same trend.
 
 The throughput on UP (~210MB/ecs) was also much better than SMP (~170MB/sec).
 
 However, I was surprised again that the local UDP throughput on Linux
 was more than double of FreeBSD.
 
 Since all these tests were performed on the same machine, it must be
 because of the kernel that made such big differences.
 
 I'm curious what is the major performance bottleneck in FreeBSD network 
 stack??
 
 Is there any plan in community to address these issues?

Did you try locking down the CPUs used with cpuset (FreeBSD) or taskset
(Linux)? This can make a very substantial difference. Something like a
UDP canon will run far more efficiently if locked to a single CPU and
will run best if that CPU is not processing the interrupts.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751


pgp5Hf429nPo6.pgp
Description: PGP signature


Re: What's the status of parallel netisr?

2008-09-18 Thread Kris Kennaway

Jian Qiu wrote:

Thanks again for the info.

As you suggested, I did test on the most recent 7.0-stable-200807 kernel.

The SMP throughout on the new kernel was improved to around 90MB/s.

However, SMP kernel still had no advantage over UP, at least for this
kind of single threaded applications.

I further did the same test on Linux with both SMP and UP.

I did observe the same trend.

The throughput on UP (~210MB/ecs) was also much better than SMP (~170MB/sec).

However, I was surprised again that the local UDP throughput on Linux
was more than double of FreeBSD.

Since all these tests were performed on the same machine, it must be
because of the kernel that made such big differences.

I'm curious what is the major performance bottleneck in FreeBSD network stack??

Is there any plan in community to address these issues?


In our application-level tests FreeBSD significantly out-performs Linux, 
so either you have found a different workload, or something is not 
configured equally.  One important thing I can think of off the top of 
my head is that Linux has a larger socket buffer size by default, so try 
tuning that on FreeBSD or confirm they are equal.


If that still fails, can you provide test code?

Kris

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: What's the status of parallel netisr?

2008-09-16 Thread Jian Qiu
Interesting.

I did a test on local UDP throughput.

I was surprised to find out the performance with a SMP kernel was
worse than UP. (~74MB/s v.s. 96 MB/s).

I had though parallel netisr might be a solution.

Anyway, thanks for the info.

On Tue, Sep 16, 2008 at 3:46 PM, Kris Kennaway [EMAIL PROTECTED] wrote:
 Jian Qiu wrote:

 I noticed there was a project trying to parallelize netisr in SMP.

 But I cannot find the relevant codes in either stable 7 or current 8.

 I'm wondering what's the current status of this project?

 When will it be merged into FreeBSD source tree?

 It's available in a perforce branch owned by rwatson (sorry, I don't have
 the branch name handy), but in my tests it either produced no benefits, or
 actually reduced performance.  This is surprising and the reasons for this
 are still unknown.

 Kris

___
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: What's the status of parallel netisr?

2008-09-16 Thread Kevin Oberman
 Date: Tue, 16 Sep 2008 22:43:25 +0800
 From: Jian Qiu [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]
 
 Interesting.
 
 I did a test on local UDP throughput.
 
 I was surprised to find out the performance with a SMP kernel was
 worse than UP. (~74MB/s v.s. 96 MB/s).

Look at CPU affinity. I have seen significant jumps in performance when
things switch between CPUs. It's best to lock the UDP cannon to a
single CPU and that the CPU not be CPU0. (This applies to both BSD and
Linux systems that I have worked with.)
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751


pgpMpAQo1hYwj.pgp
Description: PGP signature