Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-11 Thread David Collier-Brown

On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> workload with Samba, and I wanted to get some feedback on results so
> far.

Also consider using Andrew Tridgell's 
dbench/tbench/smbtorture suite in this process: it
is mathmeatically comparable to NetBench, runs on
smaller numbers of load-generationg machines, and
can give better breakdowns into the disk component,
then network component and the on-server component
of the available performance.

I also have some results from SPARC Linux: send me email.

--dave
-- 
David Collier-Brown,   | Always do right. This will gratify 
Performance & Engineering Team | some people and astonish the rest.
Americas Customer Engineering  |  -- Mark Twain
(905) 415-2849 | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-11 Thread David Collier-Brown

On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
 I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
 workload with Samba, and I wanted to get some feedback on results so
 far.

Also consider using Andrew Tridgell's 
dbench/tbench/smbtorture suite in this process: it
is mathmeatically comparable to NetBench, runs on
smaller numbers of load-generationg machines, and
can give better breakdowns into the disk component,
then network component and the on-server component
of the available performance.

I also have some results from SPARC Linux: send me email.

--dave
-- 
David Collier-Brown,   | Always do right. This will gratify 
Performance  Engineering Team | some people and astonish the rest.
Americas Customer Engineering  |  -- Mark Twain
(905) 415-2849 | [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-10 Thread Maneesh Soni

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.
> 
> Andrew Theurer
> 
> Mike Kravetz wrote:
> > 
> > On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> > >
> > > I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> > > workload with Samba, and I wanted to get some feedback on results so
> > > far.
> > 
> > Do you have any kernel profile or lock contention data?
> > 
> > --
> > Mike Kravetz [EMAIL PROTECTED]
> > IBM Linux Technology Center

Hello Andrew,

If in the kernprof data you find "fget" as one of the high rankers (say in top
10) then can you try the scalable FD management patch which uses 
read-copy-update mechanism for protecting files_struct. 

As of now there are working patches available for read-copy-update mechanism 
and FD management at "http://lse.sourceforge.net/locking/rclock.html; as 
rclock-2.4.2-01.patch and files_struct_rcu-2.4.2-03.patch but we are working on 
simpler interfaces. Also let me know if you need the patches for a different 
2.4 kernel version.

Maneesh

-- 
Maneesh Soni
IBM Linux Technology Center,
IBM India Software Lab, Bangalore.
email: [EMAIL PROTECTED]
http://lse.sourceforge.net/locking/rclock.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-10 Thread Andrew M. Theurer

Bruce Allan wrote:
> 
> Andrew Theurer wrote:
> > I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> > significant problems with lockmeter.  csum_partial_copy_generic was the
> > highest % in profile, at 4.34%.  I'll see if we can get some space on
> > http://lse.sourceforge.net to post the test data.
> 
> The Netfinity system that you are using has two different supported GigE
> adapters.  I assume you are using one of these types - Netfinity Gigabit
> Ethernet Adapter (19K4401) and the Netfinity Gigabit Ethernet SX Server
> Adapter (06P3701); using the acenic.c and e1000.c drivers, respectively.
> >From what I understand after initial perusal of the two drivers, the former
> has receive checksumming support on the adapter itself while the latter,
> the one you are using, does not support hardware checksumming (at least, it
> is not enabled by the driver).

Bruce,

According to Intel's driver for Pro/1000, it supports checksum on Rx via
module option "XsunRX=1".  I have not tried this yet because we are
waiting on our Gbps switch hardware.  
> Are you able to re-run your tests with GigE adapters that support
> checksumming on the hardware instead of doing it in the kernel?  If not, I
> will be running similar tests in a very similar configuration (with the
> 19K4401 adapters) in the near future and can share results if you'd like.

Yes, hopefully we will be running the new setup (64 clients, many Gbps
adapters) in about 2-3 weeks.  At that point I'd like to get some
results for 8-way as well.  It would definitely be a good idea to
compare results.

Andrew Theurer
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-10 Thread Dipankar Sarma

Hello Andrew,

You would need contact one of the administrators of the LSE project for this.
You would need a developer id for uploading. You can get all the information 
from http://sourceforge.net/projects/lse/.

I think it will be very helpful to have the results including lockmeter
and kernprof data available in lse.sourceforge.net.

Thanks
Dipankar
-- 
Dipankar Sarma  <[EMAIL PROTECTED]> Project: http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.
> 
> Andrew Theurer
> 
> Mike Kravetz wrote:
> > 
> > On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> > >
> > > I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> > > workload with Samba, and I wanted to get some feedback on results so
> > > far.
> > 
> > Do you have any kernel profile or lock contention data?
> > 
> > --
> > Mike Kravetz [EMAIL PROTECTED]
> > IBM Linux Technology Center
> 
> ___
> Lse-tech mailing list
> [EMAIL PROTECTED]
> http://lists.sourceforge.net/lists/listinfo/lse-tech

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-10 Thread Dipankar Sarma

Hello Andrew,

You would need contact one of the administrators of the LSE project for this.
You would need a developer id for uploading. You can get all the information 
from http://sourceforge.net/projects/lse/.

I think it will be very helpful to have the results including lockmeter
and kernprof data available in lse.sourceforge.net.

Thanks
Dipankar
-- 
Dipankar Sarma  [EMAIL PROTECTED] Project: http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
 I do have kernprof ACG and lockmeter for a 4P run.  We saw no
 significant problems with lockmeter.  csum_partial_copy_generic was the
 highest % in profile, at 4.34%.  I'll see if we can get some space on
 http://lse.sourceforge.net to post the test data.
 
 Andrew Theurer
 
 Mike Kravetz wrote:
  
  On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
  
   I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
   workload with Samba, and I wanted to get some feedback on results so
   far.
  
  Do you have any kernel profile or lock contention data?
  
  --
  Mike Kravetz [EMAIL PROTECTED]
  IBM Linux Technology Center
 
 ___
 Lse-tech mailing list
 [EMAIL PROTECTED]
 http://lists.sourceforge.net/lists/listinfo/lse-tech

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-10 Thread Andrew M. Theurer

Bruce Allan wrote:
 
 Andrew Theurer wrote:
  I do have kernprof ACG and lockmeter for a 4P run.  We saw no
  significant problems with lockmeter.  csum_partial_copy_generic was the
  highest % in profile, at 4.34%.  I'll see if we can get some space on
  http://lse.sourceforge.net to post the test data.
 
 The Netfinity system that you are using has two different supported GigE
 adapters.  I assume you are using one of these types - Netfinity Gigabit
 Ethernet Adapter (19K4401) and the Netfinity Gigabit Ethernet SX Server
 Adapter (06P3701); using the acenic.c and e1000.c drivers, respectively.
 From what I understand after initial perusal of the two drivers, the former
 has receive checksumming support on the adapter itself while the latter,
 the one you are using, does not support hardware checksumming (at least, it
 is not enabled by the driver).

Bruce,

According to Intel's driver for Pro/1000, it supports checksum on Rx via
module option XsunRX=1.  I have not tried this yet because we are
waiting on our Gbps switch hardware.  
 Are you able to re-run your tests with GigE adapters that support
 checksumming on the hardware instead of doing it in the kernel?  If not, I
 will be running similar tests in a very similar configuration (with the
 19K4401 adapters) in the near future and can share results if you'd like.

Yes, hopefully we will be running the new setup (64 clients, many Gbps
adapters) in about 2-3 weeks.  At that point I'd like to get some
results for 8-way as well.  It would definitely be a good idea to
compare results.

Andrew Theurer
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-10 Thread Maneesh Soni

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
 I do have kernprof ACG and lockmeter for a 4P run.  We saw no
 significant problems with lockmeter.  csum_partial_copy_generic was the
 highest % in profile, at 4.34%.  I'll see if we can get some space on
 http://lse.sourceforge.net to post the test data.
 
 Andrew Theurer
 
 Mike Kravetz wrote:
  
  On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
  
   I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
   workload with Samba, and I wanted to get some feedback on results so
   far.
  
  Do you have any kernel profile or lock contention data?
  
  --
  Mike Kravetz [EMAIL PROTECTED]
  IBM Linux Technology Center

Hello Andrew,

If in the kernprof data you find fget as one of the high rankers (say in top
10) then can you try the scalable FD management patch which uses 
read-copy-update mechanism for protecting files_struct. 

As of now there are working patches available for read-copy-update mechanism 
and FD management at http://lse.sourceforge.net/locking/rclock.html; as 
rclock-2.4.2-01.patch and files_struct_rcu-2.4.2-03.patch but we are working on 
simpler interfaces. Also let me know if you need the patches for a different 
2.4 kernel version.

Maneesh

-- 
Maneesh Soni
IBM Linux Technology Center,
IBM India Software Lab, Bangalore.
email: [EMAIL PROTECTED]
http://lse.sourceforge.net/locking/rclock.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Kenichi Okuyama

> "AMT" == Andrew M Theurer <[EMAIL PROTECTED]> writes:
AMT> I would like to help improve SMP scalability on this workload.  If you
AMT> have questions or comments about the above results, or if you are
AMT> conducting similar tests, please send email to
AMT> [EMAIL PROTECTED]  I have some ideas on my next steps,
AMT> but would like to discuss first.


Did you check vmstat result of each benchmarks?

Most of the problems are caused due to kernel. If you look at result
of vmstat, more than 80% CPU time are used in kernel.

It's true that heavy kernel overhead is due to Samba, and is due to
Samba generating lot's and lot's of request against kernels ( not
only disk IO, but it requires many signal handling etc ).

So, there's really two things we need to do.

1) make Linux more scalable.
   ( This sometimes seems as if it's tuning, but it's really bug
 fix. So, don't ask performance team to tune. Let them FIX. )
 
2) make Samba work in less signals.
   This means, don't call useless system calls, use shared memory
   more effectively, divide Samba source into OS dependent part
   and independent part so that you can do tuning for specific OS
   and still have wide userland, etc.
 
Kenichi Okuyama.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Chris Evans


On Wed, 9 May 2001, Alan Cox wrote:

> > significant problems with lockmeter.  csum_partial_copy_generic was the
> > highest % in profile, at 4.34%.  I'll see if we can get some space on
>
> Are you using Antons optimisations to samba to use sendfile ?

And you might like to try 2.4.4 (I saw 2.4.0 and 2.4.3 mentioned). 2.4.4
has the zerocopy TCP stuff (or was it 2.4.3 :)

Also, if the load is not disk limited, you might like to try Mingo's
pagecache/timers scalability patches. etc.

Cheers
Chris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Bruce Allan


Andrew Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.

The Netfinity system that you are using has two different supported GigE
adapters.  I assume you are using one of these types - Netfinity Gigabit
Ethernet Adapter (19K4401) and the Netfinity Gigabit Ethernet SX Server
Adapter (06P3701); using the acenic.c and e1000.c drivers, respectively.
>From what I understand after initial perusal of the two drivers, the former
has receive checksumming support on the adapter itself while the latter,
the one you are using, does not support hardware checksumming (at least, it
is not enabled by the driver).

Are you able to re-run your tests with GigE adapters that support
checksumming on the hardware instead of doing it in the kernel?  If not, I
will be running similar tests in a very similar configuration (with the
19K4401 adapters) in the near future and can share results if you'd like.

Bruce Allan/Beaverton/IBM
IBM Linux Technology Center - OS Gold
503-578-4187   T/L 775-4187
[EMAIL PROTECTED]


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Andrew M. Theurer

Alan Cox wrote:
>
> > significant problems with lockmeter.  csum_partial_copy_generic was the
> > highest % in profile, at 4.34%.  I'll see if we can get some space on
> 
> Are you using Antons optimisations to samba to use sendfile ?
> 
> Alan

Not yet.  As I understand it, we need a supported nic to take advantage
of the sendfile/zero copy patch.  Once we have the HW, we will use it.

Thanks,

Andrew Theurer
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Alan Cox

> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on

Are you using Antons optimisations to samba to use sendfile ?

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Christoph Hellwig

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
> I do have kernprof ACG and lockmeter for a 4P run.  We saw no
> significant problems with lockmeter.  csum_partial_copy_generic was the
> highest % in profile, at 4.34%.  I'll see if we can get some space on
> http://lse.sourceforge.net to post the test data.

Maybe you should try Kernel 2.4.4 (with Zerocopy TCP/IP) and Anton's
sendfile for samba patch.  A copy of the latter was posted to lkml - see
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0101.3/0484.html,
even if that maybe be unusable to due html crappieness.

Christoph

-- 
Of course it doesn't work. We've performed a software upgrade.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Andrew M. Theurer

I do have kernprof ACG and lockmeter for a 4P run.  We saw no
significant problems with lockmeter.  csum_partial_copy_generic was the
highest % in profile, at 4.34%.  I'll see if we can get some space on
http://lse.sourceforge.net to post the test data.

Andrew Theurer

Mike Kravetz wrote:
> 
> On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> >
> > I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> > workload with Samba, and I wanted to get some feedback on results so
> > far.
> 
> Do you have any kernel profile or lock contention data?
> 
> --
> Mike Kravetz [EMAIL PROTECTED]
> IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Mike Kravetz

On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
> 
> I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
> workload with Samba, and I wanted to get some feedback on results so
> far.

Do you have any kernel profile or lock contention data?

-- 
Mike Kravetz [EMAIL PROTECTED]
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Andrew M. Theurer

Hello,

I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
workload with Samba, and I wanted to get some feedback on results so
far.  I would appreciate comments and any suggestions for improving
scalability on this workload.

The environment consists of an Intel Profusion based SMP with 8 x 700
Mhz Xeon, 1 Mb L2, 14+ GB ram, ServeRAID, 8 Intel ethernet cards (IBM
Netfinity 8500R).  There are 16 500 Mhz PII, 128 MB clients running
Windows NT.  I tested for uniprocessor, 2-way, and 4-way SMP
configurations.  Future plans including testing 8-way performance when
more test clients are available.  Netbench(r) 7.01 was used with the
enterprise disk suite test.  The test was modified to use 2 engines per
client, and the range of test clients was changed from 1-60 to 8-16 
(for 2P & 4P) and 4-12 (for uniprocessor).

My initial results for linux 2.4.0, ext2 are as follows:

[UP][2P][4P]
08  149
12  199
16  227 236 260
# Eng   20  193 272 317 Mbps
24  223 283 369
28  285 396
32  285 405

Same test, but with IRQ to processor affinity for 2P & 4P on the 8
ethernet cards:
[2P][4P]
16  231 259
# Eng   20  278 297
24  293 320 Mbps
28  297 365
32  299 399*
*Still investigating; we had some cpu idle time
 on the 4P/32 engines, but not on test configuration
 with out IRQ aff.

And for linux 2.4.3 with reiserfs:
[UP][2P][4P]
08  130
12  190
16  203 210 231
# Eng   20  190 235 279
24  200 249 319 Mbps
28  239 360
32  251 335

Same, but with IRQ affinity for 2P & 4P on the 8 ethernet cards:
[2P][4P]
16  224 236
# Eng   20  220 308
24  252 331 Mbps
28  269 375
32  267 382

 --All results in Mbps, using Netbench(r) 7.0.1 and Samba 2.0.7
 --Netbench(r) is available at http://www.netbench.com

I would like to help improve SMP scalability on this workload.  If you
have questions or comments about the above results, or if you are
conducting similar tests, please send email to
[EMAIL PROTECTED]  I have some ideas on my next steps,
but would like to discuss first.

Regards,

Andrew Theurer
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Andrew M. Theurer

Hello,

I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
workload with Samba, and I wanted to get some feedback on results so
far.  I would appreciate comments and any suggestions for improving
scalability on this workload.

The environment consists of an Intel Profusion based SMP with 8 x 700
Mhz Xeon, 1 Mb L2, 14+ GB ram, ServeRAID, 8 Intel ethernet cards (IBM
Netfinity 8500R).  There are 16 500 Mhz PII, 128 MB clients running
Windows NT.  I tested for uniprocessor, 2-way, and 4-way SMP
configurations.  Future plans including testing 8-way performance when
more test clients are available.  Netbench(r) 7.01 was used with the
enterprise disk suite test.  The test was modified to use 2 engines per
client, and the range of test clients was changed from 1-60 to 8-16 
(for 2P  4P) and 4-12 (for uniprocessor).

My initial results for linux 2.4.0, ext2 are as follows:

[UP][2P][4P]
08  149
12  199
16  227 236 260
# Eng   20  193 272 317 Mbps
24  223 283 369
28  285 396
32  285 405

Same test, but with IRQ to processor affinity for 2P  4P on the 8
ethernet cards:
[2P][4P]
16  231 259
# Eng   20  278 297
24  293 320 Mbps
28  297 365
32  299 399*
*Still investigating; we had some cpu idle time
 on the 4P/32 engines, but not on test configuration
 with out IRQ aff.

And for linux 2.4.3 with reiserfs:
[UP][2P][4P]
08  130
12  190
16  203 210 231
# Eng   20  190 235 279
24  200 249 319 Mbps
28  239 360
32  251 335

Same, but with IRQ affinity for 2P  4P on the 8 ethernet cards:
[2P][4P]
16  224 236
# Eng   20  220 308
24  252 331 Mbps
28  269 375
32  267 382

 --All results in Mbps, using Netbench(r) 7.0.1 and Samba 2.0.7
 --Netbench(r) is available at http://www.netbench.com

I would like to help improve SMP scalability on this workload.  If you
have questions or comments about the above results, or if you are
conducting similar tests, please send email to
[EMAIL PROTECTED]  I have some ideas on my next steps,
but would like to discuss first.

Regards,

Andrew Theurer
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Mike Kravetz

On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
 
 I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
 workload with Samba, and I wanted to get some feedback on results so
 far.

Do you have any kernel profile or lock contention data?

-- 
Mike Kravetz [EMAIL PROTECTED]
IBM Linux Technology Center
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Andrew M. Theurer

I do have kernprof ACG and lockmeter for a 4P run.  We saw no
significant problems with lockmeter.  csum_partial_copy_generic was the
highest % in profile, at 4.34%.  I'll see if we can get some space on
http://lse.sourceforge.net to post the test data.

Andrew Theurer

Mike Kravetz wrote:
 
 On Wed, May 09, 2001 at 11:29:22AM -0500, Andrew M. Theurer wrote:
 
  I am evaluating Linux 2.4 SMP scalability, using Netbench(r) as a
  workload with Samba, and I wanted to get some feedback on results so
  far.
 
 Do you have any kernel profile or lock contention data?
 
 --
 Mike Kravetz [EMAIL PROTECTED]
 IBM Linux Technology Center
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: [Lse-tech] Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Christoph Hellwig

On Wed, May 09, 2001 at 12:30:35PM -0500, Andrew M. Theurer wrote:
 I do have kernprof ACG and lockmeter for a 4P run.  We saw no
 significant problems with lockmeter.  csum_partial_copy_generic was the
 highest % in profile, at 4.34%.  I'll see if we can get some space on
 http://lse.sourceforge.net to post the test data.

Maybe you should try Kernel 2.4.4 (with Zerocopy TCP/IP) and Anton's
sendfile for samba patch.  A copy of the latter was posted to lkml - see
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0101.3/0484.html,
even if that maybe be unusable to due html crappieness.

Christoph

-- 
Of course it doesn't work. We've performed a software upgrade.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Alan Cox

 significant problems with lockmeter.  csum_partial_copy_generic was the
 highest % in profile, at 4.34%.  I'll see if we can get some space on

Are you using Antons optimisations to samba to use sendfile ?

Alan

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Andrew M. Theurer

Alan Cox wrote:

  significant problems with lockmeter.  csum_partial_copy_generic was the
  highest % in profile, at 4.34%.  I'll see if we can get some space on
 
 Are you using Antons optimisations to samba to use sendfile ?
 
 Alan

Not yet.  As I understand it, we need a supported nic to take advantage
of the sendfile/zero copy patch.  Once we have the HW, we will use it.

Thanks,

Andrew Theurer
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Bruce Allan


Andrew Theurer wrote:
 I do have kernprof ACG and lockmeter for a 4P run.  We saw no
 significant problems with lockmeter.  csum_partial_copy_generic was the
 highest % in profile, at 4.34%.  I'll see if we can get some space on
 http://lse.sourceforge.net to post the test data.

The Netfinity system that you are using has two different supported GigE
adapters.  I assume you are using one of these types - Netfinity Gigabit
Ethernet Adapter (19K4401) and the Netfinity Gigabit Ethernet SX Server
Adapter (06P3701); using the acenic.c and e1000.c drivers, respectively.
From what I understand after initial perusal of the two drivers, the former
has receive checksumming support on the adapter itself while the latter,
the one you are using, does not support hardware checksumming (at least, it
is not enabled by the driver).

Are you able to re-run your tests with GigE adapters that support
checksumming on the hardware instead of doing it in the kernel?  If not, I
will be running similar tests in a very similar configuration (with the
19K4401 adapters) in the near future and can share results if you'd like.

Bruce Allan/Beaverton/IBM
IBM Linux Technology Center - OS Gold
503-578-4187   T/L 775-4187
[EMAIL PROTECTED]


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Chris Evans


On Wed, 9 May 2001, Alan Cox wrote:

  significant problems with lockmeter.  csum_partial_copy_generic was the
  highest % in profile, at 4.34%.  I'll see if we can get some space on

 Are you using Antons optimisations to samba to use sendfile ?

And you might like to try 2.4.4 (I saw 2.4.0 and 2.4.3 mentioned). 2.4.4
has the zerocopy TCP stuff (or was it 2.4.3 :)

Also, if the load is not disk limited, you might like to try Mingo's
pagecache/timers scalability patches. etc.

Cheers
Chris

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



Re: Linux 2.4 Scalability, Samba, and Netbench

2001-05-09 Thread Kenichi Okuyama

 AMT == Andrew M Theurer [EMAIL PROTECTED] writes:
AMT I would like to help improve SMP scalability on this workload.  If you
AMT have questions or comments about the above results, or if you are
AMT conducting similar tests, please send email to
AMT [EMAIL PROTECTED]  I have some ideas on my next steps,
AMT but would like to discuss first.


Did you check vmstat result of each benchmarks?

Most of the problems are caused due to kernel. If you look at result
of vmstat, more than 80% CPU time are used in kernel.

It's true that heavy kernel overhead is due to Samba, and is due to
Samba generating lot's and lot's of request against kernels ( not
only disk IO, but it requires many signal handling etc ).

So, there's really two things we need to do.

1) make Linux more scalable.
   ( This sometimes seems as if it's tuning, but it's really bug
 fix. So, don't ask performance team to tune. Let them FIX. )
 
2) make Samba work in less signals.
   This means, don't call useless system calls, use shared memory
   more effectively, divide Samba source into OS dependent part
   and independent part so that you can do tuning for specific OS
   and still have wide userland, etc.
 
Kenichi Okuyama.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/