Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-20 Thread Jakob Oestergaard
On Tue, Apr 19, 2005 at 06:46:28PM -0400, Trond Myklebust wrote: > ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard: > > > It mounts a home directory from a 2.6.6 NFS server - the client and > > server are on a hub'ed 100Mbit network. > > > > On the earlier 2.6 client I/O

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-20 Thread Jakob Oestergaard
On Tue, Apr 19, 2005 at 06:46:28PM -0400, Trond Myklebust wrote: ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard: It mounts a home directory from a 2.6.6 NFS server - the client and server are on a hub'ed 100Mbit network. On the earlier 2.6 client I/O performance was as

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Trond Myklebust
ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard: > It mounts a home directory from a 2.6.6 NFS server - the client and > server are on a hub'ed 100Mbit network. > > On the earlier 2.6 client I/O performance was as one would expect on > hub'ed 100Mbit - meaning, not exactly

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:28:43AM +0200, Jakob Oestergaard wrote: ... > > But still, guys, it is the *same* server with tg3 that runs well with a > 2.4 client but poorly with a 2.6 client. > > Maybe I'm just staring myself blind at this, but I can't see how a > general problem on the server

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:28:43AM +0200, Jakob Oestergaard wrote: ... But still, guys, it is the *same* server with tg3 that runs well with a 2.4 client but poorly with a 2.6 client. Maybe I'm just staring myself blind at this, but I can't see how a general problem on the server (such as

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-19 Thread Trond Myklebust
ty den 19.04.2005 Klokka 21:45 (+0200) skreiv Jakob Oestergaard: It mounts a home directory from a 2.6.6 NFS server - the client and server are on a hub'ed 100Mbit network. On the earlier 2.6 client I/O performance was as one would expect on hub'ed 100Mbit - meaning, not exactly stellar,

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-12 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:03:29AM +1000, Greg Banks wrote: > On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote: > > Yes, as far as I know - the Broadcom Tigeon3 driver does not have the > > option of enabling/disabling RX polling (if we agree that is what we're > > talking about), but looking

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-12 Thread Jakob Oestergaard
On Tue, Apr 12, 2005 at 11:03:29AM +1000, Greg Banks wrote: On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote: Yes, as far as I know - the Broadcom Tigeon3 driver does not have the option of enabling/disabling RX polling (if we agree that is what we're talking about), but looking in

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Greg Banks
On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote: > Yes, as far as I know - the Broadcom Tigeon3 driver does not have the > option of enabling/disabling RX polling (if we agree that is what we're > talking about), but looking in tg3.c it seems that it *always* > unconditionally uses NAPI...

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 11:21:45AM -0400, Trond Myklebust wrote: > må den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard: > > > > That can mean either that the server is dropping fragments, or that the > > > client is dropping the replies. Can you generate a similar tcpdump on > > > the

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
mà den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard: > > That can mean either that the server is dropping fragments, or that the > > client is dropping the replies. Can you generate a similar tcpdump on > > the server? > > Certainly; http://unthought.net/sparrow.dmp.bz2 So, it

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 10:35:25AM -0400, Trond Myklebust wrote: > må den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard: > > > Certainly; > > > > http://unthought.net/binary.dmp.bz2 > > > > I got an 'invalid snaplen' with the 9 you suggested, the above dump > > is done with 9000

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
mà den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard: > Certainly; > > http://unthought.net/binary.dmp.bz2 > > I got an 'invalid snaplen' with the 9 you suggested, the above dump > is done with 9000 - if you need another snaplen please just let me know. So, the RPC itself looks

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 08:35:39AM -0400, Trond Myklebust wrote: ... > That certainly shouldn't be the case (and isn't on any of my setups). Is > the behaviour identical same on both the PIII and the Opteron systems? The dual opteron is the nfs server The dual athlon is the 2.4 nfs client The

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
mà den 11.04.2005 Klokka 09:48 (+0200) skreiv Jakob Oestergaard: > tcp with timeo=600 causes retransmits (as seen with nfsstat) to drop to > zero. Good. > File Block Num Seq ReadRand Read Seq Write Rand Write > DirSize Size Thr Rate (CPU%) Rate (CPU%) Rate (CPU%)

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Sat, Apr 09, 2005 at 05:52:32PM -0400, Trond Myklebust wrote: > lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard: > > > 2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000) > > > > File Block Num Seq ReadRand Read Seq Write Rand Write > > DirSize Size

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Sat, Apr 09, 2005 at 05:52:32PM -0400, Trond Myklebust wrote: lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard: 2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000) File Block Num Seq ReadRand Read Seq Write Rand Write DirSize Size Thr Rate

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
m den 11.04.2005 Klokka 09:48 (+0200) skreiv Jakob Oestergaard: tcp with timeo=600 causes retransmits (as seen with nfsstat) to drop to zero. Good. File Block Num Seq ReadRand Read Seq Write Rand Write DirSize Size Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 08:35:39AM -0400, Trond Myklebust wrote: ... That certainly shouldn't be the case (and isn't on any of my setups). Is the behaviour identical same on both the PIII and the Opteron systems? The dual opteron is the nfs server The dual athlon is the 2.4 nfs client The

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
m den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard: Certainly; http://unthought.net/binary.dmp.bz2 I got an 'invalid snaplen' with the 9 you suggested, the above dump is done with 9000 - if you need another snaplen please just let me know. So, the RPC itself looks good,

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 10:35:25AM -0400, Trond Myklebust wrote: må den 11.04.2005 Klokka 15:47 (+0200) skreiv Jakob Oestergaard: Certainly; http://unthought.net/binary.dmp.bz2 I got an 'invalid snaplen' with the 9 you suggested, the above dump is done with 9000 - if you need

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Trond Myklebust
m den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard: That can mean either that the server is dropping fragments, or that the client is dropping the replies. Can you generate a similar tcpdump on the server? Certainly; http://unthought.net/sparrow.dmp.bz2 So, it looks to me

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Jakob Oestergaard
On Mon, Apr 11, 2005 at 11:21:45AM -0400, Trond Myklebust wrote: må den 11.04.2005 Klokka 16:41 (+0200) skreiv Jakob Oestergaard: That can mean either that the server is dropping fragments, or that the client is dropping the replies. Can you generate a similar tcpdump on the server?

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-11 Thread Greg Banks
On Tue, 2005-04-12 at 01:42, Jakob Oestergaard wrote: Yes, as far as I know - the Broadcom Tigeon3 driver does not have the option of enabling/disabling RX polling (if we agree that is what we're talking about), but looking in tg3.c it seems that it *always* unconditionally uses NAPI... I've

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Trond Myklebust
lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard: > 2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000) > > File Block Num Seq ReadRand Read Seq Write Rand Write > DirSize Size Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%) > --- -- ---

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 12:17:51PM -0400, Trond Myklebust wrote: > to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard: > > > I tweaked the VM a bit, put the following in /etc/sysctl.conf: > > vm.dirty_writeback_centisecs=100 > > vm.dirty_expire_centisecs=200 > > > > The defaults

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 12:17:51PM -0400, Trond Myklebust wrote: to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard: I tweaked the VM a bit, put the following in /etc/sysctl.conf: vm.dirty_writeback_centisecs=100 vm.dirty_expire_centisecs=200 The defaults are 500 and

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-09 Thread Trond Myklebust
lau den 09.04.2005 Klokka 23:35 (+0200) skreiv Jakob Oestergaard: 2.6.11.6: (dual PIII 1GHz, 2G RAM, Intel e1000) File Block Num Seq ReadRand Read Seq Write Rand Write DirSize Size Thr Rate (CPU%) Rate (CPU%) Rate (CPU%) Rate (CPU%) --- -- --- ---

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Trond Myklebust
to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard: > I tweaked the VM a bit, put the following in /etc/sysctl.conf: > vm.dirty_writeback_centisecs=100 > vm.dirty_expire_centisecs=200 > > The defaults are 500 and 3000 respectively... > > This improved things a lot; the client is

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Greg Banks
On Thu, Apr 07, 2005 at 05:38:48PM +0200, Jakob Oestergaard wrote: > On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote: > ... > > How large is the client's RAM? > > 2GB - (32 bit kernel because it's dual PIII, so I use highmem) Ok, that's probably not enough to fully trigger some of

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote: ... > How large is the client's RAM? 2GB - (32 bit kernel because it's dual PIII, so I use highmem) A few more details: With standard VM settings, the client will be laggy during the copy, but it will also have a load average around

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Wed, Apr 06, 2005 at 05:28:56PM -0400, Trond Myklebust wrote: ... > A look at "nfsstat" might help, as might "netstat -s". > > In particular, I suggest looking at the "retrans" counter in nfsstat. When doing a 'cp largefile1 largefile2' on the client, I see approx. 10 retransmissions per

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Wed, Apr 06, 2005 at 05:28:56PM -0400, Trond Myklebust wrote: ... A look at nfsstat might help, as might netstat -s. In particular, I suggest looking at the retrans counter in nfsstat. When doing a 'cp largefile1 largefile2' on the client, I see approx. 10 retransmissions per second in

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Jakob Oestergaard
On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote: ... How large is the client's RAM? 2GB - (32 bit kernel because it's dual PIII, so I use highmem) A few more details: With standard VM settings, the client will be laggy during the copy, but it will also have a load average around 10

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Greg Banks
On Thu, Apr 07, 2005 at 05:38:48PM +0200, Jakob Oestergaard wrote: On Thu, Apr 07, 2005 at 09:19:06AM +1000, Greg Banks wrote: ... How large is the client's RAM? 2GB - (32 bit kernel because it's dual PIII, so I use highmem) Ok, that's probably not enough to fully trigger some of the

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-07 Thread Trond Myklebust
to den 07.04.2005 Klokka 17:38 (+0200) skreiv Jakob Oestergaard: I tweaked the VM a bit, put the following in /etc/sysctl.conf: vm.dirty_writeback_centisecs=100 vm.dirty_expire_centisecs=200 The defaults are 500 and 3000 respectively... This improved things a lot; the client is now

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Greg Banks
On Wed, Apr 06, 2005 at 06:01:23PM +0200, Jakob Oestergaard wrote: > > Problem; during simple tests such as a 'cp largefile0 largefile1' on the > client (under the mountpoint from the NFS server), the client becomes > extremely laggy, NFS writes are slow, and I see very high CPU > utilization by

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Trond Myklebust
on den 06.04.2005 Klokka 18:01 (+0200) skreiv Jakob Oestergaard: > What do I do? > > Performance sucks and the profiles do not make sense... > > Any suggestions would be greatly appreciated, A look at "nfsstat" might help, as might "netstat -s". In particular, I suggest looking at the

bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Jakob Oestergaard
Hello list, Setup; NFS server (dual opteron, HW RAID, SCA disk enclosure) on 2.6.11.6 NFS client (dual PIII) on 2.6.11.6 Both on switched gigabit ethernet - I use NFSv3 over UDP (tried TCP but this makes no difference). Problem; during simple tests such as a 'cp largefile0 largefile1' on

bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Jakob Oestergaard
Hello list, Setup; NFS server (dual opteron, HW RAID, SCA disk enclosure) on 2.6.11.6 NFS client (dual PIII) on 2.6.11.6 Both on switched gigabit ethernet - I use NFSv3 over UDP (tried TCP but this makes no difference). Problem; during simple tests such as a 'cp largefile0 largefile1' on

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Trond Myklebust
on den 06.04.2005 Klokka 18:01 (+0200) skreiv Jakob Oestergaard: What do I do? Performance sucks and the profiles do not make sense... Any suggestions would be greatly appreciated, A look at nfsstat might help, as might netstat -s. In particular, I suggest looking at the retrans counter

Re: bdflush/rpciod high CPU utilization, profile does not make sense

2005-04-06 Thread Greg Banks
On Wed, Apr 06, 2005 at 06:01:23PM +0200, Jakob Oestergaard wrote: Problem; during simple tests such as a 'cp largefile0 largefile1' on the client (under the mountpoint from the NFS server), the client becomes extremely laggy, NFS writes are slow, and I see very high CPU utilization by