Re: Massive slowdown when re-querying large nfs dir

2007-11-08 Thread Andrew Morton
> On Thu, 8 Nov 2007 10:44:35 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> Andrew Morton wrote:
> > > > I would suggest getting a 'tcpdump -s0' trace and seeing (with
> > > > wireshark) what is different between the various cases.
> > >
> > > Thanks Neil for looking into this.  Your suggestion has already been
> > > answered in a previous post, where the difference has been attributed to
> > > "ls -l" inducing lookup for the first try, which is fast, and getattr
> > > for later tries, which is super-slow.
> > >
> > > Now it's easy to blame the userland rpc.nfs.V2 server for this, but
> > > what's not clear is how come 2.4.31 handles getattr faster than 2.6.23?
> >
> > We broke 2.6?  It'd be interesting to run the ls in an infinite loop on
> > the client them start poking at the server.  Is the 2.6 server doing
> > physical IO?  Is the 2.6 server consuming more system time?  etc.  A basic
> > `vmstat 1' trace for both 2.4 and 2.6 would be a starting point.
> >
> > Could be that there's some additional latency caused by networking
> > changes, too.  I expect the tcpdump/wireshark/etc traces would have
> > sufficient resolution for us to be able to see that.
> 
> The problem turns out to be "tune2fs -O dir_index".
> Removing that feature resolves the big slowdown.

Doh.  Well worked-out.

> Does 2.4.31 support this feature?

No.  This explains it.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-07 Thread Al Boldi
Andrew Morton wrote:
> > > I would suggest getting a 'tcpdump -s0' trace and seeing (with
> > > wireshark) what is different between the various cases.
> >
> > Thanks Neil for looking into this.  Your suggestion has already been
> > answered in a previous post, where the difference has been attributed to
> > "ls -l" inducing lookup for the first try, which is fast, and getattr
> > for later tries, which is super-slow.
> >
> > Now it's easy to blame the userland rpc.nfs.V2 server for this, but
> > what's not clear is how come 2.4.31 handles getattr faster than 2.6.23?
>
> We broke 2.6?  It'd be interesting to run the ls in an infinite loop on
> the client them start poking at the server.  Is the 2.6 server doing
> physical IO?  Is the 2.6 server consuming more system time?  etc.  A basic
> `vmstat 1' trace for both 2.4 and 2.6 would be a starting point.
>
> Could be that there's some additional latency caused by networking
> changes, too.  I expect the tcpdump/wireshark/etc traces would have
> sufficient resolution for us to be able to see that.

The problem turns out to be "tune2fs -O dir_index".
Removing that feature resolves the big slowdown.

Does 2.4.31 support this feature?

Neil Brown wrote:
> Maybe an "strace -tt" of the nfs server might show some significant
> difference.

###
# ls -l <3K dir entry> (first try after mount inducing lookup) in ~3sec
# strace -tt rpc.nfsd

08:28:14.668557 time([1194499694])  = 1194499694
08:28:14.669420 alarm(5)= 2
08:28:14.669667 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4])
08:28:14.670142 recvfrom(4, 
"\275\3607{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\4"..., 8800, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 
[16]) = 116
08:28:14.670554 time(NULL)  = 1194499694
08:28:14.670711 time([1194499694])  = 1194499694
08:28:14.670875 lstat("/a/x", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0
08:28:14.671134 time([1194499694])  = 1194499694
08:28:14.671302 lstat("/a/x/3619", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.671530 time([1194499694])  = 1194499694
08:28:14.671701 alarm(2)= 5
08:28:14.671903 time([1194499694])  = 1194499694
08:28:14.672060 lstat("/a/x/3619", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.672305 time([1194499694])  = 1194499694
08:28:14.672508 sendto(4, 
"\275\3607{\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 16) 
= 128
08:28:14.672909 time([1194499694])  = 1194499694
08:28:14.673869 alarm(5)= 2
08:28:14.674145 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4])
08:28:14.674589 recvfrom(4, 
"\276\3607{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\4"..., 8800, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 
[16]) = 116
08:28:14.675003 time(NULL)  = 1194499694
08:28:14.675160 time([1194499694])  = 1194499694
08:28:14.675321 lstat("/a/x", {st_mode=S_IFDIR|0755, st_size=36864, ...}) = 0
08:28:14.675581 time([1194499694])  = 1194499694
08:28:14.675749 lstat("/a/x/3631", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.675979 time([1194499694])  = 1194499694
08:28:14.676150 alarm(2)= 5
08:28:14.676348 time([1194499694])  = 1194499694
08:28:14.676505 lstat("/a/x/3631", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
08:28:14.676746 time([1194499694])  = 1194499694
08:28:14.676952 sendto(4, 
"\276\3607{\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 128, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 16) 
= 128

##
# ls -l <3K dir entry> (second try after mount inducing getattr) in ~11sec
# strace -tt rpc.nfsd

08:28:40.963668 time([1194499720])  = 1194499720
08:28:40.964525 alarm(5)= 2
08:28:40.964772 select(1024, [4 5], NULL, NULL, NULL) = 1 (in [4])
08:28:40.965215 recvfrom(4, 
",\3747{\0\0\0\0\0\0\0\2\0\1\206\243\0\0\0\2\0\0\0\1\0\0"..., 8800, 0, 
{sa_family=AF_INET, sin_port=htons(888), sin_addr=inet_addr("10.0.0.111")}, 
[16]) = 108
08:28:40.965609 time(NULL)  = 1194499720
08:28:40.965763 time([1194499720])  = 1194499720
08:28:40.965941 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0
08:28:40.966176 setfsuid(0) = 0
08:28:40.966329 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0
08:28:40.966539 stat("/", {st_mode=S_IFDIR|0755, st_size=2048, ...}) = 0
08:28:40.966748 open("/", O_RDONLY|O_NONBLOCK) = 0
08:28:40.966919 fcntl(0, F_SETFD, FD_CLOEXEC) = 0
08:28:40.967084 lseek(0, 0, SEEK_CUR)   = 0
08:28:40.967240 getdents(0, /* 71 entries */, 3933) = 1220
08:28:40.968195 close(0)= 0
08:28:40.968351 stat("/a/", {st_mode=S_IFDIR|0755, st_size=1024, ...}) = 0
08:28:40.968583 stat("/a/", 

Re: Massive slowdown when re-querying large nfs dir - CORRECTION

2007-11-07 Thread Neil Brown
On Thursday November 8, [EMAIL PROTECTED] wrote:
> 
> Not really a credible difference as the reported difference is between
> two *clients* and the speed of getattr vs lookup would depend on the
> *server*. 

Sorry, my bad.  I misread your original problem description.  It would
appear to be a server difference.

Maybe an "strace -tt" of the nfs server might show some significant
difference.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-07 Thread Neil Brown
On Wednesday November 7, [EMAIL PROTECTED] wrote:
> Neil Brown wrote:
> >
> > I would suggest getting a 'tcpdump -s0' trace and seeing (with
> > wireshark) what is different between the various cases.
> 
> Thanks Neil for looking into this.  Your suggestion has already been answered 
> in a previous post, where the difference has been attributed to "ls -l" 
> inducing lookup for the first try, which is fast, and getattr for later 
> tries, which is super-slow.

Not really a credible difference as the reported difference is between
two *clients* and the speed of getattr vs lookup would depend on the
*server*. 

> 
> Now it's easy to blame the userland rpc.nfs.V2 server for this, but what's 
> not clear is how come 2.4.31 handles getattr faster than 2.6.23?

I suspect a more detailed analysis of the traces is in order.  I
strongly suspect you will see a difference between the two clients,
and you have only reported a difference between the first and second
"ls -l" (unless I missed some email).

It seems most likely that 2.6 is issuing substantially more GETATTR
requests than 2.4.  There have certainly been reports of this in the
past and they have been either fixed or justified.
This may be a new situation.  Or it may be that 2.4 was being fast by
being incorrect in some way.  Only an analysis of the logs would tell.

Maybe you would like to post the (binary, using "-s 0") traces for
both 2.4 and 2.6

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-07 Thread Andrew Morton
> On Wed, 7 Nov 2007 12:36:26 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> Neil Brown wrote:
> > On Tuesday November 6, [EMAIL PROTECTED] wrote:
> > > > On Tue, 6 Nov 2007 14:28:11 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> > > > Al Boldi wrote:
> > > > > There is a massive (3-18x) slowdown when re-querying a large nfs dir
> > > > > (2k+ entries) using a simple ls -l.
> > > > >
> > > > > On 2.6.23 client and server running userland rpc.nfs.V2:
> > > > > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > > > > more tries: time -p ls -l <2k+ entry dir>  in ~8sec
> > > > >
> > > > > first  try: time -p ls -l <5k+ entry dir>  in ~9sec
> > > > > more tries: time -p ls -l <5k+ entry dir>  in ~180sec
> > > > >
> > > > > On 2.6.23 client and 2.4.31 server running userland rpc.nfs.V2:
> > > > > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > > > > more tries: time -p ls -l <2k+ entry dir>  in ~7sec
> > > > >
> > > > > first  try: time -p ls -l <5k+ entry dir>  in ~8sec
> > > > > more tries: time -p ls -l <5k+ entry dir>  in ~43sec
> > > > >
> > > > > Remounting the nfs-dir on the client resets the problem.
> > > > >
> > > > > Any ideas?
> > > >
> > > > Ok, I played some more with this, and it turns out that nfsV3 is a lot
> > > > faster.  But, this does not explain why the 2.4.31 kernel is still
> > > > over 4-times faster than 2.6.23.
> > > >
> > > > Can anybody explain what's going on?
> > >
> > > Sure, Neil can! ;)
> 
> Thanks Andrew!
> 
> > Nuh.
> > He said "userland rpc.nfs.Vx".  I only do "kernel-land NFS".  In these
> > days of high specialisation, each line of code is owned by a different
> > person, and finding the right person is hard
> >
> > I would suggest getting a 'tcpdump -s0' trace and seeing (with
> > wireshark) what is different between the various cases.
> 
> Thanks Neil for looking into this.  Your suggestion has already been answered 
> in a previous post, where the difference has been attributed to "ls -l" 
> inducing lookup for the first try, which is fast, and getattr for later 
> tries, which is super-slow.
> 
> Now it's easy to blame the userland rpc.nfs.V2 server for this, but what's 
> not clear is how come 2.4.31 handles getattr faster than 2.6.23?
> 

We broke 2.6?  It'd be interesting to run the ls in an infinite loop on the
client them start poking at the server.  Is the 2.6 server doing physical
IO?  Is the 2.6 server consuming more system time?  etc.  A basic `vmstat
1' trace for both 2.4 and 2.6 would be a starting point.

Could be that there's some additional latency caused by networking changes,
too.  I expect the tcpdump/wireshark/etc traces would have sufficient
resolution for us to be able to see that.

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-07 Thread Al Boldi
Neil Brown wrote:
> On Tuesday November 6, [EMAIL PROTECTED] wrote:
> > > On Tue, 6 Nov 2007 14:28:11 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> > > Al Boldi wrote:
> > > > There is a massive (3-18x) slowdown when re-querying a large nfs dir
> > > > (2k+ entries) using a simple ls -l.
> > > >
> > > > On 2.6.23 client and server running userland rpc.nfs.V2:
> > > > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > > > more tries: time -p ls -l <2k+ entry dir>  in ~8sec
> > > >
> > > > first  try: time -p ls -l <5k+ entry dir>  in ~9sec
> > > > more tries: time -p ls -l <5k+ entry dir>  in ~180sec
> > > >
> > > > On 2.6.23 client and 2.4.31 server running userland rpc.nfs.V2:
> > > > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > > > more tries: time -p ls -l <2k+ entry dir>  in ~7sec
> > > >
> > > > first  try: time -p ls -l <5k+ entry dir>  in ~8sec
> > > > more tries: time -p ls -l <5k+ entry dir>  in ~43sec
> > > >
> > > > Remounting the nfs-dir on the client resets the problem.
> > > >
> > > > Any ideas?
> > >
> > > Ok, I played some more with this, and it turns out that nfsV3 is a lot
> > > faster.  But, this does not explain why the 2.4.31 kernel is still
> > > over 4-times faster than 2.6.23.
> > >
> > > Can anybody explain what's going on?
> >
> > Sure, Neil can! ;)

Thanks Andrew!

> Nuh.
> He said "userland rpc.nfs.Vx".  I only do "kernel-land NFS".  In these
> days of high specialisation, each line of code is owned by a different
> person, and finding the right person is hard
>
> I would suggest getting a 'tcpdump -s0' trace and seeing (with
> wireshark) what is different between the various cases.

Thanks Neil for looking into this.  Your suggestion has already been answered 
in a previous post, where the difference has been attributed to "ls -l" 
inducing lookup for the first try, which is fast, and getattr for later 
tries, which is super-slow.

Now it's easy to blame the userland rpc.nfs.V2 server for this, but what's 
not clear is how come 2.4.31 handles getattr faster than 2.6.23?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-06 Thread Neil Brown
On Tuesday November 6, [EMAIL PROTECTED] wrote:
> > On Tue, 6 Nov 2007 14:28:11 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> > Al Boldi wrote:
> > > There is a massive (3-18x) slowdown when re-querying a large nfs dir (2k+
> > > entries) using a simple ls -l.
> > >
> > > On 2.6.23 client and server running userland rpc.nfs.V2:
> > > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > > more tries: time -p ls -l <2k+ entry dir>  in ~8sec
> > >
> > > first  try: time -p ls -l <5k+ entry dir>  in ~9sec
> > > more tries: time -p ls -l <5k+ entry dir>  in ~180sec
> > >
> > > On 2.6.23 client and 2.4.31 server running userland rpc.nfs.V2:
> > > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > > more tries: time -p ls -l <2k+ entry dir>  in ~7sec
> > >
> > > first  try: time -p ls -l <5k+ entry dir>  in ~8sec
> > > more tries: time -p ls -l <5k+ entry dir>  in ~43sec
> > >
> > > Remounting the nfs-dir on the client resets the problem.
> > >
> > > Any ideas?
> > 
> > Ok, I played some more with this, and it turns out that nfsV3 is a lot 
> > faster.  But, this does not explain why the 2.4.31 kernel is still over 
> > 4-times faster than 2.6.23.
> > 
> > Can anybody explain what's going on?
> > 
> 
> Sure, Neil can! ;)

Nuh.
He said "userland rpc.nfs.Vx".  I only do "kernel-land NFS".  In these
days of high specialisation, each line of code is owned by a different
person, and finding the right person is hard

I would suggest getting a 'tcpdump -s0' trace and seeing (with
wireshark) what is different between the various cases.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-06 Thread Andrew Morton
> On Tue, 6 Nov 2007 14:28:11 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> Al Boldi wrote:
> > There is a massive (3-18x) slowdown when re-querying a large nfs dir (2k+
> > entries) using a simple ls -l.
> >
> > On 2.6.23 client and server running userland rpc.nfs.V2:
> > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > more tries: time -p ls -l <2k+ entry dir>  in ~8sec
> >
> > first  try: time -p ls -l <5k+ entry dir>  in ~9sec
> > more tries: time -p ls -l <5k+ entry dir>  in ~180sec
> >
> > On 2.6.23 client and 2.4.31 server running userland rpc.nfs.V2:
> > first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> > more tries: time -p ls -l <2k+ entry dir>  in ~7sec
> >
> > first  try: time -p ls -l <5k+ entry dir>  in ~8sec
> > more tries: time -p ls -l <5k+ entry dir>  in ~43sec
> >
> > Remounting the nfs-dir on the client resets the problem.
> >
> > Any ideas?
> 
> Ok, I played some more with this, and it turns out that nfsV3 is a lot 
> faster.  But, this does not explain why the 2.4.31 kernel is still over 
> 4-times faster than 2.6.23.
> 
> Can anybody explain what's going on?
> 

Sure, Neil can! ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-06 Thread Al Boldi
Al Boldi wrote:
> There is a massive (3-18x) slowdown when re-querying a large nfs dir (2k+
> entries) using a simple ls -l.
>
> On 2.6.23 client and server running userland rpc.nfs.V2:
> first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> more tries: time -p ls -l <2k+ entry dir>  in ~8sec
>
> first  try: time -p ls -l <5k+ entry dir>  in ~9sec
> more tries: time -p ls -l <5k+ entry dir>  in ~180sec
>
> On 2.6.23 client and 2.4.31 server running userland rpc.nfs.V2:
> first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
> more tries: time -p ls -l <2k+ entry dir>  in ~7sec
>
> first  try: time -p ls -l <5k+ entry dir>  in ~8sec
> more tries: time -p ls -l <5k+ entry dir>  in ~43sec
>
> Remounting the nfs-dir on the client resets the problem.
>
> Any ideas?

Ok, I played some more with this, and it turns out that nfsV3 is a lot 
faster.  But, this does not explain why the 2.4.31 kernel is still over 
4-times faster than 2.6.23.

Can anybody explain what's going on?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-04 Thread Al Boldi
Matthew Wilcox wrote:
> How about tcpdumping and seeing what requests are flowing across the
> wire?  You might be able to figure out what's being done differently.

I think lookup is faster than getattr.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Massive slowdown when re-querying large nfs dir

2007-11-04 Thread Matthew Wilcox
On Mon, Nov 05, 2007 at 07:58:38AM +0300, Al Boldi wrote:
> Any ideas?

How about tcpdumping and seeing what requests are flowing across the
wire?  You might be able to figure out what's being done differently.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Massive slowdown when re-querying large nfs dir

2007-11-04 Thread Al Boldi
There is a massive (3-18x) slowdown when re-querying a large nfs dir (2k+ 
entries) using a simple ls -l.

On 2.6.23 client and server running userland rpc.nfs.V2:
first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
more tries: time -p ls -l <2k+ entry dir>  in ~8sec

first  try: time -p ls -l <5k+ entry dir>  in ~9sec
more tries: time -p ls -l <5k+ entry dir>  in ~180sec

On 2.6.23 client and 2.4.31 server running userland rpc.nfs.V2:
first  try: time -p ls -l <2k+ entry dir>  in ~2.5sec
more tries: time -p ls -l <2k+ entry dir>  in ~7sec

first  try: time -p ls -l <5k+ entry dir>  in ~8sec
more tries: time -p ls -l <5k+ entry dir>  in ~43sec

Remounting the nfs-dir on the client resets the problem.

Any ideas?


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html