Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 16 Sep 2009 12:26:04 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Tue, 15 Sep 2009, Jeff Layton wrote: Yow, that version of mount.cifs is really old. I wonder if it may be passing bad mount options to the kernel? Might be interesting to strace that. Something like: # strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx ...it'll probably have a cleartext password in it so you might want to doctor the options a bit before sending along if you do. Alternately, you might just want to try a newer version of mount.cifs and see whether that fixes this. Tried a newer version of mount.cifs without any change. Ok, good to rule that out then. I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount /company. Actually, the trace of a hanging mount would probably be interesting. Does the 32-bit capture that you sent represent a mount attempt that hung? Or was it successful? No it was successful. Hmm, ok. That isn't going to tell me as much as a mount that fails. For now, I suggest that we focus on determining why these mounts hang/fail. After that we can see whether the solution there has any bearing on why the server is so slow to respond to this particular client. What's the devname that you're giving to the mount command for the clameter dir? If there's more than 1 path component after the hostname, then the problem may be in the old version of mount.cifs. Some of them had broken handling for path prefixes. its //machinename/company/clameter So two components. Also good to know. What we should probably do at this point is track down why the 32-bit client has such a hard time mounting the clameter dir. Here's what would be most helpful: 1) some debug log info of the mount attempt: # modprobe cifs # echo 7 /proc/fs/cifs/cifsFYI ...then attempt the mount. After it hangs for a few seconds, ^c the mount to kill it. Collect the output from dmesg and send it to me. That should give me some idea of what the client is doing during this phase. If you can simultaneously capture wire traffic during the same mount attempt that would also be helpful. Cheers, -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Thu, 10 Sep 2009, Jeff Layton wrote: In any case, I think we need to look closely at what's happening at mount time. First, I'll need some other info: 1) output of /sbin/mount.cifs -V from both machines The 32 bit machine #/sbin/mount.cifs -V mount.cifs version: 1.5 mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx 64 bit machine $ /sbin/mount.cifs -V mount.cifs version: 1.12-3.4.0 mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=w2k 3) wire captures from mount attempts on both machines. Try to mount the clameter dir on both boxes and do captures of each attempt. Maybe this time use -s 0 with tcpdump so we get all of the traffic. I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount /company. There may be crackable password hashes in the captures, so you may want to send them to me privately and not cc the list. Ok will follow. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Tue, 15 Sep 2009, Jeff Layton wrote: Yow, that version of mount.cifs is really old. I wonder if it may be passing bad mount options to the kernel? Might be interesting to strace that. Something like: # strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx ...it'll probably have a cleartext password in it so you might want to doctor the options a bit before sending along if you do. Alternately, you might just want to try a newer version of mount.cifs and see whether that fixes this. Tried a newer version of mount.cifs without any change. I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount /company. Actually, the trace of a hanging mount would probably be interesting. Does the 32-bit capture that you sent represent a mount attempt that hung? Or was it successful? No it was successful. What's the devname that you're giving to the mount command for the clameter dir? If there's more than 1 path component after the hostname, then the problem may be in the old version of mount.cifs. Some of them had broken handling for path prefixes. its //machinename/company/clameter So two components. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Mon, 14 Sep 2009 16:10:47 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Thu, 10 Sep 2009, Jeff Layton wrote: In any case, I think we need to look closely at what's happening at mount time. First, I'll need some other info: 1) output of /sbin/mount.cifs -V from both machines The 32 bit machine #/sbin/mount.cifs -V mount.cifs version: 1.5 //chiprodfs2/company /mnt -ouser=clameter,domain=xxx mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx Yow, that version of mount.cifs is really old. I wonder if it may be passing bad mount options to the kernel? Might be interesting to strace that. Something like: # strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx ...it'll probably have a cleartext password in it so you might want to doctor the options a bit before sending along if you do. Alternately, you might just want to try a newer version of mount.cifs and see whether that fixes this. 64 bit machine $ /sbin/mount.cifs -V mount.cifs version: 1.12-3.4.0 mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=w2k 3) wire captures from mount attempts on both machines. Try to mount the clameter dir on both boxes and do captures of each attempt. Maybe this time use -s 0 with tcpdump so we get all of the traffic. I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount /company. Actually, the trace of a hanging mount would probably be interesting. Does the 32-bit capture that you sent represent a mount attempt that hung? Or was it successful? There may be crackable password hashes in the captures, so you may want to send them to me privately and not cc the list. Ok will follow. Thanks for the info, I had a look at the captures. They both look fairly similar. The main difference is that the 32-bit box doesn't seem to have sent any more calls after sending a QPathInfo call to the server for the root inode of the mount. What's the devname that you're giving to the mount command for the clameter dir? If there's more than 1 path component after the hostname, then the problem may be in the old version of mount.cifs. Some of them had broken handling for path prefixes. -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009 12:33:21 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Sat, 5 Sep 2009, Jeff Layton wrote: It looks like it's just taking 5s for the server to respond here. Do you happen to have a wire capture of one of these events? That may tell us more than cifsFYI info... I did a tcpdump and nothing stands out. Server acks the cmd 50 and then waits 5 seconds before sending the data. 16:23:34.336373 IP (tos 0x0, ttl 64, id 20616, offset 0, flags [DF], proto 6, length: 118) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: P 2801206064:2801206142(78) ack 468207120 win 190 16:23:34.336624 IP (tos 0x0, ttl 125, id 19869, offset 0, flags [DF], proto 6, length: 206) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: P 1:167(166) ack 78 win 64548 16:23:34.336636 IP (tos 0x0, ttl 64, id 20617, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 78:78(0) ack 167 win 190 16:23:34.336669 IP (tos 0x0, ttl 64, id 20618, offset 0, flags [DF], proto 6, length: 128) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: P 78:166(88) ack 167 win 190 16:23:34.456343 IP (tos 0x0, ttl 125, id 20045, offset 0, flags [DF], proto 6, length: 40) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . [tcp sum ok] 167:167(0) ack 166 win 64460 hiccup 16:23:39.284930 IP (tos 0x0, ttl 125, id 27544, offset 0, flags [DF], proto 6, length: 230) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 167:357(190) ack 166 win 64460 16:23:39.324060 IP (tos 0x0, ttl 64, id 20619, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 357 win 190 A binary capture would probably be easier to infer something from -- we'd be able to open it up in wireshark and get a little more info about what sort of call the client is doing. My suspicion would be that the server needs to perform an oplock break to another client before it can send the response. The only way I know how to tell that is to sniff all SMB traffic on the server and watch for oplock break calls to other clients when these stalls occur. -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009, Jeff Layton wrote: My suspicion would be that the server needs to perform an oplock break to another client before it can send the response. The only way I know how to tell that is to sniff all SMB traffic on the server and watch for oplock break calls to other clients when these stalls occur. That could be tested by switching them off right? If I do echo 0 /proc/fs/cifs/OplockEnabled and then remount the volume it should switch off oplocks? This has no effect on the stalls. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Fri, 4 Sep 2009 12:27:35 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: This is on 32 bit x86 on a Dell 1950 After mouting a cifs share we have 5 second hiccups. Typical log output when doing a simple ls /mnt: Sep 4 16:21:43 rd-spare kernel: fs/cifs/transport.c: For smb_command 50 Sep 4 16:21:43 rd-spare kernel: fs/cifs/transport.c: Sending smb: total_len 118 Sep 4 16:21:43 rd-spare kernel: fs/cifs/inode.c: CIFS VFS: leaving cifs_revalidate (xid = 258) rc = 0 Sep 4 16:21:43 rd-spare kernel: fs/cifs/dir.c: CIFS VFS: in cifs_lookup as Xid: 263 with uid: 0 Sep 4 16:21:43 rd-spare kernel: fs/cifs/dir.c: parent inode = 0xf58d2e60 name is: AutoWire.bmp and dentry = 0xf5adb63c Sep 4 16:21:43 rd-spare kernel: fs/cifs/dir.c: NULL inode in lookup Sep 4 16:21:43 rd-spare kernel: fs/cifs/dir.c: Full path: \AutoWire.bmp inode = 0x(null) Sep 4 16:21:43 rd-spare kernel: fs/cifs/inode.c: Getting info on \AutoWire.bmp Sep 4 16:21:43 rd-spare kernel: fs/cifs/transport.c: For smb_command 50 Sep 4 16:21:43 rd-spare kernel: fs/cifs/transport.c: Sending smb: total_len 104 5 second hiccup Sep 4 16:21:48 rd-spare kernel: fs/cifs/connect.c: rfc1002 length 0xce Sep 4 16:21:48 rd-spare kernel: fs/cifs/connect.c: rfc1002 length 0xc0 (adding linux-cifs-client mailing list) It looks like it's just taking 5s for the server to respond here. Do you happen to have a wire capture of one of these events? That may tell us more than cifsFYI info... Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: inode 0xf5876518 old_time=26000 new_time=32751 Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: cifs_revalidate - inode unchanged Sep 4 16:21:48 rd-spare kernel: fs/cifs/file.c: CIFS VFS: in cifs_writepages as Xid: 264 with uid: 0 Sep 4 16:21:48 rd-spare kernel: fs/cifs/file.c: CIFS VFS: leaving cifs_writepages (xid = 264) rc = 0 Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: CIFS VFS: leaving cifs_revalidate (xid = 262) rc = 0 Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: CIFS VFS: in cifs_revalidate as Xid: 265 with uid: 0 Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: Revalidate: \Akamai Headsets.doc inode 0xf5876518 count 2 dentry: 0xf5ada8d0 d_time 260 00 jiffies 32751 Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: CIFS VFS: leaving cifs_revalidate (xid = 265) rc = 0 Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: CIFS VFS: in cifs_revalidate as Xid: 266 with uid: 0 Sep 4 16:21:48 rd-spare kernel: fs/cifs/inode.c: Revalidate: \Akamai Headsets.doc inode 0xf5876518 count 2 dentry: 0xf5ada8d0 d_time 260 00 jiffies 32751 This is happening intermittently on a variety of hosts. cat /proc/fs/cifs/DebugData Display Internal CIFS Data Structures for Debugging --- CIFS Version 1.60 Active VFS Requests: 2 Servers: 1) Name: 10.2.4.64 Domain: W2K Uses: 1 OS: Windows Server 2003 R2 3790 Service Pack 2 NOS: Windows Server 2003 R2 5.2 Capability: 0x1f3fd SMB session status: 1 TCP status: 1 Local Users To Server: 1 SecMode: 0x3 Req On Wire: 2 Shares: 1) \\chiprodfs2\company Mounts: 1 Type: NTFS DevInfo: 0x20 Attributes: 0x700ff PathComponentMax: 255 Status: 0x1 type: DISK MIDs: State: 2 com: 50 pid: 5951 tsk: f756d1b0 mid 277 State: 2 com: 50 pid: 6044 tsk: f69d4760 mid 278 cat /proc/fs/cifs/Stats Resources in use CIFS Session: 1 Share (unique mount targets): 1 SMB Request/Response Buffer: 5 Pool size: 5 SMB Small Req/Resp Buffer: 1 Pool size: 30 Operations (MIDs): 2 0 session 0 share reconnects Total vfs operations: 525 maximum at one time: 3 1) \\chiprodfs2\company SMBs: 305 Oplock Breaks: 0 Reads: 0 Bytes: 0 Writes: 0 Bytes: 0 Flushes: 0 Locks: 0 HardLinks: 0 Symlinks: 0 Opens: 0 Closes: 0 Deletes: 0 Posix Opens: 0 Posix Mkdirs: 0 Mkdirs: 0 Rmdirs: 0 Renames: 0 T2 Renames 0 FindFirst: 2 FNext 0 FClose 0 What is this ??? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Sat, 5 Sep 2009, Jeff Layton wrote: It looks like it's just taking 5s for the server to respond here. Do you happen to have a wire capture of one of these events? That may tell us more than cifsFYI info... I did a tcpdump and nothing stands out. Server acks the cmd 50 and then waits 5 seconds before sending the data. 16:23:34.336373 IP (tos 0x0, ttl 64, id 20616, offset 0, flags [DF], proto 6, length: 118) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: P 2801206064:2801206142(78) ack 468207120 win 190 16:23:34.336624 IP (tos 0x0, ttl 125, id 19869, offset 0, flags [DF], proto 6, length: 206) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: P 1:167(166) ack 78 win 64548 16:23:34.336636 IP (tos 0x0, ttl 64, id 20617, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 78:78(0) ack 167 win 190 16:23:34.336669 IP (tos 0x0, ttl 64, id 20618, offset 0, flags [DF], proto 6, length: 128) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: P 78:166(88) ack 167 win 190 16:23:34.456343 IP (tos 0x0, ttl 125, id 20045, offset 0, flags [DF], proto 6, length: 40) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . [tcp sum ok] 167:167(0) ack 166 win 64460 hiccup 16:23:39.284930 IP (tos 0x0, ttl 125, id 27544, offset 0, flags [DF], proto 6, length: 230) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 167:357(190) ack 166 win 64460 16:23:39.324060 IP (tos 0x0, ttl 64, id 20619, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 357 win 190 16:23:39.324292 IP (tos 0x0, ttl 125, id 27563, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 357:1817(1460) ack 166 win 64460 16:23:39.324300 IP (tos 0x0, ttl 64, id 20620, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 1817 win 190 16:23:39.324306 IP (tos 0x0, ttl 125, id 27564, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 1817:3277(1460) ack 166 win 64460 16:23:39.324311 IP (tos 0x0, ttl 64, id 20621, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 3277 win 188 16:23:39.324315 IP (tos 0x0, ttl 125, id 27565, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 3277:4737(1460) ack 166 win 64460 16:23:39.324319 IP (tos 0x0, ttl 64, id 20622, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 4737 win 186 16:23:39.324321 IP (tos 0x0, ttl 125, id 27566, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 4737:6197(1460) ack 166 win 64460 16:23:39.324324 IP (tos 0x0, ttl 64, id 20623, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 6197 win 184 16:23:39.324329 IP (tos 0x0, ttl 125, id 27567, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 6197:7657(1460) ack 166 win 64460 16:23:39.324332 IP (tos 0x0, ttl 64, id 20624, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 7657 win 182 16:23:39.324335 IP (tos 0x0, ttl 125, id 27568, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 7657:9117(1460) ack 166 win 64460 16:23:39.324337 IP (tos 0x0, ttl 64, id 20625, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 9117 win 180 16:23:39.324354 IP (tos 0x0, ttl 125, id 27569, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 9117:10577(1460) ack 166 win 64460 16:23:39.324362 IP (tos 0x0, ttl 64, id 20626, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 10577 win 190 16:23:39.324371 IP (tos 0x0, ttl 125, id 27570, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 10577:12037(1460) ack 166 win 64460 16:23:39.324374 IP (tos 0x0, ttl 64, id 20627, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 12037 win 188 16:23:39.324377 IP (tos 0x0, ttl 125, id 27571, offset 0, flags [DF], proto 6, length: 1500) dogmeat.jules.org.microsoft-ds fawkes.jules.org.43355: . 12037:13497(1460) ack 166 win 64460 16:23:39.324379 IP (tos 0x0, ttl 64, id 20628, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009, Jeff Layton wrote: Unfortunately I doubt there's much you can do from your client to prevent that (if that is the case). There may be a way to turn off oplocks on the server side, but that may very well be even worse for performance. Also note that these hiccups occur when simply doing an ls we are not accessing or writing files. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009, Jeff Layton wrote: That'll stop your client from requesting oplocks, but that won't prevent others from doing so. If my suspicion is correct, then another client is holding an oplock and the server needs to break it before it can reply to yours. Unfortunately I doubt there's much you can do from your client to prevent that (if that is the case). There may be a way to turn off oplocks on the server side, but that may very well be even worse for performance. Hmmm... We can look at that. Another interesting tidbit is that I have never seen this from a 64 bit Linux kernel. Only occurs with 32 bit kernels it seems. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009 13:07:52 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Wed, 9 Sep 2009, Jeff Layton wrote: My suspicion would be that the server needs to perform an oplock break to another client before it can send the response. The only way I know how to tell that is to sniff all SMB traffic on the server and watch for oplock break calls to other clients when these stalls occur. That could be tested by switching them off right? If I do echo 0 /proc/fs/cifs/OplockEnabled and then remount the volume it should switch off oplocks? This has no effect on the stalls. That'll stop your client from requesting oplocks, but that won't prevent others from doing so. If my suspicion is correct, then another client is holding an oplock and the server needs to break it before it can reply to yours. Unfortunately I doubt there's much you can do from your client to prevent that (if that is the case). There may be a way to turn off oplocks on the server side, but that may very well be even worse for performance. -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009, Jeff Layton wrote: That sounds rather strange. Maybe we do have a bug of some sort? The thing to do might be to get a binary capture of the 32-bit traffic around the time of the stalls. We could then inspect the packets and see whether we have something wrong in there. Capture attached.-- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009, Jeff Layton wrote: Well, I can see the delays in the capture, but the snarflen for the capture is a little too small to tell much else. Can you redo the capture with a larger snarflen (maybe -s 512 or so)? -s 1000 version attached. Also, were you able to tell anything from a server-side capture? Is the server issuing oplock breaks at those times? Thats a pretty busy system. They have not gotten around to do any logging on that end. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Thu, 10 Sep 2009, Jeff Layton wrote: I assume that the 32 and 64 bit clients you have are calling ls in the same dir. If so, maybe a similar capture from a 64-bit client might help us see the difference? 64 bit trace attached.-- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Thu, 10 Sep 2009, Jeff Layton wrote: A couple of differences. First, the ls's were done in different directories since they had different search patterns: Right. 32 bit cannot mount the clameter directory for strange reasons. I have to go one level higher. The 64-bit capture was done in a directory with only 50 files, whereas the other one had at least 600-700 files (capture ends before it finished listing the files). That may make quite a bit of difference on the server (not sure how windows works internally in this case). Right. I just remounted the 64 bit on the same directory. No delays. The only other substantive difference I see is that the Level of Interest that the client is requesting is different: 32 == SMB_FIND_FILE_DIRECTORY_INFO 64 == SMB_FIND_FILE_ID_FULL_DIR_INFO That probably means that the 32 bit client has disabled CIFS_MOUNT_SERVER_INUM for some reason. That means that it's not asking the server for the windows equivalent of inode numbers. We typically disable that flag automatically if a query for the inode number of a path fails. I added the serverino option on the 32 bit system. No effect. Since these are the same server, that may be an indicator that the server is serving out info from two different filesystem types (maybe FAT vs. NTFS, or maybe even a CDROM or something). If so, then that may help explain some of the performance delta there. I'd be more interested to see how the 64 bit client behaves when it mounts the exact same share and does an ls in the same directory as the 32 bit client. No its all on the same file system. New capture attached for same directory. -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
One other issue that may be important: The mounting operation is very slow on 32 bit. Could it be that the handshake does not work out? -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Thu, 10 Sep 2009 14:53:12 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Wed, 9 Sep 2009, Jeff Layton wrote: Well, I can see the delays in the capture, but the snarflen for the capture is a little too small to tell much else. Can you redo the capture with a larger snarflen (maybe -s 512 or so)? -s 1000 version attached. Also, were you able to tell anything from a server-side capture? Is the server issuing oplock breaks at those times? Thats a pretty busy system. They have not gotten around to do any logging on that end. Ok. I had a look at the capture. The stalls seem to be occurring on FIND_FILE requests. Those are similar to READDIRPLUS requests in NFS, it returns a list of files that match a particular set of criteria and their attributes. Each time the client is making one of these calls to the server, it requests a set of up to 150 files. The server grinds for 5s each time and then responds. The calls themselves seem to be sane AFAICT. I don't see any problems with the parameters we're sending for the search. I also had a look over the FIND_FIRST code and it doesn't seem to have any obvious word size related problems. I assume that the 32 and 64 bit clients you have are calling ls in the same dir. If so, maybe a similar capture from a 64-bit client might help us see the difference? Thanks, -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Thu, 10 Sep 2009 15:42:28 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Thu, 10 Sep 2009, Jeff Layton wrote: I assume that the 32 and 64 bit clients you have are calling ls in the same dir. If so, maybe a similar capture from a 64-bit client might help us see the difference? 64 bit trace attached. A couple of differences. First, the ls's were done in different directories since they had different search patterns: 32 == \* 64 == \clameter\* ...did they also mount different shares from the server? The 64-bit capture was done in a directory with only 50 files, whereas the other one had at least 600-700 files (capture ends before it finished listing the files). That may make quite a bit of difference on the server (not sure how windows works internally in this case). The only other substantive difference I see is that the Level of Interest that the client is requesting is different: 32 == SMB_FIND_FILE_DIRECTORY_INFO 64 == SMB_FIND_FILE_ID_FULL_DIR_INFO That probably means that the 32 bit client has disabled CIFS_MOUNT_SERVER_INUM for some reason. That means that it's not asking the server for the windows equivalent of inode numbers. We typically disable that flag automatically if a query for the inode number of a path fails. Since these are the same server, that may be an indicator that the server is serving out info from two different filesystem types (maybe FAT vs. NTFS, or maybe even a CDROM or something). If so, then that may help explain some of the performance delta there. I'd be more interested to see how the 64 bit client behaves when it mounts the exact same share and does an ls in the same directory as the 32 bit client. Cheers, -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Thu, 10 Sep 2009 17:27:53 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: Right. 32 bit cannot mount the clameter directory for strange reasons. I have to go one level higher. [...] One other issue that may be important: The mounting operation is very slow on 32 bit. Could it be that the handshake does not work out? Ok, looks like the 64 bit client is using a different level of interest than the 32 bit on the FIND_FIRST call. I suspect that that difference may account for the difference in response time. It's not completely clear to me why that would be. Maybe a windows bug that causes a slowdown with that LOI? In any case, I think we need to look closely at what's happening at mount time. First, I'll need some other info: 1) output of /sbin/mount.cifs -V from both machines 2) mount options that you're using on both boxes 3) wire captures from mount attempts on both machines. Try to mount the clameter dir on both boxes and do captures of each attempt. Maybe this time use -s 0 with tcpdump so we get all of the traffic. There may be crackable password hashes in the captures, so you may want to send them to me privately and not cc the list. Thanks, -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009 13:33:33 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Wed, 9 Sep 2009, Jeff Layton wrote: Unfortunately I doubt there's much you can do from your client to prevent that (if that is the case). There may be a way to turn off oplocks on the server side, but that may very well be even worse for performance. Also note that these hiccups occur when simply doing an ls we are not accessing or writing files. Hmm... The hiccups you posted in the original email happened during a QPathInfo call (somewhat similar to a NFS GETATTR). I wouldn't think that would cause an oplock break, but I suppose it might. The server might decide that it needs to revoke the oplock in order to retrieve accurate size, LastWriteTime (aka mtime), etc. It could also be a windows bug... Here's an excerpt from an IRC conversation on this in #samba-technical, that might give a little info: 13:42 jlayton would a QPathInfo call cause an oplock break? 13:42 jlayton (typically)? 13:47 sdann jlayton, no it shouldn't, as it's path based and could be done with a stat() call. Only an open() or brl() operation should break an oplock. 13:48 jlayton ok, good to know -- thx 13:49 jlayton sdann: actually though, I'm asking about win2k3 server... 13:49 jlayton do you know whether it might break the oplock on a qpathinfo? 13:49 jlayton i.e. to get accurate size info, for instance 13:50 sdann well in general, only opens, writes (truncate included), and byte-range-lock ops break oplocks 13:50 sdann so any kind of meta-data request should not 13:51 jlayton hmm ok, one of the linux-kernel guys is seeing QPathInfo calls go out to win2k3 server and the server waits 5s before responding 13:51 jlayton my initial thought was oplock break to another client is causing the stall, but maybe it's something else 13:51 coffeedude sdann, SetFileInfo (allocationInfo and EndofFile) will as well. 13:51 jlayton I'm pretty sure this is QPathInfo call 13:52 sdann a quick torture test in source4/torture/raw/oplock.c would solve the issue :) 13:52 coffeedude jlayton, internally in Windows, the NTFS interface is handle based so I assume the server does a NtCreateFile(), QueryInformationFile(), CloseFile(). 13:52 jlayton ahhh maybe so 13:52 coffeedude jlayton, the internal opens should done with FILE_READ_ATTRIBUTES so they don't cause a break but it could be a Windows bug. 13:53 jlayton sounds plausible 13:53 jlayton coffeedude, sdann: thanks! 13:53 coffeedude jlayton, any open with nothing other than FILE_READ_ATTRIBUTES, FILE_WRITE_ATTRIBUTES or SYNCHRONIZE should nto cause an oplock break either. 13:53 sdann coffeedude, yeah that's certainly possible 13:53 coffeedude jlayton, any open with nothing other than FILE_READ_ATTRIBUTES, FILE_WRITE_ATTRIBUTES or SYNCHRONIZE should nto cause an oplock break either. 13:53 sdann coffeedude, yeah that's certainly possible 13:53 coffeedude sdann, only know cause I've done it :) I'd probably start with sniffing traffic at the server side and see if you can correlate the stalls with traffic to other hosts (oplock breaks in particular). If so then maybe consider patching the server or testing with a different flavor of windows. -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009 13:28:24 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Wed, 9 Sep 2009, Jeff Layton wrote: That'll stop your client from requesting oplocks, but that won't prevent others from doing so. If my suspicion is correct, then another client is holding an oplock and the server needs to break it before it can reply to yours. Unfortunately I doubt there's much you can do from your client to prevent that (if that is the case). There may be a way to turn off oplocks on the server side, but that may very well be even worse for performance. Hmmm... We can look at that. Another interesting tidbit is that I have never seen this from a 64 bit Linux kernel. Only occurs with 32 bit kernels it seems. That sounds rather strange. Maybe we do have a bug of some sort? The thing to do might be to get a binary capture of the 32-bit traffic around the time of the stalls. We could then inspect the packets and see whether we have something wrong in there. -- Jeff Layton jlay...@redhat.com -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba
Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups
On Wed, 9 Sep 2009 17:27:57 -0400 (EDT) Christoph Lameter c...@linux-foundation.org wrote: On Wed, 9 Sep 2009, Jeff Layton wrote: That sounds rather strange. Maybe we do have a bug of some sort? The thing to do might be to get a binary capture of the 32-bit traffic around the time of the stalls. We could then inspect the packets and see whether we have something wrong in there. Capture attached. Well, I can see the delays in the capture, but the snarflen for the capture is a little too small to tell much else. Can you redo the capture with a larger snarflen (maybe -s 512 or so)? Also, were you able to tell anything from a server-side capture? Is the server issuing oplock breaks at those times? Cheers, Jeff -- To unsubscribe from this list go to the following URL and read the instructions: https://lists.samba.org/mailman/options/samba