Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-18 Thread Jeff Layton
On Wed, 16 Sep 2009 12:26:04 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Tue, 15 Sep 2009, Jeff Layton wrote:
 
  Yow, that version of mount.cifs is really old. I wonder if it may be
  passing bad mount options to the kernel? Might be interesting to strace
  that. Something like:
 
  # strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt 
  -ouser=clameter,domain=xxx
 
  ...it'll probably have a cleartext password in it so you might want to
  doctor the options a bit before sending along if you do.
 
  Alternately, you might just want to try a newer version of mount.cifs
  and see whether that fixes this.
 
 Tried a newer version of mount.cifs without any change.
 

Ok, good to rule that out then.

   I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount
   /company.
  
 
  Actually, the trace of a hanging mount would probably be interesting.
 
  Does the 32-bit capture that you sent represent a mount attempt that
  hung? Or was it successful?
 
 No it was successful.
 

Hmm, ok. That isn't going to tell me as much as a mount that fails. For
now, I suggest that we focus on determining why these mounts hang/fail.
After that we can see whether the solution there has any bearing on why
the server is so slow to respond to this particular client.

  What's the devname that you're giving to the mount command for the
  clameter dir? If there's more than 1 path component after the
  hostname, then the problem may be in the old version of mount.cifs.
  Some of them had broken handling for path prefixes.
 
 its //machinename/company/clameter
 
 So two components.
 

Also good to know.

What we should probably do at this point is track down why the 32-bit
client has such a hard time mounting the clameter dir. Here's what
would be most helpful:

1) some debug log info of the mount attempt:

# modprobe cifs
# echo 7  /proc/fs/cifs/cifsFYI

...then attempt the mount. After it hangs for a few seconds, ^c the
mount to kill it. Collect the output from dmesg and send it to me. That
should give me some idea of what the client is doing during this phase.

If you can simultaneously capture wire traffic during the same mount
attempt that would also be helpful.

Cheers,
-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-16 Thread Christoph Lameter
On Thu, 10 Sep 2009, Jeff Layton wrote:

 In any case, I think we need to look closely at what's happening at
 mount time. First, I'll need some other info:

 1) output of /sbin/mount.cifs -V from both machines

The 32 bit machine

#/sbin/mount.cifs -V
mount.cifs version: 1.5

mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx

64 bit machine

$ /sbin/mount.cifs -V
mount.cifs version: 1.12-3.4.0

mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=w2k

 3) wire captures from mount attempts on both machines. Try to mount the
 clameter dir on both boxes and do captures of each attempt. Maybe
 this time use -s 0 with tcpdump so we get all of the traffic.

I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount
/company.

 There may be crackable password hashes in the captures, so you may want
 to send them to me privately and not cc the list.

Ok will follow.

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-16 Thread Christoph Lameter
On Tue, 15 Sep 2009, Jeff Layton wrote:

 Yow, that version of mount.cifs is really old. I wonder if it may be
 passing bad mount options to the kernel? Might be interesting to strace
 that. Something like:

 # strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt 
 -ouser=clameter,domain=xxx

 ...it'll probably have a cleartext password in it so you might want to
 doctor the options a bit before sending along if you do.

 Alternately, you might just want to try a newer version of mount.cifs
 and see whether that fixes this.

Tried a newer version of mount.cifs without any change.

  I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount
  /company.
 

 Actually, the trace of a hanging mount would probably be interesting.

 Does the 32-bit capture that you sent represent a mount attempt that
 hung? Or was it successful?

No it was successful.

 What's the devname that you're giving to the mount command for the
 clameter dir? If there's more than 1 path component after the
 hostname, then the problem may be in the old version of mount.cifs.
 Some of them had broken handling for path prefixes.

its //machinename/company/clameter

So two components.

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-15 Thread Jeff Layton
On Mon, 14 Sep 2009 16:10:47 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Thu, 10 Sep 2009, Jeff Layton wrote:
 
  In any case, I think we need to look closely at what's happening at
  mount time. First, I'll need some other info:
 
  1) output of /sbin/mount.cifs -V from both machines
 
 The 32 bit machine
 
 #/sbin/mount.cifs -V
 mount.cifs version: 1.5
 //chiprodfs2/company /mnt -ouser=clameter,domain=xxx
 mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx
 

Yow, that version of mount.cifs is really old. I wonder if it may be
passing bad mount options to the kernel? Might be interesting to strace
that. Something like:

# strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt 
-ouser=clameter,domain=xxx

...it'll probably have a cleartext password in it so you might want to
doctor the options a bit before sending along if you do.

Alternately, you might just want to try a newer version of mount.cifs
and see whether that fixes this.

 64 bit machine
 
 $ /sbin/mount.cifs -V
 mount.cifs version: 1.12-3.4.0
 
 mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=w2k
 
  3) wire captures from mount attempts on both machines. Try to mount the
  clameter dir on both boxes and do captures of each attempt. Maybe
  this time use -s 0 with tcpdump so we get all of the traffic.
 
 I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount
 /company.
 

Actually, the trace of a hanging mount would probably be interesting.

Does the 32-bit capture that you sent represent a mount attempt that
hung? Or was it successful?

  There may be crackable password hashes in the captures, so you may want
  to send them to me privately and not cc the list.
 
 Ok will follow.
 

Thanks for the info, I had a look at the captures. They both look
fairly similar. The main difference is that the 32-bit box doesn't seem
to have sent any more calls after sending a QPathInfo call to the
server for the root inode of the mount.

What's the devname that you're giving to the mount command for the
clameter dir? If there's more than 1 path component after the
hostname, then the problem may be in the old version of mount.cifs.
Some of them had broken handling for path prefixes.

-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Jeff Layton
On Wed, 9 Sep 2009 12:33:21 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Sat, 5 Sep 2009, Jeff Layton wrote:
 
  It looks like it's just taking 5s for the server to respond here. Do
  you happen to have a wire capture of one of these events? That may tell
  us more than cifsFYI info...
 
 I did a tcpdump and nothing stands out. Server acks the cmd 50 and then
 waits 5 seconds before sending the data.
 
 16:23:34.336373 IP (tos 0x0, ttl  64, id 20616, offset 0, flags [DF], proto 
 6, length: 118) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: P 
 2801206064:2801206142(78) ack 468207120 win 190
 16:23:34.336624 IP (tos 0x0, ttl 125, id 19869, offset 0, flags [DF], proto 
 6, length: 206) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: P 
 1:167(166) ack 78 win 64548
 16:23:34.336636 IP (tos 0x0, ttl  64, id 20617, offset 0, flags [DF], proto 
 6, length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . 
 [tcp sum ok] 78:78(0) ack 167 win 190
 16:23:34.336669 IP (tos 0x0, ttl  64, id 20618, offset 0, flags [DF], proto 
 6, length: 128) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: P 
 78:166(88) ack 167 win 190
 16:23:34.456343 IP (tos 0x0, ttl 125, id 20045, offset 0, flags [DF], proto 
 6, length: 40) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
 [tcp sum ok] 167:167(0) ack 166 win 64460
 
 hiccup
 
 16:23:39.284930 IP (tos 0x0, ttl 125, id 27544, offset 0, flags [DF], proto 
 6, length: 230) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
 167:357(190) ack 166 win 64460
 16:23:39.324060 IP (tos 0x0, ttl  64, id 20619, offset 0, flags [DF], proto 
 6, length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . 
 [tcp sum ok] 166:166(0) ack 357 win 190


A binary capture would probably be easier to infer something from --
we'd be able to open it up in wireshark and get a little more info
about what sort of call the client is doing.

My suspicion would be that the server needs to perform an oplock break
to another client before it can send the response. The only way I know
how to tell that is to sniff all SMB traffic on the server and watch
for oplock break calls to other clients when these stalls occur.

-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Wed, 9 Sep 2009, Jeff Layton wrote:

 My suspicion would be that the server needs to perform an oplock break
 to another client before it can send the response. The only way I know
 how to tell that is to sniff all SMB traffic on the server and watch
 for oplock break calls to other clients when these stalls occur.

That could be tested by switching them off right? If I do

echo 0 /proc/fs/cifs/OplockEnabled

and then remount the volume it should switch off oplocks?

This has no effect on the stalls.

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Jeff Layton
On Fri, 4 Sep 2009 12:27:35 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 This is on 32 bit x86 on a Dell 1950
 
 After mouting a cifs share we have 5 second hiccups. Typical log output
 when doing a simple ls /mnt:
 
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: For smb_command 50
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: Sending smb:
 total_len 118
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: leaving
 cifs_revalidate (xid = 258) rc = 0
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: CIFS VFS: in cifs_lookup
 as Xid: 263 with uid: 0
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: parent inode = 0xf58d2e60
 name is: AutoWire.bmp and dentry = 0xf5adb63c
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: NULL inode in lookup
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: Full path: \AutoWire.bmp
 inode = 0x(null)
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/inode.c: Getting info on 
 \AutoWire.bmp
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: For smb_command 50
 Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: Sending smb:  
 total_len 104
 
   5 second hiccup
 
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/connect.c: rfc1002 length 0xce
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/connect.c: rfc1002 length 0xc0

(adding linux-cifs-client mailing list)

It looks like it's just taking 5s for the server to respond here. Do
you happen to have a wire capture of one of these events? That may tell
us more than cifsFYI info...

 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: inode 0xf5876518 
 old_time=26000 new_time=32751
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: cifs_revalidate - inode 
 unchanged
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/file.c: CIFS VFS: in 
 cifs_writepages as Xid: 264 with uid: 0
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/file.c: CIFS VFS: leaving 
 cifs_writepages (xid = 264) rc = 0
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: leaving 
 cifs_revalidate (xid = 262) rc = 0
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: in 
 cifs_revalidate as Xid: 265 with uid: 0
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: Revalidate: \Akamai 
 Headsets.doc inode 0xf5876518 count 2 dentry: 0xf5ada8d0 d_time 260
 00 jiffies 32751
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: leaving 
 cifs_revalidate (xid = 265) rc = 0
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: in 
 cifs_revalidate as Xid: 266 with uid: 0
 Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: Revalidate: \Akamai 
 Headsets.doc inode 0xf5876518 count 2 dentry: 0xf5ada8d0 d_time 260
 00 jiffies 32751
 
 
 This is happening intermittently on a variety of hosts.
 
 cat /proc/fs/cifs/DebugData
 
 Display Internal CIFS Data Structures for Debugging
 ---
 CIFS Version 1.60
 Active VFS Requests: 2
 Servers:
 1) Name: 10.2.4.64  Domain: W2K Uses: 1 OS: Windows Server 2003 R2 3790
 Service Pack 2
   NOS: Windows Server 2003 R2 5.2 Capability: 0x1f3fd
   SMB session status: 1   TCP status: 1
   Local Users To Server: 1 SecMode: 0x3 Req On Wire: 2
   Shares:
   1) \\chiprodfs2\company Mounts: 1 Type: NTFS DevInfo: 0x20
 Attributes: 0x700ff
 PathComponentMax: 255 Status: 0x1 type: DISK
 
   MIDs:
   State: 2 com: 50 pid: 5951 tsk: f756d1b0 mid 277
   State: 2 com: 50 pid: 6044 tsk: f69d4760 mid 278
 
 cat /proc/fs/cifs/Stats
 
 Resources in use
 CIFS Session: 1
 Share (unique mount targets): 1
 SMB Request/Response Buffer: 5 Pool size: 5
 SMB Small Req/Resp Buffer: 1 Pool size: 30
 Operations (MIDs): 2
 
 0 session 0 share reconnects
 Total vfs operations: 525 maximum at one time: 3
 
 1) \\chiprodfs2\company
 SMBs: 305 Oplock Breaks: 0
 Reads:  0 Bytes: 0
 Writes: 0 Bytes: 0
 Flushes: 0
 Locks: 0 HardLinks: 0 Symlinks: 0
 Opens: 0 Closes: 0 Deletes: 0
 Posix Opens: 0 Posix Mkdirs: 0
 Mkdirs: 0 Rmdirs: 0
 Renames: 0 T2 Renames 0
 FindFirst: 2 FNext 0 FClose 0
 
 
 What is this ???
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 


-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Sat, 5 Sep 2009, Jeff Layton wrote:

 It looks like it's just taking 5s for the server to respond here. Do
 you happen to have a wire capture of one of these events? That may tell
 us more than cifsFYI info...

I did a tcpdump and nothing stands out. Server acks the cmd 50 and then
waits 5 seconds before sending the data.

16:23:34.336373 IP (tos 0x0, ttl  64, id 20616, offset 0, flags [DF], proto 6, 
length: 118) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: P 
2801206064:2801206142(78) ack 468207120 win 190
16:23:34.336624 IP (tos 0x0, ttl 125, id 19869, offset 0, flags [DF], proto 6, 
length: 206) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: P 
1:167(166) ack 78 win 64548
16:23:34.336636 IP (tos 0x0, ttl  64, id 20617, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 78:78(0) ack 167 win 190
16:23:34.336669 IP (tos 0x0, ttl  64, id 20618, offset 0, flags [DF], proto 6, 
length: 128) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: P 
78:166(88) ack 167 win 190
16:23:34.456343 IP (tos 0x0, ttl 125, id 20045, offset 0, flags [DF], proto 6, 
length: 40) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . [tcp sum 
ok] 167:167(0) ack 166 win 64460

hiccup

16:23:39.284930 IP (tos 0x0, ttl 125, id 27544, offset 0, flags [DF], proto 6, 
length: 230) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
167:357(190) ack 166 win 64460
16:23:39.324060 IP (tos 0x0, ttl  64, id 20619, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 357 win 190
16:23:39.324292 IP (tos 0x0, ttl 125, id 27563, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
357:1817(1460) ack 166 win 64460
16:23:39.324300 IP (tos 0x0, ttl  64, id 20620, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 1817 win 190
16:23:39.324306 IP (tos 0x0, ttl 125, id 27564, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
1817:3277(1460) ack 166 win 64460
16:23:39.324311 IP (tos 0x0, ttl  64, id 20621, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 3277 win 188
16:23:39.324315 IP (tos 0x0, ttl 125, id 27565, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
3277:4737(1460) ack 166 win 64460
16:23:39.324319 IP (tos 0x0, ttl  64, id 20622, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 4737 win 186
16:23:39.324321 IP (tos 0x0, ttl 125, id 27566, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
4737:6197(1460) ack 166 win 64460
16:23:39.324324 IP (tos 0x0, ttl  64, id 20623, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 6197 win 184
16:23:39.324329 IP (tos 0x0, ttl 125, id 27567, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
6197:7657(1460) ack 166 win 64460
16:23:39.324332 IP (tos 0x0, ttl  64, id 20624, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 7657 win 182
16:23:39.324335 IP (tos 0x0, ttl 125, id 27568, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
7657:9117(1460) ack 166 win 64460
16:23:39.324337 IP (tos 0x0, ttl  64, id 20625, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 9117 win 180
16:23:39.324354 IP (tos 0x0, ttl 125, id 27569, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
9117:10577(1460) ack 166 win 64460
16:23:39.324362 IP (tos 0x0, ttl  64, id 20626, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 10577 win 190
16:23:39.324371 IP (tos 0x0, ttl 125, id 27570, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
10577:12037(1460) ack 166 win 64460
16:23:39.324374 IP (tos 0x0, ttl  64, id 20627, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  dogmeat.jules.org.microsoft-ds: . [tcp sum 
ok] 166:166(0) ack 12037 win 188
16:23:39.324377 IP (tos 0x0, ttl 125, id 27571, offset 0, flags [DF], proto 6, 
length: 1500) dogmeat.jules.org.microsoft-ds  fawkes.jules.org.43355: . 
12037:13497(1460) ack 166 win 64460
16:23:39.324379 IP (tos 0x0, ttl  64, id 20628, offset 0, flags [DF], proto 6, 
length: 40) fawkes.jules.org.43355  

Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Wed, 9 Sep 2009, Jeff Layton wrote:

 Unfortunately I doubt there's much you can do from your client to
 prevent that (if that is the case). There may be a way to turn off
 oplocks on the server side, but that may very well be even worse for
 performance.

Also note that these hiccups occur when simply doing an

ls

we are not accessing or writing files.

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Wed, 9 Sep 2009, Jeff Layton wrote:

 That'll stop your client from requesting oplocks, but that won't
 prevent others from doing so. If my suspicion is correct, then another
 client is holding an oplock and the server needs to break it before it
 can reply to yours.

 Unfortunately I doubt there's much you can do from your client to
 prevent that (if that is the case). There may be a way to turn off
 oplocks on the server side, but that may very well be even worse for
 performance.

Hmmm... We can look at that.

Another interesting tidbit is that I have never seen this from a 64 bit
Linux kernel. Only occurs with 32 bit kernels it seems.

-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Jeff Layton
On Wed, 9 Sep 2009 13:07:52 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Wed, 9 Sep 2009, Jeff Layton wrote:
 
  My suspicion would be that the server needs to perform an oplock break
  to another client before it can send the response. The only way I know
  how to tell that is to sniff all SMB traffic on the server and watch
  for oplock break calls to other clients when these stalls occur.
 
 That could be tested by switching them off right? If I do
 
 echo 0 /proc/fs/cifs/OplockEnabled
 
 and then remount the volume it should switch off oplocks?
 
 This has no effect on the stalls.
 

That'll stop your client from requesting oplocks, but that won't
prevent others from doing so. If my suspicion is correct, then another
client is holding an oplock and the server needs to break it before it
can reply to yours.

Unfortunately I doubt there's much you can do from your client to
prevent that (if that is the case). There may be a way to turn off
oplocks on the server side, but that may very well be even worse for
performance.

-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Wed, 9 Sep 2009, Jeff Layton wrote:

 That sounds rather strange. Maybe we do have a bug of some sort? The
 thing to do might be to get a binary capture of the 32-bit traffic
 around the time of the stalls. We could then inspect the packets and
 see whether we have something wrong in there.

Capture attached.-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Wed, 9 Sep 2009, Jeff Layton wrote:

 Well, I can see the delays in the capture, but the snarflen for the
 capture is a little too small to tell much else. Can you redo the
 capture with a larger snarflen (maybe -s 512 or so)?

-s 1000 version attached.

 Also, were you able to tell anything from a server-side capture? Is the
 server issuing oplock breaks at those times?

Thats a pretty busy system. They have not gotten around to do any logging
on that end.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Thu, 10 Sep 2009, Jeff Layton wrote:

 I assume that the 32 and 64 bit clients you have are calling ls in
 the same dir. If so, maybe a similar capture from a 64-bit client might
 help us see the difference?

64 bit trace attached.-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter
On Thu, 10 Sep 2009, Jeff Layton wrote:

 A couple of differences. First, the ls's were done in different
 directories since they had different search patterns:

Right. 32 bit cannot mount the clameter directory for strange reasons. I
have to go one level higher.

 The 64-bit capture was done in a directory with only 50 files,
 whereas the other one had at least 600-700 files (capture ends before
 it finished listing the files). That may make quite a bit of difference
 on the server (not sure how windows works internally in this case).

Right. I just remounted the 64 bit on the same directory. No delays.

 The only other substantive difference I see is that the Level of
 Interest that the client is requesting is different:

 32 == SMB_FIND_FILE_DIRECTORY_INFO
 64 == SMB_FIND_FILE_ID_FULL_DIR_INFO

 That probably means that the 32 bit client has disabled
 CIFS_MOUNT_SERVER_INUM for some reason. That means that it's not asking
 the server for the windows equivalent of inode numbers. We typically
 disable that flag automatically if a query for the inode number of a
 path fails.

I added the serverino option on the 32 bit system. No effect.

 Since these are the same server, that may be an indicator that the
 server is serving out info from two different filesystem types (maybe
 FAT vs. NTFS, or maybe even a CDROM or something). If so, then that may
 help explain some of the performance delta there. I'd be more
 interested to see how the 64 bit client behaves when it mounts the
 exact same share and does an ls in the same directory as the 32 bit
 client.

No its all on the same file system.

New capture attached for same directory.
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba

Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-11 Thread Christoph Lameter

One other issue that may be important: The mounting operation is very slow
on 32 bit. Could it be that the handshake does not work out?


-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-10 Thread Jeff Layton
On Thu, 10 Sep 2009 14:53:12 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Wed, 9 Sep 2009, Jeff Layton wrote:
 
  Well, I can see the delays in the capture, but the snarflen for the
  capture is a little too small to tell much else. Can you redo the
  capture with a larger snarflen (maybe -s 512 or so)?
 
 -s 1000 version attached.
 
  Also, were you able to tell anything from a server-side capture? Is the
  server issuing oplock breaks at those times?
 
 Thats a pretty busy system. They have not gotten around to do any logging
 on that end.

Ok. I had a look at the capture. The stalls seem to be occurring on
FIND_FILE requests. Those are similar to READDIRPLUS requests in NFS,
it returns a list of files that match a particular set of criteria and
their attributes.

Each time the client is making one of these calls to the server, it
requests a set of up to 150 files. The server grinds for 5s each time
and then responds.

The calls themselves seem to be sane AFAICT. I don't see any problems
with the parameters we're sending for the search. I also had a look
over the FIND_FIRST code and it doesn't seem to have any obvious
word size related problems.

I assume that the 32 and 64 bit clients you have are calling ls in
the same dir. If so, maybe a similar capture from a 64-bit client might
help us see the difference?

Thanks,
-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-10 Thread Jeff Layton
On Thu, 10 Sep 2009 15:42:28 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Thu, 10 Sep 2009, Jeff Layton wrote:
 
  I assume that the 32 and 64 bit clients you have are calling ls in
  the same dir. If so, maybe a similar capture from a 64-bit client might
  help us see the difference?
 
 64 bit trace attached.

A couple of differences. First, the ls's were done in different
directories since they had different search patterns:

32 == \*
64 == \clameter\*

...did they also mount different shares from the server?

The 64-bit capture was done in a directory with only 50 files,
whereas the other one had at least 600-700 files (capture ends before
it finished listing the files). That may make quite a bit of difference
on the server (not sure how windows works internally in this case).

The only other substantive difference I see is that the Level of
Interest that the client is requesting is different:

32 == SMB_FIND_FILE_DIRECTORY_INFO
64 == SMB_FIND_FILE_ID_FULL_DIR_INFO

That probably means that the 32 bit client has disabled
CIFS_MOUNT_SERVER_INUM for some reason. That means that it's not asking
the server for the windows equivalent of inode numbers. We typically
disable that flag automatically if a query for the inode number of a
path fails.

Since these are the same server, that may be an indicator that the
server is serving out info from two different filesystem types (maybe
FAT vs. NTFS, or maybe even a CDROM or something). If so, then that may
help explain some of the performance delta there. I'd be more
interested to see how the 64 bit client behaves when it mounts the
exact same share and does an ls in the same directory as the 32 bit
client.

Cheers,
-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-10 Thread Jeff Layton
On Thu, 10 Sep 2009 17:27:53 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 Right. 32 bit cannot mount the clameter directory for strange reasons. I
 have to go one level higher.
[...]
 One other issue that may be important: The mounting operation is very slow
 on 32 bit. Could it be that the handshake does not work out?

Ok, looks like the 64 bit client is using a different level of interest
than the 32 bit on the FIND_FIRST call. I suspect that that difference
may account for the difference in response time.

It's not completely clear to me why that would be. Maybe a windows bug
that causes a slowdown with that LOI?

In any case, I think we need to look closely at what's happening at
mount time. First, I'll need some other info:

1) output of /sbin/mount.cifs -V from both machines

2) mount options that you're using on both boxes

3) wire captures from mount attempts on both machines. Try to mount the
clameter dir on both boxes and do captures of each attempt. Maybe
this time use -s 0 with tcpdump so we get all of the traffic.

There may be crackable password hashes in the captures, so you may want
to send them to me privately and not cc the list.

Thanks,
-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-09 Thread Jeff Layton
On Wed, 9 Sep 2009 13:33:33 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Wed, 9 Sep 2009, Jeff Layton wrote:
 
  Unfortunately I doubt there's much you can do from your client to
  prevent that (if that is the case). There may be a way to turn off
  oplocks on the server side, but that may very well be even worse for
  performance.
 
 Also note that these hiccups occur when simply doing an
 
   ls
 
 we are not accessing or writing files.
 

Hmm...

The hiccups you posted in the original email happened during a
QPathInfo call (somewhat similar to a NFS GETATTR). I wouldn't think
that would cause an oplock break, but I suppose it might. The server
might decide that it needs to revoke the oplock in order to retrieve
accurate size, LastWriteTime (aka mtime), etc. It could also be a
windows bug...

Here's an excerpt from an IRC conversation on this in #samba-technical, that 
might give a little info:

13:42  jlayton would a QPathInfo call cause an oplock break?
13:42  jlayton (typically)?
13:47  sdann jlayton, no it shouldn't, as it's path based and could be done 
with a stat() call.  Only an open() or brl() 
   operation should break an oplock.
13:48  jlayton ok, good to know -- thx
13:49  jlayton sdann: actually though, I'm asking about win2k3 server...
13:49  jlayton do you know whether it might break the oplock on a qpathinfo?
13:49  jlayton i.e. to get accurate size info, for instance
13:50  sdann well in general, only opens, writes (truncate included), and 
byte-range-lock ops break oplocks
13:50  sdann so any kind of meta-data request should not
13:51  jlayton hmm ok, one of the linux-kernel guys is seeing QPathInfo calls 
go out to win2k3 server and the server waits 
 5s before responding
13:51  jlayton my initial thought was oplock break to another client is 
causing the stall, but maybe it's something else
13:51  coffeedude sdann, SetFileInfo (allocationInfo and EndofFile) will as 
well.
13:51  jlayton I'm pretty sure this is QPathInfo call
13:52  sdann a quick torture test in source4/torture/raw/oplock.c would solve 
the issue :)
13:52  coffeedude jlayton, internally in Windows, the NTFS interface is 
handle based so I assume the server does a 
NtCreateFile(), QueryInformationFile(), CloseFile(). 
13:52  jlayton ahhh maybe so
13:52  coffeedude jlayton, the internal opens should done with 
FILE_READ_ATTRIBUTES so they don't cause a break but it 
could be a Windows bug.
13:53  jlayton sounds plausible
13:53  jlayton coffeedude, sdann: thanks!
13:53  coffeedude jlayton, any open with nothing other than 
FILE_READ_ATTRIBUTES, FILE_WRITE_ATTRIBUTES or SYNCHRONIZE 
should nto cause an oplock break either.
13:53  sdann coffeedude, yeah that's certainly possible
13:53  coffeedude jlayton, any open with nothing other than 
FILE_READ_ATTRIBUTES, FILE_WRITE_ATTRIBUTES or SYNCHRONIZE 
should nto cause an oplock break either.
13:53  sdann coffeedude, yeah that's certainly possible
13:53  coffeedude sdann, only know cause I've done it :)

I'd probably start with sniffing traffic at the server side and see if
you can correlate the stalls with traffic to other hosts (oplock breaks
in particular).

If so then maybe consider patching the server or testing with a
different flavor of windows.

-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-09 Thread Jeff Layton
On Wed, 9 Sep 2009 13:28:24 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Wed, 9 Sep 2009, Jeff Layton wrote:
 
  That'll stop your client from requesting oplocks, but that won't
  prevent others from doing so. If my suspicion is correct, then another
  client is holding an oplock and the server needs to break it before it
  can reply to yours.
 
  Unfortunately I doubt there's much you can do from your client to
  prevent that (if that is the case). There may be a way to turn off
  oplocks on the server side, but that may very well be even worse for
  performance.
 
 Hmmm... We can look at that.
 
 Another interesting tidbit is that I have never seen this from a 64 bit
 Linux kernel. Only occurs with 32 bit kernels it seems.
 

That sounds rather strange. Maybe we do have a bug of some sort? The
thing to do might be to get a binary capture of the 32-bit traffic
around the time of the stalls. We could then inspect the packets and
see whether we have something wrong in there.

-- 
Jeff Layton jlay...@redhat.com
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba


Re: [Samba] 2.6.31-rc8: CIFS with 5 seconds hiccups

2009-09-09 Thread Jeff Layton
On Wed, 9 Sep 2009 17:27:57 -0400 (EDT)
Christoph Lameter c...@linux-foundation.org wrote:

 On Wed, 9 Sep 2009, Jeff Layton wrote:
 
  That sounds rather strange. Maybe we do have a bug of some sort? The
  thing to do might be to get a binary capture of the 32-bit traffic
  around the time of the stalls. We could then inspect the packets and
  see whether we have something wrong in there.
 
 Capture attached.

Well, I can see the delays in the capture, but the snarflen for the
capture is a little too small to tell much else. Can you redo the
capture with a larger snarflen (maybe -s 512 or so)?

Also, were you able to tell anything from a server-side capture? Is the
server issuing oplock breaks at those times?

Cheers,
Jeff
-- 
To unsubscribe from this list go to the following URL and read the
instructions:  https://lists.samba.org/mailman/options/samba