Re: [ceph-users] ceph-fuse slow cache?

2018-09-12 Thread Stefan Kooman
Quoting Yan, Zheng (uker...@gmail.com):
 >
> 
> please add '-f' option (trace child processes' syscall)  to strace,

Good suggestion. We now see all apache child processes doing it's thing.
We have been, on and off, been stracing / debugging this issue. Nothing
obvious. We are still trying to get our fingers around it. Some website
are not "affected" and just quick, some have these "stalls".

As soon as we have more ceph related info we will let you know.

Gr. Stefan

-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-fuse slow cache?

2018-08-27 Thread Yan, Zheng
On Mon, Aug 27, 2018 at 4:47 AM Stefan Kooman  wrote:
>
> Hi,
>
> Quoting Yan, Zheng (uker...@gmail.com):
> > Could you strace apacha process, check which syscall waits for a long time.
>
> Yes, that's how I did all the tests (strace -t -T apache2 -X). With
> debug=20 (ceph-fuse) you see apache waiting for almost 20 seconds before it 
> starts serving data:
>
> 13:33:55 accept4(4, {sa_family=AF_INET6, sin6_port=htons(36829), 
> inet_pton(AF_INET6, ":::213.136.12.151", _addr), sin6_flowinfo=0, 
> sin6_scope_id=0}, [28], SOCK_CLOEXEC) = 24 <1.218381>
> 13:33:56 getsockname(24, {sa_family=AF_INET6, sin6_port=htons(80), 
> inet_pton(AF_INET6, ":::10.5.80.8", _addr), sin6_flowinfo=0, 
> sin6_scope_id=0}, [28]) = 0 <0.000113>
> 13:33:56 fcntl(24, F_GETFL) = 0x2 (flags O_RDWR) <0.72>
> 13:33:56 fcntl(24, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.25>
> 13:33:56 clone(child_stack=0, 
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
> child_tidptr=0x7f132cf6aa50) = 24260 <0.001235>
> 13:33:56 wait4(24260, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 
> 24260 <19.424578>
> 13:34:15 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24260, 
> si_uid=5003, si_status=0, si_utime=35, si_stime=27} ---
> 13:34:15 close(24)  = 0 <0.60>
> 13:34:15 clone(child_stack=0, 
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
> child_tidptr=0x7f132cf6aa50) = 24263 <0.001003>
> 13:34:15 wait4(24263, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 
> 24263 <5.043079>
> 13:34:20 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24263, 
> si_uid=5003, si_status=0, si_utime=0, si_stime=0} ---
> 13:34:20 close(24)  = 0 <0.96>
>
> Gr. Stefan
>

please add '-f' option (trace child processes' syscall)  to strace,

> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-fuse slow cache?

2018-08-27 Thread Stefan Kooman
Hi,

Quoting Yan, Zheng (uker...@gmail.com):
> Could you strace apacha process, check which syscall waits for a long time.

Yes, that's how I did all the tests (strace -t -T apache2 -X). With
debug=20 (ceph-fuse) you see apache waiting for almost 20 seconds before it 
starts serving data:

13:33:55 accept4(4, {sa_family=AF_INET6, sin6_port=htons(36829), 
inet_pton(AF_INET6, ":::213.136.12.151", _addr), sin6_flowinfo=0, 
sin6_scope_id=0}, [28], SOCK_CLOEXEC) = 24 <1.218381>
13:33:56 getsockname(24, {sa_family=AF_INET6, sin6_port=htons(80), 
inet_pton(AF_INET6, ":::10.5.80.8", _addr), sin6_flowinfo=0, 
sin6_scope_id=0}, [28]) = 0 <0.000113>
13:33:56 fcntl(24, F_GETFL) = 0x2 (flags O_RDWR) <0.72>
13:33:56 fcntl(24, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.25>
13:33:56 clone(child_stack=0, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f132cf6aa50) = 24260 <0.001235>
13:33:56 wait4(24260, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 24260 
<19.424578>
13:34:15 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24260, 
si_uid=5003, si_status=0, si_utime=35, si_stime=27} ---
13:34:15 close(24)  = 0 <0.60>
13:34:15 clone(child_stack=0, 
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x7f132cf6aa50) = 24263 <0.001003>
13:34:15 wait4(24263, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 24263 
<5.043079>
13:34:20 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=24263, 
si_uid=5003, si_status=0, si_utime=0, si_stime=0} ---
13:34:20 close(24)  = 0 <0.96>

Gr. Stefan

-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-fuse slow cache?

2018-08-26 Thread Yan, Zheng
Could you strace apacha process, check which syscall waits for a long time.
On Sat, Aug 25, 2018 at 3:04 AM Stefan Kooman  wrote:
>
> Quoting Gregory Farnum (gfar...@redhat.com):
>
> > Hmm, these aren't actually the start and end times to the same operation.
> > put_inode() is literally adjusting a refcount, which can happen for reasons
> > ranging from the VFS doing something that drops it to an internal operation
> > completing to a response coming back from the MDS. You should be able to
> > find requests coming in from the kernel and a response going back out (the
> > function names will be prefixed with "ll_", eg "ll_lookup").
>
> Ok, next try. One page visit and grepped "sess_". See results below:
>
> 2018-08-25 11:10:42.142 7ff2a0c2c700  3 client.15224195 ll_lookup 
> 0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5
> 2018-08-25 11:10:42.142 7ff2a0c2c700 10 client.15224195 _lookup concluded 
> ENOENT locally for 0x168547c.head(faked_ino=0 ref=2 ll_ref=4 cap_refs={} 
> open={} mode=40755 size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 
> mtime=2018-08-25 11:08:11.025085 ctime=2018-08-25 11:08:11.025085 
> caps=pAsLsXsFs(0=pAsLsXsFs) COMPLETE parents=0x15218bb.head["sessions"] 
> 0x557ddecc6100) dn 'sess_bss8rh5bug6bffvj6s67vnsce5'
> 2018-08-25 11:10:42.142 7ff2a0c2c700  3 client.15224195 ll_lookup 
> 0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5 -> -2 (0)
> 2018-08-25 11:10:42.142 7ff2a142d700  3 client.15224195 ll_lookup 
> 0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5
> 2018-08-25 11:10:42.142 7ff2a142d700 10 client.15224195 _lookup concluded 
> ENOENT locally for 0x168547c.head(faked_ino=0 ref=2 ll_ref=5 cap_refs={} 
> open={} mode=40755 size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 
> mtime=2018-08-25 11:08:11.025085 ctime=2018-08-25 11:08:11.025085 
> caps=pAsLsXsFs(0=pAsLsXsFs) COMPLETE parents=0x15218bb.head["sessions"] 
> 0x557ddecc6100) dn 'sess_bss8rh5bug6bffvj6s67vnsce5'
> 2018-08-25 11:10:42.142 7ff2a142d700  3 client.15224195 ll_lookup 
> 0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5 -> -2 (0)
> 2018-08-25 11:10:42.142 7ff2a0c2c700  8 client.15224195 _ll_create 
> 0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5 0100600 131138, uid 5003, 
> gid 5003
> 2018-08-25 11:10:42.142 7ff2a0c2c700 10 client.15224195 _lookup concluded 
> ENOENT locally for 0x168547c.head(faked_ino=0 ref=2 ll_ref=5 cap_refs={} 
> open={} mode=40755 size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 
> mtime=2018-08-25 11:08:11.025085 ctime=2018-08-25 11:08:11.025085 
> caps=pAsLsXsFs(0=pAsLsXsFs) COMPLETE parents=0x15218bb.head["sessions"] 
> 0x557ddecc6100) dn 'sess_bss8rh5bug6bffvj6s67vnsce5'
>
> ^^ check for existing session file, which does not exist, yet.
>
> 2018-08-25 11:10:42.142 7ff2a0c2c700  8 client.15224195 _create(0x168547c 
> sess_bss8rh5bug6bffvj6s67vnsce5, 0100600)
> 2018-08-25 11:10:42.142 7ff2a0c2c700 20 client.15224195 get_or_create 
> 0x168547c.head(faked_ino=0 ref=3 ll_ref=5 cap_refs={} open={} mode=40755 
> size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 mtime=2018-08-25 
> 11:08:11.025085 ctime=2018-08-25 11:08:11.025085 caps=pAsLsXsFs(0=pAsLsXsFs) 
> COMPLETE parents=0x15218bb.head["sessions"] 0x557ddecc6100) name 
> sess_bss8rh5bug6bffvj6s67vnsce5
> 2018-08-25 11:10:42.142 7ff2a0c2c700 15 client.15224195 link dir 
> 0x557ddecc6100 'sess_bss8rh5bug6bffvj6s67vnsce5' to inode 0 dn 0x557ddf053de0 
> (new dn)
> 2018-08-25 11:10:42.142 7ff2a0c2c700 20 client.15224195 choose_target_mds 
> inode dir hash is 2 on sess_bss8rh5bug6bffvj6s67vnsce5 => 3288169645
> 2018-08-25 11:10:42.142 7ff2a0c2c700 10 client.15224195 send_request 
> client_request(unknown.0:872 create 
> #0x168547c/sess_bss8rh5bug6bffvj6s67vnsce5 2018-08-25 11:10:42.148081 
> caller_uid=5003, caller_gid=5003{5003,}) v4 to mds.0
> 2018-08-25 11:10:42.142 7ff2a0c2c700  1 -- [2001:7b8:81:6::18]:0/4024171085 
> --> [2001:7b8:80:3:0:2c:3:2]:6800/1086374448 -- client_request(unknown.0:872 
> create #0x168547c/sess_bss8rh5bug6bffvj6s67vnsce5 2018-08-25 
> 11:10:42.148081 caller_uid=5003, caller_gid=5003{5003,}) v4 -- 0x557ddf884080 
> con 0
> 2018-08-25 11:10:42.142 7ff2a5c36700 12 client.15224195 insert_dentry_inode 
> 'sess_bss8rh5bug6bffvj6s67vnsce5' vino 0x16d9844.head in dir 
> 0x168547c.head dn 0x557ddf053de0
> 2018-08-25 11:10:42.142 7ff2a5c36700 15 client.15224195 link dir 
> 0x557ddecc6100 'sess_bss8rh5bug6bffvj6s67vnsce5' to inode 0x557ddfaa0680 dn 
> 0x557ddf053de0 (old dn)
> 2018-08-25 11:10:42.142 7ff2a5c36700 10 client.15224195 put_inode on 
> 0x16d9844.head(faked_ino=0 ref=3 ll_ref=0 cap_refs={} open={} mode=100600 
> size=0/4194304 nlink=1 btime=2018-08-25 11:10:42.148081 mtime=2018-08-25 
> 11:10:42.148081 ctime=2018-08-25 11:10:42.148081 
> caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x16d9844 ts 0/0 
> objects 0 dirty_or_tx 0] 
> parents=0x168547c.head["sess_bss8rh5bug6bffvj6s67vnsce5"] 0x557ddfaa0680)
> 2018-08-25 

Re: [ceph-users] ceph-fuse slow cache?

2018-08-25 Thread Stefan Kooman
Quoting Gregory Farnum (gfar...@redhat.com):

> Hmm, these aren't actually the start and end times to the same operation.
> put_inode() is literally adjusting a refcount, which can happen for reasons
> ranging from the VFS doing something that drops it to an internal operation
> completing to a response coming back from the MDS. You should be able to
> find requests coming in from the kernel and a response going back out (the
> function names will be prefixed with "ll_", eg "ll_lookup").

Ok, next try. One page visit and grepped "sess_". See results below:

2018-08-25 11:10:42.142 7ff2a0c2c700  3 client.15224195 ll_lookup 
0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5
2018-08-25 11:10:42.142 7ff2a0c2c700 10 client.15224195 _lookup concluded 
ENOENT locally for 0x168547c.head(faked_ino=0 ref=2 ll_ref=4 cap_refs={} 
open={} mode=40755 size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 
mtime=2018-08-25 11:08:11.025085 ctime=2018-08-25 11:08:11.025085 
caps=pAsLsXsFs(0=pAsLsXsFs) COMPLETE parents=0x15218bb.head["sessions"] 
0x557ddecc6100) dn 'sess_bss8rh5bug6bffvj6s67vnsce5'
2018-08-25 11:10:42.142 7ff2a0c2c700  3 client.15224195 ll_lookup 
0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5 -> -2 (0)
2018-08-25 11:10:42.142 7ff2a142d700  3 client.15224195 ll_lookup 
0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5
2018-08-25 11:10:42.142 7ff2a142d700 10 client.15224195 _lookup concluded 
ENOENT locally for 0x168547c.head(faked_ino=0 ref=2 ll_ref=5 cap_refs={} 
open={} mode=40755 size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 
mtime=2018-08-25 11:08:11.025085 ctime=2018-08-25 11:08:11.025085 
caps=pAsLsXsFs(0=pAsLsXsFs) COMPLETE parents=0x15218bb.head["sessions"] 
0x557ddecc6100) dn 'sess_bss8rh5bug6bffvj6s67vnsce5'
2018-08-25 11:10:42.142 7ff2a142d700  3 client.15224195 ll_lookup 
0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5 -> -2 (0)
2018-08-25 11:10:42.142 7ff2a0c2c700  8 client.15224195 _ll_create 
0x168547c.head sess_bss8rh5bug6bffvj6s67vnsce5 0100600 131138, uid 5003, 
gid 5003
2018-08-25 11:10:42.142 7ff2a0c2c700 10 client.15224195 _lookup concluded 
ENOENT locally for 0x168547c.head(faked_ino=0 ref=2 ll_ref=5 cap_refs={} 
open={} mode=40755 size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 
mtime=2018-08-25 11:08:11.025085 ctime=2018-08-25 11:08:11.025085 
caps=pAsLsXsFs(0=pAsLsXsFs) COMPLETE parents=0x15218bb.head["sessions"] 
0x557ddecc6100) dn 'sess_bss8rh5bug6bffvj6s67vnsce5'

^^ check for existing session file, which does not exist, yet.

2018-08-25 11:10:42.142 7ff2a0c2c700  8 client.15224195 _create(0x168547c 
sess_bss8rh5bug6bffvj6s67vnsce5, 0100600)
2018-08-25 11:10:42.142 7ff2a0c2c700 20 client.15224195 get_or_create 
0x168547c.head(faked_ino=0 ref=3 ll_ref=5 cap_refs={} open={} mode=40755 
size=0/0 nlink=1 btime=2018-08-23 14:23:09.291832 mtime=2018-08-25 
11:08:11.025085 ctime=2018-08-25 11:08:11.025085 caps=pAsLsXsFs(0=pAsLsXsFs) 
COMPLETE parents=0x15218bb.head["sessions"] 0x557ddecc6100) name 
sess_bss8rh5bug6bffvj6s67vnsce5
2018-08-25 11:10:42.142 7ff2a0c2c700 15 client.15224195 link dir 0x557ddecc6100 
'sess_bss8rh5bug6bffvj6s67vnsce5' to inode 0 dn 0x557ddf053de0 (new dn)
2018-08-25 11:10:42.142 7ff2a0c2c700 20 client.15224195 choose_target_mds inode 
dir hash is 2 on sess_bss8rh5bug6bffvj6s67vnsce5 => 3288169645
2018-08-25 11:10:42.142 7ff2a0c2c700 10 client.15224195 send_request 
client_request(unknown.0:872 create 
#0x168547c/sess_bss8rh5bug6bffvj6s67vnsce5 2018-08-25 11:10:42.148081 
caller_uid=5003, caller_gid=5003{5003,}) v4 to mds.0
2018-08-25 11:10:42.142 7ff2a0c2c700  1 -- [2001:7b8:81:6::18]:0/4024171085 --> 
[2001:7b8:80:3:0:2c:3:2]:6800/1086374448 -- client_request(unknown.0:872 create 
#0x168547c/sess_bss8rh5bug6bffvj6s67vnsce5 2018-08-25 11:10:42.148081 
caller_uid=5003, caller_gid=5003{5003,}) v4 -- 0x557ddf884080 con 0
2018-08-25 11:10:42.142 7ff2a5c36700 12 client.15224195 insert_dentry_inode 
'sess_bss8rh5bug6bffvj6s67vnsce5' vino 0x16d9844.head in dir 
0x168547c.head dn 0x557ddf053de0
2018-08-25 11:10:42.142 7ff2a5c36700 15 client.15224195 link dir 0x557ddecc6100 
'sess_bss8rh5bug6bffvj6s67vnsce5' to inode 0x557ddfaa0680 dn 0x557ddf053de0 
(old dn)
2018-08-25 11:10:42.142 7ff2a5c36700 10 client.15224195 put_inode on 
0x16d9844.head(faked_ino=0 ref=3 ll_ref=0 cap_refs={} open={} mode=100600 
size=0/4194304 nlink=1 btime=2018-08-25 11:10:42.148081 mtime=2018-08-25 
11:10:42.148081 ctime=2018-08-25 11:10:42.148081 
caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x16d9844 ts 0/0 
objects 0 dirty_or_tx 0] 
parents=0x168547c.head["sess_bss8rh5bug6bffvj6s67vnsce5"] 0x557ddfaa0680)
2018-08-25 11:10:42.142 7ff2a5c36700 20 client.15224195 link  inode 
0x557ddfaa0680 parents now 0x168547c.head["sess_bss8rh5bug6bffvj6s67vnsce5"]
2018-08-25 11:10:42.142 7ff2a5c36700 10 client.15224195 put_inode on 
0x16d9844.head(faked_ino=0 ref=2 ll_ref=0 cap_refs={} open={} mode=100600 

Re: [ceph-users] ceph-fuse slow cache?

2018-08-24 Thread Gregory Farnum
On Fri, Aug 24, 2018 at 1:20 AM Stefan Kooman  wrote:

> Hi Gregory,
>
> Quoting Gregory Farnum (gfar...@redhat.com):
> > This is quite strange. Given that you have a log, I think what you want
> to
> > do is find one request in the log, trace it through its lifetime, and see
> > where the time is elapsed. You may find a bifurcation, where some
> > categories of requests happen instantly but other categories take a
> second
> > or more; focus on the second, obviously.
>
> So that is what I did. Turns out it's not the (slow) cache at all, probably
> not to your surprise. The reads are quit fast actually, compared to
> kernel client it's ~ 8 ms slower, or ~ 40%. It looks like couple
> of writes / updates to, at least a session file, are slow:
>
> 2018-08-23 16:40:25.631 7f79156a8700 10 client.15158830 put_inode on
> 0x1693859.head(faked_ino=0 ref=5 ll_ref=1 cap_refs={} open={3=1}
> mode=100600 size=0/4194304 nlink=1 btime=2018-08-23 16:40:25.632601
> mtime=2018-08-23 16:40:25.632601 ctime=2018-08-23 16:40:25.632601
> caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x1693859 ts 0/0
> objects 0 dirty_or_tx 0]
> parents=0x168547c.head["sess_ia0agoj01ul4rob7ji55ouca41"]
> 0x5646ff0e8000)
>
> 2018-08-23 16:40:28.547 7f79156a8700 10 client.15158830
> update_inode_file_time 0x1693859.head(faked_ino=0 ref=4 ll_ref=1
> cap_refs={} open={3=1} mode=100600 size=0/4194304 nlink=1
> btime=2018-08-23 16:40:25.632601 mtime=2018-08-23 16:40:25.632601
> ctime=2018-08-23 16:40:25.632601
> caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x1693859 ts 0/0
> objects 0 dirty_or_tx 0]
> parents=0x168547c.head["sess_ia0agoj01ul4rob7ji55ouca41"]
> 0x5646ff0e8000) pAsxLsXsxFsxcrwb ctime 2018-08-23 16:40:25.632601 mtime
> 2018-08-23 16:40:25.632601
>

Hmm, these aren't actually the start and end times to the same operation.
put_inode() is literally adjusting a refcount, which can happen for reasons
ranging from the VFS doing something that drops it to an internal operation
completing to a response coming back from the MDS. You should be able to
find requests coming in from the kernel and a response going back out (the
function names will be prefixed with "ll_", eg "ll_lookup").


>
> So, almost 3 seconds. Page is only served after this, and possibly, after
> some cache files have been written. Note though that ceph-fuse is in
> debug=20 mode. Apparently the kernel client is _much_ faster in writing
> than ceph-fuse. If I write a file with "dd" (from /dev/urandom) it's in
> the tens of milliseconds range, not seconds. atime / ctime changes are
> handled in < 5 ms.
>
> I wonder if tuning file-striping [1] with stripe units of 4KB would be
> beneficial in this case.
>
> Gr. Stefan
>
> [1]: http://docs.ceph.com/docs/master/dev/file-striping/
>
> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688
> <+31%20318%20648%20688> / i...@bit.nl
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-fuse slow cache?

2018-08-24 Thread Stefan Kooman
Hi Gregory,

Quoting Gregory Farnum (gfar...@redhat.com):
> This is quite strange. Given that you have a log, I think what you want to
> do is find one request in the log, trace it through its lifetime, and see
> where the time is elapsed. You may find a bifurcation, where some
> categories of requests happen instantly but other categories take a second
> or more; focus on the second, obviously.

So that is what I did. Turns out it's not the (slow) cache at all, probably
not to your surprise. The reads are quit fast actually, compared to
kernel client it's ~ 8 ms slower, or ~ 40%. It looks like couple
of writes / updates to, at least a session file, are slow:

2018-08-23 16:40:25.631 7f79156a8700 10 client.15158830 put_inode on
0x1693859.head(faked_ino=0 ref=5 ll_ref=1 cap_refs={} open={3=1}
mode=100600 size=0/4194304 nlink=1 btime=2018-08-23 16:40:25.632601
mtime=2018-08-23 16:40:25.632601 ctime=2018-08-23 16:40:25.632601
caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x1693859 ts 0/0
objects 0 dirty_or_tx 0]
parents=0x168547c.head["sess_ia0agoj01ul4rob7ji55ouca41"]
0x5646ff0e8000)

2018-08-23 16:40:28.547 7f79156a8700 10 client.15158830
update_inode_file_time 0x1693859.head(faked_ino=0 ref=4 ll_ref=1
cap_refs={} open={3=1} mode=100600 size=0/4194304 nlink=1
btime=2018-08-23 16:40:25.632601 mtime=2018-08-23 16:40:25.632601
ctime=2018-08-23 16:40:25.632601
caps=pAsxLsXsxFsxcrwb(0=pAsxLsXsxFsxcrwb) objectset[0x1693859 ts 0/0
objects 0 dirty_or_tx 0]
parents=0x168547c.head["sess_ia0agoj01ul4rob7ji55ouca41"]
0x5646ff0e8000) pAsxLsXsxFsxcrwb ctime 2018-08-23 16:40:25.632601 mtime
2018-08-23 16:40:25.632601

So, almost 3 seconds. Page is only served after this, and possibly, after
some cache files have been written. Note though that ceph-fuse is in
debug=20 mode. Apparently the kernel client is _much_ faster in writing
than ceph-fuse. If I write a file with "dd" (from /dev/urandom) it's in
the tens of milliseconds range, not seconds. atime / ctime changes are
handled in < 5 ms.

I wonder if tuning file-striping [1] with stripe units of 4KB would be
beneficial in this case.

Gr. Stefan

[1]: http://docs.ceph.com/docs/master/dev/file-striping/

-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-fuse slow cache?

2018-08-22 Thread Gregory Farnum
This is quite strange. Given that you have a log, I think what you want to
do is find one request in the log, trace it through its lifetime, and see
where the time is elapsed. You may find a bifurcation, where some
categories of requests happen instantly but other categories take a second
or more; focus on the second, obviously.
-Greg

On Tue, Aug 21, 2018 at 1:31 PM Stefan Kooman  wrote:

> Hi,
>
> I'm trying to find out why ceph-fuse client(s) are slow. Luminous 12.2.7
> Ceph cluster, Mimic 13.2.1 ceph-fuse client. Ubuntu xenial,
> 4.13.0-38-generic kernel.
>
> Test case:
> 25 curl requests directed at a single threaded apache process (apache2
> -X).
>
> When the requests are handled by ceph-kernel client it takes about 1.5
> seconds for the first "GET /" (all fs /inode dentrie caches dropped
> before hand). Subsequent requests only take ~ 0.4 seconds. So fs caches
> seem to do their job.
>
> With ceph-fuse it's a bit different. The first and subsequent requests
> ("GET /") will take around 4-5 seconds every single time. As if the
> ceph-fuse / vfs cache does not work. ceph daemon client.id dump_cache
> shows that all dentries and inodes are in the cache ~ 589 of them. Relevant
> part of one of them:
>
> "caps": [
> {
> "mds": 0,
> "auth": 1,
> "mds": 0,
> "ino": "0x1096a6a",
> "cap_id": 50538956,
> "issued": "pAsLsXsFscr",
> "wanted": "-",
> "seq": 1,
> "issue_seq": 1,
> "mseq": 0,
> "gen": 0
> }
> ],
> "auth_cap": 0,
> "dirty_caps": "-",
> "shared_gen": 1,
> "cache_gen": 1,
> "hold_caps_until": "0.00",
>
> Not sure what the "issued" caps indicate. This client is currently the only
> client for this directory, so cache should not be invalidated.
> On the MDS side I hardly see any request from this client (I checked
> "dump_ops_in_flight" every second). So, I guess they should come from the
> cache. But why does it take so long? I have run ceph-fuse in debug mode
> (--debug-client=20) but this of course results in a lot of output, and I'm
> not
> sure what to look for.
>
> Watching "mds_requests" on the client every second does not show any
> request.
>
> I know the performance of ceph kernel client is (much) better than
> ceph-fuse,
> but does this also apply to objects in cache?
>
> Thanks for any hints.
>
> Gr. Stefan
>
> P.s. ceph-fuse luminous client 12.2.7 shows same result. the only active
> MDS server has 256 GB
> cache and has hardly any load. So most inodes / dentries should be cached
> there also.
>
>
> --
> | BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688
> <+31%20318%20648%20688> / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-fuse slow cache?

2018-08-21 Thread Stefan Kooman
Hi,

I'm trying to find out why ceph-fuse client(s) are slow. Luminous 12.2.7
Ceph cluster, Mimic 13.2.1 ceph-fuse client. Ubuntu xenial, 4.13.0-38-generic 
kernel.

Test case:
25 curl requests directed at a single threaded apache process (apache2
-X).

When the requests are handled by ceph-kernel client it takes about 1.5
seconds for the first "GET /" (all fs /inode dentrie caches dropped
before hand). Subsequent requests only take ~ 0.4 seconds. So fs caches
seem to do their job.

With ceph-fuse it's a bit different. The first and subsequent requests
("GET /") will take around 4-5 seconds every single time. As if the
ceph-fuse / vfs cache does not work. ceph daemon client.id dump_cache
shows that all dentries and inodes are in the cache ~ 589 of them. Relevant
part of one of them:

"caps": [
{
"mds": 0,
"auth": 1,
"mds": 0,
"ino": "0x1096a6a",
"cap_id": 50538956,
"issued": "pAsLsXsFscr",
"wanted": "-",
"seq": 1,
"issue_seq": 1,
"mseq": 0,
"gen": 0
}
],
"auth_cap": 0,
"dirty_caps": "-",
"shared_gen": 1,
"cache_gen": 1,
"hold_caps_until": "0.00",

Not sure what the "issued" caps indicate. This client is currently the only
client for this directory, so cache should not be invalidated.
On the MDS side I hardly see any request from this client (I checked
"dump_ops_in_flight" every second). So, I guess they should come from the
cache. But why does it take so long? I have run ceph-fuse in debug mode
(--debug-client=20) but this of course results in a lot of output, and I'm not
sure what to look for.

Watching "mds_requests" on the client every second does not show any request.

I know the performance of ceph kernel client is (much) better than ceph-fuse,
but does this also apply to objects in cache?

Thanks for any hints.

Gr. Stefan

P.s. ceph-fuse luminous client 12.2.7 shows same result. the only active MDS 
server has 256 GB
cache and has hardly any load. So most inodes / dentries should be cached there 
also.


-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com