Hi,
I had a problem with a cephfs freeze in a client. Impossible to
re-enable the mountpoint. A simple "ls /mnt" command totally
blocked (of course impossible to umount-remount etc.) and I had
to reboot the host. But even a "normal" reboot didn't work, the
host didn't stop. I had to do a hard reboot of the host. In brief,
it was like a big "NFS" freeze. ;)
In the logs, nothing relevant in the client side and just this line
in the cluster side:
~# cat /var/log/ceph/ceph-mds.1.log
[...]
2015-05-14 17:07:17.259866 7f3b5cffc700 0 log_channel(cluster) log [INF] :
closing stale session client.1342358 192.168.21.207:0/519924348 after 301.329013
[...]
And indeed, the freeze was probably triggered by a little network
interruption.
Here is my configuration:
- OS: Ubuntu 14.04 in the client and in the cluster nodes.
- Kernel: 3.16.0-36-generic in the client and in the cluster nodes.
(apt-get install linux-image-generic-lts-utopic).
- Ceph version: Hammer in the client and in cluster nodes (0.94.1-1trusty).
In the client, I use the cephfs kernel module (not ceph-fuse). Here
is the fstab line in the client node:
10.0.2.150,10.0.2.151,10.0.2.152:/ /mnt ceph
noatime,noacl,name=cephfs,secretfile=/etc/ceph/secret,_netdev 0 0
My only configuration concerning mds in ceph.conf is just:
mds cache size = 1000000
That's all.
Here are my questions:
1. Is this kind of freeze normal? Can I avoid these freezes with a
more recent version of the kernel in the client?
2. Can I avoid these freezes with ceph-fuse instead of the kernel
cephfs module? But in this case, the cephfs performance will be
worse. Am I wrong?
3. Is there a parameter in ceph.conf to tell mds to be more patient
before closing the "stale session" of a client?
I'm in a testing period and a hard reboot of my cephfs clients would
be quite annoying for me. Thanks in advance for your help.
--
François Lafont
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com