Re: [ceph-users] How to avoid kernel conflicts
The systems on which the `rbd map` hangs problem occurred are definitely not under memory stress. I don't believer they are doing a lot of disk I/O either. Here's the basic set-up: * all nodes in the "data-plane" are identical * they each host and OSD instance, sharing one of the drive * I'm running Docker containers using an RBD volume plugin and Docker Compose * when the hang happens, the most visible behavior is that `docker ps` hangs * then I run `systemctl status` and see and `rbd map` process spawned by the RBD volume plugin * I then tried an `strace -f -p ` and that process promptly exits (with RC 0) and the hang resolves itself I'll tried to capture the strace output the next time I run into it and share with the mailing list. Thanks, Ilya. -kc > On May 9, 2016, at 2:21 AM, Ilya Dryomov wrote: > > On Mon, May 9, 2016 at 12:19 AM, K.C. Wong wrote: >> >>> As the tip said, you should not use rbd via kernel module on an OSD host >>> >>> However, using it with userspace code (librbd etc, as in kvm) is fine >>> >>> Generally, you should not have both: >>> - "server" in userspace >>> - "client" in kernelspace >> >> If `librbd` would help avoid this problem, then switch to `rbd-fuse` >> should do the trick, right? >> >> The reason for my line of question is that I've seen occasionl freeze >> up of `rbd map` that's resolved by a 'slight tap' by way of an strace. >> There is definitely great attractiveness to not have specialized nodes >> and make every one the same as the next one on the rack. > > The problem with placing the kernel client on the OSD node is the > potential deadlock under heavy I/O when memory becomes scarce. It's > not recommended, but people are doing it - if you don't stress your > system too much, it'll never happen. > > "rbd map" freeze is definitely not related to the abov. Did the actual > command hang? Could you describe what you saw in more detail and how > did strace help? It could be that you ran into > >http://tracker.ceph.com/issues/14737 > > Thanks, > >Ilya K.C. Wong kcw...@verseon.com 4096R/B8995EDE E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDE hkps://hkps.pool.sks-keyservers.net signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to avoid kernel conflicts
On Mon, May 9, 2016 at 12:19 AM, K.C. Wong wrote: > >> As the tip said, you should not use rbd via kernel module on an OSD host >> >> However, using it with userspace code (librbd etc, as in kvm) is fine >> >> Generally, you should not have both: >> - "server" in userspace >> - "client" in kernelspace > > If `librbd` would help avoid this problem, then switch to `rbd-fuse` > should do the trick, right? > > The reason for my line of question is that I've seen occasionl freeze > up of `rbd map` that's resolved by a 'slight tap' by way of an strace. > There is definitely great attractiveness to not have specialized nodes > and make every one the same as the next one on the rack. The problem with placing the kernel client on the OSD node is the potential deadlock under heavy I/O when memory becomes scarce. It's not recommended, but people are doing it - if you don't stress your system too much, it'll never happen. "rbd map" freeze is definitely not related to the abov. Did the actual command hang? Could you describe what you saw in more detail and how did strace help? It could be that you ran into http://tracker.ceph.com/issues/14737 Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to avoid kernel conflicts
> As the tip said, you should not use rbd via kernel module on an OSD host > > However, using it with userspace code (librbd etc, as in kvm) is fine > > Generally, you should not have both: > - "server" in userspace > - "client" in kernelspace If `librbd` would help avoid this problem, then switch to `rbd-fuse` should do the trick, right? The reason for my line of question is that I've seen occasionl freeze up of `rbd map` that's resolved by a 'slight tap' by way of an strace. There is definitely great attractiveness to not have specialized nodes and make every one the same as the next one on the rack. Thanks, -kc K.C. Wong kcw...@verseon.com 4096R/B8995EDE E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDE hkps://hkps.pool.sks-keyservers.net signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to avoid kernel conflicts
As the tip said, you should not use rbd via kernel module on an OSD host However, using it with userspace code (librbd etc, as in kvm) is fine Generally, you should not have both: - "server" in userspace - "client" in kernelspace On 07/05/2016 22:13, K.C. Wong wrote: > Hi, > > I saw this tip in the troubleshooting section: > > DO NOT mount kernel clients directly on the same node as your Ceph Storage > Cluster, > because kernel conflicts can arise. However, you can mount kernel clients > within > virtual machines (VMs) on a single node. > > Does this mean having a converged deployment is > a bad idea? Do I really need dedicated storage > nodes? > > By converged, I mean every node hosting an OSD. > At the same time, workload on the node may mount > RBD volumes or access CephFS. Do I have to isolate > the OSD daemon in its own VM? > > Any advice would be appreciated. > > -kc > > K.C. Wong > kcw...@verseon.com > 4096R/B8995EDE E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDE > hkps://hkps.pool.sks-keyservers.net > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How to avoid kernel conflicts
Hi, I saw this tip in the troubleshooting section: DO NOT mount kernel clients directly on the same node as your Ceph Storage Cluster, because kernel conflicts can arise. However, you can mount kernel clients within virtual machines (VMs) on a single node. Does this mean having a converged deployment is a bad idea? Do I really need dedicated storage nodes? By converged, I mean every node hosting an OSD. At the same time, workload on the node may mount RBD volumes or access CephFS. Do I have to isolate the OSD daemon in its own VM? Any advice would be appreciated. -kc K.C. Wong kcw...@verseon.com 4096R/B8995EDE E527 CBE8 023E 79EA 8BBB 5C77 23A6 92E9 B899 5EDE hkps://hkps.pool.sks-keyservers.net signature.asc Description: Message signed with OpenPGP using GPGMail ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com