On Wed, Aug 8, 2018 at 4:46 PM Jake Grimmett <j...@mrc-lmb.cam.ac.uk> wrote: > > Hi John, > > With regard to memory pressure; Does the cephfs fuse client also cause a > deadlock - or is this just the kernel client?
TBH, I'm not expert enough on the kernel-side implementation of fuse to say. Ceph does have the fuse_disable_pagecache that might reduce the probability of issues if you're committed to running clients and servers on the same node. > We run the fuse client on ten OSD nodes, and use parsync (parallel > rsync) to backup two beegfs systems (~1PB). > > Ordinarily fuse works OK, but any OSD problems can cause an out of > memory error on other osd threads as they recover, e.g.: > > kernel: [<ffffffff9cf98906>] out_of_memory+0x4b6/0x4f0 > kernel: Out of memory: Kill process 1927903 (ceph-osd) score 27 or > sacrifice child > > Limiting bluestore_cache (as follows) prevents the OOM error, and allows > us to run the cephfs fuse client reliably: > > bluestore_cache_size = 209715200 > bluestore_cache_kv_max = 134217728 > > We have 45 OSD's per box, 128GB RAM. Dual E5-2620 v4 > mimic 13.2.1, Load average 16 or so is normal... > > Could our OOM errors (with a default config) be caused by us running > cephfs fuse on the osd servers? I wouldn't rule it out, but this is also a pretty high density of OSDs per node to begin with. If each OSD is at least a few terabytes, you're the wrong side of the rule of thumb on resources (1GB RAM per TB of OSD storage). I'd also be concerned about having only one quarter of a CPU core for each OSD. Sounds like you've got your settings tuned to something that's working in practice though, so I wouldn't mess with it :-) John > > many thanks! > > Jake > > On 07/08/18 20:36, John Spray wrote: > > On Tue, Aug 7, 2018 at 5:42 PM Reed Dier <reed.d...@focusvq.com> wrote: > >> > >> This is the first I am hearing about this as well. > > > > This is not a Ceph-specific thing -- it can also affect similar > > systems like Lustre. > > > > The classic case is when under some memory pressure, the kernel tries > > to free memory by flushing the client's page cache, but doing the > > flush means allocating more memory on the server, making the memory > > pressure worse, until the whole thing just seizes up. > > > > John > > > >> Granted, I am using ceph-fuse rather than the kernel client at this point, > >> but that isn’t etched in stone. > >> > >> Curious if there is more to share. > >> > >> Reed > >> > >> On Aug 7, 2018, at 9:47 AM, Webert de Souza Lima <webert.b...@gmail.com> > >> wrote: > >> > >> > >> Yan, Zheng <uker...@gmail.com> 于2018年8月7日周二 下午7:51写道: > >>> > >>> On Tue, Aug 7, 2018 at 7:15 PM Zhenshi Zhou <deader...@gmail.com> wrote: > >>> this can cause memory deadlock. you should avoid doing this > >>> > >>>> Yan, Zheng <uker...@gmail.com>于2018年8月7日 周二19:12写道: > >>>>> > >>>>> did you mount cephfs on the same machines that run ceph-osd? > >>>>> > >> > >> > >> I didn't know about this. I run this setup in production. :P > >> > >> Regards, > >> > >> Webert Lima > >> DevOps Engineer at MAV Tecnologia > >> Belo Horizonte - Brasil > >> IRC NICK - WebertRLZ > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > >> > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@lists.ceph.com > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com