I had a similar problem with some relatively underpowered servers (2x
E5-2603 6 core 1.7ghz no HT, 12-14 2TB OSDs per server, 32Gb RAM)

There was a process on a couple of the servers that would hang and chew up
all available CPU. When that happened, I started getting scrub errors on
those servers.



On Mon, Mar 5, 2018 at 8:45 AM, Jan Marquardt <j...@artfiles.de> wrote:

> Am 05.03.18 um 13:13 schrieb Ronny Aasen:
> > i had some similar issues when i started my proof of concept. especialy
> > the snapshot deletion i remember well.
> >
> > the rule of thumb for filestore that i assume you are running is 1GB ram
> > per TB of osd. so with 8 x 4TB osd's you are looking at 32GB of ram for
> > osd's + some  GB's for the mon service, + some GB's  for the os itself.
> >
> > i suspect if you inspect your dmesg log and memory graphs you will find
> > that the out of memory killer ends your osd's when the snap deletion (or
> > any other high load task) runs.
> >
> > I ended up reducing the number of osd's per node, since the old
> > mainboard i used was maxed for memory.
>
> Well, thanks for the broad hint. Somehow I assumed we fulfill the
> recommendations, but of course you are right. We'll check if our boards
> support 48 GB RAM. Unfortunately, there are currently no corresponding
> messages. But I can't rule out that there haven't been any.
>
> > corruptions occured for me as well. and they was normaly associated with
> > disks dying or giving read errors. ceph often managed to fix them but
> > sometimes i had to just remove the hurting OSD disk.
> >
> > hage some graph's  to look at. personaly i used munin/munin-node since
> > it was just an apt-get away from functioning graphs
> >
> > also i used smartmontools to send me emails about hurting disks.
> > and smartctl to check all disks for errors.
>
> I'll check S.M.A.R.T stuff. I am wondering if scrubbing errors are
> always caused by disk problems or if they also could be triggered
> by flapping OSDs or other circumstances.
>
> > good luck with ceph !
>
> Thank you!
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to