I can almost guarantee what you're seeing is PG subfolder splitting.  When
the subfolders in a PG get X number of objects, it splits into 16
subfolders.  Every cluster I manage has blocked requests and OSDs that get
marked down while this is happening.  To stop the OSDs getting marked down,
I increase the osd_heartbeat_grace until the OSDs no longer mark themselves
down during this process.  Based on your email, it looks like starting at 5
minutes would be a good place.  The blocked requests will still persist,
but the OSDs aren't being marked down regularly and adding peering to the
headache.

In 10.2.5 and 0.94.9, there was a way to take an OSD offline and tell it to
split the subfolders of its PGs.  I haven't done this yet, myself, but plan
to figure it out the next time I come across this sort of behavior.

On Wed, Apr 12, 2017 at 8:55 AM Jogi Hofmüller <j...@mur.at> wrote:

> Dear all,
>
> we run a small cluster [1] that is exclusively used for virtualisation
> (kvm/libvirt). Recently we started to run into performance problems
> (slow requests, failing OSDs) for no *obvious* reason (at least not for
> us).
>
> We do nightly snapshots of VM images and keep the snapshots for 14
> days. Currently we run 8 VMs in the cluster.
>
> At first it looked like the problem was related to snapshotting images
> of VMs that were up and running (respectively deleting the snapshots
> after 14 days). So we changed the procedure to first suspend the VM and
> the snapshot its image(s). Snapshots are made at 4 am.
>
> When we removed *all* the old snapshots (the ones done of running VMs)
> the cluster suddenly behaved 'normal' again, but after two days of
> creating snapshots (not deleting any) of suspended VMs, the slow
> requests started again (although by far not as frequent as before).
>
> This morning we experienced subsequent failures (e.g. osd.2
> IPv4:6800/1621 failed (2 reporters from different host after 49.976472
> >= grace 46.444312) of 4 of our 6 OSDs, resulting in HEALTH_WARN with
> up to about 20% of PGs active+undersized+degraded or stale+active+clean
> or remapped+peering. No OSD failure lasted longer than 4 minutes. After
> 15 minutes everything was back to normal again. The noise started at
> 6:25 am, a time when cron.daily scripts run here.
>
> We have no clue what could have caused this behavior :( There seems to
> be no shortage of resources (CPU, RAM, network) that would explain what
> happened, but maybe we did not look in the right places. So any hint on
> where to look/what to look for would be greatly appreciated :)
>
> [1]  cluster setup
>
> Three nodes: ceph1, ceph2, ceph3
>
> ceph1 and ceph2
>
>     1x Intel(R) Xeon(R) CPU E3-1275 v3 @ 3.50GHz
>     32 GB RAM
>     RAID1 for OS
>     1x Intel 530 Series SSDs (120GB) for Journals
>     3x WDC WD2500BUCT-63TWBY0 for OSDs (1TB)
>     2x Gbit Ethernet bonded (802.3ad) on HP 2920 Stack
>
> ceph3
>
>     virtual machine
>     1 CPU
>     4 GB RAM
>
> Software
>
>     Debian GNU/Linux Jessie (8.7)
>     Kernel 3.16
>     ceph 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)
>
> Ceph Services
>
> 3 Monitors: ceph1, ceph2, ceph3
>
> 6 OSDs: ceph1 (3), ceph2 (3)
>
> Regards,
> --
> J.Hofmüller
>
>            Nisiti
>            - Abie Nathan, 1927-2008
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to