Hello,

I've been trying to nail down a nasty performance issue related to
scrubbing. I am mostly using radosgw with a handful of buckets containing
millions of various sized objects. When ceph scrubs, both regular and deep,
radosgw blocks on external requests, and my cluster has a bunch of requests
that have blocked for > 32 seconds. Frequently OSDs are marked down.

According to atop, the OSDs being deep scrubbed are reading at only 5mb/s
to 8mb/s, and a scrub of a 6.4gb placement group takes 10-20 minutes.

Here's a screenshot of atop from a node:
https://s3.amazonaws.com/rwgps/screenshots/DgSSRyeF.png

First question: is this a reasonable speed for scrubbing, given a very
lightly used cluster? Here's some cluster details:

deploy@drexler:~$ ceph --version
ceph version 0.94.1-5-g85a68f9 (85a68f9a8237f7e74f44a1d1fbbd6cb4ac50f8e8)


2x Xeon E5-2630 per node, 64gb of ram per node.


deploy@drexler:~$ ceph status
    cluster 234c6825-0e2b-4256-a710-71d29f4f023e
     health HEALTH_WARN
            118 requests are blocked > 32 sec
     monmap e1: 3 mons at {drexler=
10.0.0.36:6789/0,lucy=10.0.0.38:6789/0,paley=10.0.0.34:6789/0}
            election epoch 296, quorum 0,1,2 paley,drexler,lucy
     mdsmap e19989: 1/1/1 up {0=lucy=up:active}, 1 up:standby
     osdmap e1115: 12 osds: 12 up, 12 in
      pgmap v21748062: 1424 pgs, 17 pools, 3185 GB data, 20493 kobjects
            10060 GB used, 34629 GB / 44690 GB avail
                1422 active+clean
                   1 active+clean+scrubbing+deep
                   1 active+clean+scrubbing
  client io 721 kB/s rd, 33398 B/s wr, 53 op/s

deploy@drexler:~$ ceph osd tree
ID WEIGHT   TYPE NAME        UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 43.67999 root default
-2 14.56000     host paley
 0  3.64000         osd.0         up  1.00000          1.00000
 3  3.64000         osd.3         up  1.00000          1.00000
 6  3.64000         osd.6         up  1.00000          1.00000
 9  3.64000         osd.9         up  1.00000          1.00000
-3 14.56000     host lucy
 1  3.64000         osd.1         up  1.00000          1.00000
 4  3.64000         osd.4         up  1.00000          1.00000
 7  3.64000         osd.7         up  1.00000          1.00000
11  3.64000         osd.11        up  1.00000          1.00000
-4 14.56000     host drexler
 2  3.64000         osd.2         up  1.00000          1.00000
 5  3.64000         osd.5         up  1.00000          1.00000
 8  3.64000         osd.8         up  1.00000          1.00000
10  3.64000         osd.10        up  1.00000          1.00000


My OSDs are 4tb 7200rpm Hitachi DeskStars, using XFS, with Samsung 850 Pro
journals (very slow, ordered s3700 replacements, but shouldn't pose
problems for reading as far as I understand things). MONs are co-located
with OSD nodes, but the nodes are fairly beefy and has very low load.
Drives are on a expanding backplane, with an LSI SAS3008 controller.

I have a fairly standard config as well:

https://gist.github.com/kingcu/aae7373eb62ceb7579da

I know that I don't have a ton of OSDs, but I'd expect a little better
performance than this. Checkout munin of my three nodes:

http://munin.ridewithgps.com/ridewithgps.com/drexler.ridewithgps.com/index.html#disk
http://munin.ridewithgps.com/ridewithgps.com/paley.ridewithgps.com/index.html#disk
http://munin.ridewithgps.com/ridewithgps.com/lucy.ridewithgps.com/index.html#disk


Any input would be appreciated, before I start trying to micro-optimize
config params, as well as upgrading to Infernalis.


Cheers,

Cullen
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to