I can only comment I the log. I would recommend using three logs (6 disks as mirror pairs) per system, and add a crush map hierarchy level for cache drive so that any given pg will never mirror twice to the same log. That'll also reduce your failure domain. On Jan 29, 2014 4:26 PM, "Geraint Jones" <gera...@koding.com> wrote:
> Hi Guys > > We have the current config : > > 2 x Storage servers, 128gb RAM, dual E5-2609, LSI MegaRAID SAS 9271-4i, > each server has 24 x 3tb Disks – these were originally set as 8 groups of 3 > Disk RAID0 (we are slowly moving to one OSD one disk) We initially had the > journals stored on an SSD, however after a disk failure this lead to > terrible performance (a_wait on the SSD was huge) so we added some PCIe > SSDs, this didn’t change a lot and we had the big a_wait on the SSDs still. > So we removed the SSDs. This made recovery pretty good (we were able to > carry on working while recovery was taking place, previously recovery = > VM’s down). > > Now two mornings in a row I have been paged due to really slow I/O. > > It appears that the background deep scrubbing was using all the IO the > disks had, this is the output of iostat –xk 3 during scrubbing: > > avg-cpu: %user %nice %system %iowait %steal %idle > 31.25 0.00 6.17 10.79 0.00 51.78 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sda 0.00 12.33 0.00 26.67 0.00 130.67 9.80 > 0.07 2.55 0.00 2.55 2.55 6.80 > sdi 0.00 8.00 269.33 319.33 31660.00 5124.33 124.98 > 4.51 7.47 5.68 8.98 1.67 98.13 > sdk 0.00 0.00 9.00 88.00 117.33 2208.17 47.95 > 0.10 1.03 10.67 0.05 0.95 9.20 > sdj 0.00 0.00 5.33 114.67 26.67 928.50 15.92 > 0.09 0.77 12.75 0.21 0.59 7.07 > sdg 0.33 0.00 124.33 39.00 15537.33 488.83 196.24 > 0.46 2.69 3.53 0.00 2.05 33.47 > sdf 0.00 10.67 12.33 277.33 96.00 2481.83 17.80 > 0.88 3.05 14.27 2.55 0.63 18.27 > sde 0.00 0.00 3.00 32.67 381.33 280.33 37.10 > 0.10 2.84 7.11 2.45 1.68 6.00 > sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdc 0.00 0.00 0.00 28.00 0.00 179.00 12.79 > 0.05 1.90 0.00 1.90 0.76 2.13 > sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > dm-0 0.00 0.00 0.00 32.33 0.00 130.67 8.08 > 0.07 2.14 0.00 2.14 2.10 6.80 > dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdn 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdp 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > > avg-cpu: %user %nice %system %iowait %steal %idle > 7.10 0.00 6.08 23.03 0.00 63.79 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sda 0.00 0.00 0.00 0.33 0.00 1.33 8.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdc 0.00 0.00 0.00 13.00 0.00 168.00 25.85 > 0.01 0.92 0.00 0.92 0.92 1.20 > sdd 0.00 0.00 21.67 142.33 181.33 24873.33 305.54 > 29.24 175.61 76.62 190.68 6.09 99.87 > sde 0.33 0.00 273.33 10.00 32257.33 242.67 229.41 > 1.45 5.11 5.25 1.20 3.13 88.67 > sdf 0.00 0.00 0.67 44.00 2.67 520.00 23.40 > 0.03 0.75 20.00 0.45 0.75 3.33 > sdg 0.00 0.00 2.00 13.33 68.00 170.67 31.13 > 0.24 13.30 70.00 4.80 12.61 19.33 > sdi 0.00 0.00 1.00 47.67 5.33 767.83 31.77 > 0.08 1.62 8.00 1.48 0.30 1.47 > sdh 0.00 0.00 1.00 21.00 54.67 264.00 28.97 > 0.06 1.94 10.67 1.52 0.61 1.33 > sdj 1.33 0.00 344.33 16.67 42037.33 212.00 234.07 > 1.50 4.18 4.32 1.28 2.66 95.87 > sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > dm-0 0.00 0.00 0.00 0.33 0.00 1.33 8.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > > > I have disabled scrubbing to return systems to a usable state (osd set > noscrub and osd set no deep-scrub) > > And here is how our iostats look now. > > avg-cpu: %user %nice %system %iowait %steal %idle > 33.33 0.00 6.73 0.67 0.00 59.27 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sda 0.00 6.67 0.00 20.00 0.00 88.00 8.80 > 0.05 2.53 0.00 2.53 2.53 5.07 > sdi 0.00 0.00 1.67 89.33 9.33 520.00 11.63 > 0.03 0.34 15.20 0.06 0.34 3.07 > sdk 0.00 0.00 1.33 139.33 8.00 1203.67 17.23 > 0.08 0.56 13.00 0.44 0.15 2.13 > sdj 0.00 0.67 2.67 175.00 126.67 4583.33 53.02 > 0.02 0.12 4.00 0.06 0.11 1.87 > sdg 0.00 0.00 1.00 151.00 5.33 1288.17 17.02 > 0.03 0.23 13.33 0.14 0.17 2.53 > sdf 0.00 0.00 3.33 93.00 20.00 870.33 18.48 > 0.03 0.36 9.20 0.04 0.33 3.20 > sde 0.00 0.00 0.00 13.33 0.00 72.00 10.80 > 0.00 0.20 0.00 0.20 0.20 0.27 > sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdc 0.00 0.00 1.33 26.00 8.00 182.67 13.95 > 0.02 0.68 9.00 0.26 0.63 1.73 > sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > dm-0 0.00 0.00 0.00 21.67 0.00 88.00 8.12 > 0.05 2.34 0.00 2.34 2.34 5.07 > dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdn 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdp 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > > avg-cpu: %user %nice %system %iowait %steal %idle > 8.40 0.00 3.65 4.41 0.00 83.55 > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdc 0.00 0.00 0.00 62.67 0.00 778.00 24.83 > 0.18 2.89 0.00 2.89 0.28 1.73 > sdd 0.00 10.33 1.67 160.00 13.33 1163.33 14.56 > 23.88 41.49 22.40 41.69 2.41 38.93 > sde 0.00 0.00 0.00 50.33 0.00 601.17 23.89 > 0.11 2.09 0.00 2.09 0.21 1.07 > sdf 0.00 0.00 0.00 106.67 0.00 1260.67 23.64 > 0.41 3.81 0.00 3.81 0.27 2.93 > sdg 0.00 0.00 0.00 71.33 0.00 863.50 24.21 > 0.15 2.15 0.00 2.15 0.37 2.67 > sdi 0.00 0.00 0.00 23.67 0.00 366.67 30.99 > 0.01 0.45 0.00 0.45 0.45 1.07 > sdh 0.00 0.00 0.00 73.00 0.00 910.67 24.95 > 0.20 2.74 0.00 2.74 0.24 1.73 > sdj 0.00 0.00 0.00 82.67 0.00 899.67 21.77 > 0.28 3.39 0.00 3.39 0.23 1.87 > sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > sdl 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 > > Now clearly we can’t live with scrubbing off for very long, so what can I > do to stop scrubbing blocking IO or if this was you how would you > reconfigure/rearchitect things ? > > As an aside has anyone used one of these to hold the journals and would 1 > be enough for 24 journals ? : > http://www.amazon.com/FUSiON-iO-420GB-Solid-State-Drive/dp/B00DVMPXV0/ref=sr_1_1?ie=UTF8&qid=1391039407&sr=8-1&keywords=fusion-io > > Thanks! > -- > Geraint Jones > Director of Systems & Infrastructure > Koding > (We are hiring!) > https://koding.com > gera...@koding.com > Phone (415) 653-0083 > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com