"This same behavior can be seen when deleting an RBD that has 100,000 objects vs 200,000 objects, it takes twice as long"
Correction, it will take a squared amount of time, but that's not really the most important part of the response. On Fri, Jun 30, 2017 at 4:24 PM David Turner <drakonst...@gmail.com> wrote: > When you delete a snapshot, Ceph places the removed snapshot into a list > in the OSD map and places the objects in the snapshot into a snap_trim_q. > Once those 2 things are done, the RBD command returns and you are moving > onto the next snapshot. The snap_trim_q is an n^2 operation (like all > deletes in Ceph), which means that if the queue has 100 objects on it and > takes 5 minutes to complete, then having 200 objects in the queue will take > 25 minutes. (exaggerated time frames to show math) This same behavior can > be seen when deleting an RBD that has 100,000 objects vs 200,000 objects, > it takes twice as long (note that object map mitigates this greatly by > ignoring any object that hasn't been created, so the previous test would be > easiest to duplicate by disabling the object map on the test RBDs). > > So paying attention to snapshot sizes as you clean them up is more > important than how many snapshots you clean up. Being on Jewel, you don't > really want to use osd_snap_trim_sleep as it literally puts a sleep onto > the main op threads for the OSD. In Hammer this setting was much more > useful (if not super hacky) and in Luminous the entire process was revamped > and (hopefully) fixed. Jewel is pretty much not viable for large > quantities of snapshots, but there are ways to get through them. > > The following thread on the ML is one of the most informative on this > problem in Jewel. The second link is the resuming of the thread months > later after the fix was scheduled for backporting into 10.2.8. > > > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-January/015675.html > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017697.html > > On Fri, Jun 30, 2017 at 4:02 PM Kenneth Van Alstyne < > kvanalst...@knightpoint.com> wrote: > >> Hey folks: >> I was wondering if the community can provide any advice — over >> time and due to some external issues, we have managed to accumulate >> thousands of snapshots of RBD images, which are now in need of cleaning >> up. I have recently attempted to roll through a “for" loop to perform a >> “rbd snap rm” on each snapshot, sequentially, waiting until the rbd command >> finishes before moving onto the next one, of course. I noticed that >> shortly after starting this, I started seeing thousands of slow ops and a >> few of our guest VMs became unresponsive, naturally. >> >> My questions are: >> - Is this expected behavior? >> - Is the background cleanup asynchronous from the “rbd snap rm” >> command? >> - If so, are there any OSD parameters I can set to reduce >> the impact on production? >> - Would “rbd snap purge” be any different? I expect not, since >> fundamentally, rbd is performing the same action that I do via the loop. >> >> Relevant details are as follows, though I’m not sure cluster size >> *really* has any effect here: >> - Ceph: version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367) >> - 5 storage nodes, each with: >> - 10x 2TB 7200 RPM SATA Spindles (for a total of 50 OSDs) >> - 2x Samsung MZ7LM240 SSDs (used as journal for the OSDs) >> - 64GB RAM >> - 2x Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz >> - 20GBit LACP Port Channel via Intel X520 Dual Port 10GbE >> NIC >> >> Let me know if I’ve missed something fundamental. >> >> Thanks, >> >> -- >> Kenneth Van Alstyne >> Systems Architect >> Knight Point Systems, LLC >> Service-Disabled Veteran-Owned Business >> 1775 Wiehle Avenue Suite 101 | Reston, VA 20190 >> c: 228-547-8045 <(228)%20547-8045> f: 571-266-3106 <(571)%20266-3106> >> www.knightpoint.com >> DHS EAGLE II Prime Contractor: FC1 SDVOSB Track >> GSA Schedule 70 SDVOSB: GS-35F-0646S >> GSA MOBIS Schedule: GS-10F-0404Y >> ISO 20000 / ISO 27001 / CMMI Level 3 >> >> Notice: This e-mail message, including any attachments, is for the sole >> use of the intended recipient(s) and may contain confidential and >> privileged information. Any unauthorized review, copy, use, disclosure, or >> distribution is STRICTLY prohibited. If you are not the intended recipient, >> please contact the sender by reply e-mail and destroy all copies of the >> original message. >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com