"This same behavior can be seen when deleting an RBD that has 100,000
objects vs 200,000 objects, it takes twice as long"

Correction, it will take a squared amount of time, but that's not really
the most important part of the response.

On Fri, Jun 30, 2017 at 4:24 PM David Turner <drakonst...@gmail.com> wrote:

> When you delete a snapshot, Ceph places the removed snapshot into a list
> in the OSD map and places the objects in the snapshot into a snap_trim_q.
> Once those 2 things are done, the RBD command returns and you are moving
> onto the next snapshot.  The snap_trim_q is an n^2 operation (like all
> deletes in Ceph), which means that if the queue has 100 objects on it and
> takes 5 minutes to complete, then having 200 objects in the queue will take
> 25 minutes. (exaggerated time frames to show math)  This same behavior can
> be seen when deleting an RBD that has 100,000 objects vs 200,000 objects,
> it takes twice as long (note that object map mitigates this greatly by
> ignoring any object that hasn't been created, so the previous test would be
> easiest to duplicate by disabling the object map on the test RBDs).
>
> So paying attention to snapshot sizes as you clean them up is more
> important than how many snapshots you clean up.  Being on Jewel, you don't
> really want to use osd_snap_trim_sleep as it literally puts a sleep onto
> the main op threads for the OSD.  In Hammer this setting was much more
> useful (if not super hacky) and in Luminous the entire process was revamped
> and (hopefully) fixed.  Jewel is pretty much not viable for large
> quantities of snapshots, but there are ways to get through them.
>
> The following thread on the ML is one of the most informative on this
> problem in Jewel.  The second link is the resuming of the thread months
> later after the fix was scheduled for backporting into 10.2.8.
>
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-January/015675.html
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017697.html
>
> On Fri, Jun 30, 2017 at 4:02 PM Kenneth Van Alstyne <
> kvanalst...@knightpoint.com> wrote:
>
>> Hey folks:
>>         I was wondering if the community can provide any advice — over
>> time and due to some external issues, we have managed to accumulate
>> thousands of snapshots of RBD images, which are now in need of cleaning
>> up.  I have recently attempted to roll through a “for" loop to perform a
>> “rbd snap rm” on each snapshot, sequentially, waiting until the rbd command
>> finishes before moving onto the next one, of course.  I noticed that
>> shortly after starting this, I started seeing thousands of slow ops and a
>> few of our guest VMs became unresponsive, naturally.
>>
>> My questions are:
>>         - Is this expected behavior?
>>         - Is the background cleanup asynchronous from the “rbd snap rm”
>> command?
>>                 - If so, are there any OSD parameters I can set to reduce
>> the impact on production?
>>         - Would “rbd snap purge” be any different?  I expect not, since
>> fundamentally, rbd is performing the same action that I do via the loop.
>>
>> Relevant details are as follows, though I’m not sure cluster size
>> *really* has any effect here:
>>         - Ceph: version 10.2.5 (c461ee19ecbc0c5c330aca20f7392c9a00730367)
>>         - 5 storage nodes, each with:
>>                 - 10x 2TB 7200 RPM SATA Spindles (for a total of 50 OSDs)
>>                 - 2x Samsung MZ7LM240 SSDs (used as journal for the OSDs)
>>                 - 64GB RAM
>>                 - 2x Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz
>>                 - 20GBit LACP Port Channel via Intel X520 Dual Port 10GbE
>> NIC
>>
>> Let me know if I’ve missed something fundamental.
>>
>> Thanks,
>>
>> --
>> Kenneth Van Alstyne
>> Systems Architect
>> Knight Point Systems, LLC
>> Service-Disabled Veteran-Owned Business
>> 1775 Wiehle Avenue Suite 101 | Reston, VA 20190
>> c: 228-547-8045 <(228)%20547-8045> f: 571-266-3106 <(571)%20266-3106>
>> www.knightpoint.com
>> DHS EAGLE II Prime Contractor: FC1 SDVOSB Track
>> GSA Schedule 70 SDVOSB: GS-35F-0646S
>> GSA MOBIS Schedule: GS-10F-0404Y
>> ISO 20000 / ISO 27001 / CMMI Level 3
>>
>> Notice: This e-mail message, including any attachments, is for the sole
>> use of the intended recipient(s) and may contain confidential and
>> privileged information. Any unauthorized review, copy, use, disclosure, or
>> distribution is STRICTLY prohibited. If you are not the intended recipient,
>> please contact the sender by reply e-mail and destroy all copies of the
>> original message.
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to