Hello Paul,

On Fri, Jul 7, 2023 at 5:13 PM Paul Mezzanini <pfm...@rit.edu> wrote:

> I recently got mclock going literally an order of magnitude faster.  I
> would love to claim I found all the options myself but I collected the
> knowledge of what knobs I needed to turn from here.
>

Significant usability and design improvements have been made to the mclock
scheduler in the upstream Reef release. These
improvements should soon be available in Quincy as well. One of the major
goals is to reduce the number of knobs to tune
and achieve a more hands free operation. This was partly achieved in the
existing releases to some extent by eliminating
the need to tune sleep and operation specific cost options.

Here are some of the major improvements (from Reef release notes) that
should help:

   1. The balanced profile is set as the default mClock profile because it
   represents a compromise between prioritizing client IO or recovery IO.
   Users can then choose either the high_client_ops profile to prioritize
   client IO or the high_recovery_ops profile to prioritize recovery IO.
   2. QoS parameters like reservation and limit are now specified in terms
   of a fraction (range: 0.0 to 1.0) of the OSD’s IOPS capacity.
   3. The cost parameters - osd_mclock_cost_per_io_usec_* and
   osd_mclock_cost_per_byte_usec_* have been removed. The cost of an operation
   is now determined internally using the random IOPS and maximum sequential
   bandwidth capability of the OSD’s underlying device.
   4. The random IOPS capacity is determined using 'osd bench' as before,
   but now based on the result, unrealistic values are not considered and
   reasonable defaults are used if the measurement crosses a threshold
   governed by osd_mclock_iops_capacity_threshold_[hdd|ssd]. The default IOPS
   capacity can still be overridden by users if not accurate, The thresholds
   too are configurable. The max sequential  bandwidth is defined by
   osd_mclock_max_sequential_bandwidth_[hdd|ssd], and are set to reasonable
   defaults. Again, these may be modified if not accurate. Therefore, these
   changes account for inaccuracies and provide good control to the user in
   terms of specifying accurate OSD characteristics.
   5. Degraded object recovery is given higher priority when compared to
   misplaced object recovery because degraded objects present a data safety
   issue not present with objects that are merely misplaced. Therefore,
   backfilling operations with the balanced and high_client_ops mClock
   profiles may progress slower than what was seen with the
   WeightedPriorityQueue (WPQ) scheduler. For faster recovery and backfills,
   the 'high_recovery_ops' profile with modified QoS parameters would help.
   6. The QoS allocations in all the mClock profiles are optimized based on
   the above fixes and enhancements.

Please see the latest upstream documentation for more details:
https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/

The recommendation is to upgrade when feasible and provide your feedback,
questions and suggestions.

-Sridhar
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to