The branch, master has been updated
via 611b4d6de72 docs-xml/manpages: doc for 'vfs_aio_ratelimit' module
via 73f01f826ab s3:selftest: test vfs_aio_ratelimit module
via 9597a30e840 vfs_aio_ratelimit: rate-limiting module for async I/O
from cad337062f4 s4:torture/smb2: add smb2.bench.write test
https://git.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit 611b4d6de728d7919ec244fe90823510b209b608
Author: Shachar Sharon <[email protected]>
Date: Thu Aug 14 17:01:16 2025 +0300
docs-xml/manpages: doc for 'vfs_aio_ratelimit' module
Documentation for newly introduced async-I/O rate-limiting module.
Signed-off-by: Shachar Sharon <[email protected]>
Reviewed-by: Avan Thakkar <[email protected]>
Reviewed-by: Anoop C S <[email protected]>
Reviewed-by: Gunther Deschner <[email protected]>
Autobuild-User(master): Anoop C S <[email protected]>
Autobuild-Date(master): Sun Jan 18 07:23:19 UTC 2026 on atb-devel-224
commit 73f01f826abd294f454053c5b8f604a168e5fb37
Author: Shachar Sharon <[email protected]>
Date: Thu Sep 4 10:45:18 2025 +0300
s3:selftest: test vfs_aio_ratelimit module
Test VFS aio_ratelimit module: ensure that a (read) delay is indeed
injected.
Signed-off-by: Shachar Sharon <[email protected]>
Reviewed-by: Avan Thakkar <[email protected]>
Reviewed-by: Anoop C S <[email protected]>
Reviewed-by: Gunther Deschner <[email protected]>
commit 9597a30e840a59c56dab8dde8909370a98331c6a
Author: Shachar Sharon <[email protected]>
Date: Sun Aug 10 11:42:42 2025 +0300
vfs_aio_ratelimit: rate-limiting module for async I/O
A new stackable module to allow rate-limiting functionality for async
I/O operations. When the number of IOPS or bytes-per-sec overflow a
user-defined threshold, inject a delay before allowing an operation to
complete, yielding an implicit throughput ceiling. Uses token-based
algorithm to calculate the actual delay.
Pair-Programmed-With: Avan Thakkar <[email protected]>
Signed-off-by: Shachar Sharon <[email protected]>
Reviewed-by: Avan Thakkar <[email protected]>
Reviewed-by: Anoop C S <[email protected]>
Reviewed-by: Gunther Deschner <[email protected]>
-----------------------------------------------------------------------
Summary of changes:
docs-xml/manpages/vfs_aio_ratelimit.8.xml | 155 ++++++
docs-xml/wscript_build | 1 +
selftest/target/Samba3.pm | 11 +
source3/modules/vfs_aio_ratelimit.c | 761 +++++++++++++++++++++++++++++
source3/modules/wscript_build | 8 +
source3/script/tests/test_aio_ratelimit.sh | 91 ++++
source3/selftest/tests.py | 9 +
source3/wscript | 3 +-
8 files changed, 1038 insertions(+), 1 deletion(-)
create mode 100644 docs-xml/manpages/vfs_aio_ratelimit.8.xml
create mode 100644 source3/modules/vfs_aio_ratelimit.c
create mode 100755 source3/script/tests/test_aio_ratelimit.sh
Changeset truncated at 500 lines:
diff --git a/docs-xml/manpages/vfs_aio_ratelimit.8.xml
b/docs-xml/manpages/vfs_aio_ratelimit.8.xml
new file mode 100644
index 00000000000..43d3e695c08
--- /dev/null
+++ b/docs-xml/manpages/vfs_aio_ratelimit.8.xml
@@ -0,0 +1,155 @@
+<?xml version="1.0" encoding="iso-8859-1"?>
+<!DOCTYPE refentry PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant
V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc">
+<refentry id="vfs_aio_ratelimit.8">
+
+<refmeta>
+ <refentrytitle>vfs_aio_ratelimit</refentrytitle>
+ <manvolnum>8</manvolnum>
+ <refmiscinfo class="source">Samba</refmiscinfo>
+ <refmiscinfo class="manual">System Administration tools</refmiscinfo>
+ <refmiscinfo class="version">&doc.version;</refmiscinfo>
+</refmeta>
+
+<refnamediv>
+ <refname>vfs_aio_ratelimit</refname>
+ <refpurpose>Implement async-I/O rate-limiting for Samba</refpurpose>
+</refnamediv>
+
+<refsynopsisdiv>
+ <cmdsynopsis>
+ <command>vfs objects = aio_ratelimit</command>
+ </cmdsynopsis>
+</refsynopsisdiv>
+
+<refsect1>
+ <title>DESCRIPTION</title>
+
+ <para>This VFS module is part of the
+ <citerefentry><refentrytitle>samba</refentrytitle>
+ <manvolnum>7</manvolnum></citerefentry> suite.</para>
+
+ <para>The <command>aio_ratelimit</command> VFS module enables run-time
+ rate-limiting on specific shares by enforcing upper limit on async I/O
+ operations. An administrator may define this limit as operations
+ per-second or bytes-per-second. When one of those limits is exceeded,
+ a delay value (in milliseconds) is calculated based on current I/O load
+ and injected to async I/O operations, yielding an implicit throughput
+ ceiling.
+ </para>
+
+ <para>
+ This module operates only on asynchronous VFS READ/WRITE operation.
+ </para>
+
+ <para>This module is stackable.</para>
+</refsect1>
+
+<refsect1>
+ <title>CONFIGURATION</title>
+
+ <para>Straight forward use:</para>
+
+<programlisting>
+ <smbconfsection name="[share]"/>
+ <smbconfoption name="path">/path/to/share</smbconfoption>
+ <smbconfoption name="vfs objects">aio_ratelimit</smbconfoption>
+</programlisting>
+
+</refsect1>
+
+<refsect1>
+ <title>OPTIONS</title>
+
+ <variablelist>
+ <varlistentry>
+ <term>aio_ratelimit:read_iops_limit = count</term>
+ <listitem>
+ <para>
+ Upper limit of READ operations-per-second before
+ injecting delays. Zero value implies no limit.
+ </para>
+ <para>Default: 0, Max: 1000000</para>
+ <para>Example: aio_ratelimit:read_iops_limit = 1000</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>aio_ratelimit:read_bw_limit = count</term>
+ <listitem>
+ <para>
+ Upper limit of READ bandwidth (bytes-per-second) before
+ injecting delays. Zero value implies no limit.
+ </para>
+ <para>Default: 0, Max: 1T</para>
+ <para>Example: aio_ratelimit:read_bw_limit = 1000000</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>aio_ratelimit:read_delay_max = seconds</term>
+ <listitem>
+ <para>
+ Maximal allowed delay value, in seconds, for READ.
+ </para>
+ <para>Default: 30, Max: 300</para>
+ <para>Example: aio_ratelimit:read_delay_max = 15</para>
+ </listitem>
+ </varlistentry>
+
+
+ <varlistentry>
+ <term>aio_ratelimit:write_iops_limit = count</term>
+ <listitem>
+ <para>
+ Upper limit of WRITE operations-per-second before
+ injecting delays. Zero value implies no limit.
+ </para>
+ <para>Default: 0, Max: 1000000</para>
+ <para>Example: aio_ratelimit:write_iops_limit = 1000</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>aio_ratelimit:write_bw_limit = count</term>
+ <listitem>
+ <para>
+ Upper limit of WRITE bandwidth (bytes-per-second)
+ before injecting delays. Zero value implies no limit.
+ </para>
+ <para>Default: 0, Max: 1T</para>
+ <para>Example: aio_ratelimit:write_bw_limit = 1000000</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>aio_ratelimit:write_delay_max = seconds</term>
+ <listitem>
+ <para>
+ Maximal allowed delay value, in seconds, for WRITE.
+ </para>
+ <para>Default: 30, Max: 300</para>
+ <para>Example: aio_ratelimit:write_delay_max = 20</para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+</refsect1>
+
+<refsect1>
+ <title>VERSION</title>
+
+ <para>This man page is part of version &doc.version; of the Samba suite.
+ </para>
+</refsect1>
+
+<refsect1>
+ <title>AUTHOR</title>
+
+ <para>The original Samba software and related utilities
+ were created by Andrew Tridgell. Samba is now developed
+ by the Samba Team as an Open Source project similar
+ to the way the Linux kernel is developed.</para>
+
+</refsect1>
+
+</refentry>
diff --git a/docs-xml/wscript_build b/docs-xml/wscript_build
index 42833a964c0..5d231ca1624 100644
--- a/docs-xml/wscript_build
+++ b/docs-xml/wscript_build
@@ -73,6 +73,7 @@ vfs_module_manpages = ['vfs_acl_tdb',
'vfs_acl_xattr',
'vfs_aio_fork',
'vfs_aio_pthread',
+ 'vfs_aio_ratelimit',
'vfs_io_uring',
'vfs_audit',
'vfs_btrfs',
diff --git a/selftest/target/Samba3.pm b/selftest/target/Samba3.pm
index dc6f7314a5d..9a059b86f38 100755
--- a/selftest/target/Samba3.pm
+++ b/selftest/target/Samba3.pm
@@ -3756,6 +3756,17 @@ sub provision($$)
comment = smb username is [%U]
guest ok = yes
+[aio_ratelimit]
+ comment = Testing aio_ratelimit
+ path = $shrdir
+ vfs objects = aio_ratelimit
+ aio_ratelimit: read_iops_limit = 10
+ aio_ratelimit: read_bw_limit = 100000
+ aio_ratelimit: read_delay_max = 10
+ aio_ratelimit: write_iops_limit = 100
+ aio_ratelimit: write_bw_limit = 100000
+ aio_ratelimit: write_delay_max = 10
+
include = $aliceconfdir/%U.conf
";
diff --git a/source3/modules/vfs_aio_ratelimit.c
b/source3/modules/vfs_aio_ratelimit.c
new file mode 100644
index 00000000000..6ebc0114c02
--- /dev/null
+++ b/source3/modules/vfs_aio_ratelimit.c
@@ -0,0 +1,761 @@
+/*
+ * Asynchronous I/O rate-limiting VFS module.
+ *
+ * Copyright (c) 2025 Shachar Sharon <[email protected]>
+ * Copyright (c) 2025 Avan Thakkar <[email protected]>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 3 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ Token-base rate-limiter using Samba's VFS stack-able module. For each samba
+ share a user may define READ/WRITE thresholds in terms of IOPS or BYTES
+ per-second. If one of those thresholds is exceeded along the asynchronous
+ I/O path, a delay is injected before sending back a reply to the caller,
+ thus causing a rate-limit ceiling.
+
+ An example to smb.conf segment (zero value implies ignore-this-option):
+
+ [share]
+ vfs objects = aio_ratelimit ...
+ aio_ratelimit: read_iops_limit = 2000
+ aio_ratelimit: read_bw_limit = 2000000
+ aio_ratelimit: write_iops_limit = 0
+ aio_ratelimit: write_bw_limit = 1000000
+ ...
+
+ Upon successful completion of async I/O request, tokens are produced based on
+ the time which elapsed from previous requests, and tokens are consumed based
+ on actual I/O size. When current tokens value is negative, a delay is
+ calculated end injected to in-flight request. The delay value (microseconds)
+ is calculated based on the current tokens deficit.
+ */
+
+#include "includes.h"
+#include "lib/util/time.h"
+#include "lib/util/tevent_unix.h"
+
+#undef DBGC_CLASS
+#define DBGC_CLASS DBGC_VFS
+
+/* Default and maximal delay values, in seconds */
+#define DELAY_SEC_DEF (30L)
+#define DELAY_SEC_MAX (300L)
+
+/* Maximal value for iops_limit */
+#define IOPS_LIMIT_MAX (1000000L)
+
+/* Maximal value for bw_limit */
+#define BYTES_LIMIT_MAX (1L << 40)
+
+/* Module type-name in smb.conf & debug logging */
+#define MODULE_NAME "aio_ratelimit"
+
+/* Token-based rate-limiter control state */
+struct ratelimiter {
+ const char *oper;
+ struct timespec ts_base;
+ struct timespec ts_last;
+ int64_t iops_limit;
+ int64_t iops_total;
+ float iops_tokens;
+ float iops_tokens_max;
+ float iops_tokens_min;
+ int64_t bw_limit;
+ int64_t bytes_total;
+ float bytes_tokens;
+ float bytes_tokens_max;
+ float bytes_tokens_min;
+ int64_t delay_sec_max;
+ int snum;
+};
+
+/* In-memory rate-limiting entry per connection */
+struct vfs_aio_ratelimit_config {
+ struct ratelimiter rd_ratelimiter;
+ struct ratelimiter wr_ratelimiter;
+};
+
+static float maxf(float x, float y)
+{
+ return MAX(x, y);
+}
+
+static float minf(float x, float y)
+{
+ return MIN(x, y);
+}
+
+static struct timespec time_now(void)
+{
+ struct timespec ts;
+
+ clock_gettime_mono(&ts);
+ return ts;
+}
+
+static int64_t time_diff(const struct timespec *now,
+ const struct timespec *prev)
+{
+ return nsec_time_diff(now, prev) / 1000; /* usec */
+}
+
+static void ratelimiter_init(struct ratelimiter *rl,
+ int snum,
+ const char *oper_name,
+ int64_t iops_limit,
+ int64_t bw_limit,
+ int64_t delay_sec_max)
+{
+ ZERO_STRUCTP(rl);
+ rl->oper = oper_name;
+ rl->iops_total = 0;
+ rl->iops_limit = iops_limit;
+ rl->iops_tokens = 0.0;
+ rl->iops_tokens_max = (float)rl->iops_limit;
+ rl->iops_tokens_min = -rl->iops_tokens_max;
+ rl->bytes_total = 0;
+ rl->bw_limit = bw_limit;
+ rl->bytes_tokens = 0.0;
+ rl->bytes_tokens_max = (float)rl->bw_limit;
+ rl->bytes_tokens_min = -rl->bytes_tokens_max;
+ rl->delay_sec_max = delay_sec_max;
+ rl->snum = snum;
+
+ DBG_DEBUG("[%s snum:%d %s] init ratelimiter:"
+ " iops_limit=%" PRId64 " bw_limit=%" PRId64
+ " delay_sec_max=%" PRId64 "\n",
+ MODULE_NAME,
+ rl->snum,
+ rl->oper,
+ rl->iops_limit,
+ rl->bw_limit,
+ rl->delay_sec_max);
+}
+
+static bool ratelimiter_enabled(const struct ratelimiter *rl)
+{
+ return (rl->delay_sec_max > 0) &&
+ ((rl->iops_limit > 0) || (rl->bw_limit > 0));
+}
+
+static void ratelimiter_renew_tokens(struct ratelimiter *rl)
+{
+ if (rl->iops_limit > 0) {
+ rl->iops_tokens = rl->iops_tokens_max;
+ }
+ if (rl->bw_limit > 0) {
+ rl->bytes_tokens = rl->bytes_tokens_max;
+ }
+}
+
+static void ratelimiter_take_tokens(struct ratelimiter *rl, int64_t nbytes)
+{
+ if (rl->iops_limit > 0) {
+ rl->iops_tokens = maxf(rl->iops_tokens - 1.0,
+ rl->iops_tokens_min);
+ }
+ if (rl->bw_limit > 0) {
+ rl->bytes_tokens = maxf(rl->bytes_tokens - (float)nbytes,
+ rl->bytes_tokens_min);
+ }
+}
+
+static void ratelimiter_give_bw_tokens(struct ratelimiter *rl, int64_t nbytes)
+{
+ if (rl->bw_limit > 0) {
+ rl->bytes_tokens = minf(rl->bytes_tokens + (float)nbytes,
+ rl->bytes_tokens_max);
+ }
+}
+
+static float calc_fill_tokens(float tokens_max, int64_t dif_usec)
+{
+ return ((float)(dif_usec)*tokens_max) / 1000000.0f;
+}
+
+static void ratelimiter_fill_tokens(struct ratelimiter *rl, int64_t dif_usec)
+{
+ float fill;
+
+ if (rl->iops_limit > 0) {
+ fill = calc_fill_tokens(rl->iops_tokens_max, dif_usec);
+ rl->iops_tokens = minf(rl->iops_tokens + fill,
+ rl->iops_tokens_max);
+ }
+ if (rl->bw_limit > 0) {
+ fill = calc_fill_tokens(rl->bytes_tokens_max, dif_usec);
+ rl->bytes_tokens = minf(rl->bytes_tokens + fill,
+ rl->bytes_tokens_max);
+ }
+}
+
+static float calc_delay_usec(float tokens, float tokens_min)
+{
+ return (tokens * 1000000.0f) / tokens_min;
+}
+
+static uint32_t ratelimiter_calc_delay(const struct ratelimiter *rl)
+{
+ float iops_delay_usec = 0.0;
+ float bytes_delay_usec = 0.0;
+ int64_t delay_usec = 0;
+
+ /* Calculate delay for 1-second window */
+ if ((rl->iops_limit > 0) && (rl->iops_tokens < 0.0)) {
+ iops_delay_usec = calc_delay_usec(rl->iops_tokens,
+ rl->iops_tokens_min);
+ }
+ if ((rl->bw_limit > 0) && (rl->bytes_tokens < 0.0)) {
+ bytes_delay_usec = calc_delay_usec(rl->bytes_tokens,
+ rl->bytes_tokens_min);
+ }
+ /* Normalize delay within valid span */
+ delay_usec = (int64_t)maxf(iops_delay_usec, bytes_delay_usec);
+ return (uint32_t)(delay_usec * rl->delay_sec_max);
+}
+
+static bool ratelimiter_need_renew(const struct ratelimiter *rl,
+ const struct timespec *now)
+{
+ time_t sec_dif = 0;
+
+ if (rl->ts_base.tv_sec == 0) {
+ /* First time */
+ DBG_DEBUG("[%s snum:%d %s] init\n",
+ MODULE_NAME,
+ rl->snum,
+ rl->oper);
+ return true;
+ }
+ sec_dif = (now->tv_sec - rl->ts_last.tv_sec);
+ if (sec_dif >= 60) {
+ /* Force renew after 1-minutes idle */
+ DBG_DEBUG("[%s snum:%d %s] idle sec_dif=%ld\n",
+ MODULE_NAME,
+ rl->snum,
+ rl->oper,
+ (long)sec_dif);
+ return true;
+ }
+ sec_dif = (now->tv_sec - rl->ts_base.tv_sec);
+ if (sec_dif >= 1200) {
+ /* Force renew every 20-minutes to avoid skew */
+ DBG_DEBUG("[%s snum:%d %s] renew sec_dif=%ld\n",
+ MODULE_NAME,
+ rl->snum,
+ rl->oper,
+ (long)sec_dif);
+ return true;
+ }
+ return false;
+}
+
+static void ratelimiter_dbg(const struct ratelimiter *rl,
+ int64_t nbytes,
+ int64_t tdiff_usec,
+ uint32_t delay_usec)
+{
+ if (rl->iops_limit > 0) {
+ DBG_DEBUG("[%s snum:%d %s]"
+ " iops_total=%" PRId64 " iops_limit=%" PRId64
+ " iops_tokens_max=%.2f iops_tokens=%.2f"
+ " tdiff_usec=%" PRId64 " delay_usec=%" PRIu32 " \n",
+ MODULE_NAME,
+ rl->snum,
+ rl->oper,
+ rl->iops_total,
+ rl->iops_limit,
+ rl->iops_tokens_max,
+ rl->iops_tokens,
+ tdiff_usec,
+ delay_usec);
+ }
+ if (rl->bw_limit > 0) {
+ DBG_DEBUG("[%s snum:%d %s]"
+ " bytes_total=%" PRId64 " bw_limit=%" PRId64
+ " bytes_tokens_max=%.2f bytes_tokens=%.2f"
+ " nbytes=%" PRId64 " tdiff_usec=%" PRId64
+ " delay_usec=%" PRIu32 " \n",
+ MODULE_NAME,
+ rl->snum,
+ rl->oper,
+ rl->bytes_total,
+ rl->bw_limit,
+ rl->bytes_tokens_max,
+ rl->bytes_tokens,
+ nbytes,
+ tdiff_usec,
--
Samba Shared Repository