Re: [PATCH v2] btrfs: use customized batch size for total_bytes_pinned

ethanlien Thu, 12 Jul 2018 19:28:31 -0700

Omar Sandoval 於 2018-07-13 06:19 寫到:

On Wed, Jul 11, 2018 at 11:59:36PM +0800, Ethan Lien wrote:
In commit b150a4f10d878 ("Btrfs: use a percpu to keep track ofpossiblypinned bytes") we use total_bytes_pinned to track how many bytes wearegoing to free in this transaction. When we are close to ENOSPC, wecheck itand know if we can make the allocation by commit the currenttransaction.
For every data/metadata extent we are going to free, we add
total_bytes_pinned in btrfs_free_extent() and btrfs_free_tree_block(),andrelease it in unpin_extent_range() when we finish the transaction. Sothis
is a variable we frequently update but rarely read - just the suitable
use of percpu_counter. But in previous commit we updatetotal_bytes_pinned
by default 32 batch size, making every update essentially a spin lock
protected update. Since every spin lock/unlock operation involvessyncinga globally used variable and some kind of barrier in a SMP system,this ismore expensive than using total_bytes_pinned as a simple atomic64_t.So
fix this by using a customized batch size. Since we only read
total_bytes_pinned when we are close to ENOSPC and fail to alloc newchunk,we can use a really large batch size and have nearly no penalty inmost
cases.
[Test]
We test the patch on a 4-cores x86 machine:
1. falloate a 16GiB size test file.
2. take snapshot (so all following writes will be cow write).
3. run a 180 sec, 4 jobs, 4K random write fio on test file.
We also add a temporary lockdep class on percpu_counter's spin lockused
by total_bytes_pinned to track lock_stat.


[Results]
unpatched:
lock_stat version 0.4
-----------------------------------------------------------------------
                              class name    con-bounces    contentions
waittime-min waittime-max waittime-total waittime-avgacq-bouncesacquisitions holdtime-min holdtime-max holdtime-totalholdtime-avg
               total_bytes_pinned_percpu:            82             82
0.21 0.61 29.46 0.36298340635973 0.09 11.01 173476.250.27
patched:
lock_stat version 0.4
-----------------------------------------------------------------------
                              class name    con-bounces    contentions
waittime-min waittime-max waittime-total waittime-avgacq-bouncesacquisitions holdtime-min holdtime-max holdtime-totalholdtime-avg
               total_bytes_pinned_percpu:             1              1
0.62 0.62 0.62 0.621360131542 0.14 9.61 11016.900.35
[Analysis]
Since the spin lock only protect a single in-memory variable, the
contentions (number of lock acquisitions that had to wait) in both
unpatched and patched version are low. But when we see acquisitionsandacq-bounces, we get much lower counts in patched version. Here themost
important metric is acq-bounces. It means how many times the lock get
transferred between different cpus, so the patch can really recude
cacheline bouncing of spin lock (also the global counter ofpercpu_counter)
in a SMP system.

Fixes: b150a4f10d878 ("Btrfs: use a percpu to keep track of possibly
pinned bytes")

Signed-off-by: Ethan Lien <ethanl...@synology.com>
---

V2:
        Rewrite commit comments.
        Add lock_stat test.
        Pull dirty_metadata_bytes out to a separate patch.

 fs/btrfs/ctree.h       |  1 +
fs/btrfs/extent-tree.c | 46++++++++++++++++++++++++++++--------------
 2 files changed, 32 insertions(+), 15 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 118346aceea9..df682a521635 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -422,6 +422,7 @@ struct btrfs_space_info {
         * time the transaction commits.
         */
        struct percpu_counter total_bytes_pinned;
+       s32 total_bytes_pinned_batch;
Can this just be a constant instead of adding it to space_info?


Yes constant is better here, I'll resend it, thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2] btrfs: use customized batch size for total_bytes_pinned

Reply via email to