On Wed, Jun 07, 2017 at 02:01:46PM +0300, Timofey Titovets wrote:
> 2017-06-07 12:45 GMT+03:00 Filipe Manana <fdman...@gmail.com>:
> > On Wed, Jun 7, 2017 at 1:58 AM, Timofey Titovets <nefelim...@gmail.com> 
> > wrote:
> >> In worst case code do 4 comparison,
> >> just add some new checks to switch check branch faster
> >> now in worst case code do 3 comparison
> >
> > Some comments below.
> >
> >>
> >> Signed-off-by: Timofey Titovets <nefelim...@gmail.com>
> >> ---
> >>  fs/btrfs/ctree.c | 20 +++++++++++---------
> >>  1 file changed, 11 insertions(+), 9 deletions(-)
> >>
> >> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> >> index a3a75f1de..318379531 100644
> >> --- a/fs/btrfs/ctree.c
> >> +++ b/fs/btrfs/ctree.c
> >> @@ -460,15 +460,17 @@ __tree_mod_log_insert(struct btrfs_fs_info *fs_info, 
> >> struct tree_mod_elem *tm)
> >>         while (*new) {
> >>                 cur = rb_entry(*new, struct tree_mod_elem, node);
> >>                 parent = *new;
> >> -               if (cur->logical < tm->logical)
> >> -                       new = &((*new)->rb_left);
> >> -               else if (cur->logical > tm->logical)
> >> -                       new = &((*new)->rb_right);
> >> -               else if (cur->seq < tm->seq)
> >> -                       new = &((*new)->rb_left);
> >> -               else if (cur->seq > tm->seq)
> >> -                       new = &((*new)->rb_right);
> >> -               else
> >> +               if (cur->logical != tm->logical) {
> >> +                       if (cur->logical < tm->logical)
> >> +                               new = &((*new)->rb_left);
> >
> > So now when cur->logical is less then tm->logical, we do 2 comparisons
> > instead of 1.
> > Is this case much less common always? If it were it could be a good change.
> > What guarantees that this change will give better performance for
> > common workloads? What guarantees you the example I just gave isn't
> > typical for most/many common workloads?
> >
> > And most importantly, did you actually do some benchmarks that justify
> > this change? Is this a hot code path that would benefit from such
> > changes?
> > (Since it's the tree mod log, seldom used, I really doubt it).
> >
> > Same comments apply to all the other similar patches you have sent.
> 
> Nope, i just lack of expirience to do such benchmarks,
> same for understand all btrfs code, it's difficult to say that path is 
> hot/cold.

So if you call the patchset cleanups and then claim some
performance-related improvements, you can expect to be called out to
provide numbers or an analysis.

I've checked the resulting assembly of "Btrfs: backref_comp decrease max
compare count" that does similar conversion of the sequence of ifs.
Surprise surprise, the assembly does not differ regarding number of
branches. This at mininimum can be done before any profiling to see if
this makes sense.

If the assembly stays the same, the code readability may be preferred,
which is one of the points of cleanup patches. In that case I prefer the
current code over your change, but it may be just personal preference
and because I'm used to reading the "linear ifs" pattern.

> I just did reread all btrfs code, analize that i understand and trying
> fix places that looks questionably.
> 
> In most cases i can explain why this can help (IMHO):
> 01 - __compare_inode_defrag - we compare inodes, by my expirience
> in most distros and setups btrfs have 1-2 subvolumes (root_ids), so in
> most time root_id1 == root_id2
> 02 - backref_comp - same
> 03 - ref_node_cmp - same
> 04 - ref_tree_add - just a cleanup, compiller do the same
> https://godbolt.org/g/Ww93ws
> 05 - add_all_parents - just a cleanup, compiller do the same
> https://godbolt.org/g/srSGeW
> 07 - end_workqueue_bio - if i understand correctly this code executed
> on every write,
> so by convertion to switch case will save several cpu cycles on every write
> 
> For that function (06 -  __tree_mod_log_insert):
> i do the same as for 01-03 "by inertia" - as i just not know all code,
> i just assume that as this fuction work with rb tree, values will be
> equally distributed

>From my point above, I'll not merge patches 1, 2, 3 and 6. Patches 4 and 5
seem ok.

Patch 7 does not result in the equivalent code, as reported by the
kbuild robot. But converting the ifs to switch is desired in
end_workqueue_bio, namely the second branch as it's not clear how are
some of the BTRFS_WQ_ENDIO_* values handled.

If you still care, please resend 4 and 5. The changelog does not need to
repeat the code, more describe that the change simplifies the code to
make it more readable.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to