On 07/02/2017 14:57, Mateusz Guzik wrote:
On Sun, Feb 05, 2017 at 03:17:46PM +0000, Alexey Dokuchaev wrote:
On Sun, Feb 05, 2017 at 04:00:06AM +0100, Mateusz Guzik wrote:
For instance, plugging an unused variable, a memory leak, doing a
lockless check first etc. are all pretty standard and unless there is
something unusual going on (e.g. complicated circumstances leading to a
leak) there is not much to explain. In particular, I don't see why
anyone would explain why leaks are bad on each commit plugging one.
Right; these (unused variable, resource leaks) usually do not warrant
elaborate explanation.

[ Some linefeeds below were trimmed for brevity ]
The gist is as follows: there are plenty of cases where the kernel wants
to atomically replace the value of a particular variable. Sometimes,
like in this commit, we want to bump the counter by 1, but only if the
current value is not 0. For that we need to read the value, see if it is
0 and if not, try to replace what we read with what we read + 1. We
cannot just increment as the value could have changed to 0 in the
meantime.
But this also means that multiple cpus doing the same operation on the
same variable will trip on each other - one will succeed while the rest
will have to retry.
Prior to this commit, each retry attempt would explicitly re-read the
value. This induces cache coherency traffic slowing everyone down.
amd64 has the nice property of giving us the value it found eleminating
the need to explicitly re-read it. There is similar story on i386 and
sparc.
Other architectures may also benefit from this, but that I did not
benchmark.

In short[,] under contention atomic_fcmpset is going to be faster than
atomic_cmpset.
I did not benchmark this particular change, but a switch of the sort
easily gives 10%+ in microbenchmarks on amd64.
That said, while one can argue this optimizes the code, it really
depessimizes it as something of the sort should have been already
employed.
Given the above, IMHO it's quite far from an obvious or of manpage-lookup
thing, and thus requires proper explanation in the commit log.

If the aformenteiond explanation is necessary, the place for it is in
the man page. There are already several commits with fcmpset and there
will be more to come. I don't see why any of them would convey the
information.

The details of why performance under contention of atomic_fcmpset is better than atomic_cmpset, a manpage would be nice.

All I'm suggesting is, while one could guess this may be a performance or possibly a compatibility thing, the reason is not obvious, so a small piece of detail on why the change was done should always be included.

For this one something like the following would be nice:

Switch fget_unlocked to atomic_fcmpset

Improve performance under contention by switching fget_unlocked to
use atomic_fcmpset.

With small piece of additional information, its clear the reason for the change (why) was to improve performance and anyone who wants more detail on why this would be the case can research it via a manpage or other resources, wouldn't you agree?

    Regards
    Steve
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to