On 04/25/12 01:50, Evan Clinton wrote:
- The v5 code doesn't actually make use of the kuser helper barriers
in its versions of opal_atomic_cmpset_acq/rel.
Quote the documentation, __kuser_cmpxchg "already includes memory
barriers as needed," so the code using it shouldn't need anything
extra.
Fair enough - but could you put a comment to this effect into the patch?
Tweaked the patch, should be a little nicer now.
Thank you.
- The patch does not do a runtime test for kernel helper version. While
normally not a problem, this could cause tricky to debug issues if
running on top of old kernels (as in "executing uninitialized
memory" tricky).
I'm not familiar enough with the codebase to know where it should go,
and I suspect the obvious check based on the __kuser_helper_version
will not be available on kernels much older than the ones with the
features we need anyway. We'd only have a problem on kernels older
than 2.6.15, which was released in January of 2006, and I'm only
confident we'd have __kuser_help_version from 2.6.12 onward.
Hmm, true enough...
But I'm still not too happy about even the very unlikely risk of
something executing "random stuff" depending on kernel version.
For now, could you update the comments to that effect that:
"These kernel functions are available on kernel versions 2.6.15 and
greater" + ", and running this software on earlier versions will result
in undefined behaviour."
?
- This patch does not include out-of-line assembly versions
(in opal/asm/base) for the atomic operations. However I am not sure
if this causes any practical problems.
As far as I can tell, this isn't an issue, since no actual new
assembly is added or changed.
It doesn't cause any problems with existing code - so what I'm really
saying is that I don't know that this won't cause issues specifically
for these new targets. Unless Jeff objects, I'm happy.
- If 64-bit atomics are desirable, these are actually possible on most
ARMv6 platforms (including the Raspberry PI), so a configure test on
whether LDREXD assembles without errors could be used to detect this.
I'd add another case to the architecture detection in the config
script, but I'm not sure what strings to match on (i.e. uname outputs)
for detecting presence of LDREXD, and I'm not aware of any particular
benefit to having the 64bit functions.
What I'm suggesting is not to parse information out of the build host,
but rather using the configured toolchain and options and try to
assemble the 64-bit atomic instructions. This would make it easy to
rebuild a generic ARMv6 package to support 64-bit atomics by just adding
-march=armv6zk to CFLAGS.
One use-case for the 64-bit atomics is in lockless list algorithms where
they let you update two pointers or a pointer and a semaphore
atomically. Whether such code would realistically use Open MPI
primitives, I'm not the right person to say.
But this is not something that needs doing for this patch to be
accepted. If you can add the two comments, I'm happy for this to go in.
/
Leif
-- IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended recipient,
please notify the sender immediately and do not disclose the contents to any
other person, use it for any purpose, or store or copy the information in any
medium. Thank you.