Hi Evan,

And just to add to the delay, I've been off sick...

First of all - thanks for the patch!

The patch adds support for currently unsupported platforms, without actually
changing any code paths for currently supported platforms. So from that
perspective, I would not object strongly to it being applied as is.

However, I do have a few minor comments on the code:
- The v5 code doesn't actually make use of the kuser helper barriers
  in its versions of opal_atomic_cmpset_acq/rel.
- The line
  +#if (OMPI_GCC_INLINE_ASSEMBLY || (OPAL_ASM_ARM_VERSION < 6))
  is correct, but does my head in. Could another define like
  OPAL_ARM_KUSER_BARRIERS or similar be added by the barrier definition
  and used here instead, to improve readability?
- Could you change the line
  +#if (OPAL_ASM_ARM_VERSION >= 6 && OMPI_GCC_INLINE_ASSEMBLY)
  to
  +#if (OMPI_GCC_INLINE_ASSEMBLY && (OPAL_ASM_ARM_VERSION >= 6))
  again for readability?

And a few higher-level comments/suggestions:
- The patch does not do a runtime test for kernel helper version. While
  normally not a problem, this could cause tricky to debug issues if 
  running on top of old kernels (as in "executing uninitialized
  memory" tricky).
  Not sure what the best way to hook such a test in would be though.
- This patch does not include out-of-line assembly versions
  (in opal/asm/base) for the atomic operations. However I am not sure
  if this causes any practical problems.
- If 64-bit atomics are desirable, these are actually possible on most
  ARMv6 platforms (including the Raspberry PI), so a configure test on
  whether LDREXD assembles without errors could be used to detect this.

Longer-term, I'd like to migrate to using the new GCC __atomic* intrinsics
(in 4.7 - http://gcc.gnu.org/wiki/Atomic/GCCMM). However, the old __sync*
intrinsic were a bit heavyweight so until 4.7 is ubiquitous I prefer to keep
the inline asm, and now the kuser calls.

/
        Leif

> -----Original Message-----
> From: Jeffrey Squyres [mailto:jsquy...@cisco.com]
> Sent: 19 April 2012 16:21
> To: Open MPI Developers; Leif Lindholm
> Subject: Re: [OMPI devel] [PATCH] Open MPI on ARMv5
> 
> Thanks Evan!
> 
> (sorry for the delay in replying -- I was on vacation all last week and
> I'm *still* catching up...)
> 
> Lief -- does this look good to you?
> 
> 
> On Apr 13, 2012, at 11:13 PM, Evan Clinton wrote:
> 
> > At present Open MPI only supports ARMv7 processors.  Attached is a
> > patch against current trunk (r26270) that extends the atomic
> > operations and memory barriers code to work with ARMv5 and ARMv6
> ones,
> > too.
> >
> > For v6, the only changes were to use "mcr p15, 0, r0, c7, c10, 5"
> > instead of the unavailable DMB instruction, and to disable the 64 bit
> > compare-exchange function (which I understand is not vital for Open
> > MPI on 32 bit platforms?).  For v5, it was a bit trickier; the
> > processor lacks nice memory barrier instructions or proper atomic
> > operations.  Fortunately, the Linux kernel offers several helper
> > functions on ARM, and I've used those here.
> >
> > The changes build and pass all of the assembly-related tests in the
> > test folder and the hello world examples run on my "armv5tel" box
> > running Debian with Linux 2.6.32-5.  It should also run fine on ARMv6
> > boxes, and presumably v4, but I don't have either to test on.
> >
> > Documentation for the Linux kernel helper functions:
> >
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=D
> ocumentation/arm/kernel_user_helpers.txt
> >
> > I've sent in a contributor agreement so there should be no IP
> problems.
> >
> > Hopefully this is useful,
> > Evan Clinton
> > <ompi_armv5.diff>_______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 




Reply via email to