The old opal_atomic_cmpset_32 worked:
static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %1,%2 \n\t"
"sete %0 \n\t"
: "=qm" (ret)
: "q"(newval), "m"(*addr), "a"(oldval)
: "memory");
return (int)ret;
}
The new opal_atomic_cmpset_32 fails:
static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
int32_t oldval, int32_t newval)
{
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%4 \n\t"
"sete %0 \n\t"
: "=qm" (ret), "=a" (oldval), "=m" (*addr)
: "q"(newval), "m"(*addr), "1"(oldval)
return (int)ret;
}
**However** if you put back the "clobber" for memory line (3rd :), it works:
static inline int opal_atomic_cmpset_32( volatile int32_t *addr,
int32_t oldval, int32_t newval)
{
unsigned char ret;
__asm__ __volatile__ (
SMPLOCK "cmpxchgl %3,%4 \n\t"
"sete %0 \n\t"
: "=qm" (ret), "=a" (oldval), "=m" (*addr)
: "q"(newval), "m"(*addr), "1"(oldval)
: "memory");
return (int)ret;
}
This works in a test case for pathcc, gcc, icc, pgcc, SUN studio cc and open64
(pathscale
lineage - which also fails with 1.4.1).
Also the SMPLOCK above is defined as "lock; " - the ";" is a GNU as statement
delimter - is
that right? Seems to work with/without the ";".
Also, a question - I see you generate via perl another "lock" asm file which
you put into
opal/asm/generated/<whatever, e.g. atomic-amd64-linux.s> and stick into libasm
- what you
generate there for whatever usage hasn't changed 1.4->1.4.1->svn trunk?
DM
On Tue, 9 Feb 2010, Jeff Squyres wrote:
Perhaps someone with a pathscale compiler support contract can investigate this
with them.
Have them contact us if they want/need help understanding our atomics; we're
happy to explain, etc. (the atomics are fairly localized to a small part of
OMPI).
On Feb 9, 2010, at 11:42 AM, Mostyn Lewis wrote:
All,
FWIW, Pathscale is dying in the new atomics in 1.4.1 (and svn trunk) - actually
looping -
from gdb:
opal_progress_event_users_decrement () at
../.././opal/include/opal/sys/atomic_impl.h:61
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
Current language: auto; currently asm
(gdb) where
#0 opal_progress_event_users_decrement () at
../.././opal/include/opal/sys/atomic_impl.h:61
#1 0x0000000000000001 in ?? ()
#2 0x00002aec4cf6a5e0 in ?? ()
#3 0x00000000000000eb in ?? ()
#4 0x00002aec4cfb57e0 in ompi_mpi_init () at
../.././ompi/runtime/ompi_mpi_init.c:818
#5 0x00007fff5db3bd58 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) list
56 {
57 int32_t oldval;
58
59 do {
60 oldval = *addr;
61 } while (0 == opal_atomic_cmpset_32(addr, oldval, oldval - delta));
62 return (oldval - delta);
63 }
64 #endif /* OPAL_HAVE_ATOMIC_SUB_32 */
65
(gdb)
DM
On Tue, 9 Feb 2010, Jeff Squyres wrote:
FWIW, I have had terrible luck with the patschale compiler over the years.
Repeated attempts to get support from them -- even when I was a paying customer
-- resulted in no help (e.g., a pathCC bug with the OMPI C++ bindings that I
filed years ago was never resolved).
Is this compiler even supported anymore? I.e., is there a support department
somewhere that you have a hope of getting any help from?
I can't say for sure, of course, but if MPI hello world hangs, it smells like a compiler
bug. You might want to attach to "hello world" in a debugger and see where
it's hung. You might need to compile OMPI with debugging symbols to get any meaningful
information.
** NOTE: My personal feelings about the pathscale compiler suite do not reflect
anyone else's feelings in the Open MPI community. Perhaps someone could change
my mind someday, but *I* have personally given up on this compiler. :-(
On Feb 8, 2010, at 2:38 AM, Rafael Arco Arredondo wrote:
Hello,
It does work with version 1.4. This is the hello world that hangs with
1.4.1:
#include <stdio.h>
#include <mpi.h>
int main(int argc, char **argv)
{
int node, size;
MPI_Init(&argc,&argv);
MPI_Comm_rank(MPI_COMM_WORLD, &node);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("Hello World from Node %d of %d.\n", node, size);
MPI_Finalize();
return 0;
}
El mar, 26-01-2010 a las 03:57 -0500, ?ke Sandgren escribi?:
1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built
1.4.1 yet)
2 - There is a bug in the pathscale compiler with -fPIC and -g that
generates incorrect dwarf2 data so debuggers get really confused and
will have BIG problems debugging the code. I'm chasing them to get a
fix...
3 - Do you have an example code that have problems?
--
Rafael Arco Arredondo
Centro de Servicios de Inform?tica y Redes de Comunicaciones
Universidad de Granada
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users