[OMPI devel] [PATCH] Open MPI on ARMv5

2012-04-13 Thread Evan Clinton
At present Open MPI only supports ARMv7 processors. Attached is a patch against current trunk (r26270) that extends the atomic operations and memory barriers code to work with ARMv5 and ARMv6 ones, too. For v6, the only changes were to use "mcr p15, 0, r0, c7, c10, 5" instead of the unavailable D

Re: [OMPI devel] Non-zero exit status

2012-04-13 Thread Ralph Castain
Did you have the param set? I found some missing code in the orted errmgr that contributed to it, but unless you had set the param in your test, there is no way it would abort no matter how many procs exit with non-zero status. I'm guessing you have that param set in your test due to our earlier

Re: [OMPI devel] Non-zero exit status

2012-04-13 Thread TERRY DONTJE
I could see if less then N processes exit with non-zero exit code that the ORTE may choose not to abort the job. However, if all N processes have exited or aborted I expect everything to clean up and mpirun to exit. It does not do that at the moment which I think is what is causing most of th

[OMPI devel] Non-zero exit status

2012-04-13 Thread Ralph Castain
This has come up again because some of the MTT tests depend on a specific behavior when a process exits with a non-zero status - in this case, they expect ORTE to abort the job. At some point, the default had been switched to NOT abort the job if a process exited with a non-zero status. So I'll

Re: [OMPI devel] [EXTERNAL] Re: r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread Ralph Castain
I don't know about "drama", but people did clearly explain to you why this approach was unacceptable. You simply cannot cross-link at the component level. If you need something from the opal/mca/memory framework, you have to get it from the framework level. Doesn't seem that hard a concept to g

Re: [OMPI devel] [EXTERNAL] Re: r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread Mike Dubman
Too many drama - we will fix it to detect hooks availability at configure stage, this will make your life back to normal. The problem is not a Mellanox hw, but Intel PCI bus implementation, which charge extra latency if buffers are not aligned. The patch is a workaround for this problem and help t

Re: [OMPI devel] [EXTERNAL] Re: r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread TERRY DONTJE
On 4/13/2012 12:06 PM, Barrett, Brian W wrote: r2655 is awful as a patch. It doesn't work on any non-Linux platform, which is unpleasant. But worse, what does it possibly accomplish? In codes other than benchmarks, there's no advantage to aligning the pointer to 32 or 64 byte boundaries, as

Re: [OMPI devel] RTE node allocation component

2012-04-13 Thread Ralph Castain
Looks like you are using an old version - the trunk RAS has changed a bit. I'll shortly be implementing further changes to support dynamic allocation requests that might be relevant here as well. Adding job data to the RAS base isn't a good idea - remember, multiple jobs can be launching at the

Re: [OMPI devel] [EXTERNAL] Re: r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread Barrett, Brian W
r2655 is awful as a patch. It doesn't work on any non-Linux platform, which is unpleasant. But worse, what does it possibly accomplish? In codes other than benchmarks, there's no advantage to aligning the pointer to 32 or 64 byte boundaries, as the malloced buffer very rarely is exactly what is

[OMPI devel] RTE node allocation component

2012-04-13 Thread Alex Margolin
Hi, The next component I'm writing is a component for allocating nodes to run the processes of an MPI job. Suppose I have a "getbestnode" executable which not only tells me the best location for spawning a new process, but it also reserves the space (for some time), so that every time I run it I

Re: [OMPI devel] r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread TERRY DONTJE
I am thinking MEMORY_LINUX_PTMALLOC2 is probably the right define to key off of but this is really going to look gross ifdef'ing out the lines that are accessing the Linux memory module. One other idea I have is to create a dummy __malloc_hook in the Solaris memory module but might there be ot

[OMPI devel] r26255 has made openib unusable on Solaris platforms

2012-04-13 Thread TERRY DONTJE
r26255 is forcing the use of __malloc_hook which is implemented in opal/mca/memory/linux however that is not compiled in the library when built on Solaris thus causing a referenced symbol not found when libmpi tries to load the openib btl. I am looking how to fix this now but if someone has a