Re: [OMPI devel] [OMPI svn] svn:open-mpi r14444

2007-04-25 Thread George Bosilca
In fact, if you find any internal Open MPI code which call directly  
the MPI functions you should raise the flag. It's a design choice,  
the MPI API is for the users, when you need to call any MPI like  
functions from inside ompi you need to take another path. In the  
collectives we're using the PML macros. One of the reasons is  
performance. We know that the arguments for the functions we call are  
correct, so there is no reason to check them. For this particular  
case, the collective modules never call the PML with the request set  
to NULL, instead we're using the MPI_REQUEST_NULL.


  george.


On Apr 24, 2007, at 2:46 PM, Josh Hursey wrote:


Actually, after doing a quick grep through the code base it seems
that the tuned collectives use the PML start interface without going
through the MPI level call (as far as I could tell). So since I don't
know the full impact of such a change I'm going to not make this
change and leave it for someone more knowledgeable in those systems
to do so. Someone else can better ensure proper testing of the impact
of this change.

Sorry,
Josh

On Apr 20, 2007, at 4:05 PM, Josh Hursey wrote:


Yeah I was not actually sure what the standard said about passing an
array of requests and having one of the elements be NULL. This just
seemed like a subtle bug when I was looking through the code.

Taking a quick look at the check in mpi/c/startall.c it seems that we
do check for this case there and error out if any element is NULL, so
I agree that we can just remove this from the file. If no one gets to
it before tomorrow sometime (or there are objections) then I'll take
the NULL check out.

Cheers,
Josh

On Apr 20, 2007, at 3:28 PM, George Bosilca wrote:


I think the NULL test is a left over from long ago. At one point in
the
past we decided that all MPI related tests have to be done outside
the PML
functions (i.e. in the MPI layer). The test for request == NULL is
present
in the start.c and startall.c. Anywhere else (i.e. where we use
internally
the pml_start call) we can make sure that this doesn't happens.
Therefore,
the test can be safely removed from the startall function.

   george.


On Fri, 20 Apr 2007, jjhur...@osl.iu.edu wrote:


Author: jjhursey
Date: 2007-04-20 13:17:11 EDT (Fri, 20 Apr 2007)
New Revision: 1
URL: https://svn.open-mpi.org/trac/ompi/changeset/1

Log:
Check for NULL before trying to use the variable.


Text files modified:
  trunk/ompi/mca/pml/ob1/pml_ob1_start.c | 4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)

Modified: trunk/ompi/mca/pml/ob1/pml_ob1_start.c
=== 
=

=
=
--- trunk/ompi/mca/pml/ob1/pml_ob1_start.c  (original)
+++ trunk/ompi/mca/pml/ob1/pml_ob1_start.c  2007-04-20 13:17:11 EDT
(Fri, 20 Apr 2007)
@@ -32,11 +32,11 @@

for(i=0; ireq_type) {
continue;
}
-if(NULL == pml_request)
-continue;

/* If the persistent request is currently active - obtain
the
 * request lock and verify the status is incomplete. if the
___
svn mailing list
s...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/svn



"We must accept finite disappointment, but we must never lose
infinite
hope."
   Martin Luther King

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Fancy ORTE/MPIRUN bugs

2007-04-25 Thread Aurelien Bouteiller


All bugs occur on Intel 32 bits architecture under Mac OS X using  
gcc 4.2



The other one occurs when running MPI program without mpirun


As of r14440, I'm unable to replicate, but it could have been one of
those getting lucky issues.  Can you see if the problem is still
occurring?




Unfortunately my Mac is gone for repair at applecare. I installed  
everithing
on a Linux box but I am unable to reproduce the error. Outside of  
the OS, most
softwares are the same version, exept I use gcc 4.1.2 instead of  
gcc 4.2
(testing). I will try to see if moving to gcc 4.1.2 (stable) fixes  
the bug as

soon as my mac is back.


My mac is back. I am unable to reproduce the bug with both gcc 4.0.1  
and 4.1.2. That might be some bug coming from regression in gcc 4.2,  
qs it is still unstable.


Aurelien