[OMPI devel] Bug (wrong LB?) when using cascading derived data types

2009-03-02 Thread Markus Blatt
Hi,

I already posted this accidentally on the users list, but as it seems
like a bug I suppose this list is more appropriated.

In one of my applications I am using cascaded derived MPI datatypes
created with MPI_Type_struct. One of these types is used to just send
a part (one MPI_Char) of a struct consisting of an int followed by two
chars. I.e, the int at the beginning is/should be ignored.

This works fine if I use this data type on its own. 

Unfortunately I need to send another struct that contains an int and
the int-char-char struct from above. Again I construct a custom MPI
data type for this.

When sending this cascaded data type It seems that the offset of the
char in the inner custom type is disregarded on the receiving end and
the
received data ('1') is stored in the first int instead of the
following char.

I have tested this code with both lam and mpich. There it worked as
expected (saving the '1' in the first char).

The last two lines of the output of the attached test case read

received global=10 attribute=0 (local=1 public=0)
received  attribute=1 (local=100 public=0)

for openmi instead of

received global=10 attribute=1 (local=100 public=0)
received  attribute=1 (local=100 public=0)

for lam and mpich.

This problem is experienced when using version 1.3-2 as well as
1.2.7~rc2-2 of openmpi on Debian.
At first sight it seemed corellated with bug #1677. But as that should
be fixed since 1.2.9 it is probably something else.


Cheers,

Markus



#include"mpi.h"
#include

struct LocalIndex
{
  int local_;
  char attribute_;
  char public_;
};


struct IndexPair
{
  int global_;
  LocalIndex local_;
};


int main(int argc, char** argv)
{
  MPI_Init(&argc, &argv);

  int rank, size;

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  if(size<2)
{
  std::cerr<<"no procs has to be >2"2"<

ompi_info.tgz
Description: GNU Unix tar archive


[OMPI devel] 1.3.1rc2 has been released

2009-03-02 Thread Jeff Squyres

A bunch of ORTE and VT fixes are the big differences between rc1 and r2.

Please test 1.3.1rc2 ASAP.  Thanks!

http://www.open-mpi.org/software/ompi/v1.3/

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] Bug (wrong LB?) when using cascading derived data types

2009-03-02 Thread George Bosilca

Markus,

You're right, there was a problem in the code. I'll pass the gore  
details of the why and how. The problem is now fixed by commit r20674.  
It will be in the next release.


  Thanks,
george.

On Mar 2, 2009, at 10:04 , Markus Blatt wrote:


Hi,

I already posted this accidentally on the users list, but as it seems
like a bug I suppose this list is more appropriated.

In one of my applications I am using cascaded derived MPI datatypes
created with MPI_Type_struct. One of these types is used to just send
a part (one MPI_Char) of a struct consisting of an int followed by two
chars. I.e, the int at the beginning is/should be ignored.

This works fine if I use this data type on its own.

Unfortunately I need to send another struct that contains an int and
the int-char-char struct from above. Again I construct a custom MPI
data type for this.

When sending this cascaded data type It seems that the offset of the
char in the inner custom type is disregarded on the receiving end and
the
received data ('1') is stored in the first int instead of the
following char.

I have tested this code with both lam and mpich. There it worked as
expected (saving the '1' in the first char).

The last two lines of the output of the attached test case read

received global=10 attribute=0 (local=1 public=0)
received  attribute=1 (local=100 public=0)

for openmi instead of

received global=10 attribute=1 (local=100 public=0)
received  attribute=1 (local=100 public=0)

for lam and mpich.

This problem is experienced when using version 1.3-2 as well as
1.2.7~rc2-2 of openmpi on Debian.
At first sight it seemed corellated with bug #1677. But as that should
be fixed since 1.2.9 it is probably something else.


Cheers,

Markus



#include"mpi.h"
#include

struct LocalIndex
{
 int local_;
 char attribute_;
 char public_;
};


struct IndexPair
{
 int global_;
 LocalIndex local_;
};


int main(int argc, char** argv)
{
 MPI_Init(&argc, &argv);

 int rank, size;

 MPI_Comm_rank(MPI_COMM_WORLD, &rank);
 MPI_Comm_size(MPI_COMM_WORLD, &size);

 if(size<2)
   {
 std::cerr<<"no procs has to be >2"___

devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] calling sendi earlier in the PML

2009-03-02 Thread Eugene Loh
I'm on the verge of giving up moving the sendi call in the PML.  I will 
try one or two last things, including this e-mail asking for feedback.


The idea is that when a BTL goes over a very low-latency interconnect 
(like sm), we really want to shave off whatever we can from the software 
stack.  One way of doing so is to use a "send-immediate" function, which 
a few BTLs (like sm) provide.  The problem is avoiding a bunch of 
overhead introduced by the PML before checking for a "sendi()" call.


Currently, the PML does something like this:

   for ( btl = ... ) {
   if ( SUCCESS == btl->sendi() ) return SUCCESS;
   if ( SUCCESS == btl->send() ) return SUCCESS;
   }
   return ERROR;

That is, it roundrobins over all available BTLs, for each one trying 
sendi() and then send().  If ever a sendi or send completes 
successfully, we exit the loop successfully.


The problem is that this loop is buried several functioncalls deep in 
the PML.  Before it reaches this far, the PML has initialized a large 
"send request" data structure while traversing some (to me) complicated 
call graph of functions.  This introduces a lot of overhead that 
mitigates much of the speedup we might hope to see with the sendi 
function.  That overhead is unnecessary for a sendi call, but necessary 
for a send call.  I've tried reorganizing the code to defer as much of 
that work as possible -- performing that overhead only if it's need to 
perform a send call -- but I've gotten braincramp every time I've tried 
this reorganization.


I think these are the options:

Option A) Punt!

Option B) Have someone more familiar with the PML make these changes.

Option C) Have Eugene keep working at this because he'll learn more 
about the PML and it's good for his character.


Option D) Go to a strategy in which all BTLs are tried for sendi before 
any of them is tried for a send.  The code would look like this:


   for ( BTL = ... ) if ( SUCCESS == btl_sendi() ) return SUCCESS;
   for ( BTL = ... ) if ( SUCCESS == btl_send() ) return SUCCESS;
   return ERROR;

The reason this is so much easier to achieve is that we can put that 
first loop way up high in the PML (as soon as a send enters the PML, 
avoiding all that expensive overhead) and leave the second loop several 
layers down, where it is today.  George is against this new loop 
structure because he thinks round robin selection of BTLs is most fair 
and distributes the load over BTLs as evenly as possible.  (In contrast, 
the proposed loop would favor BTLs with sendi functions.)  It seems to 
me, however, that favoring BTLs that have sendi functions is exactly the 
right thing to do!  I'm not even convinced that the conditions he's 
worried about are that common:  multiple eager BTLs to poll, one has a 
sendi, and that sendi is not very good or that BTL is getting overloaded.


Anyhow, I like Option D, but George does not.

Option E) Go to a strategy in which the next BTL is tested for a sendi 
function.  If there is one, use it.  If not, just continue with the 
usual heavyweight PML procedure.  This feels a little hackish to me, but 
it'll mean that most of the time that sendi can be called, the 
heavyweight PML overhead will be avoided, while at the same time "fair" 
roundrobin polling over the BTLs is maintained.


I'll proceed with Option C for the time being.  If I don't announce 
success or surrender in the next few days, please write to me at the 
insane asylum.


Re: [OMPI devel] ompi v1.3 compilation problem on ia64/gcc/rhel4.7

2009-03-02 Thread Jeff Squyres

Disregard -- it looks like the VT guys have fixed this issue.

Can you test 1.3.1rc2 or later?


On Feb 24, 2009, at 2:02 AM, Mike Dubman wrote:

I searched for similar problems reported to the list and have not  
found any. (only related to icc compiler found, which is unrelevant)

What discussed problems you referencing to?

regards

Mike


On Thu, Feb 19, 2009 at 3:04 PM, Jeff Squyres   
wrote:
Could this pertain to the other itanium compilation problems that  
were discussed (and not yet resolved) earlier?




On Feb 19, 2009, at 3:52 AM, Mike Dubman wrote:


Hello guys,

We have compilation problem of ompi v1.3 on Itanium ia64 + gcc +  
rhel 4.7.
It seems that vt_pform_linux.c:46 includes asm/intrinsics.h which is  
unavailable on rhel47/ia64 in /usr/include/asm but is a part of  
kernel-headers rpm

(in /usr/src/kernels/2.6.9-78.EL-ia64/include/asm-ia64/)


We compile ompi v1.3 from srpm with a command:

configure_options="--define 'configure_options --enable-orterun- 
prefix-by-default --with-openib --enable-mpirun-prefix-by-default'"
rpmbuild_options="--define 'install_in_opt 1' --define  
'use_default_rpm_opt_flags 0' --define 'ofed 1' --define 'mflags - 
j4' --define '_vendor Voltaire' --define 'packager Voltaire'"
rpmbuild --rebuild $configure_options $rpmbuild_options /path/to/ 
openmpi_v1.3_src.rpm


and getting the following error:

tlib/otf/otflib -D_GNU_SOURCE -DBINDIR=\"/opt/openmpi/1.3/bin\" - 
DDATADIR=\"
/opt/openmpi/1.3/share\" -DRFG -DVT_BFD -DVT_MEMHOOK -DVT_IOWRAP   - 
MT vt_pform_
linux.o -MD -MP -MF .deps/vt_pform_linux.Tpo -c -o vt_pform_linux.o  
vt_pform_lin ux.c

vt_pform_linux.c:46:31: asm/intrinsics.h: No such file or directory
vt_pform_linux.c: In function `vt_pform_wtime':
vt_pform_linux.c:172: error: `_IA64_REG_AR_ITC' undeclared (first  
use in this fu

nction)
vt_pform_linux.c:172: error: (Each undeclared identifier is reported  
only once

vt_pform_linux.c:172: error: for each function it appears in.)
make[5]: *** [vt_pform_linux.o] Error 1
make[5]: *** Waiting for unfinished jobs
mv -f .deps/vt_otf_trc.Tpo .deps/vt_otf_trc.Po
make[5]: *** Waiting for unfinished jobs
mv -f .deps/vt_otf_gen.Tpo .deps/vt_otf_gen.Po mv -f .deps/ 
vt_iowrap.Tpo .deps/vt_iowrap.Po
make[5]: Leaving directory `/tmp/buildopenmpi-30371/BUILD/ 
openmpi-1.3/ompi/contr

ib/vt/vt/vtlib'
make[4]: make[4]: Leaving directory `/tmp/buildopenmpi-30371/BUILD/ 
openmpi-1.3/o

mpi/contrib/vt/vt'
*** [all-recursive] Error 1


Please suggest.

Thanks

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] PML Start error?

2009-03-02 Thread George Bosilca
Right, this should be reinitialized at the beginning of each loop.  
However, the current code works fine, it only call the  
ompi_convertor_set_position twice if the condition is true. This  
function check if the current position match the requested one, and  
does nothing if its the case.


  george.

On Feb 28, 2009, at 09:26 , Jeff Squyres wrote:


Looks that way to me, too.

On Feb 27, 2009, at 5:34 PM, Eugene Loh wrote:

I'm looking at pml_ob1_start.c.  It loops over requests and starts  
them.  It makes some decision about whether an old request can be  
reused or if a new one must be allocated/initialized.  So, there is  
a variable named reuse_old_request.  It's initialized to "true",  
but if a new request must be alloced/inited, then it's set to false.


The thing is, this variable is initialized to true only once, at  
entry to the function and outside the loop over requests.  This  
strikes me as wrong.  It appears that if ever the variable is set  
to false, it will remain so until the end of the function.  I would  
think the intent is for the variable to be reset to true at the  
start of every iteration.


Yes/no?
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel