Re: [OMPI users] Whether to use the IB BTL or not

2015-01-05 Thread Jeff Squyres (jsquyres)
In addition to what Howard said, there's actually two other metrics that are 
used, as well.  Each BTL exports a priority and an exclusivity value.

IIRC (it's been a while since I've looked at this code), the PML gathers up all 
BTL modules that claim that they can communicate between a pair of peers, and 
uses all the modules that share the same, highest exclusivity rating.

TCP has a lower exclusivity rating than OS-bypass BTLs (such as usnic and 
openib).  Hence, even if TCP is available between a pair of peers, OS-bypass 
transports will be preferred over TCP.



On Dec 23, 2014, at 4:04 PM, Howard Pritchard  wrote:

> HI Gary,
> 
> The decision occurs within the MPI processes themselves (during the call to 
> MPI_Init) - so
> after the orte daemons have started on the nodes.
> 
> The BTL's report their "latency" and "bandwidth" - up the stack to the 
> PML/BML layer which
> then decides based on these metrics which BTL to use to send/recv messages 
> between
> any given pair of MPI ranks. 
> 
> Hope this helps,
> 
> Howard
> 
> 
> 
> 
> 2014-12-23 13:29 GMT-07:00 Gary Jackson :
> 
> I'm not having any trouble getting it to start, and it's definitely using the 
> openib btl. I was just wondering how it decided whether the openib btl was 
> appropriate before going down the btl list to tcp when all mpirun gets is a 
> hostname and no other information about connectivity on the remote end. For 
> instance, is this determined before or after orted runs on the remote end?
> 
> On 12/23/14, 2:18 PM, Howard Pritchard wrote:
> Hello Gary,
> 
> It depends on how the Open MPI was built, and on mca parameters passed
> to the job either via settings in an mca params conf
> file or the mpirun command line or env. variables.  If you have mxm
> (MLNX) or psm (qlogic/intel) installed on the system
> where your open mpi was built, you may actually be using one of those
> via the MTL path.
> 
> Try
> 
> mpirun -np 2 -H hosta,hostb -mca btl self,vader,openib ./your_favorite_test
> 
> That should force open mpi to try using openib between the pair of
> hosts.  Note you don't need "vader" on the command line
> if you are running only one mpi rank/node.
> 
> Howard
> 
> 
> 
> 
> 2014-12-23 11:48 GMT-07:00 Gary Jackson  >:
> 
> 
> How does OpenMPI decide whether to use the IB BTL between a given
> pair of hosts, assuming there is an IB interface available?
> 
> --
> Gary
> _
> users mailing list
> us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/__mailman/listinfo.cgi/users
> 
> Link to this post:
> http://www.open-mpi.org/__community/lists/users/2014/12/__26063.php
> 
> 
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/26064.php
> 
> 
> 
> -- 
> Gary
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/26065.php
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/26066.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-05 Thread Diego Avesani
Dear Gilles,

Thanks, Thanks a lot.
Now is more clear.

Again, thanks a lot

Diego


Re: [OMPI users] OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-05 Thread Gilles Gouaillardet
Diego,

The compiler likely added some padding after %ip to have data aligned on 128 
bits.

You need two dummies in case the compiler adds some padding at the end of the 
type.

Cheers,

Gilles

Diego Avesani さんのメール:
>Dear Gilles, Dear all,
>
>thanks, thanks a lot.
>
>
>Could you explain it to me, please? 
>
>
>I mean, when I print displacements I get:
>
>
>displacements(0)= 6922656
>
>displacements(1)= 0             
>
>displacements(2)= 16
>
>displacements(3)= 48
>
>displacements(4)= 112
>
> 
>
>Why do I have 16 spaces in displacements(2), I have only an integer in 
>dummy%ip?
>
>Why do you use dummy(1) and dummy(2)?
>
>
>Thanks a lot    
>
>
>
>Diego
>
>
>On 5 January 2015 at 02:44, Gilles Gouaillardet 
> wrote:
>
>Diego,
>
>MPI_Get_address was invoked with parameters in the wrong order
>
>here is attached a fixed version
>
>Cheers,
>
>Gilles
>
>On 2015/01/05 2:32, Diego Avesani wrote:
>
>Dear Gilles, Dear all, It works. The only thing that is missed is: *CALL 
>MPI_Finalize(MPI%iErr)* at the end of the program. Now, I have to test it 
>sending some data from a processor to another. I would like to ask you if you 
>could explain me what you have done. I wrote in the program: * 
>IF(MPI%myrank==1)THEN* * WRITE(*,*) DISPLACEMENTS* * ENDIF* and the results 
>is: *139835891001320 -139835852218120 -139835852213832* * -139835852195016 
>8030673735967299609* I am not able to understand it. Thanks a lot. In the 
>attachment you can find the program Diego On 4 January 2015 at 12:10, Gilles 
>Gouaillardet < gilles.gouaillar...@gmail.com> wrote: 
>
>Diego, here is an updated revision i will double check tomorrow /* i dit not 
>test it yet, so forgive me it it does not compile/work */ Cheers, Gilles On 
>Sun, Jan 4, 2015 at 6:48 PM, Diego Avesani  wrote: 
>
>Dear Gilles, Dear all, in the attachment you can find the program. What do you 
>meam "remove mpi_get_address(dummy) from all displacements". Thanks for all 
>your help Diego Diego On 3 January 2015 at 00:45, Gilles Gouaillardet < 
>gilles.gouaillar...@gmail.com> wrote: 
>
>Diego, George gave you the solution, The snippet you posted has two mistakes 
>You did not remove mpi_get_address(dummy) from all displacements (See my 
>previous reply) You pass incorrect values to mpi_type_create_resized Can you 
>post a trimmed version of your program instead of a snippet ? Gus is right 
>about using double precision vs real and -r8 Cheers, Gilles Diego Avesani 
>さんのメー
>
>ル: Dear Gilles Dear all, I have done all that to avoid to pedding an integer, 
>as suggested by George. I define tParticle as a common object. I am using 
>Intel fortran compiler. George suggests: *"" The displacements are relative to 
>the benign of your particle type. Thus the first one is not 0 but the 
>displacement of “integer :: ip” due to the fact that the compiler is allowed 
>to introduce gaps in order to better align.* * 
>DISPLACEMENTS(1)=MPI_GET_ADDRESS(dummy%ip)* * 
>DISPLACEMENTS(2)=**MPI_GET_ADDRESS(dummy%RP[1])* * 
>DISPLACEMENTS(3)=**MPI_GET_ADDRESS(dummy%QQ[1])* *and then remove the 
>MPI_GET_ADDRESS(dummy) from all of them.* *3. After creating the structure 
>type you need to resize it in order to correctly determine the span of the 
>entire structure, and how an array of such structures lays in memory. 
>Something like:* *MPI_TYPE_CREATE_RESIZED(old type, DISPLACEMENT(1),* * 
>MPI_GET_ADDRESS(dummy[2]) - MPI_GET_ADDRESS(dummy[1]), newt) ""* What do you 
>think? George, Did i miss something? Thanks a lot Diego On 2 January 2015 at 
>12:51, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: 
>
>Diego, First, i recommend you redefine tParticle and add a padding integer so 
>everything is aligned. Before invoking MPI_Type_create_struct, you need to 
>call MPI_Get_address(dummy, base, MPI%err) displacements = displacements - 
>base MPI_Type_create_resized might be unnecessary if tParticle is aligned And 
>the lower bound should be zero. BTW, which compiler are you using ? Is 
>tParticle object a common ? iirc, intel compiler aligns types automatically, 
>but not commons, and that means MPI_Type_create_struct is not aligned as it 
>should most of the time. Cheers, Gilles Diego Avesani 
>さんのメー
>
>ル: dear all, I have a problem with MPI_Type_Create_Struct and 
>MPI_TYPE_CREATE_RESIZED. I have this variable type: * TYPE tParticle* * 
>INTEGER :: ip* * REAL :: RP(2)* * REAL :: QQ(2)* * ENDTYPE tParticle* Then I 
>define: Nstruct=3 *ALLOCATE(TYPES(Nstruct))* *ALLOCATE(LENGTHS(Nstruct))* 
>*ALLOCATE(DISPLACEMENTS(Nstruct))* *!set the types* *TYPES(1) = MPI_INTEGER* 
>*TYPES(2) = MPI_DOUBLE_PRECISION* *TYPES(3) = MPI_DOUBLE_PRECISION* *!set the 
>lengths* *LENGTHS(1) = 1* *LENGTHS(2) = 2* *LENGTHS(3) = 2* As gently 
>suggested by Nick Papior Andersen and George Bosilca some months ago, I 
>checked the variable adress to resize my struct variable to avoid 

Re: [OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-05 Thread Diego Avesani
Dear Gilles, Dear all,
thanks, thanks a lot.

Could you explain it to me, please?

I mean, when I print displacements I get:

displacements(0)= 6922656
displacements(1)= 0
displacements(2)= 16
displacements(3)= 48
displacements(4)= 112

Why do I have 16 spaces in displacements(2), I have only an integer in
dummy%ip?
Why do you use dummy(1) and dummy(2)?

Thanks a lot


Diego


On 5 January 2015 at 02:44, Gilles Gouaillardet <
gilles.gouaillar...@iferc.org> wrote:

>  Diego,
>
> MPI_Get_address was invoked with parameters in the wrong order
>
> here is attached a fixed version
>
> Cheers,
>
> Gilles
>
> On 2015/01/05 2:32, Diego Avesani wrote:
>
> Dear Gilles, Dear all,
>
> It works. The only thing that is missed is:
>
> *CALL MPI_Finalize(MPI%iErr)*
>
> at the end of the program.
>
> Now, I have to test it sending some data from a processor to another.
> I would like to ask you if you could explain me what you have done.
> I wrote in the program:
>
> *   IF(MPI%myrank==1)THEN*
> *  WRITE(*,*) DISPLACEMENTS*
> *   ENDIF*
>
> and the results is:
>
>*139835891001320  -139835852218120  -139835852213832*
> *  -139835852195016   8030673735967299609*
>
> I am not able to understand it.
>
> Thanks a lot.
>
> In the attachment you can find the program
>
>
>
>
>
>
>
>
> Diego
>
>
> On 4 January 2015 at 12:10, Gilles Gouaillardet 
>  wrote:
>
>
>  Diego,
>
> here is an updated revision i will double check tomorrow
> /* i dit not test it yet, so forgive me it it does not compile/work */
>
> Cheers,
>
> Gilles
>
> On Sun, Jan 4, 2015 at 6:48 PM, Diego Avesani  
> 
> wrote:
>
>
>  Dear Gilles, Dear all,
>
> in the attachment you can find the program.
>
> What do you meam "remove mpi_get_address(dummy) from all displacements".
>
> Thanks for all your help
>
> Diego
>
>
>
> Diego
>
>
> On 3 January 2015 at 00:45, Gilles Gouaillardet 
>  wrote:
>
>
>  Diego,
>
> George gave you the solution,
>
> The snippet you posted has two mistakes
> You did not remove mpi_get_address(dummy) from all displacements
> (See my previous reply)
> You pass incorrect values to mpi_type_create_resized
>
> Can you post a trimmed version of your program instead of a snippet ?
>
> Gus is right about using double precision vs real and -r8
>
> Cheers,
>
> Gilles
>
> Diego Avesani  さんのメー
> ル:
> Dear Gilles Dear all,
>
> I have done all that to avoid to pedding an integer, as suggested by
> George.
> I define tParticle as a common object.
> I am using Intel fortran compiler.
>
> George suggests:
>
> *"" The displacements are relative to the benign of your particle type.
> Thus the first one is not 0 but the displacement of “integer :: ip” due to
> the fact that the compiler is allowed to introduce gaps in order to better
> align.*
>
> *  DISPLACEMENTS(1)=MPI_GET_ADDRESS(dummy%ip)*
> *  DISPLACEMENTS(2)=**MPI_GET_ADDRESS(dummy%RP[1])*
>
> *  DISPLACEMENTS(3)=**MPI_GET_ADDRESS(dummy%QQ[1])*
>
> *and then remove the MPI_GET_ADDRESS(dummy) from all of them.*
>
> *3. After creating the structure type you need to resize it in order to
> correctly determine the span of the entire structure, and how an array of
> such structures lays in memory. Something like:*
> *MPI_TYPE_CREATE_RESIZED(old type, DISPLACEMENT(1),*
> *   MPI_GET_ADDRESS(dummy[2]) - MPI_GET_ADDRESS(dummy[1]), newt) ""*
>
> What do you think?
> George, Did i miss something?
>
> Thanks a lot
>
>
>
> Diego
>
>
> On 2 January 2015 at 12:51, Gilles Gouaillardet 
>  wrote:
>
>
>  Diego,
>
> First, i recommend you redefine tParticle and add a padding integer so
> everything is aligned.
>
>
> Before invoking MPI_Type_create_struct, you need to
> call MPI_Get_address(dummy, base, MPI%err)
> displacements = displacements - base
>
> MPI_Type_create_resized might be unnecessary if tParticle is aligned
> And the lower bound should be zero.
>
> BTW, which compiler are you using ?
> Is tParticle object a common ?
> iirc, intel compiler aligns types automatically, but not commons, and
> that means MPI_Type_create_struct is not aligned as it should most of the
> time.
>
> Cheers,
>
> Gilles
>
> Diego Avesani  さんのメー
> ル:
>
> dear all,
>
> I have a problem with MPI_Type_Create_Struct and
> MPI_TYPE_CREATE_RESIZED.
>
> I have this variable type:
>
> *  TYPE tParticle*
> * INTEGER  :: ip*
> * REAL :: RP(2)*
> * REAL :: QQ(2)*
> *  ENDTYPE tParticle*
>
> Then I define:
>
> Nstruct=3
> *ALLOCATE(TYPES(Nstruct))*
> *ALLOCATE(LENGTHS(Nstruct))*
> *ALLOCATE(DISPLACEMENTS(Nstruct))*
> *!set the types*
> *TYPES(1) = MPI_INTEGER*
> *TYPES(2) = MPI_DOUBLE_PRECISION*
> *TYPES(3) = MPI_DOUBLE_PRECISION*
> *!set the lengths*
> *LENGTHS(1) = 1*
> *LENGTHS(2) = 2*
> *LENGTHS(3) = 2*
>
> As gently suggested by Nick Papior Andersen and George