[OMPI devel] 答复: 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Wang,Yanfei(SYS)
Hi, 1. In the RoCE, we cannot use OOB(via tcp socket) for RDMA connection. However, as I known, mellanox HCA supporting RoCE can make rdma and tcp/ip work simultaneously. whether some other HCAs can only work on RoCE and normal Ethernet individually, so that OMPI cannot user OOB(like tcp soc

Re: [OMPI devel] cleanup of rr_byobj

2014-03-27 Thread tmishima
I added two improvements. Please replace the previous patch file by this attached one, and take a look this week end. 1. Add pre-check for ORTE_ERR_NOT_FOUND to make retry with byslot work afterward correctly. Otherwise, the retry could fail, because some fields such as node->procs, node->slots_

Re: [OMPI devel] Missing MPI 3 definitions

2014-03-27 Thread Jeff Squyres (jsquyres)
Actually, I stand corrected -- we don't have MPI_Comm_set|get_info() at all. Oops. I'll go add them. On Mar 27, 2014, at 3:25 PM, "Jeff Squyres (jsquyres)" wrote: > Yoinks. Those look like oversights. > > I see set_info and get_info on the trunk, for example -- I think we just > forgot t

Re: [OMPI devel] Missing MPI 3 definitions

2014-03-27 Thread Jeff Squyres (jsquyres)
Yoinks. Those look like oversights. I see set_info and get_info on the trunk, for example -- I think we just forgot to bring them to v1.7/v1.8. On Mar 27, 2014, at 3:18 PM, Lisandro Dalcin wrote: > In 1.7.5, you guys bumped MPI_VERSION to 3 but forgot to add > definitions for the following c

[OMPI devel] Missing MPI 3 definitions

2014-03-27 Thread Lisandro Dalcin
In 1.7.5, you guys bumped MPI_VERSION to 3 but forgot to add definitions for the following constants: MPI_ERR_RMA_SHARED MPI_WEIGHTS_EMPTY Also, the following two functions are missing: MPI_Comm_set_info() MPI_Comm_get_info() PS: The two missing functions are trivial to provide, the first could

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Jeff Squyres (jsquyres)
On Mar 27, 2014, at 11:15 AM, "Wang,Yanfei(SYS)" wrote: > Normally we use rdma-cm to build rdma connection ,then create Qpairs to do > rdma data transmit ion, so what is the consideration for separating rdma-cm > connection built and data transmit ion at design stage? There's some history her

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Wang,Yanfei(SYS)
Normally we use rdma-cm to build rdma connection ,then create Qpairs to do rdma data transmit ion, so what is the consideration for separating rdma-cm connection built and data transmit ion at design stage? Maybe this question is not reasonable, and hope any response? Thanks Yanfei Sent from m

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Wang,Yanfei(SYS)
Hi, Thanks Ralph and Jeff. Do we have full documentation on these parameters, further on open MPI transport design architecture? Pls recommend some website or paper. Thanks Yanfei Sent from my iPad On 2014年3月27日, at 下午10:10, "Ralph Castain" mailto:r...@open-mpi.org>> wrote: Just one other po

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Ralph Castain
Just one other point to clarify - there is an apparent misunderstanding regarding the following MCA param: -mca btl_openib_cpc_include rdmacm This param has nothing to do with telling openib to use RDMA for communication. What it does is tell the openib BTL to use RDMA to establish the point-to-p

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Wang,Yanfei(SYS)
it seem that all confusions have already been shot, thanks Jeff! Thanks! Yanfei Sent from my iPad > On 2014年3月27日, at 下午8:38, "Jeff Squyres (jsquyres)" > wrote: > > Here's a few key facts that might help: > > 1. The hostfile has nothing to do with what network interfaces are used for > MPI

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Jeff Squyres (jsquyres)
Here's a few key facts that might help: 1. The hostfile has nothing to do with what network interfaces are used for MPI traffic. It is only used to specify what servers you launch on, regardless of what IP interface on that server you specify. 2. What network interfaces are used are a combinati

[OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Wang,Yanfei(SYS)
Hi, Update: If explicitly assign --mca btl tcp,sm,self and the traffic will go 10G TCP/IP link instead of 40G RDMA link, and the tcp/ip latency is 22us at average, which is reasonable. [root@bb-nsi-ib04 pt2pt]# mpirun --hostfile hosts -np 2 --map-by node --mca btl tcp,sm,self osu_latency # OSU

[OMPI devel] 答复: doubt on latency result with OpenMPI library

2014-03-27 Thread Wang,Yanfei(SYS)
HI, “--map-by node” does remove this trouble. --- Configuration: Even if using mpi --hostfile to control traffic to go 10G TCP/IP network, and the latency still is 5us in both situation! [root@bb-nsi-ib04 pt2pt]# cat /etc/hosts 192.168.71.3 ib03 192.168.71.4 ib04 [root@bb-nsi-ib04 pt2pt]# ifconfi

Re: [OMPI devel] doubt on latency result with OpenMPI library

2014-03-27 Thread Ralph Castain
Try adding "--map-by node" to your command line to ensure the procs really are running on separate nodes. On Thu, Mar 27, 2014 at 1:40 AM, Wang,Yanfei(SYS) wrote: > Hi, > > > > HW Test Topology: > > Ip:192.168.72.4/24 –192.168.72.4/24, enable vlan and RoCE > > IB03 server 40G port-- - 40G Ethe

[OMPI devel] doubt on latency result with OpenMPI library

2014-03-27 Thread Wang,Yanfei(SYS)
Hi, HW Test Topology: Ip:192.168.72.4/24 �C192.168.72.4/24, enable vlan and RoCE IB03 server 40G port-- - 40G Ethernet switch IB04 server 40G port: configure it as RoCE link IP: 192.168.71.3/24 ---192.168.71.4/24 IB03 server 10G port �C 10G Ethernet switch �C IB04 server 10G port: configure

[OMPI devel] 答复: 答复: example/Hello_c.c : mpirun run failed on two physical nodes.

2014-03-27 Thread Wang,Yanfei(SYS)
Hi, I found that the server installed OpenMPI arms with iptables, so further communication between ib03 and ib04 is prohibited! The mpirun works fine across multi-server with hello_c Thanks Ralph ! Thanks -Yanfei 发件人: devel [mailto:devel-boun...@open-mpi.org] 代表 Ralph Castain 发送时间: 2014年3月26日