[OMPI devel] 答复: 答复: 答复: 答复: doubt on latency result with OpenMPI library

2014-03-28 Thread Wang,Yanfei(SYS)
Thanks Jeff! It's very helpful, I will read all responses of this thread again to deep understand your opinions. Thanks Yanfei -邮件原件- 发件人: devel [mailto:devel-boun...@open-mpi.org] 代表 Jeff Squyres (jsquyres) 发送时间: 2014年3月28日 19:18 收件人: Open MPI Developers 主题: Re: [OMPI devel] 答复: 答复:

[OMPI devel] v1.8rc1 posted

2014-03-28 Thread Ralph Castain
Hi folks We've put up a 1.8rc1 for testing - please beat it up!! http://www.open-mpi.org/software/ompi/v1.8/ Ralph

Re: [OMPI devel] Loading Open MPI from MPJ Express (Java) fails

2014-03-28 Thread Jeff Squyres (jsquyres)
On Mar 26, 2014, at 2:10 PM, Bibrak Qamar wrote: > 1) By heterogeneous do you mean Derived Datatypes? >MPJ Express's buffering layer handles this. It flattens the data into > a ByteBuffer. In this way native device doesn't have to worry about Derived > Datatypes (those things are handle

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-28 Thread Jeff Squyres (jsquyres)
On Mar 28, 2014, at 1:17 PM, Joshua Ladd wrote: > So after all that: I think you shouldn't need to specify the connection > manager MCA parameter at all; the openib BTL should choose the Right one for > you. > [Josh] Nyet. See above. Ah. Shows how much I know. :-) Is there any way to make

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-28 Thread Joshua Ladd
I also believe that for iWARP and RoCE, the RDMA CM will be chosen automatically, and UD CM will be automatically chosen for IB. [Josh] If you want to run OMPI over RoCE on Mellanox hardware, you must explicitly choose rdmacm with -mca btl openib,sm,self -mca btl_openib_cpc_include rdmacm - th

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-28 Thread Jeff Squyres (jsquyres)
On Mar 28, 2014, at 12:18 PM, "Shamis, Pavel" wrote: >> Technically you may setup RoCE connection without RDMA CM. > > The version of the RoCE support that I implemented (in an alternative MPI > implementation) did it through the regular OOB > channel. As I remember the only difference is the

Re: [OMPI devel] 答复: 答复: doubt on latency result with OpenMPI library

2014-03-28 Thread Shamis, Pavel
> On Mar 27, 2014, at 11:45 PM, "Wang,Yanfei(SYS)" > wrote: > >> 1. In the RoCE, we cannot use OOB(via tcp socket) for RDMA connection. > > More specifically, RoCE QPs can only be made using the RDMA connection > manager. Technically you may setup RoCE connection without RDMA CM. The vers

Re: [OMPI devel] system call failed during shared memory initialization with openmpi-1.8a1r31254

2014-03-28 Thread tmishima
Thanks Jeff. But I'm already offline today ... I can not confirm it until monday morning, sorry. Tetsuya > Ralph applied a bunch of CMRs to the v1.8 branch after the nightly tarball was made last night. > > I just created a new nightly tarball that includes all of those CMRs: 1.8a1r31269. It s

Re: [OMPI devel] system call failed during shared memory initialization with openmpi-1.8a1r31254

2014-03-28 Thread Jeff Squyres (jsquyres)
Ralph applied a bunch of CMRs to the v1.8 branch after the nightly tarball was made last night. I just created a new nightly tarball that includes all of those CMRs: 1.8a1r31269. It should have the fix for this error included in it. On Mar 28, 2014, at 6:50 AM, wrote: > > > Thanks Jeff. I

Re: [OMPI devel] 答复: 答复: 答复: doubt on latency result with OpenMPI library

2014-03-28 Thread Jeff Squyres (jsquyres)
On Mar 27, 2014, at 11:45 PM, "Wang,Yanfei(SYS)" wrote: > 1. In the RoCE, we cannot use OOB(via tcp socket) for RDMA connection. More specifically, RoCE QPs can only be made using the RDMA connection manager. > However, as I known, mellanox HCA supporting RoCE can make rdma and tcp/ip > work

Re: [OMPI devel] system call failed during shared memory initialization with openmpi-1.8a1r31254

2014-03-28 Thread tmishima
Thanks Jeff. It seems to be really the latest one - ticket #4474. > On Mar 28, 2014, at 5:45 AM, wrote: > > > -- > > A system call failed during shared memory initialization that should > > not have. It is likely that your

Re: [OMPI devel] system call failed during shared memory initialization with openmpi-1.8a1r31254

2014-03-28 Thread Jeff Squyres (jsquyres)
On Mar 28, 2014, at 5:45 AM, wrote: > -- > A system call failed during shared memory initialization that should > not have. It is likely that your MPI job will now either abort or > experience performance degradation. > >

[OMPI devel] system call failed during shared memory initialization with openmpi-1.8a1r31254

2014-03-28 Thread tmishima
Hi all, I saw this error as shown below with openmpi-1.8a1r31254. I've never seen it before with openmpi-1.7.5. The message implies it's related to vader and I can stop it by excluding vader from btl, -mca btl ^vader. Could someone fix this problem? Tetsuya [mishima@manage openmpi]$ mpirun -n