Hi,
I'm having some really strange error causing me some serious headaches.
I want to integrate OpenMPI version 1.1.1 from the OFED package version
1.1 with SGE version 6.0. For mvapich all works, but for OpenMPI not ;(.
Here is my jobfile and error message:
#!/bin/csh -f
#$ -N MPI_Job
#$ -pe mpi
Hi, Jeff
many thanks for your reply..
> 1. You might want to update your version of Open MPI if possible; the
> v1.1.1 version is quite old. We have added many new bug fixes and
> features since v1.1.1 (including tight SGE integration). There is
> nothing special about the Open MPI that i
Markus Daene schrieb:
> Hi.
>
> I think it is not necessary to specify the hosts via the hostfile using SGE
> and OpenMPI, even the $NSLOTS is not necessary , just run
> mpirun executable this works very well.
This produces the same error, but thanks for your suggestion. (For the
sake of intere
Markus Daene wrote:
>>> to your memory problem:
>>> I had similar problems when I specified the h_vmem option to use in SGE.
>>> Without SGE everything works, but starting with SGE gives such memory
>>> errors. You can easily check this with 'qconf -sc'. If you have used this
>>> option, try witho
Hi Pak,
> Jeff Squyres wrote:
2. I know little/nothing about SGE, but I'm assuming that you need to
have SGE pass the proper memory lock limits to new processes. In an
interactive login, you showed that the max limit is "8162952" -- you
might just want to make it unlimited, un
Jeff Squyres schrieb:
>> Hmm, I've heard about conflicts with OMPI 1.2.x and OFED 1.1 (sorry no
>> refference here),
>
> I'm unaware of any problems with OMPI 1.2.x and OFED 1.1. I run OFED
> 1.1 on my cluster at Cisco and have many different versions of OMPI
> installed (1.2, trunk, etc.).
Sorry for late reply, but I havent had access to the machine at the weekend.
> I don't really know what this means. People have explained "loose"
> vs. "tight" integration to me before, but since I'm not an SGE user,
> the definitions always fall away.
I *assume* loose coupled jobs, are just
Pak Lui schrieb:
> sad...@gmx.net wrote:
>> Sorry for late reply, but I havent had access to the machine at the weekend.
>>
>>> I don't really know what this means. People have explained "loose"
>>> vs. "tight" integration to me before, but since I'm not an SGE user,
>>> the definitions always
> Are you referring to this SEGV error here? I am assuming this is OMPI
> 1.1.1 so you are using rsh PLS to launch your executables (using loose
> integration).
oops, I wanted to compile ompi 1.2.3 against OFED 1.1 and these are the
errors. This problem has nothing to do with the SGE anymore (J