This looks like a PSM problem (PSM is the layer than runs below Open MPI on 
QLogic NICs).  You might need to contact QLogic tech support to find out how to 
solve it.


On Mar 29, 2012, at 11:26 AM, Raju wrote:

> Hi Ralph,
> 
> I recompiled OMPI with --with-tm  option, but still same issue... I changed 
> the input file as below... Please let me know what i have to fine tune and 
> verify 
> 
> #!/bin/bash
> #PBS -N matmul
> #PBS -l nodes=1:ppn=1
> node=1
> ppn=1
> nprocs=`expr ${node} \* ${ppn}`
> export PSM_SHAREDCONTEXTS_MAX=16
> 
> mpirun -np ${nprocs} /home/khan/a.out < /home/khan/iter
> 
> Regards,
> Raju...
> 
> On Thu, Mar 29, 2012 at 8:49 PM, Raju <bra...@gmail.com> wrote:
> Hi Ralph,
> 
> Thanks for the very quick response, I did compiled with -tm option i am doing 
> now, once it done i will revert back...
> 
> Thanks
> Raju..
> 
> 
> On Thu, Mar 29, 2012 at 8:29 PM, Ralph Castain <r...@open-mpi.org> wrote:
> One thing stands out right away: why are you specifying a hostfile? Did you 
> remember to configure OMPI with --with-tm so we launch via Torque? If not, 
> then you could hit issues as you are actually attempting to launch via ssh, 
> which has implications on a Torque-based system.
> 
> 
> On Mar 29, 2012, at 8:51 AM, Raju wrote:
> 
>> Hi Team,
>> 
>> I am using Qlogic Infiniband and Openmpi-1.5.3. I can able to run the jobs 
>> by CLI without any issues, but when iam submitting over torque scheduler 
>> facing the below issue.
>> 
>> I am facing issue while submitting the jobs through Torque scheduler. Error 
>> file is attached
>> 
>> Overview of the problem:
>> 
>> node1.ibab.ac.in.5910Driver initialization failure on /dev/ipath (err=23)
>> --------------------------------------------------------------------------
>> PSM was unable to open an endpoint. Please make sure that the network link is
>> active on the node and the hardware is functioning.
>>  
>>   Error: Failure in initializing endpoint
>>  
>> I gone through the link 
>> http://www.open-mpi.org/community/lists/users/2011/12/17888.php for 
>> solution, same followed but no luck.
>> 
>> I exported the value in my input submit script file as export 
>> PSM_SHAREDCONTEXTS_MAX=16, and submitted the job.
>> 
>> Sample inputfile is
>> 
>> #!/bin/bash
>> #PBS -N matmul
>> #PBS -l nodes=1:ppn=1
>> node=1
>> ppn=1
>> nprocs=`expr ${node} \* ${ppn}`
>> echo "--- PBS_NODEFILE CONTENT ---"
>> cat $PBS_NODEFILE
>> export PSM_SHAREDCONTEXTS_MAX=16
>>  
>> mpirun -np ${nprocs} --hostfile $PBS_NODEFILE  /home/khan/a.out < 
>> /home/khan/iter
>>  
>> Please let me know I doing correct or not ? and suggest me for best out ?
>> 
>> Regards,
>> Bhagya Raju K
>> <errfile.txt>_______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to