Re: [OMPI devel] problem in runing MPI job through XGrid

2007-10-26 Thread Jinhui Qin
Hi Brian, Some good news and bad news. According to the information provided on http://www.open-mpi.org/faq/?category=running, I have enabled X11Forward on all remote nodes, and added the path to mpirun, which is "/usr/local/bin", on all node, and "xhost +" on my localhost, and set the D

Re: [OMPI devel] problem in runing MPI job through XGrid

2007-10-26 Thread Brian Barrett
XGrid does not forward X11 credentials, so you would have to setup an X11 environment by yourself. Using ssh or a local starter does forward X11 credentials, which is why it works in that case. Brian On Oct 25, 2007, at 10:23 PM, Jinhui Qin wrote: Hi Brian, I got another problem in run

Re: [OMPI devel] problem in runing MPI job through XGrid

2007-10-26 Thread Jinhui Qin
Hi Brian, I got another problem in running an MPI job through XGrid. During the execution of this MPI job it will call Xlib functions (i.e. XOpenDisplay()) to open an X window. The XOpenDisplay() function call failed (return "null"), it can not open a display no matter how many processors that

Re: [OMPI devel] problem in runing MPI job through XGrid

2007-10-10 Thread Jinhui Qin
Hi Brian, I found the problem. It looks like xgrid need to do more work on fault tolerance. It seems that xgrid controller distributed jobs to each available agent only in certain fixed order, if one of the agents has problem in communicating with the controller, all jobs failed, even when the

Re: [OMPI devel] problem in runing MPI job through XGrid

2007-10-09 Thread Brian Barrett
On Oct 4, 2007, at 3:06 PM, Jinhui Qin wrote: sib:sharcnet$ mpirun -n 3 ~/openMPI_stuff/Hello Process 0.1.1 is unable to reach 0.1.2 for MPI communication. If you specified the use of a BTL component, you may have forgotten a component (such as "self") in the list of usable components. This i

[OMPI devel] problem in runing MPI job through XGrid

2007-10-04 Thread Jinhui Qin
Hi, I have set up an Xgrid including one laptop and 7 Mac mini nodes (all are duo core machines). I have also installed openMPI (openmpi 1.2.1) on all nodes. The laptop node (hostname: sib) has three roles: agent, controller and client, all the other nodes are only agents. When I started "mpirun -