date:20091214

Re: [OMPI users] NFS and openmpi through different NICs

2009-12-14 Thread Bill Rankin

On 12/14/2009 11:11 PM, Dmitry Zaletnev wrote: > Hi, > is it possible to have NFS and openmpi running on different NICs? Yes. Just make sure that the two subnets for the NICs don't overlap and that your routing tables are correct. As for channel bonding, I'll let someone who has actually used it

[OMPI users] NFS and openmpi through different NICs

2009-12-14 Thread Dmitry Zaletnev

Hi, is it possible to have NFS and openmpi running on different NICs? By the way, is it possible to have openmpi using multiple NICs without hardware support for bonding? Thank you in advance. -- Dmitry

Re: [OMPI users] Hanging vs Stopping behaviour in communication failures

2009-12-14 Thread Constantinos Makassikis

Jeff Squyres wrote: On Dec 9, 2009, at 3:47 AM, Constantinos Makassikis wrote: sometimes when running Open MPI jobs, the application hangs. By looking the output I get the following error message: [ic17][[34562,1],74][../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv

Re: [OMPI users] Hanging vs Stopping behaviour in communication failures

2009-12-14 Thread Jeff Squyres

On Dec 9, 2009, at 3:47 AM, Constantinos Makassikis wrote: > sometimes when running Open MPI jobs, the application hangs. By looking the > output I get the following error message: > > [ic17][[34562,1],74][../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv > > ] mca_btl_

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-12-14 Thread Reuti

Hi, no, I never tried Open MPI's checkpointing. But there are two Howto's from which you may get some ideas to integrate it with SGE: http://gridengine.sunsource.net/howto/checkpointing.html http://gridengine.sunsource.net/howto/APSTC-TB-2004-005.pdf (but Open MPI's checkpointing seems more

Re: [OMPI users] OpenMPI 1.4 RPM Spec file problem

2009-12-14 Thread Jeff Squyres

Jim and I iterated a bit off-list. Jim -- I committed a change to our specfile that makes it work for me. Before I release a 1.4-2 SRPM, could you give it a whirl? http://www.open-mpi.org/~jsquyres/unofficial/ On Dec 9, 2009, at 6:41 PM, Jim Kusznir wrote: > By the way, if I set build_a

[OMPI users] Disabling irqbalance service for better performance of MPI jobs

2009-12-14 Thread Rahul Nabar

I have already been using the processor and memory affinity options to bind the processes to specific cores. Does the presence of the irqbalance daemon matter? I saw some recommendation to disable this for a performance boost. Or is this irrelevant? I am running HPC jobs with no over- nor under-su

Re: [OMPI users] OpenMPI problem on Fedora Core 12

2009-12-14 Thread Ashley Pittman

On Sun, 2009-12-13 at 19:04 +0100, Gijsbert Wiesenekker wrote: > The following routine gives a problem after some (not reproducible) > time on Fedora Core 12. The routine is a CPU usage friendly version of > MPI_Barrier. There are some proposals for Non-blocking collectives before the MPI forum cu

Re: [OMPI users] OpenMPI problem on Fedora Core 12

2009-12-14 Thread Eugene Loh

Let's start with this: You generate non-blocking sends (MPI_Isend). Those sends are not completed anywhere. So, strictly speaking, they don't need to be executed. In practice, even if they are executed, they should be "completed" from the user program's point of view (MPI_Test, MPI_Wait, MP

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-12-14 Thread Sergio Díaz

Hi Reuti, Yes, I sent a job with SGE and I checkpointed the mpirun process, by hand, entering into the mpi master node. Then I killed the job with qdel and after that I did the ompi-restart. I will try to integrate with SGE creating a ckpt environment but I think that it could be a bit difficu

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-12-14 Thread Reuti

Hi, Am 14.12.2009 um 17:05 schrieb Sergio Díaz: I got a successful checkpoint with a fresh installation and without use the trunk. I can't understand why it is working now and before I could do a successful restart... Maybe there was something wrong in the openmpi installation and then the

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-12-14 Thread Sergio Díaz

Hi Josh, I got a successful checkpoint with a fresh installation and without use the trunk. I can't understand why it is working now and before I could do a successful restart... Maybe there was something wrong in the openmpi installation and then the metadata was created in a wrong way. I wi

Re: [OMPI users] NFS and openmpi through different NICs

[OMPI users] NFS and openmpi through different NICs

Re: [OMPI users] Hanging vs Stopping behaviour in communication failures

Re: [OMPI users] Hanging vs Stopping behaviour in communication failures

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

Re: [OMPI users] OpenMPI 1.4 RPM Spec file problem

[OMPI users] Disabling irqbalance service for better performance of MPI jobs

Re: [OMPI users] OpenMPI problem on Fedora Core 12

Re: [OMPI users] OpenMPI problem on Fedora Core 12

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

12 matches

Site Navigation

Mail list logo

Footer information