On 12/14/2009 11:11 PM, Dmitry Zaletnev wrote:
> Hi,
> is it possible to have NFS and openmpi running on different NICs?
Yes. Just make sure that the two subnets for the NICs don't overlap and
that your routing tables are correct.
As for channel bonding, I'll let someone who has actually used it
Hi,
is it possible to have NFS and openmpi running on different NICs? By the way,
is it possible to have openmpi using multiple NICs without hardware support for
bonding?
Thank you in advance.
--
Dmitry
Jeff Squyres wrote:
On Dec 9, 2009, at 3:47 AM, Constantinos Makassikis wrote:
sometimes when running Open MPI jobs, the application hangs. By looking the
output I get the following error message:
[ic17][[34562,1],74][../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv
On Dec 9, 2009, at 3:47 AM, Constantinos Makassikis wrote:
> sometimes when running Open MPI jobs, the application hangs. By looking the
> output I get the following error message:
>
> [ic17][[34562,1],74][../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv
>
> ] mca_btl_
Hi,
no, I never tried Open MPI's checkpointing. But there are two Howto's
from which you may get some ideas to integrate it with SGE:
http://gridengine.sunsource.net/howto/checkpointing.html
http://gridengine.sunsource.net/howto/APSTC-TB-2004-005.pdf (but Open
MPI's checkpointing seems more
Jim and I iterated a bit off-list.
Jim -- I committed a change to our specfile that makes it work for me. Before
I release a 1.4-2 SRPM, could you give it a whirl?
http://www.open-mpi.org/~jsquyres/unofficial/
On Dec 9, 2009, at 6:41 PM, Jim Kusznir wrote:
> By the way, if I set build_a
I have already been using the processor and memory affinity options to
bind the processes to specific cores. Does the presence of the
irqbalance daemon matter? I saw some recommendation to disable this
for a performance boost. Or is this irrelevant?
I am running HPC jobs with no over- nor under-su
On Sun, 2009-12-13 at 19:04 +0100, Gijsbert Wiesenekker wrote:
> The following routine gives a problem after some (not reproducible)
> time on Fedora Core 12. The routine is a CPU usage friendly version of
> MPI_Barrier.
There are some proposals for Non-blocking collectives before the MPI
forum cu
Let's start with this: You generate non-blocking sends (MPI_Isend).
Those sends are not completed anywhere. So, strictly speaking, they
don't need to be executed. In practice, even if they are executed, they
should be "completed" from the user program's point of view (MPI_Test,
MPI_Wait, MP
Hi Reuti,
Yes, I sent a job with SGE and I checkpointed the mpirun process, by
hand, entering into the mpi master node. Then I killed the job with qdel
and after that I did the ompi-restart.
I will try to integrate with SGE creating a ckpt environment but I think
that it could be a bit difficu
Hi,
Am 14.12.2009 um 17:05 schrieb Sergio Díaz:
I got a successful checkpoint with a fresh installation and without
use the trunk. I can't understand why it is working now and before
I could do a successful restart... Maybe there was something wrong
in the openmpi installation and then the
Hi Josh,
I got a successful checkpoint with a fresh installation and without use
the trunk. I can't understand why it is working now and before I could
do a successful restart... Maybe there was something wrong in the
openmpi installation and then the metadata was created in a wrong way.
I wi
12 matches
Mail list logo