Dear Josh,
This will really help a lot. Thank you for the support.
Best Regards,
Nguyen Toan
On Wed, Oct 26, 2011 at 9:20 PM, Josh Hursey <jjhur...@open-mpi.org> wrote:
> Since this would be a new feature for 1.4, we cannot move it since the
> 1.4 branch is for bug fixes only. How
Dear Josh,
Thank you. I will test the 1.7 trunk as you suggested.
Also I want to ask if we can add this interface to OpenMPI 1.4.2,
because my applications are mainly involved in this version.
Regards,
Nguyen Toan
On Wed, Oct 26, 2011 at 3:25 AM, Josh Hursey <jjhur...@open-mpi.org>
program
with OpenMPI or how to do that.
Any ideas are very appreciated.
Regards,
Nguyen Toan
$ echo $LD_LIBRARY_PATH
>
> /cluster/sw/blcr/0.8.2/x86_64/gcc//lib:/cluster/sw/openmpi/1.5.3_ft/x86_64/gcc/lib:/opt/intel/Compiler/11.1/056/lib/intel64
>
> The library path seems to be ok or should it look different? do you have
> another idea?
> cheers
> roman
>
> ___
Hi Roman,
Did you try to checkpoint and restart with the parameter "-machinefile". It
may work.
Regards,
Nguyen Toan
On Wed, Apr 6, 2011 at 7:05 PM, Hellmüller Roman <hro...@student.ethz.ch>wrote:
> Hi
>
> I'm trying to get fault tolerant ompi running on our cluste
Thanks Josh.
Actually I also tested with the Himeno
benchmark<http://accc.riken.jp/assets/files/himenob_loadmodule/himenoBMT_c_mpi.lzh>and
got the same problem, so I think this could be a bug.
Hope this information also helps.
Regards,
Nguyen Toan
On Fri, Mar 4, 2011 at 12:04 AM, Joshua
Dear Josh,
Did you find out the problem? I still cannot progress anything.
Hope to hear some good news from you.
Regards,
Nguyen Toan
On Sun, Feb 13, 2011 at 3:04 PM, Nguyen Toan <nguyentoan1...@gmail.com>wrote:
> Hi Josh,
>
> I tried the MCA parameter you mentioned but
Hi Josh,
I tried the MCA parameter you mentioned but it did not help, the unknown
overhead still exists.
Here I attach the output of 'ompi_info', both version 1.5 and 1.5.1.
Hope you can find out the problem.
Thank you.
Regards,
Nguyen Toan
On Wed, Feb 9, 2011 at 11:08 PM, Joshua Hursey <jj
.
Do you have any other idea?
Regards,
Nguyen Toan
On Wed, Feb 9, 2011 at 12:41 AM, Joshua Hursey <jjhur...@open-mpi.org>wrote:
> There are a few reasons why this might be occurring. Did you build with the
> '--enable-ft-thread' option?
>
> If so, it looks like
Hi all,
I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
I found that when running an application,which uses MPI_Isend, MPI_Irecv and
MPI_Wait,
enabling C/R, i.e using "-am ft-enable-cr", the application runtime is much
longer than the normal execution with mpirun (no checkpoint
> the 1.5 series install.
>
> On Dec 8, 2010, at 8:02 AM, Nguyen Toan wrote:
>
> > Dear all,
> >
> > I am having a problem while running mpirun in OpenMPI 1.5 version. I
> compiled OpenMPI 1.5 with BLCR 0.8.2 and OFED 1.4.1 as follows:
> >
> > ./configure \
>
Dear all,
I am having a problem while running mpirun in OpenMPI 1.5 version. I
compiled OpenMPI 1.5 with BLCR 0.8.2 and OFED 1.4.1 as follows:
./configure \
--with-ft=cr \
--enable-mpi-threads \
--with-blcr=/home/nguyen/opt/blcr \
--with-blcr-libdir=/home/nguyen/opt/blcr/lib \
Dear Josh,
I hope to see this new API soon. Anyway, I will try these critical section
functions in BLCR. Thank you for the support.
Best Regards,
Nguyen Toan
On Sat, Jul 17, 2010 at 6:34 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:
>
> On Jun 14, 2010, at 5:26 AM, Nguyen Toan wro
lication and system
configuration specific but in general is there any relationship between
"Others" and the number of processes or data size?
Thank you.
Best Regards,
Nguyen Toan
On Sat, Jul 17, 2010 at 6:25 AM, Josh Hursey <jjhur...@open-mpi.org> wrote:
> The amount of checkpo
Somebody helps please? I am sorry to spam the mailing list but I really need
your help.
Thanks in advance.
Best Regards,
Nguyen Toan
On Thu, Jul 8, 2010 at 1:25 AM, Nguyen Toan <nguyentoan1...@gmail.com>wrote:
> Hello everyone,
> I have a question concerning the checkpoint overhead
to the overall checkpoint overhead in
Open MPI. Is it because of the increase of coordination time for checkpoint?
And what is included in the overall checkpoint overhead besides the BLCR's
checkpoint overhead and coordination time?
Thank you.
Best Regards,
Nguyen Toan
int time (executing ompi-checkpoint), is there a way to let
OpenMPI wait until my_atomic_func() finishes its operation?
+ How does ompi-checkpoint operate to checkpoint MPI threads?
Regards,
Nguyen Toan
Hi all,
I finally figured out the answer. I just put the parameter "-machinefile
host" in the "ompi-restart" command and it restarted correctly. So is it
unable to restart multi-threaded application on 1 node in OpenMPI?
Nguyen Toan
On Tue, Jun 8, 2010 at 12:07 AM, Nguy
helps? Thank you very much.
Nguyen Toan
On Mon, Jun 7, 2010 at 11:51 PM, Nguyen Toan <nguyentoan1...@gmail.com>wrote:
> Hello everyone,
>
> I'm using OpenMPI 1.4.2 with BLCR 0.8.2 to test checkpointing on 2 nodes
> but it failed to restart (Segmentation fault).
> Here are t
as created successfully. However it failed to restart using
ompi-restart:
*"mpirun noticed that process rank 0 with PID 21242 on node rc014.local
exited on signal 11 (Segmentation fault)"
*
Did I miss something in the installation of OpenMPI?
Regards,
Nguyen Toan
t;$ ompi-restart ompi_global_snapshot_10982.ckpt
>--
>mpirun noticed that process rank 1 with PID 11346 on node rc013.local
exited >on signal 11 (Segmentation fault).
>--
21 matches
Mail list logo