Re: [OMPI users] Segfault in ucp_dt_pack function from UCX library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI 4.0.3 and 4.1.2

2022-06-02 Thread Josh Hursey via users
pi/4.1.2+ucx-1.11.2/lib/openmpi/mca_coll_tuned.so(ompi_coll_tuned_alltoallv_intra_dec_fixed+0x42)
 [0x2ab3d4836da2]
Wed Jun  1 23:07:07 2022:# 013: 
/scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2/lib/libmpi.so.40(PMPI_Alltoallv+0x29)
 [0x2ab3bbc7bdf9]
Wed Jun  1 23:07:07 2022:# 014: 
/scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1/lib/libparmetis.so(libparmetis__gkMPI_Alltoallv+0x106)
 [0x2ab3bb0e1c06]
Wed Jun  1 23:07:07 2022:# 015: 
/scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1/lib/libparmetis.so(ParMETIS_V3_Mesh2Dual+0xdd6)
 [0x2ab3bb0f10b6]
Wed Jun  1 23:07:07 2022:# 016: 
/scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1/lib/libparmetis.so(ParMETIS_V3_PartMeshKway+0x100)
 [0x2ab3bb0f1ac0]

PARMetis is compiled as part of PETSc-3.17.1 with 64bit indices.  Here are 
PETSc configure options:

--prefix=/scinet/niagara/software/2022a/opt/gcc-11.2.0-openmpi-4.1.2+ucx-1.11.2/petsc-64bits/3.17.1
COPTFLAGS=\"-O2 -march=native\"
CXXOPTFLAGS=\"-O2 -march=native\"
FOPTFLAGS=\"-O2 -march=native\"
--download-fftw=1
--download-hdf5=1
--download-hypre=1
--download-metis=1
--download-mumps=1
--download-parmetis=1
--download-plapack=1
--download-prometheus=1
--download-ptscotch=1
--download-scotch=1
--download-sprng=1
--download-superlu_dist=1
--download-triangle=1
--with-avx512-kernels=1
--with-blaslapack-dir=/scinet/intel/oneapi/2021u4/mkl/2021.4.0
--with-cc=mpicc
--with-cxx=mpicxx
--with-cxx-dialect=C++11
--with-debugging=0
--with-fc=mpifort
--with-mkl_pardiso-dir=/scinet/intel/oneapi/2021u4/mkl/2021.4.0
--with-scalapack=1
--with-scalapack-lib=\"[/scinet/intel/oneapi/2021u4/mkl/2021.4.0/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/oneapi/2021u4/mkl/2021.4.0/lib/intel64/libmkl_blacs_openmpi_lp64.so]\"
--with-x=0
--with-64-bit-indices=1
--with-memalign=64

and OpenMPI configure options:

'--prefix=/scinet/niagara/software/2022a/opt/gcc-11.2.0/openmpi/4.1.2+ucx-1.11.2'
'--enable-mpi-cxx'
'--enable-mpi1-compatibility'
'--with-hwloc=internal'
'--with-knem=/opt/knem-1.1.3.90mlnx1'
'--with-libevent=internal'
'--with-platform=contrib/platform/mellanox/optimized'
'--with-pmix=internal'
'--with-slurm=/opt/slurm'
'--with-ucx=/scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2'

I am then wondering:

1) Is UCX library considered "stable" for production use with very large sized 
problems ?

2) Is there a way to "bypass" UCX at runtime?

3) Any idea for debugging this?

Of course, I do not yet have a "minimum reproducer" that bugs, since it happens 
only on "large" problems, but I think I could export the data for a 512 
processes reproducer with PARMetis call only...

Thanks for helping,

Eric


-- 


Eric Chamberland, ing., M. Ing


Professionnel de recherche


GIREF/Université Laval


(418) 656-2131 poste 41 22 42



-- 
Josh Hursey
IBM Spectrum MPI Developer


Re: [OMPI users] PRRTE DVM: how to specify rankfile per prun invocation?

2021-01-11 Thread Josh Hursey via users
ocesses are bound, which is correct w.r.t. the rankfile but
contradictory to --bind-to none:

    Mapping policy: PPR:NO_USE_LOCAL,NOOVERSUBSCRIBE
    Ranking policy: SLOT Binding policy: NONE
    Cpu set: N/A  PPR: 2:node  Cpus-per-rank: N/A  Cpu Type: CORE

    Data for node: nid03828     Num slots: 64   Max slots: 0 Num procs: 1
            Process jobid: [23033,1] App: 0 Process rank: 0 Bound: 
package[0][core:0]

    Data for node: nid03829     Num slots: 64   Max slots: 0    Num procs: 2
            Process jobid: [23033,1] App: 0 Process rank: 1 Bound: 
package[0][core:0]
            Process jobid: [23033,1] App: 0 Process rank: 2 Bound: 
package[0][core:1]

If I don't pass a rank file, I can achieve no binding:

    mpirun --bind-to none -n 3 --map-by ppr:2:node:DISPLAY ./mpitest
    ...
    Mapping policy: BYNODE:NO_USE_LOCAL,NOOVERSUBSCRIBE
    Ranking policy: SLOT Binding policy: NONE
    ...
    Process jobid: [11399,1] App: 0 Process rank: 0 Bound: N/A


Is there any reason for not supporting this not-bound configuration when

a rankfile is specified?


[1] 
https://stackoverflow.com/questions/32333785/how-to-provide-a-default-slot-list-in-openmpi-rankfile


-- 
Josh Hursey
IBM Spectrum MPI Developer


Re: [OMPI users] [ORTE] Connecting back to parent - Forcing tcp port

2021-01-07 Thread Josh Hursey via users
I posted a fix for the static ports issue (currently on the v4.1.x branch):
  https://github.com/open-mpi/ompi/pull/8339

If you have time do you want to give it a try and confirm that it fixes your 
issue?

Thanks,
Josh


On Tue, Dec 22, 2020 at 2:44 AM Vincent mailto:boubl...@yahoo.co.uk> > wrote:
 
 
On 18/12/2020 23:04, Josh Hursey wrote:
 
 
 
Vincent,
 

 
 
Thanks for the details on the bug. Indeed this is a case that seems to have 
been a problem for a little while now when you use static ports with ORTE (-mca 
oob_tcp_static_ipv4_ports option). It must have crept in when we refactored the 
internal regular expression mechanism for the v4 branches (and now that I look 
maybe as far back as v3.1). I just hit this same issue in the past day or so 
working with a different user.
 

 
 
Though I do not have a suggestion for a workaround at this time (sorry) I did 
file a GitHub Issue and am looking at this issue. With the holiday I don't know 
when I will have a fix, but you can watch the ticket for updates.
 
  https://github.com/open-mpi/ompi/issues/8304
 

 
 
In the meantime, you could try the v3.0 series release (which predates this 
change) or the current Open MPI master branch (which approaches this a little 
differently). The same command line should work in both. Both can be downloaded 
from the links below:
 
  https://www.open-mpi.org/software/ompi/v3.0/
 
  https://www.open-mpi.org/nightly/master/
 
 Hello Josh
 
 Thank you for considering the problem. I will certainly keep watching the 
ticket. However, there is nothing really urgent (to me anyway).
 
 

 
 

 
 
Regarding your command line, it looks pretty good:
 
  orterun --launch-agent /home/boubliki/openmpi/bin/orted -mca btl tcp --mca 
btl_tcp_port_min_v4 6706 --mca btl_tcp_port_range_v4 10 --mca 
oob_tcp_static_ipv4_ports 6705 -host node2:1 -np 1 /path/to/some/program arg1 
.. argn
 

 
 
I would suggest, while you are debugging this, that you use a program like 
/bin/hostname instead of a real MPI program. If /bin/hostname launches properly 
then move on to an MPI program. That will assure you that the runtime wired up 
correctly (oob/tcp), and then we can focus on the MPI side of the communication 
(btl/tcp). You will want to change "-mca btl tcp" to at least "-mca btl 
tcp,self" (or better "-mca btl tcp,vader,self" if you want shared memory). 
'self' is the loopback interface in Open MPI.
 
 Yes. This is actually what I did. I just wanted to be generic and report the 
problem without too much flourish.
 But it is important you reminded this for new users, helping them to 
understand the real purpose of each layer in an MPI implementation.
 
 
 

 
 
Is there a reason that you are specifying the --launch-agent to the orted? Is 
it installed in a different path on the remote nodes? If Open MPI is installed 
in the same location on all nodes then you shouldn't need that.
 
 I recompiled the sources, activating --enable-orterun-prefix-by-default when 
running ./configure. Of course, it helps :)
 
 Again, thank you.
 
 Kind regards
 
 Vincent.
 
 
 

 
 

 
 
Thanks,
 
Josh
 
 


-- 
Josh Hursey
IBM Spectrum MPI Developer


Re: [OMPI users] [ORTE] Connecting back to parent - Forcing tcp port

2020-12-18 Thread Josh Hursey via users
t the node info */
    565  error = "cannot construct daemon map for static ports - no 
node map info";
    566  goto error;
    567  }
    568  /* extract the node info from the environment and
    569   * build a nidmap from it - this will update the
    570   * routing plan as well
    571   */
    572  if (ORTE_SUCCESS != (ret = orte_regx.build_daemon_nidmap())) {
    573  ORTE_ERROR_LOG(ret);
    574  error = "construct daemon map from static ports";
    575  goto error;
    576  }
    577  /* be sure to update the routing tree so the initial "phone 
home"
    578   * to mpirun goes through the tree if static ports were enabled
    579   */
    580  orte_routed.update_routing_plan(NULL);
    581  /* routing can be enabled */
    582  orte_routed_base.routing_enabled = true;
    583  }
 boubliki@node1: ~/openmpi/src/openmpi-4.0.5>
 
 The debugger led me to printing element called orte_regx, showing address of a 
method called build_daemon_nidmap containing a NULL value while line 572 wants 
precisely to call this method for execution.
 (gdb) 
 Thread 1 "orted" received signal SIGSEGV, Segmentation fault.
 0x in ?? ()
 (gdb) bt
 #0  0x in ?? ()
 #1  0x7f76ae3fa585 in orte_ess_base_orted_setup () at 
base/ess_base_std_orted.c:572
 #2  0x7f76ae2662b4 in rte_init () at ess_env_module.c:149
 #3  0x7f76ae432645 in orte_init (pargc=pargc@entry=0x7ffe1c87a81c, 
pargv=pargv@entry=0x7ffe1c87a810, flags=flags@entry=2) at 
runtime/orte_init.c:271
 #4  0x7f76ae3e0bf0 in orte_daemon (argc=, argv=) at orted/orted_main.c:362
 #5  0x7f76acc976a3 in __libc_start_main () from /lib64/libc.so.6
 #6  0x0040111e in _start ()
 (gdb) p orte_regx
 $1 = {init = 0x0, nidmap_create = 0x7f76ab46c230 , nidmap_parse 
= 0x7f76ae4180b0 , 
   extract_node_names = 0x7f76ae41bd20 , 
encode_nodemap = 0x7f76ae418730 , 
   decode_daemon_nodemap = 0x7f76ae41a190 
, build_daemon_nidmap = 0x0, 
   generate_ppn = 0x7f76ae41b0f0 , parse_ppn = 
0x7f76ae41b760 , finalize = 0x0}

 (gdb)
 I suppose the orte_regx element has been initialized somewhere through an 
inline function in [maybe] opal/class/opal_object.h but I'm lost in code and 
probably in some concurrency/multi-threading aspects, and can't even figure out 
at the end whether I'm using the mca option correctly or not, or facing a bug 
in the core application
 
     478 static inline opal_object_t *opal_obj_new(opal_class_t * cls)
     479 {
     480 opal_object_t *object;
     481 assert(cls->cls_sizeof >= sizeof(opal_object_t));
     482 
     483 #if OPAL_WANT_MEMCHECKER
     484 object = (opal_object_t *) calloc(1, cls->cls_sizeof);
     485 #else
     486 object = (opal_object_t *) malloc(cls->cls_sizeof);
     487 #endif
     488 if (opal_class_init_epoch != cls->cls_initialized) {
     489 opal_class_initialize(cls);
     490 }
     491 if (NULL != object) {
     492 object->obj_class = cls;
     493 object->obj_reference_count = 1;
     494 opal_obj_run_constructors(object);
     495 }
     496 return object;
     497 }
 
 Can you maybe (firstly) fix my knowledge about what correct mca option I could 
use for this and get orted on the remote node connecting back to the tcp port I 
specify ?
 Or (worse) browse the code for a potential bug related to this functionality ?
 
 Thank you
 
 Vincent
 
 
 


-- 
Josh Hursey
IBM Spectrum MPI Developer


Re: [OMPI users] [EXTERNAL] hwloc support for Power9/IBM AC922 servers

2019-04-16 Thread Josh Hursey
Our Spectrum MPI and JSM teams at IBM regularly use hwloc with and without
Open MPI on a variety of ppc64le hardware including the AC922 systems. I've
personally tested the 1.11 and 2.0 series and it is working well for me on
those systems.

Let us know if you encounter any issues.

-- Josh


On Tue, Apr 16, 2019 at 12:38 PM Hammond, Simon David via users <
users@lists.open-mpi.org> wrote:

> Hi Prentice,
>
> We are using OpenMPI and HWLOC on POWER9 servers. The topology information
> looks good from our initial use.
>
> Let me know if you need anything specifically.
>
> S.
>
> —
> Si Hammond
> Scalable Computer Architectures
> Sandia National Laboratories, NM
>
> > On Apr 16, 2019, at 11:28 AM, Prentice Bisbal via users <
> users@lists.open-mpi.org> wrote:
> >
> > OpenMPI Users,
> >
> > Are any of you using hwloc on Power9 hardware, specifically the IBM
> AC922 servers? If so, have you encountered any issues? I checked the
> documentation for the latest version (2.03), and found this:
> >
> >> Since it uses standard Operating System information, hwloc's support is
> mostly independant from the processor type
> >> (x86, powerpc, ...) and just relies on the Operating System support.
> >
> > and this:
> >
> >> To check whether hwloc works on a particular machine, just try to build
> it and run lstopo or lstopo-no-graphics.
> >> If some things do not look right (e.g. bogus or missing cache
> information
> >
> > We haven't bought any AC922 nodes yet, so i can't try that just yet  We
> are looking to purchase a small cluster, and want to make sure there are no
> known issues between the hardware and software before we make a purchase.
> >
> > Any feedback will be greatly appreciated.
> >
> > Thanks,
> >
> > Prentice
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] openmpi 1.10.2 with the IBM lsf system

2017-09-15 Thread Josh Hursey
That line of code is here:

https://github.com/open-mpi/ompi/blob/v1.10.2/orte/mca/plm/lsf/plm_lsf_module.c#L346
(Unfortunately we didn't catch the rc from lsb_launch to see why it failed
- I'll fix that).

So it looks like LSF failed to launch our daemon on one or more remote
machines. This could be an LSF issue on one of the machines in your
allocation. One thing to try is a blaunch from the command line to launch
one process per node in your allocation (which is similar to what we are
trying to do in this function). I would expect that to fail, but might show
you which machine is problematic.


On Fri, Sep 15, 2017 at 7:15 AM, Jing Gong  wrote:

> Hi,
>
>
> We tried to run a job of openfoam with 4480 cpus using the IBM LSF system
> but got the following error messages:
>
>
> ...
>
> [bs209:16251] [[25529,0],0] ORTE_ERROR_LOG: The specified application
> failed to start in file /software/OpenFOAM/ThirdParty-
> v1606+/openmpi-1.10.2/orte/mca/plm/lsf/plm_lsf_module.c at line 346
> [bs209: 16251] lsb_launch failed: 0
> ...
>
>
> The openfoam is built by openmpi 1.10.2 within its Thirdparty package and
> it works fine  around 2000 cpus on the same cluster.
>
>
> Is the issue related to the LSF system? Are there any openmpi flags
> available to
>
> diagnose the problem ?
>
>
> Thanks a lot.
>
>
> Regards, Jing
>
>
>
>
>
> _______
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI in docker container

2017-03-11 Thread Josh Hursey
>From the stack track it looks like it's failing the PSM2 MTL, which you
shouldn't need (or want?) in this scenario.

Try adding this additional MCA parameter to your command line:
 -mca pml ob1

That will force Open MPI's selection such that it avoids that component.
That might get you further along.


On Sat, Mar 11, 2017 at 7:49 AM, Ender GÜLER  wrote:

> Hi there,
>
> I try to use openmpi in a docker container. My host and container OS is
> CentOS 7 (7.2.1511 to be exact). When I try to run a simple MPI hello world
> application, the app core dumps every time with BUS ERROR. The OpenMPI
> version is 2.0.2 and I compiled in the container. When I copied the
> installation from container to host, it runs without any problem.
>
> Have you ever tried to run OpenMPI and encountered a problem like this
> one. If so what can be wrong? What should I do to find the root cause and
> solve the problem? The very same application can be run with IntelMPI in
> the container without any problem.
>
> I pasted the output of my mpirun command and its output below.
>
> [root@cn15 ~]# mpirun --allow-run-as-root -mca btl sm -np 2 -machinefile
> mpd.hosts ./mpi_hello.x
> [cn15:25287] *** Process received signal ***
> [cn15:25287] Signal: Bus error (7)
> [cn15:25287] Signal code: Non-existant physical address (2)
> [cn15:25287] Failing at address: 0x7fe2d0fbf000
> [cn15:25287] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7fe2d53e9100]
> [cn15:25287] [ 1] /lib64/libpsm2.so.2(+0x4b034)[0x7fe2d5a9a034]
> [cn15:25287] [ 2] /lib64/libpsm2.so.2(+0xc45f)[0x7fe2d5a5b45f]
> [cn15:25287] [ 3] /lib64/libpsm2.so.2(+0xc706)[0x7fe2d5a5b706]
> [cn15:25287] [ 4] /lib64/libpsm2.so.2(+0x10d60)[0x7fe2d5a5fd60]
> [cn15:25287] [ 5] /lib64/libpsm2.so.2(psm2_ep_open+0x41e)[0x7fe2d5a5e8de]
> [cn15:25287] [ 6] /opt/openmpi/2.0.2/lib/libmpi.
> so.20(ompi_mtl_psm2_module_init+0x1df)[0x7fe2d69b5d5b]
> [cn15:25287] [ 7] /opt/openmpi/2.0.2/lib/libmpi.so.20(+0x1b3249)[
> 0x7fe2d69b7249]
> [cn15:25287] [ 8] /opt/openmpi/2.0.2/lib/libmpi.
> so.20(ompi_mtl_base_select+0xc2)[0x7fe2d69b2956]
> [cn15:25287] [ 9] /opt/openmpi/2.0.2/lib/libmpi.so.20(+0x216c9f)[
> 0x7fe2d6a1ac9f]
> [cn15:25287] [10] /opt/openmpi/2.0.2/lib/libmpi.so.20(mca_pml_base_select+
> 0x29b)[0x7fe2d69f7566]
> [cn15:25287] [11] /opt/openmpi/2.0.2/lib/libmpi.
> so.20(ompi_mpi_init+0x665)[0x7fe2d687e0f4]
> [cn15:25287] [12] /opt/openmpi/2.0.2/lib/libmpi.so.20(MPI_Init+0x99)[
> 0x7fe2d68b1cb4]
> [cn15:25287] [13] ./mpi_hello.x[0x400927]
> [cn15:25287] [14] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fe2d5039b15]
> [cn15:25287] [15] ./mpi_hello.x[0x400839]
> [cn15:25287] *** End of error message ***
> [cn15:25286] *** Process received signal ***
> [cn15:25286] Signal: Bus error (7)
> [cn15:25286] Signal code: Non-existant physical address (2)
> [cn15:25286] Failing at address: 0x7fd4abb18000
> [cn15:25286] [ 0] /lib64/libpthread.so.0(+0xf100)[0x7fd4b3f56100]
> [cn15:25286] [ 1] /lib64/libpsm2.so.2(+0x4b034)[0x7fd4b4607034]
> [cn15:25286] [ 2] /lib64/libpsm2.so.2(+0xc45f)[0x7fd4b45c845f]
> [cn15:25286] [ 3] /lib64/libpsm2.so.2(+0xc706)[0x7fd4b45c8706]
> [cn15:25286] [ 4] /lib64/libpsm2.so.2(+0x10d60)[0x7fd4b45ccd60]
> [cn15:25286] [ 5] /lib64/libpsm2.so.2(psm2_ep_open+0x41e)[0x7fd4b45cb8de]
> [cn15:25286] [ 6] /opt/openmpi/2.0.2/lib/libmpi.
> so.20(ompi_mtl_psm2_module_init+0x1df)[0x7fd4b5522d5b]
> [cn15:25286] [ 7] /opt/openmpi/2.0.2/lib/libmpi.so.20(+0x1b3249)[
> 0x7fd4b5524249]
> [cn15:25286] [ 8] /opt/openmpi/2.0.2/lib/libmpi.
> so.20(ompi_mtl_base_select+0xc2)[0x7fd4b551f956]
> [cn15:25286] [ 9] /opt/openmpi/2.0.2/lib/libmpi.so.20(+0x216c9f)[
> 0x7fd4b5587c9f]
> [cn15:25286] [10] /opt/openmpi/2.0.2/lib/libmpi.so.20(mca_pml_base_select+
> 0x29b)[0x7fd4b5564566]
> [cn15:25286] [11] /opt/openmpi/2.0.2/lib/libmpi.
> so.20(ompi_mpi_init+0x665)[0x7fd4b53eb0f4]
> [cn15:25286] [12] /opt/openmpi/2.0.2/lib/libmpi.so.20(MPI_Init+0x99)[
> 0x7fd4b541ecb4]
> [cn15:25286] [13] ./mpi_hello.x[0x400927]
> [cn15:25286] [14] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd4b3ba6b15]
> [cn15:25286] [15] ./mpi_hello.x[0x400839]
> [cn15:25286] *** End of error message ***
> --
> mpirun noticed that process rank 1 with PID 0 on node cn15 exited on
> signal 7 (Bus error).
> ------
>
> Thanks in advance,
>
> Ender
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] fatal error with openmpi-2.1.0rc1 on Linux with Sun C

2017-02-27 Thread Josh Hursey
Drat! Thanks for letting us know. That fix was missed when we swept through
to create the PMIx v1.2.1 - which triggered the OMPI v2.1.0rc1. Sorry about
that :(

Jeff filed an Issue to track this here:
  https://github.com/open-mpi/ompi/issues/3048

I've filed a PR against PMIx to bring it into the next PMIx v1.2.2 here:
  https://github.com/pmix/master/pull/322

We'll followup on the Issue with the resolution tomorrow morning during the
OMPI developer's teleconf.


On Mon, Feb 27, 2017 at 8:05 AM, Siegmar Gross <
siegmar.gr...@informatik.hs-fulda.de> wrote:

> Hi,
>
> I tried to install openmpi-2.1.0rc1 on my "SUSE Linux Enterprise
> Server 12.2 (x86_64)" with Sun C 5.14. Unfortunately, "make"
> breaks with the following error. I had reported the same problem
> for openmpi-master-201702150209-404fe32. Gilles was able to solve
> the problem (https://github.com/pmix/master/pull/309).
>
> ...
>   CC   src/dstore/pmix_esh.lo
> "/export2/src/openmpi-2.1.0/openmpi-2.1.0rc1/opal/include/opal/sys/amd64/atomic.h",
> line 159: warning: parameter in inline asm statement unused: %3
> "/export2/src/openmpi-2.1.0/openmpi-2.1.0rc1/opal/include/opal/sys/amd64/atomic.h",
> line 205: warning: parameter in inline asm statement unused: %2
> "/export2/src/openmpi-2.1.0/openmpi-2.1.0rc1/opal/include/opal/sys/amd64/atomic.h",
> line 226: warning: parameter in inline asm statement unused: %2
> "/export2/src/openmpi-2.1.0/openmpi-2.1.0rc1/opal/include/opal/sys/amd64/atomic.h",
> line 247: warning: parameter in inline asm statement unused: %2
> "/export2/src/openmpi-2.1.0/openmpi-2.1.0rc1/opal/include/opal/sys/amd64/atomic.h",
> line 268: warning: parameter in inline asm statement unused: %2
> cc: Fatal error in /opt/sun/developerstudio12.5/lib/compilers/bin/acomp :
> Signal number = 139
> Makefile:1329: recipe for target 'src/dstore/pmix_esh.lo' failed
> make[4]: *** [src/dstore/pmix_esh.lo] Error 1
> make[4]: Leaving directory '/export2/src/openmpi-2.1.0/op
> enmpi-2.1.0rc1-Linux.x86_64.64_cc/opal/mca/pmix/pmix112/pmix'
> Makefile:1596: recipe for target 'all-recursive' failed
> make[3]: *** [all-recursive] Error 1
> make[3]: Leaving directory '/export2/src/openmpi-2.1.0/op
> enmpi-2.1.0rc1-Linux.x86_64.64_cc/opal/mca/pmix/pmix112/pmix'
> Makefile:1941: recipe for target 'all-recursive' failed
> make[2]: *** [all-recursive] Error 1
> make[2]: Leaving directory '/export2/src/openmpi-2.1.0/op
> enmpi-2.1.0rc1-Linux.x86_64.64_cc/opal/mca/pmix/pmix112'
> Makefile:2307: recipe for target 'all-recursive' failed
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory '/export2/src/openmpi-2.1.0/op
> enmpi-2.1.0rc1-Linux.x86_64.64_cc/opal'
> Makefile:1806: recipe for target 'all-recursive' failed
> make: *** [all-recursive] Error 1
> loki openmpi-2.1.0rc1-Linux.x86_64.64_cc 129
>
>
> Gilles, I would be grateful, if you can fix the problem for
> openmpi-2.1.0rc1 as well. Thank you very much for your help
> in advance.
>
>
> Kind regards
>
> Siegmar
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] More confusion about --map-by!

2017-02-23 Thread Josh Hursey
././.][./././././././././././.]
> > [somehost:102035] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket
> 0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.]
> > [somehost:102035] MCW rank 3 bound to socket 0[core 6[hwt 0]], socket
> 0[core 7[hwt 0]]: [././././././B/B/./././.][./././././././././././.]
> > [somehost:102035] MCW rank 4 bound to socket 0[core 8[hwt 0]], socket
> 0[core 9[hwt 0]]: [././././././././B/B/./.][./././././././././././.]
> > [somehost:102035] MCW rank 5 bound to socket 0[core 10[hwt 0]], socket
> 0[core 11[hwt 0]]: [././././././././././B/B][./././././././././././.]
> >
> >
> > ... whereas if I map by socket instead of slot, I achieve aim (1) but
> > fail on aim (2):
> >
> > $ mpirun -np 6 -map-by socket:PE=2 --bind-to core --report-bindings ./
> prog
> > [somehost:105601] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket
> 0[core 1[hwt 0]]: [B/B/./././././././././.][./././././././././././.]
> > [somehost:105601] MCW rank 1 bound to socket 1[core 12[hwt 0]], socket
> 1[core 13[hwt 0]]: [./././././././././././.][B/B/./././././././././.]
> > [somehost:105601] MCW rank 2 bound to socket 0[core 2[hwt 0]], socket
> 0[core 3[hwt 0]]: [././B/B/./././././././.][./././././././././././.]
> > [somehost:105601] MCW rank 3 bound to socket 1[core 14[hwt 0]], socket
> 1[core 15[hwt 0]]: [./././././././././././.][././B/B/./././././././.]
> > [somehost:105601] MCW rank 4 bound to socket 0[core 4[hwt 0]], socket
> 0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.]
> > [somehost:105601] MCW rank 5 bound to socket 1[core 16[hwt 0]], socket
> 1[core 17[hwt 0]]: [./././././././././././.][././././B/B/./././././.]
> >
> >
> > Any ideas, please?
> >
> > Thanks,
> >
> > Mark
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> >
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>



-- 
Josh Hursey
IBM Spectrum MPI Developer
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Can OMPI 1.8.8 or later support LSF 9.1.3 or 10.1?

2016-07-11 Thread Josh Hursey
IBM will be helping to support the LSF functionality in Open MPI. We don't
have any detailed documentation just yet, other than the FAQ on the Open
MPI site. However, the LSF components in Open MPI should be functional in
the latest releases. I've tested recently with LSF 9.1.3 and 10.1.

I pushed some changes to the Open MPI 1.10.3 release (and 2.0.0
pre-release) for affinity support in MPMD configurations. That was tested
on machine with LSF 9.1.3. This is using the the "-R" affinity options to
bsub - so the affinity specification mechanism built into LSF. It worked as
I expected it to for the few configurations I tried.

I have not tested with v1.8 series since it's an older series. I would
suggest trying the 1.10.3 release (and the soon to be released 2.0.0) on
your system.


On Fri, Jul 8, 2016 at 1:20 PM, Gang Chen  wrote:

> Hi,
>
> I am wondering if there's integration test conducted with v1.8.8 and IBM
> LSF 9.1.3 or 10.1, especially the cpu affinity parts. Is there somewhere I
> can find detail info?
>
> Thanks,
> Gordon
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/07/29621.php
>


Re: [OMPI users] Fw: LSF's LSB_PJL_TASK_GEOMETRY + OpenMPI 1.10.2

2016-04-19 Thread Josh Hursey
Just an update for the list. Really only impacts folks running Open MPI
under LSF.


The LSB_PJL_TASK_GEOMETRY changes what lbs_getalloc() returns regarding the
allocation. It adjusts it to the mapping/ordering specified in that
environment variable. However, since it is not set by LSF when the job
starts the LSB_AFFINITY_HOSTFILE will show a broader mapping/ordering. The
difference between these two requests is the core of the problem here.

Consider an LSB hostfile with the following:
=== LSB_AFFINITY_HOSTFILE ===
p10a33 0,1,2,3,4,5,6,7
p10a33 8,9,10,11,12,13,14,15
p10a33 16,17,18,19,20,21,22,23
p10a30 0,1,2,3,4,5,6,7
p10a30 8,9,10,11,12,13,14,15
p10a30 16,17,18,19,20,21,22,23
p10a58 0,1,2,3,4,5,6,7
p10a58 8,9,10,11,12,13,14,15
p10a58 16,17,18,19,20,21,22,23
=

This tells Open MPI to launch 3 processes per node with a particular set of
bindings - so 9 processes total.

export LSB_PJL_TASK_GEOMETRY="{(5)(4,3)(2,1,0)}"

The LSB_PJL_TASK_GEOMETRY variable (above) tells us to only launch 6
processes. So lbs_getalloc() will return to us (ras_lsf_module.c) a list of
resources that match launching 6 processes. However, when we go to the
rmaps_seq.c we tell it to pay attention to the LSB_AFFINITY_HOSTFILE. So it
tries to map 9 processes even though we set the slots on the nodes to be a
total of 6. So eventually we get an oversubscription issue.

Interesting difference between 1.10.2 and 1.10.3rc1 - using the
LSB_AFFINITY_HOSTFILE, seen above.
In 1.10.2 RAS thinks it has the following allocation (with and without the
LSB_PJL_TASK_GEOMETRY set):
==   ALLOCATED NODES   ==
p10a33: slots=1 max_slots=0 slots_inuse=0 state=UP
=
In 1.10.3.rc1 RAS thinks it has the following allocation (with the
LSB_PJL_TASK_GEOMETRY set)
==   ALLOCATED NODES   ==
p10a33: slots=1 max_slots=0 slots_inuse=0 state=UP
p10a30: slots=2 max_slots=0 slots_inuse=0 state=UP
p10a58: slots=3 max_slots=0 slots_inuse=0 state=UP
=
In 1.10.3.rc1 RAS thinks it has the following allocation (without the
LSB_PJL_TASK_GEOMETRY set)
==   ALLOCATED NODES   ==
p10a33: slots=3 max_slots=0 slots_inuse=0 state=UP
p10a30: slots=3 max_slots=0 slots_inuse=0 state=UP
p10a58: slots=3 max_slots=0 slots_inuse=0 state=UP
=

The 1.10.3rc1 behavior is what I would expect to happen. The 1.10.2
behavior seems to be a bug when running under LSF.

The original error comes from trying to map 3 process on each of the nodes
(since the affinity file wants to launch 9 processes), but the nodes having
a more restricted set of slots (Due to the LSB_PJL_TASK_GEOMETRY variable).


I know a number of things have changed from 1.10.2 to 1.10.3 regarding how
we allocate/map. Ralph, do you know offhand what might have caused this
difference? It's not a big deal if not, just curious.


I'm working with Farid on some options to work around the issue for 1.10.2.
Open MPI 1.10.3 seems to be ok for basic LSF functionality (without the
LSB_PJL_TASK_GEOMETRY variable).

-- Josh


On Tue, Apr 19, 2016 at 8:57 AM, Josh Hursey  wrote:

> Farid,
>
> I have access to the same cluster inside IBM. I can try to help you track
> this down and maybe work up a patch with the LSF folks. I'll contact you
> off-list with my IBM address and we can work on this a bit.
>
> I'll post back to the list with what we found.
>
> -- Josh
>
>
> On Tue, Apr 19, 2016 at 5:06 AM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
>
>> On Apr 18, 2016, at 7:08 PM, Farid Parpia  wrote:
>> >
>> > I will try to put you in touch with someone in LSF development
>> immediately.
>>
>> FWIW: It would be great if IBM could contribute the fixes to this.  None
>> of us have access to LSF resources, and IBM is a core contributor to Open
>> MPI.
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/04/28963.php
>>
>
>


Re: [OMPI users] Fw: LSF's LSB_PJL_TASK_GEOMETRY + OpenMPI 1.10.2

2016-04-19 Thread Josh Hursey
Farid,

I have access to the same cluster inside IBM. I can try to help you track
this down and maybe work up a patch with the LSF folks. I'll contact you
off-list with my IBM address and we can work on this a bit.

I'll post back to the list with what we found.

-- Josh


On Tue, Apr 19, 2016 at 5:06 AM, Jeff Squyres (jsquyres)  wrote:

> On Apr 18, 2016, at 7:08 PM, Farid Parpia  wrote:
> >
> > I will try to put you in touch with someone in LSF development
> immediately.
>
> FWIW: It would be great if IBM could contribute the fixes to this.  None
> of us have access to LSF resources, and IBM is a core contributor to Open
> MPI.
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/04/28963.php
>


Re: [OMPI users] Missing -enable-crdebug option in configure step

2014-07-01 Thread Josh Hursey
The C/R Debugging feature (the ability to do reversible debugging or
backward stepping with gdb and/or DDT) was added on 8/10/2010 in the commit
below:
  https://svn.open-mpi.org/trac/ompi/changeset/23587

This feature never made it into a release so it was only ever available on
the trunk. However, since that time the C/R functionality has fallen into
disrepair. It is most likely broken in the trunk today.

There is an effort to bring back the checkpoint/restart functionality in
the Open MPI trunk. Once that is stable we might revisit bringing back this
feature if there is time and interest.

-- Josh



On Mon, Jun 30, 2014 at 8:35 AM, Ralph Castain  wrote:

> I don't recall ever seeing such an option in Open MPI - what makes you
> believe it should exist?
>
> On Jun 29, 2014, at 9:25 PM, Đỗ Mai Anh Tú  wrote:
>
> Hi all,
>
> I am trying to run the checkpoint/restart enabled debugging code in Open
> MPI. This requires configure this option at the set up step :
>
> ./configure --with-ft=cr --enable-crdebug
>
> But no matter which version of Open MPI, I can't not find any option as
> --enable-crdebug (I have tried all versions from 1.5 to the newest one
> 1.8.1). Anyone could  help me fingure out this prolems. Was this option no
> longer belong or it has been replaced by the other term ?
>
> I appreciate all your helps and thanks all.
>
> --
> Đỗ Mai Anh Tú - Student ID 51104066
> Department of Computer Engineering
> Faculty of Computer Science and Engineering
> HCMC University of Technology.
> Viet Nam National University
>  ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/06/24729.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/06/24730.php
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [OMPI users] Checkpointing an MPI application with OMPI

2013-02-05 Thread Josh Hursey
This is a bit late in the thread, but I wanted to add one more note.

The functionality that made it to v1.6 is fairly basic in terms of C/R
support in Open MPI. It supported a global checkpoint write, and (for a
time) a simple staged option (I think that is now broken).

In the trunk (about 3 years ago is when it was all committed) we extended
the support to allow the user a bit more control over how the checkpoint
files are managed (in addition to other features like automatic recovery,
process migration, and debugging support). These storage techniques allowed
the user to request that a local tmp disk be used to stage a checkpoint.
This allows BLCR to write to the local file system and the application to
continue running while the checkpoint is being moved. Open MPI would stage
it back to the global file system (there were some quality of service
controls, and compression options). This effort was to help alleviate some
of the load on the network file system during the checkpoint burst - since
we are using a fully coordinated approach. It helped quite a bit in the
experiments that I ran as part of my PhD.

Unfortunately, since that initial commit we have not been able to create a
release that includes those additional features. Most of the blame goes to
me for not having the resources to sustain support for them after
completing my PhD until now (as we start to prepare 1.7). So this has lead
to the unfortunate, but realistic situation where it will not be included
in 1.7 and is not available as a configuration option in the trunk (most of
the code is present, but it is know to not function correctly).

My hope is to bring the C/R support back in the future, but I cannot commit
to any specific date at this time. As an open-source project, we are always
looking for developers to help out. So if you (or anyone else on the list)
are interested in helping bring this support back I am willing to help
advise where I can.

Best,
Josh

On Wed, Jan 30, 2013 at 8:18 AM, Maxime Boissonneault <
maxime.boissonnea...@calculquebec.ca> wrote:

>  Le 2013-01-29 21:02, Ralph Castain a écrit :
>
>
>  On Jan 28, 2013, at 10:53 AM, Maxime Boissonneault <
> maxime.boissonnea...@calculquebec.ca> wrote:
>
> While our filesystem and management nodes are on UPS, our compute nodes
> are not. With one average generic (power/cooling mostly) failure every one
> or two months, running for weeks is just asking for trouble. If you add to
> that typical dimm/cpu/networking failures (I estimated about 1 node goes
> down per day because of some sort hardware failure, for a cluster of 960
> nodes). With these numbers, a job running on 32 nodes for 7 days has a ~35%
> chance of failing before it is done.
>
>
> I've been running this in my head all day - it just doesn't fit
> experience, which really bothered me. So I spent a little time running the
> calculation, and I came up with a number much lower (more like around 5%).
> I'm not saying my rough number is correct, but it is at least a little
> closer to what we see in the field.
>
>  Given that there are a lot of assumptions required when doing these
> calculations, I would like to suggest you conduct a very simply and quick
> experiment before investing tons of time on FT solutions. All you have to
> do is:
>
>  Thanks for the calculation. However, this is a cluster that I manage, I
> do not use it per say, and running such statistical jobs on a large part of
> the cluster for a long period of time is impossible. We do have the numbers
> however. The cluster has 960 nodes. We experience roughly one power or
> cooling failure per month or two months. Assuming one such failure per two
> months, if you run for 1 month, you have a 50% chance your job will be
> killed before it ends. If you run for 2 weeks, 25%, etc. These are very
> rough estimates obviously, but it is way more than 5%.
>
> In addition to that, we have a failure rate of ~0.1%/day, meaning that out
> of 960, on average, one node will have a hardware failure every day. Most
> of the time, this is a failure of one of the dimms. Considering each node
> has 12 dimms of 2GB of memory, it means a dimm failure rate of ~0.0001 per
> day. I don't know if that's bad or not, but this is roughly what we have.
>
>  If it turns out you see power failure problems, then a simple, low-cost,
> ride-thru power stabilizer might be a good solution. Flywheels and
> capacitor-based systems can provide support for momentary power quality
> issues at reasonably low costs for a cluster of your size.
>
> I doubt there is anything low cost for a 330 kW system, and in any case,
> hardware upgrade is not an option since this a mid-life cluster. Again, as
> I said, the filesystem (2 x 500 TB lustre partitions) and the management
> nodes are on UPS, but there is no way to put the compute nodes on UPS.
>
>
>  If your node hardware is the problem, or you decide you do want/need to
> pursue an FT solution, then you might look at the OMPI-based solutions from

Re: [OMPI users] Live process migration

2012-12-12 Thread Josh Hursey
ompi-migrate is not in the 1.6 release. It is only available in the Open
MPI trunk.

On Tue, Dec 11, 2012 at 8:04 PM, Ifeanyi  wrote:

> Hi Josh,
>
> I can checkpoint but cannot migrate.
>
> when I type ~openmpi-1.6# ompi-migrate ...  I got this problem
> bash: ompi-migrate: command not found
>
> Please assist.
>
> Regards - Ifeanyi
>
>
>
> On Wed, Dec 12, 2012 at 3:19 AM, Josh Hursey wrote:
>
>> Process migration was implemented in Open MPI and working in the trunk a
>> couple of years ago. It has not been well maintained for a few years though
>> (hopefully that will change one day). So you can try it, but your results
>> may vary.
>>
>> Some details are at the link below:
>>   http://osl.iu.edu/research/ft/ompi-cr/tools.php#ompi-migrate
>>
>>  On Mon, Dec 10, 2012 at 10:39 PM, Ifeanyi wrote:
>>
>>>  Hi all,
>>>
>>> Just wondering if live process migration of processes is supported in
>>> open mpi?
>>>
>>> or any idea of how to do live migration of processes pls.
>>>
>>> Regards,
>>> Ifeanyi
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>> --
>> Joshua Hursey
>> Assistant Professor of Computer Science
>> University of Wisconsin-La Crosse
>> http://cs.uwlax.edu/~jjhursey
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [OMPI users] BLCR + Qlogic infiniband

2012-12-11 Thread Josh Hursey
With that configure string, Open MPI should fail in configure if it does
not find the BLCR libraries. Note that this does not check to make sure the
BLCR is loaded as a module in the kernel (you will need to check that
manually).

The ompi_info command will also show you if C/R is enabled and will show
you if the 'blcr' 'crs' module in the listing at the end. That is probably
the best way to see if the build includes this support.


On Tue, Dec 4, 2012 at 4:43 AM, William Hay  wrote:

>
>
>
> On 28 November 2012 11:14, William Hay  wrote:
>
>> I'm trying to build openmpi with support for BLCR plus qlogic infiniband
>> (plus grid engine).  Everything seems to compile OK and checkpoints are
>> taken but whenever I try to restore a checkpoint I get the following error:
>> - do_mmap(, 2aaab18c7000, 1000, ...) failed:
>> ffea
>> - mmap failed: /dev/ipath
>> - thaw_threads returned error, aborting. -22
>> - thaw_threads returned error, aborting. -22
>> Restart failed: Invalid argument
>>
>> This occurs whether I specify psm or openib as the btl.
>>
>> This looks like the sort of thing I would expect to be handled by the
>> blcr supporting code in openmpi.  So I guess I have a couple ofquestions.
>> 1)Are Infiniband and BLCR support in openmpi compatible?
>> 2)Are there any special tricks necessary to get them working together.
>>
>> A third question occurred to me that may be relevant.  How do I verify
> that my openmpi install has blcr support built in?  I would have thought
> this would mean that either mpiexec or binaries built with mpicc would have
> libcr linked in.  However running ldd doesn't report this in either case.
>  I'm setting LD_PRELOAD to point to it but I would have thought openmpi
> would need to register a callback with blcr and it would be easier to do
> this if the library were linked in rather than trying to detect whether it
> has been LD_PRELOADed.  I'm building with the following options:
> ./configure --prefix=/home/ccaawih/openmpi-blcr --with-openib
> --without-psm --with-blcr=/usr --with-blcr-libdir=/usr/lib64 --with-ft=cr
> --enable-ft-thread --enable-mpi-threads --with-sge
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [OMPI users] Live process migration

2012-12-11 Thread Josh Hursey
Process migration was implemented in Open MPI and working in the trunk a
couple of years ago. It has not been well maintained for a few years though
(hopefully that will change one day). So you can try it, but your results
may vary.

Some details are at the link below:
  http://osl.iu.edu/research/ft/ompi-cr/tools.php#ompi-migrate

On Mon, Dec 10, 2012 at 10:39 PM, Ifeanyi  wrote:

> Hi all,
>
> Just wondering if live process migration of processes is supported in open
> mpi?
>
> or any idea of how to do live migration of processes pls.
>
> Regards,
> Ifeanyi
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [OMPI users] BLCR + Qlogic infiniband

2012-11-30 Thread Josh Hursey
The openib BTL and BLCR support in Open MPI were working about a year ago
(when I last checked). The psm BTL is not supported at the moment though.

>From the error, I suspect that we are not fully closing the openib btl
driver before the checkpoint thus when we try to restart it is looking for
a resource that is no longer present. I created a ticket for us to
investigate further if you want to follow it:
  https://svn.open-mpi.org/trac/ompi/ticket/3417

Unfortunately, I do not know who is currently supporting that code path (I
might pick it back up at some point, but cannot promise anything in the
near future). But I will keep an eye on the ticket and see what I can do.
If it is what I think it is, then it should not take too much work to get
it working again.

-- Josh

On Wed, Nov 28, 2012 at 5:14 AM, William Hay  wrote:

> I'm trying to build openmpi with support for BLCR plus qlogic infiniband
> (plus grid engine).  Everything seems to compile OK and checkpoints are
> taken but whenever I try to restore a checkpoint I get the following error:
> - do_mmap(, 2aaab18c7000, 1000, ...) failed:
> ffea
> - mmap failed: /dev/ipath
> - thaw_threads returned error, aborting. -22
> - thaw_threads returned error, aborting. -22
> Restart failed: Invalid argument
>
> This occurs whether I specify psm or openib as the btl.
>
> This looks like the sort of thing I would expect to be handled by the blcr
> supporting code in openmpi.  So I guess I have a couple ofquestions.
> 1)Are Infiniband and BLCR support in openmpi compatible?
> 2)Are there any special tricks necessary to get them working together.
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [OMPI users] MCA crs: none (MCA v2.0, API v2.0, Component v1.6.3)

2012-11-30 Thread Josh Hursey
Can you send the config.log and some of the other information described on:
  http://www.open-mpi.org/community/help/

-- Josh

On Wed, Nov 14, 2012 at 6:01 PM, Ifeanyi  wrote:

> Hi all,
>
> I got this message when I issued this command:
>
> root@node1:/home/abolap# ompi_info | grep crs
>  MCA crs: none (MCA v2.0, API v2.0, Component v1.6.3)
>
> The installation looks okay and I have reinstalled but still got the same
> issue.
>
> When I searched for the solution I found out that this is a bug which Josh
> has filed (https://svn.open-mpi.org/trac/ompi/ticket/2097) but I cannot
> see the solution or workaround.
>
> This is the initial post -
> http://www.digipedia.pl/usenet/thread/11269/6087/#post6031
>
> Please assist.
>
> Regards,
> Ifeanyi
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [OMPI users] (no subject)

2012-11-30 Thread Josh Hursey
Pramoda,

That paper was exploring an application of a proposed extension to the MPI
standard for fault tolerance purposes. By default this proposed interface
is not provided by Open MPI. We have created a prototype version of Open
MPI that includes this extension, and it can be found at the following
website:
  http://fault-tolerance.org/

You should look at the interfaces in the new proposal (ULFM Specification)
since MPI_Comm_validate_rank is no longer part of the proposal. You can get
the same functionality through some of the new interfaces that replace it.
There are some examples on that website, and in the proposal that should
help you as well.

Best,
Josh

On Mon, Nov 19, 2012 at 8:59 AM, sri pramoda  wrote:

>  Dear Sir,
> I am Pramoda,PG scholar from Jadavpur Univesity,India.
>  I've gone through a paper "Building a Fault Tolerant MPI Application:
>  A Ring Communication Example".In this I found MPI_Comm_validate_rank
> command.
>  But I didn't found this command in mpi. Hence I request you to please
> send methe implementation of this command.
> Thank you,
>   Pramoda.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey


Re: [OMPI users] Best way to map MPI processes to sockets?

2012-11-07 Thread Josh Hursey
In your desired ordering you have rank 0 on (socket,core) (0,0) and
rank 1 on (0,2). Is there an architectural reason for that? Meaning
are cores 0 and 1 hardware threads in the same core, or is there a
cache level (say L2 or L3) connecting cores 0 and 1 separate from
cores 2 and 3?

hwloc's lstopo should give you that information if you don't have that
information handy.

I am asking so that I might provide you with a potentially more
general solution than a rankfile.

-- Josh


On Wed, Nov 7, 2012 at 12:25 PM, Blosch, Edwin L
 wrote:
> I am trying to map MPI processes to sockets in a somewhat compacted pattern
> and I am wondering the best way to do it.
>
>
>
> Say there are 2 sockets (0 and 1) and each processor has 4 cores (0,1,2,3)
> and I have 4 MPI processes, each of which will use 2 OpenMP processes.
>
>
>
> I’ve re-ordered my parallel work such that pairs of ranks (0,1 and 2,3)
> communicate more with each other than with other ranks.  Thus I think the
> best mapping would be:
>
>
>
> RANK   SOCKETCORE
>
> 0  0  0
>
> 1  0  2
>
> 2  1  0
>
> 3  1  2
>
>
>
> My understanding is that --bysocket --bind-to-socket will give me ranks 0
> and 2 on socket 0 and ranks 1 and 3 on socket 1, not what I want.
>
>
>
> It looks like --cpus-per-proc might be what I want, i.e. seems like I might
> give the value 2.  But it was unclear to me whether I would also need to
> give --bysocket and the FAQ suggests this combination is untested.
>
>
>
> May be a rankfile is what I need?
>
>
>
> I would appreciate some advice on the easiest way to get this mapping.
>
>
>
> Thanks
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey



Re: [OMPI users] checkpoint problem

2012-07-28 Thread Josh Hursey
Currently you have to do as Reuti mentioned (use the queuing system,
or create a script). We do have a feature request ticket open for this
feature if you are interested in following the progress:
  https://svn.open-mpi.org/trac/ompi/ticket/1961

It has been open for a while, but the feature should not be difficult
to implement if someone was interested in taking a pass at it.

-- Josh

On Mon, Jul 23, 2012 at 5:15 AM, Reuti  wrote:
> Am 23.07.2012 um 10:02 schrieb 陈松:
>
>> How can I create ckpt files regularly? I mean, do checkpoint every 100 
>> seconds. Is there any options to do this? Or I have to write a script myself?
>
> Yes, or use a queuing system which supports creation of a checkpoint in fixed 
> time intervals.
>
> -- Reuti
>
>
>> THANKS,
>>
>>
>>
>> ---
>> CHEN Song
>> R&D Department
>> National Supercomputer Center in Tianjin
>> Binhai New Area, Tianjin, China
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://www.joshuahursey.com



[OMPI users] Re: [OMPI users] 回复: [OMPI users] Fault Tolerant Features in OpenMPI

2012-06-25 Thread Josh Hursey
The official support page for the C/R features is hosted by Indiana
University (linked from the Open MPI FAQs):
  http://osl.iu.edu/research/ft/ompi-cr/

The instructions probably need to be cleaned up (some of the release
references are not quite correct any longer). But the following should
give you a build of Open MPI with C/R support:
 shell$ ./configure --with-ft=cr --with-ft=cr --enable-opal-multi-threads

You will also need to enable it on the command line with mpirun:
 shell$ mpirun -am ft-enable-cr my-app

Best,
Josh

On Mon, Jun 25, 2012 at 6:21 AM, 陈松  wrote:
> THANK YOU for your detailed answer.
>
> [quote]If you want a fault tolerance feature, such as automatic
> checkpoint/restart recovery, you will need to create a build of Open
> MPI with that feature enabled. There are instructions on the various
> links above about how to do so.[/quote]
>
>
> Could you give me some kind of official guide to enable the C/R feature? I
> googled some aritcles but there seems problems with those methods.
>
> Best wishes.
>
> - 原始邮件信息 -
> 发件人: "Open MPI Users" 
> 收件人: "Open MPI Users" 
> 主题: [OMPI users] Re: [OMPI users] 回复: Re: [OMPI users] 2012/06/18 14:35:07
> 自动保存草稿
> 日期: 2012/06/20 21:43:27, Wednesday
>
> You are correct that the Open MPI project combined the efforts of a
> few preexisting MPI implementations towards building a single,
> extensible MPI implementation with the best features of the prior MPI
> implementations. From the beginning of the project the Open MPI
> developer community has desired to provide a solid MPI 2 (soon MPI 3)
> compliant MPI implementation. Features outside of the MPI standard,
> such as fault tolerance, have been (and are) goals as well.
>
> The fault tolerance efforts in Open MPI have been mostly pursued by
> the research side of the community. As such, maintenance support for
> these features is often challenging and a point of frequent discussion
> in the core developer community. There are users for each of these
> fault tolerance features/techniques, so they are important to provide.
> Integrating these features into Open MPI without diminishing
> performance, scalability, and usability is often a delicate software
> engineering challenge. Per the prior comments on this thread, it can
> often lead to heated debate. :)
>
>
> In the Open MPI trunk and 1.6 release series there are a few fault
> tolerance features that you might be interested in, all with various
> degrees of functionality and support. Each of these features are
> advancements on the fault tolerance features from the LAM/MPI,
> MPICH-V, FT-MPI, and LA-MPI projects.
>
> Checkpoint/Restart support allows a user to manually (via a command
> line tool) checkpoint and restart an MPI application, migrate
> processes in the machine, and/or ask Open MPI to automatically restart
> failed processes on spare resources. Additionally, the application can
> use APIs to checkpoint/restart/migrate processes without using the
> command line tools. This C/R technique is similar to the feature
> provided by LAM/MPI, and was developed by Indiana University (for my
> PhD work). For more details see the link below:
> http://www.open-mpi.org/faq/?category=ft#cr-support
>
> Message logging support was added a while back by UTK, but I am
> uncertain about its current state. This technique is similar to the
> features provided by the MPICH-V project. For more details, I think
> the wiki page below describes the functionality:
> https://svn.open-mpi.org/trac/ompi/wiki/EventLog_CR
>
> The MPI Forum standardization body's Fault Tolerance Working Group has
> a proposal for application managed fault tolerance. In essence this is
> similar to the FT-MPI work, although the interface is quite a bit
> different. This feature is not yet in the Open MPI trunk, but you can
> find a beta release and more information at the link below:
> http://www.open-mpi.org/~jjhursey/projects/ft-open-mpi/
>
> End-to-end data reliability worked at one point in time, but I do not
> know if it is being maintained. This is similar to the fault tolerance
> features found in LA-MPI. For information about that project see the
> link below:
> http://www.open-mpi.org/faq/?category=ft#dr-support
>
> There are also research projects that are exploring other fault
> tolerance techniques above MPI, such as peer based checkpointing and
> replication. So far, these projects have tried to stay above the MPI
> layer for portability, and have not requested any specific extensions
> of Open MPI (maybe with the exception of the work in the MPI Forum,
> cited above). Below are links to two such projects, though there are
> many others out there:
> http://sourceforge.net/projects/scalablecr/
> http://prod.sandia.gov/techlib/access-control.cgi/2011/112488.pdf
>
>
> So that should give you an overview of the current state of fault
> tolerance techniques in Open MPI. To your question about what you can
> expect if a process crashes in your Open MPI job. By default,

Re: [OMPI users] checkpointing of NPB

2012-06-20 Thread Josh Hursey
Ifeanyi,

I am usually the one that responds to checkpoint/restart questions,
but unfortunately I do not have time to look into this issue at the
moment (and probably won't for at least a few more months). There are
a few other developers that work on the checkpoint/restart
functionality that might be able to more immediately help you.
Hopefully they will chime in.

At one point in time (about a year ago) I was able to
checkpoint/restart the NAS benchmarks (and other applications) without
issue. From the error message that you posted earlier, it seems that
something has broken in the 1.6 branch. Unfortunately, I do not have
any advice on an alternative branch to try. The C/R functionality in
the Open MPI trunk is known to be broken. There is a patch for the
trunk making its way through testing at the moment. Once that is
committed then you should be able to use the Open MPI trunk until
someone fixes the 1.6 branch.

Sorry I cannot be of much help. Hopefully others can assist.

-- Josh

On Tue, Jun 19, 2012 at 1:22 AM, Ifeanyi  wrote:
> Dear,
>
> Please help.
>
> I configured the open mpi and it can checkpoint HPL.
>
> However, whenever I want to checkpoint NAS parallel benchmark it kills the
> application without informative message.
>
> Please how do I configure the openmpi 1.6 to checkpoint NPB? I really need a
> help, I have been on this issue for the past few days without solution
>
> Regards,
> Ifeanyi
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


[OMPI users] Re: [OMPI users] 回复: Re: [OMPI users] 2012/06/18 14:35:07 自动保存草稿

2012-06-20 Thread Josh Hursey
You are correct that the Open MPI project combined the efforts of a
few preexisting MPI implementations towards building a single,
extensible MPI implementation with the best features of the prior MPI
implementations. From the beginning of the project the Open MPI
developer community has desired to provide a solid MPI 2 (soon MPI 3)
compliant MPI implementation. Features outside of the MPI standard,
such as fault tolerance, have been (and are) goals as well.

The fault tolerance efforts in Open MPI have been mostly pursued by
the research side of the community. As such, maintenance support for
these features is often challenging and a point of frequent discussion
in the core developer community. There are users for each of these
fault tolerance features/techniques, so they are important to provide.
Integrating these features into Open MPI without diminishing
performance, scalability, and usability is often a delicate software
engineering challenge. Per the prior comments on this thread, it can
often lead to heated debate. :)


In the Open MPI trunk and 1.6 release series there are a few fault
tolerance features that you might be interested in, all with various
degrees of functionality and support. Each of these features are
advancements on the fault tolerance features from the LAM/MPI,
MPICH-V, FT-MPI, and LA-MPI projects.

Checkpoint/Restart support allows a user to manually (via a command
line tool) checkpoint and restart an MPI application, migrate
processes in the machine, and/or ask Open MPI to automatically restart
failed processes on spare resources. Additionally, the application can
use APIs to checkpoint/restart/migrate processes without using the
command line tools. This C/R technique is similar to the feature
provided by LAM/MPI, and was developed by Indiana University (for my
PhD work). For more details see the link below:
  http://www.open-mpi.org/faq/?category=ft#cr-support

Message logging support was added a while back by UTK, but I am
uncertain about its current state. This technique is similar to the
features provided by the MPICH-V project. For more details, I think
the wiki page below describes the functionality:
  https://svn.open-mpi.org/trac/ompi/wiki/EventLog_CR

The MPI Forum standardization body's Fault Tolerance Working Group has
a proposal for application managed fault tolerance. In essence this is
similar to the FT-MPI work, although the interface is quite a bit
different. This feature is not yet in the Open MPI trunk, but you can
find a beta release and more information at the link below:
  http://www.open-mpi.org/~jjhursey/projects/ft-open-mpi/

End-to-end data reliability worked at one point in time, but I do not
know if it is being maintained. This is similar to the fault tolerance
features found in LA-MPI. For information about that project see the
link below:
  http://www.open-mpi.org/faq/?category=ft#dr-support

There are also research projects that are exploring other fault
tolerance techniques above MPI, such as peer based checkpointing and
replication. So far, these projects have tried to stay above the MPI
layer for portability, and have not requested any specific extensions
of Open MPI (maybe with the exception of the work in the MPI Forum,
cited above). Below are links to two such projects, though there are
many others out there:
  http://sourceforge.net/projects/scalablecr/
  http://prod.sandia.gov/techlib/access-control.cgi/2011/112488.pdf


So that should give you an overview of the current state of fault
tolerance techniques in Open MPI. To your question about what you can
expect if a process crashes in your Open MPI job. By default, Open MPI
will kill your entire MPI job and the user will have to restart the
job from either the beginning of execution or from any checkpoint
files that the application has written. Open MPI defaults to killing
the entire MPI job since that is what is often expected by MPI
applications, as most use the default MPI error handler
MPI_ERRORS_ARE_FATAL:
  http://www.netlib.org/utk/papers/mpi-book/node177.html

Last I checked, the current Open MPI trunk will terminate the entire
job even if the user set MPI_ERRORS_RETURN on their communicators. A
reason for this is that the behavior of MPI after returning such an
error is undefined. The MPI Forum Fault Tolerance working group is
working to define this behavior. So if this is of interest see the MPI
Forum work cited above.

If you want a fault tolerance feature, such as automatic
checkpoint/restart recovery, you will need to create a build of Open
MPI with that feature enabled. There are instructions on the various
links above about how to do so.


If you are particularly interested in one feature or have a strong use
case for a set of features, then that is important information for the
Open MPI developer community. This will help use as a project
prioritize the maintenance of various features in the Open MPI
project.


Best of luck,
Josh

On Wed, Jun 20, 2012 at 2:59 AM, 陈松  wrote:
> 

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-24 Thread Josh Hursey
The ~/.openmpi/mca-params.conf file should contain the same
information on all nodes.

You can install Open MPI as root. However, we do not recommend that
you run Open MPI as root.

If the user $HOME directory is NFS mounted, then you can use an NFS
mounted directory to store your files. With this option you do not
need to use the local disk. For an NFS mounted directory you only need
to set:
  snapc_base_global_snapshot_dir=/path_to_NFS_directory/

If you need to stage the files then the following options are what you need.
  snapc_base_store_in_place=0
  snapc_base_global_snapshot_dir=/path_to_global_storage_dir/
  crs_base_snapshot_dir=/path_to_local_storage_dir/

As you start getting setup, I would recommend the NFS options to
reduce the number of variables that you need to worry about to get the
basic setup working.

-- Josh


On Tue, Apr 24, 2012 at 11:43 AM, kidd  wrote:
> Hi ,Thank you For your reply.
> I have some problem:
> Q1:  I setting 2 kinds  mac.para.conf
> (1) crs_base_snapshot_dir=/root/kidd_openMPI/Tmp
>   snapc_base_global_snapshot_dir=/root/kidd_openMPI/checkpoints
>
>  My Master : /root/kidd_openMPI   is My opempi-Installed Dir
> ,it is  Shared by NFS .
>  Do I have to mount  a   User_Account , Rather than a  dir  ?
>
>
>  (2) snapc_base_store_in_place=0
>   crs_base_snapshot_dir= /tmp/OmpiStore/local
>   snapc_base_global_snapshot_dir= /tmp/OmpiStore/global
>
> In this  case  ,I not use  NFS  in OmpiStore/local  &
> OmpiStore/local;
> is it right ?
>   (3)
>Do I setting .openmpi in all-Node ,or just seting on Master .
>
>   (4)  I install openmpi  in root ,should I move   to
> General-user-account ?
>
> 
> 寄件者: Josh Hursey 
> 收件者: Open MPI Users 
> 寄件日期: 2012/4/24 (週二) 10:58 PM
>
> 主旨: Re: [OMPI users] Ompi-restart failed and process migration
>
> On Tue, Apr 24, 2012 at 10:10 AM, kidd  wrote:
>> Hi ,Thank you For your reply.
>>  but I still failed. I must add -x  LD_LIBRARY_PATH
>> this is my  All Setting ;
>> 1) Master-Node(cuda07)  &  Slaves Node(cuda08) :
>>Configure:
>>./configure --prefix=/root/kidd_openMPI  --with-ft=cr
>> --enable-ft-thread  --with-blcr=/usr/local/BLCR
>>--with-blcr-libdir=/usr/local/BLCR/lib
>> --enable-mpirun-prefix-by-default
>>--enable-static --enable-shared  --enable-opal-progress-threads; make ;
>> make install;
>>
>>   (2) Path && LD_PATH:
>> #In /etc/profile
>>  ==>export PATH=$PATH:/usr/local/BLCR/bin ;
>>  ==>export  LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/BLCR/lib
>>#In ~/.bashrc
>> ==>export PATH=$PATH:/root/kidd_openMPI/bin
>> ==>export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/kidd_openMPI/lib
>>
>>(3) Compiler && Running:
>>   ==> ~/kidd_openMPI/NBody_TEST#  mpicc -o  TEST -DDEFSIZE=5000  \
>>   -DDEF_PROC=2 MPINbodyOMP.c
>>
>>   ==>   root@cuda07:~/kidd_openMPI/NBody_TEST# mpirun -hostfile Hosts
>> -np 2 TEST
>>
>>   TEST: error while loading shared libraries: libcr.so.0: cannot open
>> shared
>> object file: No such file or directory
>
>
> I still think the core problem is with the search path given this
> message. Open MPI is trying to load BLCR's libcr.so.0, and it is not
> finding the library in the LD_LIBRARY_PATH search path. Something is
> still off in the backend nodes. Try adding the BLCR
> PATH/LD_LIBRARY_PATH to your .bashrc instead of the profile.
>
>
>>
>>==> I make sure  Master and Slave  have  same Install and  same Path .
>>I  let slave-node  using cr_restart   restart a contextfile
>> ,the
>> contextfile checked by Master ,so
>>Blcr  can work;
>>but it still  cannot open shared object file->libcr.so.0:
>
>
> So BLCR is giving this error?
>
>>
>>   (4)  ifI pass  -x LD_LIBRARY_PATH
>>  ( local mount )
>> (4-1)My mca-params.conf(In Master )
>>  ==> snapc_base_store_in_place=0
>>  crs_base_snapshot_dir=/tmp/OmpiStore/local
>>  snapc_base_global_snapshot_dir=/tmp/OmpiStore/global
>>
>>   step 1: mpirun -hostfile Hosts -np 2 -x LD_LIBRARY_PATH -am
>> ft-enable-cr ./TEST
>>   step 2: ompi-checkpoint -term Pid ( I use another command)
>>   step 3:
>>cd  /tmp/OmpiStore/global
>>   ==> ompi-restartOmpi_Pid.ckpt .   (all process
>> Only Restart on Master

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-24 Thread Josh Hursey
>  this is my command ==>
>  ompi-restart --mpirun_opts  -x  LD_LIBRARY_PATH  -hostfile Hosts \
>  ompi_global_snapshot_8873.ckpt/
>  but it is Error.


Use quotes around the mpirun specific options:
 ompi-restart --mpirun_opts  "-x  LD_LIBRARY_PATH"  -hostfile Hosts
ompi_global_snapshot_8873.ckpt
or
 ompi-restart --mpirun_opts  "-x  LD_LIBRARY_PATH -hostfile Hosts"
ompi_global_snapshot_8873.ckpt

-- Josh

>
>  thanks.
> 
> 寄件者: Josh Hursey 
> 收件者: Open MPI Users 
> 寄件日期: 2012/4/24 (週二) 3:23 AM
>
> 主旨: Re: [OMPI users] Ompi-restart failed and process migration
>
> On Mon, Apr 23, 2012 at 2:45 PM, kidd  wrote:
>> Hi ,Thank you For your reply.
>>
>> I have some problems:
>> (1)
>> Now ,In the my platform , all nodes have the same path and
>> LD_LIBRARY_PATH.
>>  I set in .bashrc
>>
>> //
>> #BLCR
>> export PATH=$PATH:/usr/local/BLCR/bin
>> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/BLCR/lib
>> #openMPI
>> export PATH=$PATH:/root/kidd_openMPI/bin
>> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/kidd_openMPI/lib
>>
>> /---/
>> but ,when I  running  mpirun  , I have to add  " -x  LD_LIBRARY_PATH" ,or
>> it can't  run
>>  example:  mpirun -hostfile hosts  -np  2  ./TEST .
>>  Error Message==>
>> ./TEST: error while loading shared libraries: libcr.so.0: cannot open
>> shared
>> object file: No such file or directory
>
> It sounds like something is still not quite right with your
> environment and system setup. If you have set the PATH and
> LD_LIBRARY_PATH appropriately on all nodes then you should not have to
> pass the "-x LD_LIBRARY_PATH" option to mpirun. Additionally, the
> error you are seeing is from BLCR. That error seems to indicate that
> BLCR is not installed correctly on all nodes.
>
> Some things to look into (in this order):
> 1) Make sure that you have BLCR and Open MPI installed in the same
> location on all machines.
> 2) Make sure that BLCR works on all machines by checkpointing and
> restarting a single process program
> 3) Make sure that Open MPI works on all machines -without-
> checkpointing, and without passing the -x option.
> 4) Checkpoint/restart an MPI job
>
>
>>  (2)  BLCR need to unify linux-kernel  of all the Node ?
>>    Now ,I reset all  Node.(using Ubuntu 10.04)
>
> I do not understand what you are trying to ask here. Please rephrase.
>
>
>>  (3)
>>       Now , My porgram using  DLL . I implements some DLL  ,MPI-Program
>> calls DLLs .
>>   Ompi can check/Restart  Program contains  DLL ?
>
> I do not understand what you are trying to ask here. Please rephrase.
>
> -- Josh
>
>
>> 
>>
>> 
>> 寄件者: Josh Hursey 
>> 收件者: Open MPI Users 
>> 寄件日期: 2012/4/23 (週一) 10:51 PM
>> 主旨: Re: [OMPI users] Ompi-restart failed and process migration
>>
>> I wonder if the LD_LIBRARY_PATH is not being set properly upon
>> restart. In your mpirun you pass the '-x LD_LIBRARY_PATH'.
>> ompi-restart will not pass that variable along for you, so if you are
>> using that to set the BLCR path this might be your problem.
>>
>> A couple solutions:
>> - have the PATH and LD_LIBRARY_PATH set the same on all nodes
>> - have ompi-restart pass the -x parameter to the underlying mpirun by
>> using the -mpirun_opts command line switch:
>>   ompi-restart --mpirun_opts "-x LD_LIBRARY_PATH" ...
>>
>> Yes. ompi-restart will let you checkpoint a process on one node and
>> restart it on another. You will have to restart the whole application
>> since the ompi-migration operation is not available in the 1.5 series.
>>
>> -- Josh
>>
>> On Sat, Apr 21, 2012 at 4:11 AM, kidd  wrote:
>>> Hi all,
>>> I have Some problems,I wana check/Restart Multiple process on 2 node.
>>>
>>>  My environment:
>>>  BLCR= 0.8.4   , openMPI= 1.5.5  , OS = ubuntu 11.04
>>> I have 2 Node :
>>>  N05(Master ,it have NFS shared file system),N07(slave
>>>  ,mount Master-Node).
>>>
>>>  My configure format=./configure --prefix=/root/kidd_openMPI
>>>  --with-ft=cr --enable-ft-thread  --with-blcr=/usr/local/BLCR
>>>  --with-blcr-libdir=/usr/local/BLCR/lib --enable-mpirun-prefix-by-default
>>>  --enable-static --enable-shared --enable-opa

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-23 Thread Josh Hursey
On Mon, Apr 23, 2012 at 2:45 PM, kidd  wrote:
> Hi ,Thank you For your reply.
>
> I have some problems:
> (1)
> Now ,In the my platform , all nodes have the same path and LD_LIBRARY_PATH.
>  I set in .bashrc
> //
> #BLCR
> export PATH=$PATH:/usr/local/BLCR/bin
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/BLCR/lib
> #openMPI
> export PATH=$PATH:/root/kidd_openMPI/bin
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/kidd_openMPI/lib
> /---/
> but ,when I  running  mpirun  , I have to add  " -x  LD_LIBRARY_PATH" ,or
> it can't  run
>  example:  mpirun -hostfile hosts  -np  2  ./TEST .
>  Error Message==>
> ./TEST: error while loading shared libraries: libcr.so.0: cannot open shared
> object file: No such file or directory

It sounds like something is still not quite right with your
environment and system setup. If you have set the PATH and
LD_LIBRARY_PATH appropriately on all nodes then you should not have to
pass the "-x LD_LIBRARY_PATH" option to mpirun. Additionally, the
error you are seeing is from BLCR. That error seems to indicate that
BLCR is not installed correctly on all nodes.

Some things to look into (in this order):
 1) Make sure that you have BLCR and Open MPI installed in the same
location on all machines.
 2) Make sure that BLCR works on all machines by checkpointing and
restarting a single process program
 3) Make sure that Open MPI works on all machines -without-
checkpointing, and without passing the -x option.
 4) Checkpoint/restart an MPI job


>  (2)  BLCR need to unify linux-kernel  of all the Node ?
>    Now ,I reset all  Node.(using Ubuntu 10.04)

I do not understand what you are trying to ask here. Please rephrase.


>  (3)
>       Now , My porgram using  DLL . I implements some DLL  ,MPI-Program
> calls DLLs .
>   Ompi can check/Restart  Program contains  DLL ?

I do not understand what you are trying to ask here. Please rephrase.

-- Josh


> 
>
> 
> 寄件者: Josh Hursey 
> 收件者: Open MPI Users 
> 寄件日期: 2012/4/23 (週一) 10:51 PM
> 主旨: Re: [OMPI users] Ompi-restart failed and process migration
>
> I wonder if the LD_LIBRARY_PATH is not being set properly upon
> restart. In your mpirun you pass the '-x LD_LIBRARY_PATH'.
> ompi-restart will not pass that variable along for you, so if you are
> using that to set the BLCR path this might be your problem.
>
> A couple solutions:
> - have the PATH and LD_LIBRARY_PATH set the same on all nodes
> - have ompi-restart pass the -x parameter to the underlying mpirun by
> using the -mpirun_opts command line switch:
>   ompi-restart --mpirun_opts "-x LD_LIBRARY_PATH" ...
>
> Yes. ompi-restart will let you checkpoint a process on one node and
> restart it on another. You will have to restart the whole application
> since the ompi-migration operation is not available in the 1.5 series.
>
> -- Josh
>
> On Sat, Apr 21, 2012 at 4:11 AM, kidd  wrote:
>> Hi all,
>> I have Some problems,I wana check/Restart Multiple process on 2 node.
>>
>>  My environment:
>>  BLCR= 0.8.4   , openMPI= 1.5.5  , OS = ubuntu 11.04
>> I have 2 Node :
>>  N05(Master ,it have NFS shared file system),N07(slave
>>  ,mount Master-Node).
>>
>>  My configure format=./configure --prefix=/root/kidd_openMPI
>>  --with-ft=cr --enable-ft-thread  --with-blcr=/usr/local/BLCR
>>  --with-blcr-libdir=/usr/local/BLCR/lib --enable-mpirun-prefix-by-default
>>  --enable-static --enable-shared --enable-opal-multi-threads;
>>
>>  I had also set  ~/.openmpi/mca-params.conf->
>>     crs_base_snapshot_dir=/root/kidd_openMPI/Tmp
>>     snapc_base_global_snapshot_dir=/root/kidd_openMPI/checkpoints.
>>
>> the dir->kidd_openMPI is my nfs shared dir.
>>
>>  My Command :
>>  1. mpicc -o TEST -DDEFSIZE=3000 -DDEF_PROC=2 -fopenmp MPIMatrix.c
>>
>>   2. mpirun -hostfile Hosts -am ft-enable-cr -x LD_LIBRARY_PATH
>>      -np 2 ./TEST .
>>
>>  I can restart process-0 on Master,but process-1 on N07 was failed.
>>
>>  I checked my Node,it does not install the prelink,
>>  so the error(restart-failed) is caused by other reasons.
>>
>>  Error Message-->
>>
>> --
>>   root@cuda05:~/kidd_openMPI/checkpoints#
>>  ompi-restart -hostfile Hosts ompi_global_snapshot_2892.ckpt/
>>
>> --

Re: [OMPI users] Ompi-restart failed and process migration

2012-04-23 Thread Josh Hursey
I wonder if the LD_LIBRARY_PATH is not being set properly upon
restart. In your mpirun you pass the '-x LD_LIBRARY_PATH'.
ompi-restart will not pass that variable along for you, so if you are
using that to set the BLCR path this might be your problem.

A couple solutions:
 - have the PATH and LD_LIBRARY_PATH set the same on all nodes
 - have ompi-restart pass the -x parameter to the underlying mpirun by
using the -mpirun_opts command line switch:
   ompi-restart --mpirun_opts "-x LD_LIBRARY_PATH" ...

Yes. ompi-restart will let you checkpoint a process on one node and
restart it on another. You will have to restart the whole application
since the ompi-migration operation is not available in the 1.5 series.

-- Josh

On Sat, Apr 21, 2012 at 4:11 AM, kidd  wrote:
> Hi all,
> I have Some problems,I wana check/Restart Multiple process on 2 node.
>
>  My environment:
>  BLCR= 0.8.4   , openMPI= 1.5.5  , OS = ubuntu 11.04
> I have 2 Node :
>  N05(Master ,it have NFS shared file system),N07(slave
>  ,mount Master-Node).
>
>  My configure format=./configure --prefix=/root/kidd_openMPI
>  --with-ft=cr --enable-ft-thread  --with-blcr=/usr/local/BLCR
>  --with-blcr-libdir=/usr/local/BLCR/lib --enable-mpirun-prefix-by-default
>  --enable-static --enable-shared --enable-opal-multi-threads;
>
>   I had also set  ~/.openmpi/mca-params.conf->
>     crs_base_snapshot_dir=/root/kidd_openMPI/Tmp
>     snapc_base_global_snapshot_dir=/root/kidd_openMPI/checkpoints.
>
> the dir->kidd_openMPI is my nfs shared dir.
>
>  My Command :
>   1. mpicc -o TEST -DDEFSIZE=3000 -DDEF_PROC=2 -fopenmp MPIMatrix.c
>
>   2. mpirun -hostfile Hosts -am ft-enable-cr -x LD_LIBRARY_PATH
>  -np 2 ./TEST .
>
>   I can restart process-0 on Master,but process-1 on N07 was failed.
>
>   I checked my Node,it does not install the prelink,
>   so the error(restart-failed) is caused by other reasons.
>
>   Error Message-->
>  --
>   root@cuda05:~/kidd_openMPI/checkpoints#
>   ompi-restart -hostfile Hosts ompi_global_snapshot_2892.ckpt/
>  --
>     Error: BLCR was not able to restart the process because exec failed.
>      Check the installation of BLCR on all of the machines in your
>      system. The following information may be of help:
>   Return Code : -1
>   BLCR Restart Command : cr_restart
>   Restart Command Line : cr_restart
>  /root/kidd_openMPI/checkpoints/ompi_global_snapshot_2892.ckpt/0/
>  opal_snapshot_1.ckpt/ompi_blcr_context.2704
>  --
>  --
>  Error: Unable to obtain the proper restart command to restart from the
>     checkpoint file (opal_snapshot_1.ckpt). Returned -1.
>     Check the installation of the blcr checkpoint/restart service
>     on all of the machines in your system.
>  ###
>  problem 2: I wana let MPI-process can migration to another Node.
>  if Ompi-Restart  Multiple-Node can be successful.
>  Can restart in another new node, rather than the original node?
>example:
>  checkpoint (node1,node2,node3),then restart(node1,node3,node4).
>  or just restart(node1,node3(2-process) ).
>
>    Please help me , thanks .
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] ompi-restart failed && ompi-migrate

2012-04-11 Thread Josh Hursey
The 1.5 series does not support process migration, so there is no
ompi-migrate option there. This was only contributed to the trunk (1.7
series). However, changes to the runtime environment over the past few
months have broken this functionality. It is currently unclear when
this will be repaired. We hope to have it fixed and functional again
before the first release of the 1.7 series.

As far as your problem with ompi-restart have you checked the prelink
option on all of your nodes, per:
  https://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#prelink

-- Josh

On Tue, Apr 10, 2012 at 11:14 PM, kidd  wrote:
> Hello !
> I had some  problems .
> This is My environment
>    BLCR= 0.8.4   , openMPI= 1.5.5  , OS= ubuntu 11.04
>    I have 2 Node : cuda05(Master ,it have NFS  file system)  , cuda07(slave
> ,mount Master)
>
>    I had also set  ~/.openmpi/mca-params.conf->
>  crs_base_snapshot_dir=/root/kidd_openMPI/Tmp
>  snapc_base_global_snapshot_dir=/root/kidd_openMPI/checkpoints
>
>   my configure format=
> ./configure --prefix=/root/kidd_openMPI --with-ft=cr --enable-ft-thread
>  --with-blcr=/usr/local/BLCR  --with-blcr-libdir=/usr/local/BLCR/lib
> --enable-mpirun-prefix-by-default
>  --enable-static --enable-shared  --enable-opal-multi-threads;
>
> problem 1:  ompi-restart  on multiple Node
>   command 01: mpirun -hostfile  Hosts -am ft-enable-cr  -x  LD_LIBRARY_PATH
> -np 2  ./TEST
>   command 02: ompi-restart  ompi_global_snapshot_2892.ckpt
>   -> I can checkpoint 2 process on multiples nodes ,but when I restart
> ,it can only restart on Master-Node.
>
>      command 03 : ompi-restart  -hostfile Hosts
> ompi_global_snapshot_2892.ckpt
>     ->Error Message .   I make sure BLCR  is OK.
> 
>
> --
>     root@cuda05:~/kidd_openMPI/checkpoints# ompi-restart -hostfile Hosts
> ompi_global_snapshot_2892.ckpt/
>
> --
>    Error: BLCR was not able to restart the process because exec failed.
>     Check the installation of BLCR on all of the machines in your
>    system. The following information may be of help:
>  Return Code : -1
>  BLCR Restart Command : cr_restart
>  Restart Command Line : cr_restart
> /root/kidd_openMPI/checkpoints/ompi_global_snapshot_2892.ckpt/0/opal_snapshot_1.ckpt/ompi_blcr_context.2704
> --
> --
> Error: Unable to obtain the proper restart command to restart from the
>    checkpoint file (opal_snapshot_1.ckpt). Returned -1.
>    Check the installation of the blcr checkpoint/restart service
>    on all of the machines in your system.essage
> 
>  problem 2: ompi-migrate i can't find .   How to use ompi-migrate ?
>
>   Please help me , thanks .
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Segmentation fault when checkpointing

2012-03-29 Thread Josh Hursey
This is a bit of a non-answer, but can you try the 1.5 series (1.5.5
in the current release)? 1.4 is being phased out, and 1.5 will replace
it in the near future. 1.5 has a number of C/R related fixes that
might help.

-- Josh

On Thu, Mar 29, 2012 at 1:12 PM, Linton, Tom  wrote:
> We have a legacy application that runs fine on our cluster using Intel MPI
> with hundreds of cores. We ported it to OpenMPI so that we could use BLCR
> and it runs fine but checkpointing is not working properly:
>
>
>
> 1. when we checkpoint with more than 1 core, each MPI rank reports a
> segmentation fault for the MPI job and the ompi-checkpoint command does not
> return. For example, with two cores we get:
>
> [tscco28017:16352] *** Process received signal ***
>
> [tscco28017:16352] Signal: Segmentation fault (11)
>
> [tscco28017:16352] Signal code: Address not mapped (1)
>
> [tscco28017:16352] Failing at address: 0x7fffef51
>
> [tscco28017:16353] *** Process received signal ***
>
> [tscco28017:16353] Signal: Segmentation fault (11)
>
> [tscco28017:16353] Signal code: Address not mapped (1)
>
> [tscco28017:16353] Failing at address: 0x7fffef51
>
> [tscco28017:16353] [ 0] /lib64/libpthread.so.0(+0xf5d0) [0x7698e5d0]
>
> [tscco28017:16353] [ 1] [0xf500b0]
>
> [tscco28017:16353] *** End of error message ***
>
> [tscco28017:16352] [ 0] /lib64/libpthread.so.0(+0xf5d0) [0x7698e5d0]
>
> [tscco28017:16352] [ 1] [0xf500b0]
>
> [tscco28017:16352] *** End of error message ***
>
> --
>
> mpirun noticed that process rank 1 with PID 16353 on node tscco28017 exited
> on signal 11 (Segmentation fault).
>
> --
>
> When I execute the TotalView debugger on a resulting core file (I assume
> it’s for the rank 0 process), Totalview reports a null frame pointer and the
> stack is trashed (gdb shows a backtrace with 30 frames but shows no debug
> info).
>
>
>
> 2. Checkpointing with 1 core on the legacy program works.
>
> 3. Checkpointing with a simple test program on 16 cores works.
>
>
>
>
>
> Can you suggest how to debug this problem?
>
>
>
> Some additional information:
>
>
>
> ·    I execute the program like this: mpirun -am ft-enable-cr -n 2
> -machinefile machines program inputfile
>
> ·    We are using Open MPI 1.4.4 with BLCR 0.8.4
>
> ·    OpenMPI and the application were both compiled on the same machine
> using the Intel icc 12.0.4 compiler
>
> ·    For the failing example, both MPI processes are running on cores on
> the same machine node.
>
> ·    I have attached “ompi_info.txt”
>
> ·    We’re running on a single Xeon 5150 node with Gigabit Ethernet.
>
> ·    [Reuti: previously I reported a problem involving illegal
> instructions but this turned out to be a build problem. Sorry I didn’t
> answer your response to my previous thread but I was having problems with
> accessing this email list at that time.]
>
>
>
> Thanks
>
> Tom
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] MPI_Barrier in Self-checkpointing call

2012-02-15 Thread Josh Hursey
When you receive that callback the MPI has ben put in a quiescent state. As
such it does not allow MPI communication until the checkpoint is completely
finished. So you cannot call barrier in the checkpoint callback. Since Open
MPI did doing a coordinated checkpoint, you can assume that all processes
are calling the same callback at about the same time (the coordination
algorithm synchronizes them for you)

If you would like a notification callback before the quiescence protocol
you might want to look at the INC callbacks:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_inc_register_callback
They are available in the Open MPI trunk (v1.7). The
OMPI_CR_INC_PRE_CRS_PRE_MPI
callback will give you immediate notice, and you -should- be able to make
MPI calls in that callback. I have not tried it, but conceptually it should
work. If it does not, I can file a bug ticket and we can look into
addressing it.

-- Josh

On Wed, Feb 15, 2012 at 4:23 AM, Faisal Shahzad wrote:

>  Dear Group,
>
> I wanted to do a synchronization check with 'MPI_Barrier(MPI_COMM_WORLD)'
> in 'opal_crs_self_user_checkpoint(char **restart_cmd)' call. Although every
> process is present in this call, it fails to synchronize. Is there any
> reason why cant we use barrier?
> Thanks in advance.
>
> Kind regards,
> Faisal
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] Strange recursive "no" error message when compiling 1.5 series with fault tolerance enabled

2012-01-26 Thread Josh Hursey
It looks like Jeff beat me too it. The problem was with a missing 'test' in
the configure script. I'm not sure how it creeped in there, but the fix is
in the pipeline for the next 1.5 release. The ticket to track the progress
of this patch is on the following ticket:
  https://svn.open-mpi.org/trac/ompi/ticket/2979

Thanks for the bug report!

-- Josh

On Thu, Jan 26, 2012 at 4:16 PM, Jeff Squyres  wrote:

> Doh!  That's a fun one.  Thanks for the report!
>
> I filed a fix; we'll get this in very shortly (looks like the fix is
> already on the trunk, but somehow got missed on the v1.5 branch).
>
>
> On Jan 26, 2012, at 3:42 PM, David Akin wrote:
>
> > I can build OpenMPI with FT on my system if I'm using 1.4 source, but
> > if I use any of the 1.5 series, I get hung in a strange "no" loop  at the
> > beginning of the compile (see below):
> >
> > + ./configure --build=x86_64-unknown-linux-gnu
> > --host=x86_64-unknown-linux-gnu --target=x86_64-redhat-linux-gnu
> > --program-prefix= --prefix=/usr/mpi/intel/openmpi-1.5-ckpt
> > --exec-prefix=/usr/mpi/intel/openmpi-1.5-ckpt
> > --bindir=/usr/mpi/intel/openmpi-1.5-ckpt/bin
> > --sbindir=/usr/mpi/intel/openmpi-1.5-ckpt/sbin
> > --sysconfdir=/usr/mpi/intel/openmpi-1.5-ckpt/etc
> > --datadir=/usr/mpi/intel/openmpi-1.5-ckpt/share
> > --includedir=/usr/mpi/intel/openmpi-1.5-ckpt/include
> > --libdir=/usr/mpi/intel/openmpi-1.5-ckpt/lib64
> > --libexecdir=/usr/mpi/intel/openmpi-1.5-ckpt/libexec
> > --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man
> > --infodir=/usr/share/info --enable-ft-thread --with-ft=cr
> > --enable-opal-multi-threads
> >
> > .
> > .
> > .
> >
> >
> 
> > == System-specific tests
> >
> 
> > checking checking for type of MPI_Offset... long long
> > checking checking for an MPI datatype for MPI_Offset... MPI_LONG_LONG
> > checking for _SC_NPROCESSORS_ONLN... yes
> > checking whether byte ordering is bigendian... no
> > checking for broken qsort... no
> > checking if word-sized integers must be word-size aligned... no
> > checking if C compiler and POSIX threads work as is... no
> > checking if C++ compiler and POSIX threads work as is... no
> > checking if F77 compiler and POSIX threads work as is... yes
> > checking if C compiler and POSIX threads work with -Kthread... no
> > checking if C compiler and POSIX threads work with -kthread... no
> > checking if C compiler and POSIX threads work with -pthread... yes
> > checking if C++ compiler and POSIX threads work with -Kthread... no
> > checking if C++ compiler and POSIX threads work with -kthread... no
> > checking if C++ compiler and POSIX threads work with -pthread... yes
> > checking for PTHREAD_MUTEX_ERRORCHECK_NP... yes
> > checking for PTHREAD_MUTEX_ERRORCHECK... yes
> > checking for working POSIX threads package... yes
> > checking if C compiler and Solaris threads work... no
> > checking if C++ compiler and Solaris threads work... no
> > checking if F77 compiler and Solaris threads work... no
> > checking for working Solaris threads package... no
> > checking for type of thread support... posix
> > checking if threads have different pids (pthreads on linux)... no
> > checking if want OPAL thread support... yes
> > checking if want fault tolerance thread... = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > = no
> > .
> > .
> > .
> >
> >
> > The system just keeps repeating "no" over and over infinitely.
> >
> > I'm on RHEL6 2.6.32-220.2.1.el6.x86_64. I've tried the
> > following OpenMPI 1.5 series tarballs with the same results:
> >
> > openmpi-1.5.5rc1.tar.bz2
> > openmpi-1.5.5rc2r25765.tar.bz2
> > openmpi-1.5.5rc2r25773.tar.bz2
> >
> > Any guidance is appreciated.
> > Thanks!
> > Dave
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>


-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] Strange recursive "no" error message when compiling 1.5 series with fault tolerance enabled

2012-01-26 Thread Josh Hursey
Well that is awfully insistent. I have been able to reproduce the problem.
Upon initial inspection I don't see the bug, but I'll dig into it today and
hopefully have a patch in a bit. Below is a ticket for this bug:
  https://svn.open-mpi.org/trac/ompi/ticket/2980

I'll let you know what I find out.

-- Josh

On Thu, Jan 26, 2012 at 3:42 PM, David Akin  wrote:

> I can build OpenMPI with FT on my system if I'm using 1.4 source, but
> if I use any of the 1.5 series, I get hung in a strange "no" loop  at the
> beginning of the compile (see below):
>
> + ./configure --build=x86_64-unknown-linux-gnu
> --host=x86_64-unknown-linux-gnu --target=x86_64-redhat-linux-gnu
> --program-prefix= --prefix=/usr/mpi/intel/openmpi-1.5-ckpt
> --exec-prefix=/usr/mpi/intel/openmpi-1.5-ckpt
> --bindir=/usr/mpi/intel/openmpi-1.5-ckpt/bin
> --sbindir=/usr/mpi/intel/openmpi-1.5-ckpt/sbin
> --sysconfdir=/usr/mpi/intel/openmpi-1.5-ckpt/etc
> --datadir=/usr/mpi/intel/openmpi-1.5-ckpt/share
> --includedir=/usr/mpi/intel/openmpi-1.5-ckpt/include
> --libdir=/usr/mpi/intel/openmpi-1.5-ckpt/lib64
> --libexecdir=/usr/mpi/intel/openmpi-1.5-ckpt/libexec
> --localstatedir=/var --sharedstatedir=/var/lib --mandir=/usr/share/man
> --infodir=/usr/share/info --enable-ft-thread --with-ft=cr
> --enable-opal-multi-threads
>
> .
> .
> .
>
>
> 
> == System-specific tests
>
> 
> checking checking for type of MPI_Offset... long long
> checking checking for an MPI datatype for MPI_Offset... MPI_LONG_LONG
> checking for _SC_NPROCESSORS_ONLN... yes
> checking whether byte ordering is bigendian... no
> checking for broken qsort... no
> checking if word-sized integers must be word-size aligned... no
> checking if C compiler and POSIX threads work as is... no
> checking if C++ compiler and POSIX threads work as is... no
> checking if F77 compiler and POSIX threads work as is... yes
> checking if C compiler and POSIX threads work with -Kthread... no
> checking if C compiler and POSIX threads work with -kthread... no
> checking if C compiler and POSIX threads work with -pthread... yes
> checking if C++ compiler and POSIX threads work with -Kthread... no
> checking if C++ compiler and POSIX threads work with -kthread... no
> checking if C++ compiler and POSIX threads work with -pthread... yes
> checking for PTHREAD_MUTEX_ERRORCHECK_NP... yes
> checking for PTHREAD_MUTEX_ERRORCHECK... yes
> checking for working POSIX threads package... yes
> checking if C compiler and Solaris threads work... no
> checking if C++ compiler and Solaris threads work... no
> checking if F77 compiler and Solaris threads work... no
> checking for working Solaris threads package... no
> checking for type of thread support... posix
> checking if threads have different pids (pthreads on linux)... no
> checking if want OPAL thread support... yes
> checking if want fault tolerance thread... = no
> = no
> = no
> = no
> = no
> = no
> = no
> = no
> = no
> = no
> = no
> = no
> = no
> .
> .
> .
>
>
> The system just keeps repeating "no" over and over infinitely.
>
>  I'm on RHEL6 2.6.32-220.2.1.el6.x86_64. I've tried the
> following OpenMPI 1.5 series tarballs with the same results:
>
> openmpi-1.5.5rc1.tar.bz2
> openmpi-1.5.5rc2r25765.tar.bz2
> openmpi-1.5.5rc2r25773.tar.bz2
>
> Any guidance is appreciated.
> Thanks!
> Dave
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>


-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] MPI_Comm_create with unequal group arguments

2012-01-20 Thread Josh Hursey
For MPI_Comm_create -all- processes in the communicator must make the call,
not just those that are in the subgroups. The 2.2 standard states that
  "The function is collective and must be called by all processes in the
group of comm."

However, this is a common misconception about the MPI_Comm_create
interface, and has encouraged the MPI Forum standardization body to
consider an additional interface that just requires those processes in the
specified groups to make the call.

If you are interested in that proposal below are a few links that you might
find informative:
 -
http://meetings.mpi-forum.org/secretary/2012/01/slides/ticket_286_jan_2012_presentation.pdf
 - https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/286

The ticket includes a link to an EuroMPI paper regarding the prototype, and
the specific language being proposed.

-- Josh

2012/1/20 Jens Jørgen Mortensen 

>  On 20-01-2012 15:26, Josh Hursey wrote:
>
> That behavior is permitted by the MPI 2.2 standard. It seems that our
> documentation is incorrect in this regard. I'll file a bug to fix it.
>
>  Just to clarify, in the MPI 2.2 standard in Section 6.4.2 (Communicator
> Constructors) under MPI_Comm_create it states:
> "Each process must call with a group argument that is a subgroup of the
> group associated with comm; this could be MPI_GROUP_EMPTY. The processes
> may specify different values for the group argument. If a process calls
> with a non-empty group then all processes in that group must call the
> function with the same group as argument, that is the same processes in the
> same order. Otherwise the call is erroneous."
>
>  Thanks for reporting the man page bug.
>
>
> Thanks for the quick reply.
>
> Is it also allowed to call MPI_Comm_create only on those processes that
> are in the sub-group?  This seems to work also.  Or must one always call
> MPI_Comm_create on all processes in comm - as the description says.
>
> Jens Jørgen
>
>
>
>  -- Josh
>
> 2012/1/20 Jens Jørgen Mortensen 
>
>> Hi!
>>
>> For a long time, I have been calling MPI_Comm_create(comm, group,
>> newcomm) with different values for group on the different processes of
>> comm.  In pseudo-code, I would create two sub-communicators from a world
>> with 4 ranks like this:
>>
>> if world.rank < 2:
>>comm = world.create([0, 1])
>> else:
>>comm = world.create([2, 3])
>>
>> Now I read from the MPI_Comm_create description that this way of calling
>> MPI_Comm_create is erroneous:
>>
>>  "The call is erroneous if not all group arguments have the same value"
>>
>>  http://www.open-mpi.org/doc/v1.4/man3/MPI_Comm_create.3.php#toc7
>>
>> So, I guess I have just been lucky that it has worked for me?  Or is it
>> OK to do what I do?
>>
>> Jens Jørgen
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
>
>  --
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] MPI_Comm_create with unequal group arguments

2012-01-20 Thread Josh Hursey
That behavior is permitted by the MPI 2.2 standard. It seems that our
documentation is incorrect in this regard. I'll file a bug to fix it.

Just to clarify, in the MPI 2.2 standard in Section 6.4.2 (Communicator
Constructors) under MPI_Comm_create it states:
"Each process must call with a group argument that is a subgroup of the
group associated with comm; this could be MPI_GROUP_EMPTY. The processes
may specify different values for the group argument. If a process calls
with a non-empty group then all processes in that group must call the
function with the same group as argument, that is the same processes in the
same order. Otherwise the call is erroneous."

Thanks for reporting the man page bug.

-- Josh

2012/1/20 Jens Jørgen Mortensen 

> Hi!
>
> For a long time, I have been calling MPI_Comm_create(comm, group, newcomm)
> with different values for group on the different processes of comm.  In
> pseudo-code, I would create two sub-communicators from a world with 4 ranks
> like this:
>
> if world.rank < 2:
>comm = world.create([0, 1])
> else:
>comm = world.create([2, 3])
>
> Now I read from the MPI_Comm_create description that this way of calling
> MPI_Comm_create is erroneous:
>
>  "The call is erroneous if not all group arguments have the same value"
>
>  
> http://www.open-mpi.org/doc/**v1.4/man3/MPI_Comm_create.3.**php#toc7
>
> So, I guess I have just been lucky that it has worked for me?  Or is it OK
> to do what I do?
>
> Jens Jørgen
>
> __**_
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/**mailman/listinfo.cgi/users
>
>


-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] Checkpoint an MPI process

2012-01-20 Thread Josh Hursey
Rodrigo,

Open MPI has the ability to migrate a subset of processes (in the trunk -
though currently broken due to recent code movement, I'm slowing developing
the fix in my spare time). The current implementation only checkpoints the
migrating processes, but suspends all other processes during the migration
activity. There has been some work on providing more of a live migration
mechanism in Open MPI (where non-migrating processes are not suspended),
but I do not know the state of that work. The original work was integrated
into LAM/MPI by Chao Wang and Frank Mueller at North Carolina State
University and depended on some, yet, unreleased features of BLCR.

Open MPI also has the ability to suspend a job via SIGSTOP/SIGCONT without
the need for checkpoint, but it applies to the whole job. A while back, I
enhanced that feature such that a checkpoint is established before the
SIGSTOP is processed, so that a user can terminate and restart the job if
they wish instead of just being able to SIGCONT.

So these features are not quite what you are looking for, but could be used
as a starting point for future development if someone was so motivated. A
short term alternative is to use a virtual machine that provides the
migration functionality you are looking for, though at the additional cost
of a virtual machine interposition layer.

-- Josh

On Fri, Jan 20, 2012 at 8:31 AM, Rodrigo Oliveira  wrote:

> I appreciate your help.
>
> Indeed, it's better to create my own mechanism as mentioned
> Lloyd. Actually my application is a framework to stream processing
> (something like IBM System-S), in which I use Open MPI as communication
> layer and part of process management. One of this framework's features is
> to provide a dynamic load balance mechanism. In some situations I need to
> move processes between machines or temporally suspend their execution. To
> achieve this, I need a checkpoint/restart mechanism. It is the reason of my
> question.
>
> Thanks again.
>
>
> Rodrigo Silva Oliveira
> M.Sc. Student - Computer Science
> Universidade Federal de Minas Gerais
> www.dcc.ufmg.br/~rsilva <http://www.dcc.ufmg.br/%7Ersilva>
>
>
>
>
>
> On Thu, Jan 19, 2012 at 1:18 PM, Lloyd Brown  wrote:
>
>> Since you're looking for a function call, I'm going to assume that you
>> are writing this application, and it's not a pre-compiled, commercial
>> application.  Given that, it's going to be significantly better to have
>> an internal application checkpointing mechanism, where it serializes and
>> stores the data, etc., than to use an external, applicaiton-agnostic
>> checkpointing mechanism like BLCR or similar.  The application should be
>> aware of what data is important, how to most efficiently store it, etc.
>>  A generic library has to assume that everything is important, and store
>> it all.
>>
>> Don't get me wrong.  Libraries like BLCR are great for applications that
>> don't have that visibility, and even as a tool for the
>> application-internal checkpointing mechanism (where the application
>> deliberately interacts with the library to annotate what's important to
>> store, and how to do so, etc.).  But if you're writing the application,
>> you're better off to handle it internally, than externally.
>>
>> Lloyd Brown
>> Systems Administrator
>> Fulton Supercomputing Lab
>> Brigham Young University
>> http://marylou.byu.edu
>>
>> On 01/19/2012 08:05 AM, Josh Hursey wrote:
>> > Currently Open MPI only supports the checkpointing of the whole
>> > application. There has been some work on uncoordinated checkpointing
>> > with message logging, though I do not know the state of that work with
>> > regards to availability. That work has been undertaken by the University
>> > of Tennessee Knoxville, so maybe they can provide more information.
>> >
>> > -- Josh
>> >
>> > On Wed, Jan 18, 2012 at 3:24 PM, Rodrigo Oliveira
>> > mailto:rsilva.olive...@gmail.com>> wrote:
>> >
>> > Hi,
>> >
>> > I'd like to know if there is a way to checkpoint a specific process
>> > running under an mpirun call. In other words, is there a function
>> > CHECKPOINT(rank) in which I can pass the rank of the process I want
>> > to checkpoint? I do not want to checkpoint the entire application,
>> > but just one of its processes.
>> >
>> > Thanks
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org <mailto:us...@open-mpi.org>
>> > 

Re: [OMPI users] Checkpoint an MPI process

2012-01-19 Thread Josh Hursey
Currently Open MPI only supports the checkpointing of the whole
application. There has been some work on uncoordinated checkpointing with
message logging, though I do not know the state of that work with regards
to availability. That work has been undertaken by the University of
Tennessee Knoxville, so maybe they can provide more information.

-- Josh

On Wed, Jan 18, 2012 at 3:24 PM, Rodrigo Oliveira  wrote:

> Hi,
>
> I'd like to know if there is a way to checkpoint a specific process
> running under an mpirun call. In other words, is there a function
> CHECKPOINT(rank) in which I can pass the rank of the process I want to
> checkpoint? I do not want to checkpoint the entire application, but just
> one of its processes.
>
> Thanks
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] checkpointing on other transports

2012-01-17 Thread Josh Hursey
I have not tried to support a MTL with the checkpointing functionality, so
I do not have first hand experience with those - just the OB1/BML/BTL stack.

The difficulty in porting to a new transport is really a function of how
the transport interacts with the checkpointer (e.g., BLCR). The draining
logic is handled above the PML level (in the CRCP framework), so the MTL
would only have to implement a ft_event() handler. The ft_event() handler
needs to (1) prepare the transport for checkpointing (the channel is know
to be clear at this point, but you may have to handle registered memory and
things like that), (2) continue operation after a checkpoint in the same
process image, and (3) restarting the transport on recovery into a new
process image (usually something like reinitializing the driver).

The easiest way to implement these is to shutdown the driver on checkpoint
prep (something like a finalize function) and reinitialize it on
continue/restart phases (something like an init function). Depending on the
transport driver you might be able to do something better (like we do for
tcp and sm), but it is really transport driver specific.

If you decide to dig into this, let me know how it goes and if I can be of
further help.

-- Josh

On Thu, Jan 12, 2012 at 8:16 AM, Dave Love  wrote:

> What would be involved in adding checkpointing to other transports,
> specifically the PSM MTL?  Are there (likely to be?) technical
> obstacles, and would it be a lot of work if not?  I'm asking in case it
> would be easy, and we don't have to exclude QLogic from a procurement,
> given they won't respond about open-mpi support.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>


-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] segfault when resuming on different host

2011-12-29 Thread Josh Hursey
Often this type of problem is due to the 'prelink' option in Linux.
BLCR has a FAQ item that discusses this issue and how to resolve it:
  https://upc-bugs.lbl.gov/blcr/doc/html/FAQ.html#prelink

I would give that a try. If that does not help then you might want to
try checkpointing a single (non-MPI) process on one node with BLCR and
restart it on the other node. If that fails, then it is likely a
BLCR/system configuration issue that is the cause. If it does work,
then we can dig more into the Open MPI causes.

Let me know if disabling prelink works for you.

-- Josh

On Thu, Dec 29, 2011 at 1:19 PM, Lloyd Brown  wrote:
> Hi, all.
>
> I'm in the middle of testing some of the checkpoint/restart capabilities
> of OpenMPI with BLCR on our cluster.  I've been able to checkpoint and
> restart successfully when I restart on the same nodes as it was running
> previously.  But when I try to restart on a different host, I always get
> an error like this:
>
>> $ ompi-restart ompi_global_snapshot_15935.ckpt
>> --
>> mpirun noticed that process rank 1 with PID 15201 on node m5stage-1-2.local 
>> exited on signal 11 (Segmentation fault).
>> --
>
>
> Now, it's very possible that I've missed something during the setup, or
> that despite my failure to find it while searching the mailing list,
> that this is already answered somewhere, but none of the threads I could
> find seemed to apply (eg. cr_restart *is* installed, etc.).
>
> I'm attaching a tarball that contains the source code of the very-simple
> test application, as well as some example output of "ompi_info --all"
> and "ompi_info -v ompi full --parsable".  I don't know if this will be
> useful or not.
>
> This is being tested on CentOS v5.4 with BLCR v0.8.4.  I've seen this
> problem with OpenMPI v1.4.2, v1.4.4, and v1.5.4.
>
> If anyone has any ideas on what's going on, or how to best debug this,
> I'd love to hear about it.
>
> I don't mind doing the legwork too, but I'm just stumped where to go
> from here.  I have some core files, but I'm having trouble getting the
> symbols from the backtrace in gdb.  Maybe I'm doing it wrong.
>
>
> TIA,
>
> --
> Lloyd Brown
> Systems Administrator
> Fulton Supercomputing Lab
> Brigham Young University
> http://marylou.byu.edu
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] MPI_COMM_split hanging

2011-12-12 Thread Josh Hursey
For MPI_Comm_split, all processes in the input communicator (oldcomm
or MPI_COMM_WORLD in your case) must call the operation since it is
collective over the input communicator. In your program rank 0 is not
calling the operation, so MPI_Comm_split is waiting for it to
participate.

If you want rank 0 to be excluded from the any of the communicators,
you can give it a special color that is distinct from all other ranks.
Upon return from MPI_Comm_split, rank 0 will be given a new
communicator containing just one processes, itself. If you do not
intend to use that communicator you can free it immediately
afterwards.

Hope that helps,
Josh


On Fri, Dec 9, 2011 at 6:52 PM, Gary Gorbet  wrote:
> I am attempting to split my application into multiple master+workers
> groups using MPI_COMM_split. My MPI revision is shown as:
>
> mpirun --tag-output ompi_info -v ompi full --parsable
> [1,0]:package:Open MPI root@build-x86-64 Distribution
> [1,0]:ompi:version:full:1.4.3
> [1,0]:ompi:version:svn:r23834
> [1,0]:ompi:version:release_date:Oct 05, 2010
> [1,0]:orte:version:full:1.4.3
> [1,0]:orte:version:svn:r23834
> [1,0]:orte:version:release_date:Oct 05, 2010
> [1,0]:opal:version:full:1.4.3
> [1,0]:opal:version:svn:r23834
> [1,0]:opal:version:release_date:Oct 05, 2010
> [1,0]:ident:1.4.3
>
> The basic problem I am having is that none of processor instances ever
> returns from the MPI_COMM_split call. I am pretty new to MPI and it is
> likely I am not doing things quite correctly. I'd appreciate some guidance.
>
> I am working with an application that has functioned nicely for a while
> now. It only uses a single MPI_COMM_WORLD communicator. It is standard
> stuff:  a master that hands out tasks to many workers, receives output
> and keeps track of workers that are ready to receive another task. The
> tasks are quite compute-intensive. When running a variation of the
> process that uses Monte Carlo iterations, jobs can exceed the 30 hours
> they are limited to. The MC iterations are independent of each other -
> adding random noise to an input - so I would like to run multiple
> iterations simultaneously so that 4 times the cores runs in a fourth of
> the time. This would entail a supervisor interacting with multiple
> master+workers groups.
>
> I had thought that I would just have to declare a communicator for each
> group so that broadcasts and syncs would work within a single group.
>
>   MPI_Comm_size( MPI_COMM_WORLD, &total_proc_count );
>   MPI_Comm_rank( MPI_COMM_WORLD, &my_rank );
>   ...
>   cores_per_group = total_proc_count / groups_count;
>   my_group = my_rank / cores_per_group;     // e.g., 0, 1, ...
>   group_rank = my_rank - my_group * cores_per_group;  // rank within a
> group
>   if ( my_rank == 0 )    continue;    // Do not create group for supervisor
>   MPI_Comm oldcomm = MPI_COMM_WORLD;
>   MPI_Comm my_communicator;    // Actually declared as a class variable
>   int sstat = MPI_Comm_split( oldcomm, my_group, group_rank,
>         &my_communicator );
>
> There is never a return from the above _split() call. Do I need to do
> something else to set this up? I would have expected perhaps a non-zero
> status return, but not that I would get no return at all. I would
> appreciate any comments or guidance.
>
> - Gary
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Process Migration

2011-11-10 Thread Josh Hursey
Note that the "migrate me from my current node to node " scenario
is covered by the migration API exported by the C/R infrastructure, as
I noted earlier.
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_migrate

The "move rank N to node " scenario could probably be added as an
extension of this interface (since you can do that via the command
line now) if that is what you are looking for.

-- Josh

On Thu, Nov 10, 2011 at 11:03 AM, Ralph Castain  wrote:
> So what you are looking for is an MPI extension API that let's you say
> "migrate me from my current node to node "? Or do you have a rank that
> is the "master" that would order "move rank N to node "?
> Either could be provided, I imagine - just want to ensure I understand what
> you need. Can you pass along a brief description of the syntax and
> functionality you would need?
>
> On Nov 10, 2011, at 8:27 AM, Mudassar Majeed wrote:
>
> Thank you for your reply. In our previous publication, we have figured it
> out that run more than one processes on cores and balancing the
> computational load considerably reduces the total execution time. You know
> the MPI_Graph_create function, we created another function MPI_Load_create
> that maps the processes on cores such that balance of computational load can
> be achieved on cores. We were having some issues with increase in
> communication cost due to ranks rearrangements (due to MPI_Comm_split, with
> color=0), so in this research work we will see how can we balance both
> computation load on each core and communication load on each node. Those
> processes that communicate more will reside on the same node keeping the
> computational load balance over the cores. I solved this problem using ILP
> but ILP takes time and can't be used in run time so I am thinking about an
> heuristic. That's why I want to see if it is possible to migrate a process
> from one core to another or not. Then I will see how good my heuristic will
> be.
>
> thanks
> Mudassar
>
> 
> From: Jeff Squyres 
> To: Mudassar Majeed ; Open MPI Users
> 
> Cc: Ralph Castain 
> Sent: Thursday, November 10, 2011 2:19 PM
> Subject: Re: [OMPI users] Process Migration
>
> On Nov 10, 2011, at 8:11 AM, Mudassar Majeed wrote:
>
>> Thank you for your reply. I am implementing a load balancing function for
>> MPI, that will balance the computation load and the communication both at a
>> time. So my algorithm assumes that all the cores may at the end get
>> different number of processes to run.
>
> Are you talking about over-subscribing cores?  I.e., putting more than 1 MPI
> process on each core?
>
> In general, that's not a good idea.
>
>> In the beginning (before that function will be called), each core will
>> have equal number of processes. So I am thinking either to start more
>> processes on each core (than needed) and run my function for load balancing
>> and then block the remaining processes (on each core). In this way I will be
>> able to achieve different number of processes per core.
>
> Open MPI spins aggressively looking for network progress.  For example, if
> you block in an MPI_RECV waiting for a message, Open MPI is actively banging
> on the CPU looking for network progress.  Because of this (and other
> reasons), you probably do not want to over-subscribe your processors
> (meaning: you probably don't want to put more than 1 process per core).
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Process Migration

2011-11-10 Thread Josh Hursey
The MPI standard does not provide explicit support for process
migration. However, some MPI implementations (including Open MPI) have
integrated such support based on checkpoint/restart functionality. For
more information about the checkpoint/restart process migration
functionality in Open MPI see the links below:
  http://osl.iu.edu/research/ft/ompi-cr/
  http://osl.iu.edu/research/ft/ompi-cr/tools.php#ompi-migrate

I even implemented an MPI Extensions API to this functionality so you
can call it from within your application:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_migrate

These pieces of functionality are currently only available in the Open
MPI development trunk.

-- Josh

On Thu, Nov 10, 2011 at 8:19 AM, Jeff Squyres  wrote:
> On Nov 10, 2011, at 8:11 AM, Mudassar Majeed wrote:
>
>> Thank you for your reply. I am implementing a load balancing function for 
>> MPI, that will balance the computation load and the communication both at a 
>> time. So my algorithm assumes that all the cores may at the end get 
>> different number of processes to run.
>
> Are you talking about over-subscribing cores?  I.e., putting more than 1 MPI 
> process on each core?
>
> In general, that's not a good idea.
>
>> In the beginning (before that function will be called), each core will have 
>> equal number of processes. So I am thinking either to start more processes 
>> on each core (than needed) and run my function for load balancing and then 
>> block the remaining processes (on each core). In this way I will be able to 
>> achieve different number of processes per core.
>
> Open MPI spins aggressively looking for network progress.  For example, if 
> you block in an MPI_RECV waiting for a message, Open MPI is actively banging 
> on the CPU looking for network progress.  Because of this (and other 
> reasons), you probably do not want to over-subscribe your processors 
> (meaning: you probably don't want to put more than 1 process per core).
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] configure blcr errors openmpi 1.4.4

2011-10-31 Thread Josh Hursey
I wonder if the try_compile step is failing. Can you send a compressed
copy of your config.log from this build?

-- Josh

On Mon, Oct 31, 2011 at 10:04 AM,   wrote:
> Hi !
>
> I am trying to compile  openmpi 1.4.4 with  Torque, Infiniband and blcr
> checkpoint support  on  Puias Linux 6.x  (free derivate of RHEL 6x).
>
> all packages of blcr  are installed  (including header files blcr-devel)
>
> even torque ..
>
> torque-libs,  openib libraries are found ..
>
> But, when  executing the ./configure script it does not find  the right
> libraries and header files even though they are located  under  the
> specified path.
>
>  ./configure --with-tm=/usr/local/ --with-openib
> --prefix=/usr/mpi/gcc/openmpi-1.4.4  --with-blcr-libdir=/usr/lib64/
> --with-blcr=/usr
>
>
> last few lines of output of   the ./configure run :
>
> "..
> checking for MCA component crs:blcr compile mode... dso
> checking --with-blcr value... sanity check ok (/usr)
> checking --with-blcr-libdir value... sanity check ok (/usr/lib64/)
> configure: WARNING: BLCR support requested but not found.  Perhaps you
> need to specify the location of the BLCR libraries.
> configure: error: Aborting.
> .."
>
>
> [root@gpu01 openmpi-1.4.4]# rpm -ql blcr-devel
> /usr/include/blcr_common.h
> /usr/include/blcr_errcodes.h[root@gpu01 openmpi-1.4.4]#[root@gpu01
> openmpi-1.4.4]# rpm  -ql blcr-libs
> /usr/lib64/libcr.so.0
> /usr/lib64/libcr.so.0.5.2
> /usr/lib64/libcr_omit.so.0
> /usr/lib64/libcr_omit.so.0.5.2
> /usr/lib64/libcr_run.so.0
> /usr/lib64/libcr_run.so.0.5.2
> /usr/share/doc/blcr-libs-0.8.2 rpm  -ql blcr-libs
> /usr/lib64/libcr.so.0
> /usr/lib64/libcr.so.0.5.2
> /usr/lib64/libcr_omit.so.0
> /usr/lib64/libcr_omit.so.0.5.2
> /usr/lib64/libcr_run.so.0
> /usr/lib64/libcr_run.so.0.5.2
> /usr/share/doc/blcr-libs-0.8.2
> /usr/include/blcr_ioctl.h
> /usr/include/blcr_proc.h
> /usr/include/libcr.h
> /usr/lib64/libcr.so
> /usr/lib64/libcr_omit.so
> /usr/lib64/libcr_run.so
> /usr/share/doc/blcr-devel-0.8.2
> ..."
>
> "..
> [root@gpu01 openmpi-1.4.4]# rpm  -ql blcr-libs
> /usr/lib64/libcr.so.0
> /usr/lib64/libcr.so.0.5.2
> /usr/lib64/libcr_omit.so.0
> /usr/lib64/libcr_omit.so.0.5.2
> /usr/lib64/libcr_run.so.0
> /usr/lib64/libcr_run.so.0.5.2
> /usr/share/doc/blcr-libs-0.8.2
> .."
>
>
> So, how can I  set  the right options  for my  requests  in the
> configure-script what exactly is it looking for (concerning blcr files ..)
>
> Any help is appreciated,
>
> Thanks and greetings from Salzburg/Austria/Europe
>
> Vlad Popa
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Checkpoint from inside MPI program with OpenMPI 1.4.2 ?

2011-10-26 Thread Josh Hursey
Since this would be a new feature for 1.4, we cannot move it since the
1.4 branch is for bug fixes only. However, we may be able to add it to
1.5. I filed a ticket if you want to track that progress:
  https://svn.open-mpi.org/trac/ompi/ticket/2895

-- Josh


On Tue, Oct 25, 2011 at 11:52 PM, Nguyen Toan  wrote:
> Dear Josh,
> Thank you. I will test the 1.7 trunk as you suggested.
> Also I want to ask if we can add this interface to OpenMPI 1.4.2,
> because my applications are mainly involved in this version.
> Regards,
> Nguyen Toan
> On Wed, Oct 26, 2011 at 3:25 AM, Josh Hursey  wrote:
>>
>> Open MPI (trunk/1.7 - not 1.4 or 1.5) provides an application level
>> interface to request a checkpoint of an application. This API is
>> defined on the following website:
>>  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_checkpoint
>>
>> This will behave the same as if you requested the checkpoint of the
>> job from the command line.
>>
>> -- Josh
>>
>> On Mon, Oct 24, 2011 at 12:37 PM, Nguyen Toan 
>> wrote:
>> > Dear all,
>> > I want to automatically checkpoint an MPI program with OpenMPI ( I'm
>> > currently using 1.4.2 version with BLCR 0.8.2),
>> > not by manually typing ompi-checkpoint command line from another
>> > terminal.
>> > So I would like to know if there is a way to call checkpoint function
>> > from
>> > inside an MPI program
>> > with OpenMPI or how to do that.
>> > Any ideas are very appreciated.
>> > Regards,
>> > Nguyen Toan
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Checkpoint from inside MPI program with OpenMPI 1.4.2 ?

2011-10-25 Thread Josh Hursey
Open MPI (trunk/1.7 - not 1.4 or 1.5) provides an application level
interface to request a checkpoint of an application. This API is
defined on the following website:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_checkpoint

This will behave the same as if you requested the checkpoint of the
job from the command line.

-- Josh

On Mon, Oct 24, 2011 at 12:37 PM, Nguyen Toan  wrote:
> Dear all,
> I want to automatically checkpoint an MPI program with OpenMPI ( I'm
> currently using 1.4.2 version with BLCR 0.8.2),
> not by manually typing ompi-checkpoint command line from another terminal.
> So I would like to know if there is a way to call checkpoint function from
> inside an MPI program
> with OpenMPI or how to do that.
> Any ideas are very appreciated.
> Regards,
> Nguyen Toan
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] Question regarding mpirun options with ompi-restart

2011-10-18 Thread Josh Hursey
That option is only available on the trunk at the moment. I filed a
ticket to move the functionality to the 1.5 branch:
  https://svn.open-mpi.org/trac/ompi/ticket/2890

The work around would be to take the appfile generated from
"ompi-restart --apponly ompi_snapshot...", and then run mpirun with
the options followed by the generated appfile.
 shell$ mpirun -np 10 -npernode 2 appfile_from_ompi_restart

Not as pretty, but it should work.

-- Josh

On Tue, Oct 18, 2011 at 11:28 AM, Faisal Shahzad  wrote:
> I am using openmpi v1.5.3
> regards,
> Faisal
>
>> Date: Tue, 18 Oct 2011 11:17:33 -0400
>> From: jjhur...@open-mpi.org
>> To: us...@open-mpi.org
>> Subject: Re: [OMPI users] Question regarding mpirun options with
>> ompi-restart
>>
>> That command line option may be only available on the trunk. What
>> version of Open MPI are you using?
>>
>> -- Josh
>>
>> On Tue, Oct 18, 2011 at 11:14 AM, Faisal Shahzad 
>> wrote:
>> > Hi,
>> > Thank you for your reply.
>> > I actually do not see option flag '--mpirun_opts' with 'ompi-restart
>> > --help'.
>> > Besides, I could only find 'mpirun_opts'
>> > in /ompi-trunk/orte/tools/orte-restart/
>> >
>> > https://svn.open-mpi.org/source/search?q=mpirun_opts+&defs=&refs=&path=&hist=
>> > Kind regards,
>> > Faisal
>> >> Date: Tue, 18 Oct 2011 10:01:25 -0400
>> >> From: jjhur...@open-mpi.org
>> >> To: us...@open-mpi.org
>> >> Subject: Re: [OMPI users] Question regarding mpirun options with
>> >> ompi-restart
>> >>
>> >> I'll preface my response with the note that I have not tried any of
>> >> those options with the C/R functionality. It should just work, but I
>> >> am not 100% certain. If it doesn't, let me know and I'll file a bug to
>> >> fix it.
>> >>
>> >> You can pass any mpirun option through ompi-restart by using the
>> >> --mpirun_opts option.
>> >> http://osl.iu.edu/research/ft/ompi-cr/tools.php#ompi-restart
>> >>
>> >> So something like:
>> >> shell$ ompi-restart --mpirun_opts "-npernode 2"
>> >> ompi-global-snapshot-1234
>> >>
>> >> -- Josh
>> >>
>> >> On Tue, Oct 18, 2011 at 7:45 AM, Faisal Shahzad 
>> >> wrote:
>> >> > Dear Group,
>> >> > I am using  openmpi/1.5.3 and using ompi-checkpoint to checkpoint my
>> >> > applicaiton. I use some mpirun option flags (-npernode, -npersocket,
>> >> > binding
>> >> > options etc. ) for mpirun. It works fine.
>> >> > My question is that is it possible to specify these mpirun options
>> >> > (-npernode, -npersocket, binding options etc. ) for ompi-restart?
>> >> > I will be thankful for your reply.
>> >> > Kind regards,
>> >> > Faisal
>> >> > ___
>> >> > users mailing list
>> >> > us...@open-mpi.org
>> >> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Joshua Hursey
>> >> Postdoctoral Research Associate
>> >> Oak Ridge National Laboratory
>> >> http://users.nccs.gov/~jjhursey
>> >>
>> >> ___
>> >> users mailing list
>> >> us...@open-mpi.org
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Question regarding mpirun options with ompi-restart

2011-10-18 Thread Josh Hursey
That command line option may be only available on the trunk. What
version of Open MPI are you using?

-- Josh

On Tue, Oct 18, 2011 at 11:14 AM, Faisal Shahzad  wrote:
> Hi,
> Thank you for your reply.
> I actually do not see option flag '--mpirun_opts' with 'ompi-restart
> --help'.
> Besides, I could only find 'mpirun_opts'
> in /ompi-trunk/orte/tools/orte-restart/
> https://svn.open-mpi.org/source/search?q=mpirun_opts+&defs=&refs=&path=&hist=
> Kind regards,
> Faisal
>> Date: Tue, 18 Oct 2011 10:01:25 -0400
>> From: jjhur...@open-mpi.org
>> To: us...@open-mpi.org
>> Subject: Re: [OMPI users] Question regarding mpirun options with
>> ompi-restart
>>
>> I'll preface my response with the note that I have not tried any of
>> those options with the C/R functionality. It should just work, but I
>> am not 100% certain. If it doesn't, let me know and I'll file a bug to
>> fix it.
>>
>> You can pass any mpirun option through ompi-restart by using the
>> --mpirun_opts option.
>> http://osl.iu.edu/research/ft/ompi-cr/tools.php#ompi-restart
>>
>> So something like:
>> shell$ ompi-restart --mpirun_opts "-npernode 2" ompi-global-snapshot-1234
>>
>> -- Josh
>>
>> On Tue, Oct 18, 2011 at 7:45 AM, Faisal Shahzad 
>> wrote:
>> > Dear Group,
>> > I am using  openmpi/1.5.3 and using ompi-checkpoint to checkpoint my
>> > applicaiton. I use some mpirun option flags (-npernode, -npersocket,
>> > binding
>> > options etc. ) for mpirun. It works fine.
>> > My question is that is it possible to specify these mpirun options
>> > (-npernode, -npersocket, binding options etc. ) for ompi-restart?
>> > I will be thankful for your reply.
>> > Kind regards,
>> > Faisal
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>>
>>
>>
>> --
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Question regarding mpirun options with ompi-restart

2011-10-18 Thread Josh Hursey
I'll preface my response with the note that I have not tried any of
those options with the C/R functionality. It should just work, but I
am not 100% certain. If it doesn't, let me know and I'll file a bug to
fix it.

You can pass any mpirun option through ompi-restart by using the
--mpirun_opts option.
  http://osl.iu.edu/research/ft/ompi-cr/tools.php#ompi-restart

So something like:
  shell$ ompi-restart --mpirun_opts "-npernode 2" ompi-global-snapshot-1234

-- Josh

On Tue, Oct 18, 2011 at 7:45 AM, Faisal Shahzad  wrote:
> Dear Group,
> I am using  openmpi/1.5.3 and using ompi-checkpoint to checkpoint my
> applicaiton. I use some mpirun option flags (-npernode, -npersocket, binding
> options etc. ) for mpirun. It works fine.
> My question is that is it possible to specify these mpirun options
> (-npernode, -npersocket, binding options etc. ) for ompi-restart?
> I will be thankful for your reply.
> Kind regards,
> Faisal
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] ompi-checkpoint problem on shared storage

2011-09-23 Thread Josh Hursey
It sounds like there is a race happening in the shutdown of the
processes. I wonder if the app is shutting down in a way that mpirun
does not quite like.

I have not tested the C/R functionality in the 1.4 series in a long
time. Can you give it a try with the 1.5 series, and see if there is
any variation? You might also try the trunk, but I have not tested it
recently enough to know if things are still working correctly or not
(have others?).

I'll file a ticket so we do not lose track of the bug. Hopefully we
will get to it soon.
  https://svn.open-mpi.org/trac/ompi/ticket/2872

Thanks,
Josh

On Fri, Sep 23, 2011 at 3:08 PM, Dave Schulz  wrote:
> Hi Everyone.
>
> I've been trying to figure out an issue with ompi-checkpoint/blcr.  The
> symptoms seem to be related to what filesystem the
> snapc_base_global_snapshot_dir is located on.
>
> I wrote a simple mpi program where rank 0 sends to 1, 1 to 2, etc. then the
> highest sends to 0. then it waits 1 sec and repeats.
>
> I'm using openmpi-1.4.3 and when I run "ompi-checkpoint --term
> " on the shared filesystems, the ompi-checkpoint returns a
> checkpoint reference, the worker processes go away, but the mpirun remains
> but is stuck (It dies right away if I run kill on it -- so it's responding
> to SIGTERM).  If I attach an strace to the mpirun, I get the following from
> strace forever:
>
> poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN},
> {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=0, events=POLLIN}], 6,
> 1000) = 0 (Timeout)
> poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN},
> {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=0, events=POLLIN}], 6,
> 1000) = 0 (Timeout)
> poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN},
> {fd=8, events=POLLIN}, {fd=10, events=POLLIN}, {fd=0, events=POLLIN}], 6,
> 1000) = 0 (Timeout)
>
> I'm running with:
> mpirun -machinefile machines -am ft-enable-cr ./mpiloop
> the "machines" file simply has the local hostname listed a few times.  I've
> tried 2 and 8.  I can try up to 24 as this node is a pretty big one if it's
> deemed useful.  Also, there's 256Gb of RAM.  And it's Opteron 6 core, 4
> socket if that helps.
>
>
> I initially installed this on a test system with only local harddisks and
> standard nfs on Centos 5.6 where everything worked as expected.  When I
> moved over to the production system things started breaking.  The filesystem
> is the major software difference.  The shared filesystems are Ibrix and that
> is where the above symptoms started to appear.
>
> I haven't even moved on to multi-node mpi runs as I can't even get this to
> work for any number of processes on the local machine except if I set the
> checkpoint directory to /tmp which is on a local xfs harddisk.  If I put the
> checkpoints on any shared directory, things fail.
>
> I've tried a number of *_verbose mca parameters and none of them seem to
> issue any messages at the point of checkpoint, only when I give-up and send
> kill `pidof mpirun` are there any further messages.
>
> openmpi is compiled with:
> ./configure --prefix=/global/software/openmpi-blcr
> --with-blcr=/global/software/blcr
> --with-blcr-libdir=/global/software/blcr/lib/ --with-ft=cr
> --enable-ft-thread --enable-mpi-threads --with-openib --with-tm
>
> and blcr only has a prefix to put it in /global/software/blcr otherwise it's
> vanilla.  Both are compiled with the default gcc.
>
> One final note, is that occasionally it does succeed and terminate.  But it
> seems completely random.
>
> What I'm wondering is has anyone else seen symptoms like this -- especially
> where the mpirun doesn't quit after a checkpoint with --term but the worker
> processes do?
>
> Also, is there some sort of somewhat unusual filesystem semantic that our
> shared filesystem may not support that ompi/ompi-checkpoint is needing?
>
> Thanks for any insights you may have.
>
> -Dave
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] mpiexec option for node failure

2011-09-16 Thread Josh Hursey
Though I do not share George's pessimism about acceptance to the Open
MPI community, it has been slightly difficult to add such a
non-standard feature to the code base for various reasons.

At ORNL, I have been developing a prototype for the MPI Forum Fault
Tolerance Working Group [1] of the Run-Through Stabilization proposal
[2,3]. This would allow the application to continue running and using
MPI functions even though processes fail during execution. We have
been doing some limited alpha releases for some friendly application
developers desiring to play with the prototype for a while now. We are
hoping to do a more public beta release in the coming months. I'll
likely post a message to the ompi-devel list once it is ready.

-- Josh

[1] http://svn.mpi-forum.org/trac/mpi-forum-web/wiki/FaultToleranceWikiPage
[2] See PDF on 
https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabilization
[3] See PDF on 
https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabilization_2

On Thu, Sep 15, 2011 at 4:14 PM, George Bosilca  wrote:
> Rob,
>
> The Open MPI community did consider such as option, but it deemed it as 
> uninteresting. However, we (UTK team) have a patched version supporting 
> several fault tolerant modes, including the one you described in your email. 
> If you are interested please contact me directly.
>
>  Thanks,
>    george.
>
>
> On Sep 12, 2011, at 20:43 , Ralph Castain wrote:
>
>> We don't have anything similar in OMPI. There are fault tolerance modes, but 
>> not like the one you describe.
>>
>> On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote:
>>
>>> Hi,
>>>
>>> I have implemented a simple fault tolerant ping pong C program with MPI, 
>>> here: http://pastebin.com/7mtmQH2q
>>>
>>> MPICH2 offers a parameter with mpiexec:
>>> $ mpiexec -disable-auto-cleanup
>>>
>>> .. as described here: http://trac.mcs.anl.gov/projects/mpich2/ticket/1421
>>>
>>> It is fault tolerant in the respect that, when I ssh to one of the nodes in 
>>> the hosts file, and kill the relevant process, the MPI job is not 
>>> terminated. Simply, the ping will not prompt a pong from the dead node, but 
>>> the ping-pong runs forever on the remaining live nodes.
>>>
>>> Is such an feature available for openMPI, either via mpiexec or some other 
>>> means?
>>>
>>>
>>> --
>>> Rob Stewart
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Question regarding SELF-checkpointing

2011-08-31 Thread Josh Hursey
That seems like a bug to me.

What version of Open MPI are you using? How have you setup the C/R
functionality (what MCA options do you have set, what command line
options are you using)? Can you send a small reproducing application
that we can test against?

That should help us focus in on the problem a bit.

-- Josh

On Wed, Aug 31, 2011 at 6:36 AM, Faisal Shahzad  wrote:
> Dear Group,
> I have a mpi-program in which every process is communicating with its
> neighbors. When SELF-checkpointing, every process writes to a separate file.
> Problem is that sometimes after making a checkpoint, program does not
> continue again. Having more number of processes makes this problem severe.
> With just 1 process (no communication), SEFL-checkpointing works normally
> with no problem.
> I have tried different '--mca btl' parameters (openib,tcp,sm,self), but
> problem persists.
> I would very much appreciate your support regarding it.
> Kind regards,
> Faisal
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] Related to project ideas in OpenMPI

2011-08-26 Thread Josh Hursey
There are some great comments in this thread. Process migration (like
many topics in systems) can get complex fast.

The Open MPI process migration implementation is checkpoint/restart
based (currently using BLCR), and uses an 'eager' style of migration.
This style of migration stops a process completely on the source
machine, checkpoints/terminates it, restarts it on the destination
machine, then rejoins it with the other running processes. I think the
only documentation that we have is at the webpage below (and my PhD
thesis, if you want the finer details):
  http://osl.iu.edu/research/ft/ompi-cr/

We have wanted to experiment with a 'pre-copy' or 'live' migration
style, but have not had the necessary support from the underlying
checkpointer or time to devote to making it happen. I think BLCR is
working on including the necessary pieces in a future release (there
are papers where a development version of BLCR has done this with
LAM/MPI). So that might be something of interest.

Process migration techniques can benefit from fault prediction and
'good' target destination selection. Fault prediction allows us to
move processes away from soon-to-fail locations, but it can be
difficult to accurately predict failures. Open MPI has some hooks in
the runtime layer that support 'sensors' which might help here. Good
target destination selection is equally complex, but the idea here is
to move processes to a machine where they can continue supporting the
efficient execution of the application. So this might mean moving to
the least loaded machine, or moving to a machine with other processes
to reduce interprocess communication (something like dynamic load
balancing).

So there are some ideas to get you started.

-- Josh

On Thu, Aug 25, 2011 at 12:06 PM, Rayson Ho  wrote:
> Don't know which SSI project you are referring to... I only know the
> OpenSSI project, and I was one of the first who subscribed to its
> mailing list (since 2001).
>
> http://openssi.org/cgi-bin/view?page=openssi.html
>
> I don't think those OpenSSI clusters are designed for tens of
> thousands of nodes, and not sure if it scales well to even a thousand
> nodes -- so IMO they have limited use for HPC clusters.
>
> Rayson
>
>
>
> On Thu, Aug 25, 2011 at 11:45 AM, Durga Choudhury  wrote:
>> Also, in 2005 there was an attempt to implement SSI (Single System
>> Image) functionality to the then-current 2.6.10 kernel. The proposal
>> was very detailed and covered most of the bases of task creation, PID
>> allocation etc across a loosely tied cluster (without using fancy
>> hardware such as RDMA fabric). Anybody knows if it was ever
>> implemented? Any pointers in this direction?
>>
>> Thanks and regards
>> Durga
>>
>>
>> On Thu, Aug 25, 2011 at 11:08 AM, Rayson Ho  wrote:
>>> Srinivas,
>>>
>>> There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
>>> if you can checkpoint an MPI task and restart it on a new node, then
>>> this is also "process migration".
>>>
>>> Of course, doing a checkpoint & restart can be slower than pure
>>> in-kernel process migration, but the advantage is that you don't need
>>> any kernel support, and can in fact do all of it in user-space.
>>>
>>> Rayson
>>>
>>>
>>> On Thu, Aug 25, 2011 at 10:26 AM, Ralph Castain  wrote:
 It also depends on what part of migration interests you - are you wanting 
 to look at the MPI part of the problem (reconnecting MPI transports, 
 ensuring messages are not lost, etc.) or the RTE part of the problem 
 (where to restart processes, detecting failures, etc.)?


 On Aug 24, 2011, at 7:04 AM, Jeff Squyres wrote:

> Be aware that process migration is a pretty complex issue.
>
> Josh is probably the best one to answer your question directly, but he's 
> out today.
>
>
> On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
>
>> I am final year grad student looking for my final year project in 
>> OpenMPI.We are group of 4 students.
>> I wanted to know about the "Process Migration" process of MPI processes 
>> in OpenMPI.
>> Can anyone suggest me any ideas for project related to process migration 
>> in OenMPI or other topics in Systems.
>>
>>
>>
>> regards,
>> Srinivas Kundaram
>> srinu1...@gmail.com
>> +91-8149399160
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/l

Re: [OMPI users] help regarding SELF checkpointing, c or c++

2011-08-01 Thread Josh Hursey
There should not be any issue is checkpointing a C++ vs C program
using the 'self' checkpointer. The self checkpointer just looks for a
particular function name to be present in the compiled program binary.
Something to try is to run 'nm' on the compiled C++ program and make
sure that the 'self' checkpointing functions are present in the
output.

If you can post a small repeater program if the above does not help,
then I can file a ticket and see if someone can take a look.

Thanks,
Josh

On Mon, Aug 1, 2011 at 5:16 AM, Faisal Shahzad  wrote:
> Dear Group,
> My question is that, does SELF checkpointing work only with 'c' or also with
> 'c++' program?
> I have a simple program written in 'c'. It makes self-checkpoint (run
> callback functions) when i compile it with mpicc and do checkpointing during
> run.
> But when i convert same program to .cpp, compile with mpiCC and do
> checkpointing again, it makes BLCR checkpoint and not self-checkpoint.
> Thanks in advance.
> Regards,
> Faisal
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey


Re: [OMPI users] [ompi-1.4.2] Infiniband issue on smoky @ ornl

2011-06-23 Thread Josh Hursey
I wonder if this is related to memory pinning. Can you try turning off
the leave pinned, and see if the problem persists (this may affect
performance, but should avoid the crash):
  mpirun ... --mca mpi_leave_pinned 0 ...

Also it looks like Smoky has a slightly newer version of the 1.4
branch that you should try to switch to if you can. The following
command will show you all of the available installs on that machine:
  shell$ module avail ompi

For a list of supported compilers for that version try the 'show' option:
shell$ module show ompi/1.4.3
---
/sw/smoky/modulefiles-centos/ompi/1.4.3:

module-whatisThis module configures your environment to make Open
MPI 1.4.3 available.
Supported Compilers:
 pathscale/3.2.99
 pathscale/3.2
 pgi/10.9
 pgi/10.4
 intel/11.1.072
 gcc/4.4.4
 gcc/4.4.3
---

Let me know if that helps.

Josh


On Wed, Jun 22, 2011 at 4:16 AM, Mathieu Gontier
 wrote:
> Dear all,
>
> First of all, all my apologies because I post this message to both the bug
> and user mailing list. But for the moment, I do not know if it is a bug!
>
> I am running a CFD structured flow solver at ORNL, and I have an access to a
> small cluster (Smoky) using OpenMPI-1.4.2 with Infiniband by default.
> Recently we increased the size of our models, and since that time we have
> run into many infiniband related problems.  The most serious problem is a
> hard crash with the following error message:
>
> [smoky45][[60998,1],32][/sw/sources/ompi/1.4.2/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one]
> error creating qp errno says Cannot allocate memory
>
> If we force the solver to use ethernet (mpirun -mca btl ^openib) the
> computations works correctly, although very slowly (a single iteration take
> ages). Do you have any idea what could be causing these problems?
>
> If it is due to a bug or a limitation into OpenMPI, do you think the version
> 1.4.3, the coming 1.4.4 or any 1.5 version could solve the problem? I read
> the release notes, but I did not read any obvious patch which could fix my
> problem. The system administrator is ready to compile a new package for us,
> but I do not want to ask to install to many of them.
>
> Thanks.
> --
>
> Mathieu Gontier
> skype: mathieu_gontier
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] Meaning of ./configure --with-ft=LAM option

2011-06-20 Thread Josh Hursey
When we started adding Checkpoint/Restart functionality to Open MPI,
we were hoping to provide a LAM/MPI-like interface to the C/R
functionality. So we added a configure option as a placeholder. The
'LAM' option was intended to help those transitioning from LAM/MPI to
Open MPI. However we never got around to providing the wrapper
functionality for this, so the 'LAM' option in Open MPI defaults to
the Open MPI 'cr' option.

-- Josh

On Mon, Jun 20, 2011 at 4:54 AM, Constantinos Makassikis
 wrote:
> Hi everyone !
>
> I have started playing a little bit with Open MPI 1.5.3 and
> came across the  " --with-ft=LAM " option in the ./configure script.
>
> However, I did not find its meaning anywhere in the doc ...
>
>
> Best regards,
>
> --
> Constantinos
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey



Re: [OMPI users] ompi-restart, ompi-ps problem

2010-07-16 Thread Josh Hursey
(Sorry for the late reply)

On Jun 7, 2010, at 4:48 AM, Nguyen Kim Son wrote:

> Hello,
> 
> I'n trying to get functions like orte-checkpoint, orte-restart,... works but 
> there are some errors that I don't have any clue about.
> 
> Blcr (0.8.2) works fine apparently and  I have installed openmpi 1.4.2 from 
> source with option blcr. 
> The command
> mpirun -np 4  -am ft-enable-cr ./checkpoint_test
> seemed OK but 
> orte-checkpoint --term PID_of_checkpoint_test ( obtaining after ps -ef | grep 
> mpirun )
> does not return and shows nothing like errors!

You mean the PID of 'mpirun', right?

Does it checkpoint correctly without the '--term' argument?

Can you try the v1.5 release candidate to see if you have the same problem?
  http://www.open-mpi.org/software/ompi/v1.5/

What MCA parameters do you have set in your environment?

-- Josh

> 
> Then, I checked with 
> ompi-ps
> this time, I obtain:
> oob-tcp: Communication retries exceeded.  Can not communicate with peer
> 
> Does anyone has the same problem?
> Any idea is welcomed!
> Thanks,
> Son.
> 
> 
> -- 
> -
> Son NGUYEN KIM  
> Antibes 06600
> Tel: 06 48 28 37 47 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] ompi-restart failed

2010-07-16 Thread Josh Hursey
Open MPI can restart multi-threaded applications on any number of nodes (I do 
this routinely in testing).

If you are still experiencing this problem (sorry for the late reply), can you 
send me the MCA parameters that you are using, command line, and a backtrace 
from the corefile generated by the application?

Those bits of information will help me narrow down what might be going wrong. 
You might also try testing against the v1.5 series or the development trunk to 
make sure that the problem is not just v1.4 specific.

-- Josh

On Jun 14, 2010, at 2:47 AM, Nguyen Toan wrote:

> Hi all,
> I finally figured out the answer. I just put the parameter "-machinefile 
> host" in the "ompi-restart" command and it restarted correctly. So is it 
> unable to restart multi-threaded application on 1 node in OpenMPI?
> 
> Nguyen Toan 
> 
> On Tue, Jun 8, 2010 at 12:07 AM, Nguyen Toan  wrote:
> Sorry, I just want to add 2 more things:
> + I tried configure with and without --enable-ft-thread but nothing changed
> + I also applied this patch for OpenMPI here and reinstalled but I got the 
> same error
> https://svn.open-mpi.org/trac/ompi/raw-attachment/ticket/2139/v1.4-preload-part1.diff
> 
> Somebody helps? Thank you very much.
> 
> Nguyen Toan
> 
> 
> On Mon, Jun 7, 2010 at 11:51 PM, Nguyen Toan  wrote:
> Hello everyone,
> 
> I'm using OpenMPI 1.4.2 with BLCR 0.8.2 to test checkpointing on 2 nodes but 
> it failed to restart (Segmentation fault).
> Here are the details concerning my problem:
> 
> + OS: Centos 5.4
> + OpenMPI configure:
> ./configure --with-ft=cr --enable-ft-thread --enable-mpi-threads \
> --with-blcr=/home/nguyen/opt/blcr 
> --with-blcr-libdir=/home/nguyen/opt/blcr/lib \
> --prefix=/home/nguyen/opt/openmpi \
> --enable-mpirun-prefix-by-default
> + mpirun -am ft-enable-cr -machinefile host ./test
> 
> I checkpointed the test program using "ompi-checkpoint -v -s PID" and the 
> checkpoint file was created successfully. However it failed to restart using 
> ompi-restart:
> "mpirun noticed that process rank 0 with PID 21242 on node rc014.local exited 
> on signal 11 (Segmentation fault)"
> 
> Did I miss something in the installation of OpenMPI?
>  
> Regards,
> Nguyen Toan
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] How to checkpoint atomic function in OpenMPI

2010-07-16 Thread Josh Hursey

On Jun 14, 2010, at 5:26 AM, Nguyen Toan wrote:

> Hi all,
> I have a MPI program as follows:
> ---
> int main(){
>MPI_Init();
>..
>for (i=0; i<1; i++) {
>   my_atomic_func();
>}
>...
>MPI_Finalize();
>return 0;
> }
> 
> 
> The runtime of this program mainly involves in running the loop and 
> my_atomic_func() takes a little bit long. 
> Here I want my_atomic_func() to be operated atomically, but the timing of 
> checkpointing (by running ompi-checkpoint command) may be in the middle of 
> my_atomic_func() operation and hence ompi-restart may fail to restart 
> correctly.
> 
> So my question is:
> + At the checkpoint time (executing ompi-checkpoint), is there a way to let 
> OpenMPI wait until my_atomic_func()  finishes its operation?

We do not currently have an external function to declare a critical section 
during which a checkpoint should not be taken. I filed a ticket to make one 
available. The link is below if you would like to follow its progress:
  https://svn.open-mpi.org/trac/ompi/ticket/2487

I have an MPI Extension interface for C/R that I will be bringing into the 
trunk in the next few weeks. I should be able to extend it to include this 
feature. But I can't promise a deadline, just that I will update the ticket 
when it is available.

In the mean time you might try to use the BLCR interface to define critical 
sections. If you are using the C/R thread then this may work (though I have not 
tried it):
  cr_enter_cs()
  cr_leave_cs()

> + How does ompi-checkpoint operate to checkpoint MPI threads? 

We depend on the Checkpoint/Restart Service (e.g., BLCR) to capture the whole 
process image including all threads. So BLCR will capture the state of all 
threads when we take the process checkpoint.

-- Josh

> 
> Regards,
> Nguyen Toan
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Question on checkpoint overhead in Open MPI

2010-07-16 Thread Josh Hursey
The amount of checkpoint overhead is application and system configuration 
specific. So it is impossible to give you a good answer to how much checkpoint 
overhead to expect for your application and system setup.

BLCR is only used to capture the single process image. The coordination of the 
distributed checkpoint includes:
 - the time to initiate the checkpoint,
 - the time to marshall the network (we currently use an all-to-all bookmark 
exchange, similar to to what LAM/MPI used),
 - Store the local checkpoints to stable storage,
 - Verify that all of the local checkpoints have been stored successfully, and
 - Return the handle to the end user.

The bulk of the time is spent saving the local checkpoints (a.k.a. snapshots) 
to stable storage. By default Open MPI saves directly to a globally mounted 
storage device. So the application is stalled until the checkpoint is complete 
(checkpoint overhead = checkpoint latency). You can also enable checkpoint 
staging in which the application saves the checkpoint to a local disk. After 
which the local daemon stages the file back to stable storage while the 
application continues execution (checkpoint overhead << checkpoint latency).

If you are concerned with scaling, definitely look at the staging technique.

Does that help?

-- Josh

On Jul 7, 2010, at 12:25 PM, Nguyen Toan wrote:

> Hello everyone,
> I have a question concerning the checkpoint overhead in Open MPI, which is 
> the difference taken from the runtime of application execution with and 
> without checkpoint.
> I observe that when the data size and the number of processes increases, the 
> runtime of BLCR is very small compared to the overall checkpoint overhead in 
> Open MPI. Is it because of the increase of coordination time for checkpoint? 
> And what is included in the overall checkpoint overhead besides the BLCR's 
> checkpoint overhead and coordination time?
> Thank you.
> 
> Best Regards,
> Nguyen Toan
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Some Questions on Building OMPI on Linux Em64t

2010-05-26 Thread Josh Hursey

(Sorry for the delay, I missed the C/R question in the mail)

On May 25, 2010, at 9:35 AM, Jeff Squyres wrote:


On May 24, 2010, at 2:02 PM, Michael E. Thomadakis wrote:

| > 2) I have installed blcr V0.8.2 but when I try to built OMPI  
and I point to the
| > full installation it complains it cannot find it. Note that I  
build BLCR with

| > GCC but I am building OMPI with Intel compilers (V11.1)
|
| Can you be more specific here?

I pointed to the insatllation path for BLCR but config complained  
that it
couldn't find it. If BLCR is only needed for checkpoint / restart  
then we can

leave without it. Is BLCR needed for suspend/resume of mpi jobs ?


You mean suspend with ctrl-Z?  If so, correct -- BLCR is *only* used  
for checkpoint/restart.  Ctrl-Z just uses the SIGSTP functionality.


So BLCR is used for the checkpoint/restart functionality in Open MPI.  
We have a webpage with some more details and examples at the link below:

  http://osl.iu.edu/research/ft/ompi-cr/

You should be able to suspend/resume an Open MPI job using SIGSTOP/ 
SIGCONT without the C/R functionality. We have FAQ item that talks  
about how to enable this functionality:

  http://www.open-mpi.org/faq/?category=running#suspend-resume

You can combine the C/R and the SIGSTOP/SIGCONT functionality so that  
when you 'suspend' a job a checkpoint is taken and the process is  
stopped. You can continue the job by sending SIGCONT as normal.  
Additionally, this way if the job needs to be terminated for some  
reason (e.g., memory footprint, maintenance), it can be safely  
terminated and restarted from the checkpoint. I have a example of how  
this works at the link below:

  http://osl.iu.edu/research/ft/ompi-cr/examples.php#uc-ckpt-stop

As far as C/R integration with schedulers/resource managers, I know  
that the BLCR folks have been working with Torque to better integrate  
Open MPI+BLCR+Torque. If this is of interest, you might want to check  
with them on the progress of that project.


-- Josh



Re: [OMPI users] Using a rankfile for ompi-restart

2010-05-18 Thread Josh Hursey

(Sorry for the delay in replying, more below)

On Apr 8, 2010, at 1:34 PM, Fernando Lemos wrote:


Hello,


I've noticed that ompi-restart doesn't support the --rankfile option.
It only supports --hostfile/--machinefile. Is there any reason
--rankfile isn't supported?

Suppose you have a cluster without a shared file system. When one node
fails, you transfer its checkpoint to a spare node and invoke
ompi-restart. In 1.5, ompi-restart automagically handles this
situation (if you supply a hostfile) and is able to restart the
process, but I'm afraid it might not always be able to find the
checkpoints this way. If you could specify to ompi-restart where the
ranks are (and thus where the checkpoints are), then maybe restart
would always work as long (as long as you've specified the location of
the checkpoints correctly), or maybe ompi-restart would be faster.


We can easily add the --rankfile option to ompi-restart. I filed a  
ticket to add this option, and assess if there are other options that  
we should pass along (e.g., --npernode, --byhost). I should be able to  
fix this in the next week or so, but the ticket is linked below so you  
can follow the progress.

  https://svn.open-mpi.org/trac/ompi/ticket/2413

Most of the ompi-restart parameters are passed directly to the mpirun  
command. ompi-restart is mostly a wrapper around mpirun that is able  
to parse the metadata and create the appcontext file. I wonder if a  
more general parameter like '--mpirun-args ...' might make sense so  
users don't have to wait on me to expose the interface they need.


Donno. What do you think? Should I create a '--mpirun-args' option or  
duplicate all of the mpirun command line parameters, or some  
combination of the two.


-- Josh



Regards,
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] OpenMPI Checkpoint/Restart is failed

2010-05-18 Thread Josh Hursey

(Sorry for the delay in replying, more below)

On Apr 12, 2010, at 6:36 AM, Hideyuki Jitsumoto wrote:


Hi Members,

I tried to use checkpoint/restart by openmpi.
But I can not get collect checkpoint data.
I prepared execution environment as follows, the strings in () mean
name of output file which attached on next e-mail ( for mail size
limitation ):

1. installed BLCR and checked BLCR is working correctly by "make  
check"

2. executed ./configure with some parameters on openMPI source dir
(config.output / config.log)
3. executed make and make install (make.output.2 / install.output.2)
4. confirmed that mca_crs_blcr.[la|so], mca_crs_self.[la|so] on
/${INSTALL_DIR}/lib/openmpi
5. make ~/.openmpi/mca-params.conf (mca-params.conf)
6. compiled NPB and executed with -am ft-enable-cr
7. invoked ompi-checkpoint 

As result, I got the message "Checkpoint failed: no processes  
checkpointed."

(cr_test_cg)


It is unclear from the output what caused the checkpoint to fail. Can  
you turn on some verbose arguments and send me the output?


Put the following options in you ~/.openmpi/mca-params.conf:
#---
orte_debug_daemons=1
snapc_full_verbose=20
crs_base_verbose=10
opal_cr_verbose=10
#---




In addition, when I confirmed open_info output as your demo movie, I  
got
"MCA crs: none (MCA v2.0, API v2.0, Component  
v1.4.1)" (open_info.output)


This is actually a known bug with ompi_info. I have a fix in the works  
for it, and should be available soon. Until then the ticket is linked  
below:

  https://svn.open-mpi.org/trac/ompi/ticket/2097



How should I do for checkpointing ?
Any guidance in this regard would be highly appreciated.


Let's see what the verbose output tells us, and go from there. What  
version of BLCR are you using?


-- Josh



Thank you,
Hideyuki

--
Sincerely Yours,
Hideyuki Jitsumoto (jitum...@gsic.titech.ac.jp)
Tokyo Institute of Technology
Global Scientific Information and Computing center (Matsuoka Lab.)
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] (no subject)

2010-05-18 Thread Josh Hursey
The functionality of checkpoint operation is not tied to CPU  
utilization. Are you running with the C/R thread enabled? If not then  
the checkpoint might be waiting until the process enters the MPI  
library.


Does the system emit an error message describing the error that it  
encountered?


The C/R support does require that all processes be between MPI_INIT  
and MPI_FINALIZE. It is difficult to guarantee that the job is between  
these two functions globally (there are race conditions to worry  
about). This might be causing the problem as well since if some of the  
processes have not passed through MPI_INIT then some of the support  
services might not be properly initialized.


Let me know what you find, and we can start looking at what might be  
causing this problem.


-- Josh

On May 11, 2010, at 5:35 PM,  wrote:


Hi

I am using open-mpi 1.3.4 with BLCR. Sometimes I am running into a  
strange problem with ompi-checkpoint command. Even though I see that  
all MPI processes (equal to np argument) are running, ompi- 
checkpoint command fails at times. I have seen this failure always  
when the MPI processes spawned are not fully running ie; these  
processes are not running above 90% CPU utilization. How do I ensure  
that the MPI processes are fully up and running before I issue ompi- 
checkpoint because dynamically detecting if the processes are  
utilizing above 90% CPU resources is not easy.


Are there any MCA parameters I can use to overcome this issue?

Thanks
Ananda
Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any  
attachments to this message are intended for the exclusive use of  
the addressee(s) and may contain proprietary, confidential or  
privileged information. If you are not the intended recipient, you  
should not disseminate, distribute or copy this e-mail. Please  
notify the sender immediately and destroy all copies of this message  
and any attachments.


WARNING: Computer viruses can be transmitted via email. The  
recipient should check this email and any attachments for the  
presence of viruses. The company accepts no liability for any damage  
caused by any virus transmitted by this email.


www.wipro.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] opal_cr_tmp_dir

2010-05-18 Thread Josh Hursey
When you defined them in your environment did you prefix them with  
'OMPI_MCA_'? Open MPI looks for this prefix to identify which  
parameters are intended for it specifically.


-- Josh

On May 12, 2010, at 11:09 PM,  > wrote:



Ralph

Defining these parameters in my environment also did not resolve the  
problem. Whenever I restart my program, the temporary files are  
getting stored in the default /tmp directory instead of the  
directory I had defined.


Thanks

Ananda

=

Subject: Re: [OMPI users] opal_cr_tmp_dir
From: Ralph Castain (rhc_at_[hidden])
Date: 2010-05-12 19:48:16

	• Previous message: ananda.mudar_at_[hidden]: "Re: [OMPI users]  
opal_cr_tmp_dir"
	• In reply to: ananda.mudar_at_[hidden]: "Re: [OMPI users]  
opal_cr_tmp_dir"
Define them in your environment prior to executing any of those  
commands.


On May 12, 2010, at 4:43 PM,  wrote:

> Ralph
>
> When you say manually, do you mean setting these parameters in the  
command line while calling mpirun, ompi-restart, and ompi- 
checkpoint? Or is there another way to set these parameters?

>
> Thanks
>
> Ananda
>
> ==
>
> Subject: Re: [OMPI users] opal_cr_tmp_dir
> From: Ralph Castain (rhc_at_[hidden])
> Date: 2010-05-12 18:09:17
>
> Previous message: ananda.mudar_at_[hidden]: "Re: [OMPI users]  
opal_cr_tmp_dir"
> In reply to: ananda.mudar_at_[hidden]: "Re: [OMPI users]  
opal_cr_tmp_dir"
> You shouldn't have to, but there may be a bug in the system. Try  
manually setting both envars and see if it fixes the problem.

>
> On May 12, 2010, at 3:59 PM,  wrote:
>
> > Ralph
> >
> > I have these parameters set in ~/.openmpi/mca-params.conf file
> >
> > $ cat ~/.openmpi/mca-params.conf
> >
> > orte_tmpdir_base = /home/ananda/ORTE
> >
> > opal_cr_tmp_dir = /home/ananda/OPAL
> >
> > $
> >
> >
> >
> > Should I be setting OMPI_MCA_opal_cr_tmp_dir?
> >
> >
> >
> > FYI, I am using openmpi 1.3.4 with blcr 0.8.2
> >
> >
> > Thanks
> >
> > Ananda
> >
> > =
> >
> > Subject: Re: [OMPI users] opal_cr_tmp_dir
> > From: Ralph Castain (rhc_at_[hidden])
> > Date: 2010-05-12 16:47:16
> >
> > Previous message: Jeff Squyres: "Re: [OMPI users] getc in openmpi"
> > In reply to: ananda.mudar_at_[hidden]: "Re: [OMPI users]  
opal_cr_tmp_dir"
> > ompi-restart just does a fork/exec of the mpirun, so it should  
get the param if it is in your environ. How are you setting it? Have  
you tried adding OMPI_MCA_opal_cr_tmp_dir= to your  
environment?

> >
> > On May 12, 2010, at 12:45 PM,  wrote:
> >
> > > Thanks Ralph.
> > >
> > > Another question. Even though I am setting opal_cr_tmp_dir to  
a directory other than /tmp while calling ompi-restart command, this  
setting is not getting passed to the mpirun command that gets  
generated by ompi-restart. How do I overcome this constraint?

> > >
> > >
> > >
> > > Thanks
> > >
> > > Ananda
> > >
> > > ==
> > >
> > > Subject: Re: [OMPI users] opal_cr_tmp_dir
> > > From: Ralph Castain (rhc_at_[hidden])
> > > Date: 2010-05-12 14:38:00
> > >
> > > Previous message: ananda.mudar_at_[hidden]: "[OMPI users]  
opal_cr_tmp_dir"
> > > In reply to: ananda.mudar_at_[hidden]: "[OMPI users]  
opal_cr_tmp_dir"

> > > It's a different MCA param: orte_tmpdir_base
> > >
> > > On May 12, 2010, at 12:33 PM,  wrote:
> > >
> > > > I am setting the MCA parameter “opal_cr_tmp_dir” to a  
directory other than /tmp while calling “mpirun”, “ompi-restart”,  
and “ompi-checkpoint” commands so that I don’t fill up /tmp  
filesystem. But I see that openmpi-sessions* directory is still  
getting created under /tmp. How do I overcome this problem so that  
openmpi-sessions* directory also gets created under the same  
directory I have defined for “opal_cr_tmp_dir”?

> > > >
> > > > Is there a way to clean up these temporary files after their  
requirement is over?

> > > >
> > > > Thanks
> > > > Ananda
> > > > Please do not print this email unless it is absolutely  
necessary.

> > > >
> > > > The information contained in this electronic message and any  
attachments to this message are intended for the exclusive use of  
the addressee(s) and may contain proprietary, confidential or  
privileged information. If you are not the intended recipient, you  
should not disseminate, distribute or copy this e-mail. Please  
notify the sender immediately and destroy all copies of this message  
and any attachments.

> > > >
> > > > WARNING: Computer viruses can be transmitted via email. The  
recipient should check this email and any attachments for the  
presence of viruses. The company accepts no liability for any damage  
caused by any virus transmitted by this email.

> > > >
> > > > www.wipro.com
> > > >
> > > > ___
> > > > users mailing list
> > > > users_at_[hidden]
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > > Please do not print this email unless it is absolutely  
necessary.

> > >
> > > The information contained in th

Re: [OMPI users] ompi-restart fails with "found pid in use"

2010-05-18 Thread Josh Hursey
So I recently hit this same problem while doing some scalability  
testing. I experimented with adding the --no-restore-pid option, but  
found the same problem as you mention. Unfortunately, the problem is  
with BLCR, not Open MPI.


BLCR will restart the process with a new PID, but the value returned  
from getpid() is the old PID, not the new one. So when we connect the  
daemon and the newly restarted process they are exchanging an invalid  
PID. This eventually leads to ompi-checkpoint waiting for a PID to  
respond that may not exist on the machine.


I am working on a bug report for BLCR at the moment. Once it is fixed  
on that side, then I would be happy to add a -no-restore-pid like  
option to the Open MPI C/R system.


-- Josh

On May 14, 2010, at 11:34 AM,  > wrote:



Hi

I am using open mpi v1.3.4 with BLCR 0.8.2. I have been testing my  
openmpi based program on a 3-node cluster (each node is a Intel  
Nehalem based dual quad core) and I have been successful in  
checkpointing and restarting the program successfully multiple times.


Recently I moved to a 15 node cluster with the same configuration  
and I started seeing the problem with ompi-restart.


Ompi-checkpoint gets completed successfully and I terminate the  
program after that. I have ensured that there are no MPI processes  
before I restarted. When I restart using ompi-restart, I get the  
error in restarting few of the MPI processes and the error message  
is “found pid 4185 in use; Restart failed: Device or Resource  
busy” (of course with different pid numbers). What I found was that  
when the MPI process was restarted, it got restarted on a different  
node than what it was running before termination and found that it  
cannot reuse the pid.


Unlike cr_restart (BLCR), ompi-restart doesn’t have an option to say  
not to use the same pid with option such as “--no-restore-pid”.  
Since ompi-restart in turn calls cr_restart, I tried to alias  
cr_restart to “cr_restart --no-restore-pid”. This actually made the  
problem “pid in use” go away and the process completes successfully.  
However if I call ompi-checkpoint on the restarted open MPI job,  
both the openMPI job (all MPI processes) and the checkpoint command  
hang forever. I guess this is because of the fact that ompi-restart  
has different set of pids compared to the actual pids that are  
running.


Long story short, I am stuck with this problem as I cannot get the  
original pids during restart.


I really appreciate if you have any other options to share with me  
which I can use to overcome this problem.


Thanks
Ananda
Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any  
attachments to this message are intended for the exclusive use of  
the addressee(s) and may contain proprietary, confidential or  
privileged information. If you are not the intended recipient, you  
should not disseminate, distribute or copy this e-mail. Please  
notify the sender immediately and destroy all copies of this message  
and any attachments.


WARNING: Computer viruses can be transmitted via email. The  
recipient should check this email and any attachments for the  
presence of viruses. The company accepts no liability for any damage  
caused by any virus transmitted by this email.


www.wipro.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] Hibernating/Wakeup MPI processes

2010-04-13 Thread Josh Hursey
So what you are looking for is checkpoint/restart support, which you  
can find some details about at the link below:

  http://osl.iu.edu/research/ft/ompi-cr/

Additionally, we relatively recently added the ability to checkpoint  
and 'stop' the application. This generates a usable checkpoint of the  
application then sends SIGSTOP. The processes can be continued with  
'SIGCONT', but they could also be killed (or otherwise removed from  
the system) and then later restarted from the checkpoint. Some details  
on this feature are at the link below:

  http://osl.iu.edu/research/ft/ompi-cr/examples.php#uc-ckpt-stop

-- Josh

On Apr 13, 2010, at 10:28 AM, Ralph Castain wrote:

I believe that is called "checkpoint/restart" - see the FAQ page on  
that subject.


On Apr 13, 2010, at 7:30 AM, Hoelzlwimmer Andreas - S0810595005 wrote:


Hi,

I found in the FAQ that it is possible to suspend/resume MPI jobs.  
Would it also be possible to Hibernate the jobs (free the memory,  
serialize it to the hard drive) and continue/wake them up later,  
possibly at different locations?


cheers,
Andreas

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Segmentation fault (11)

2010-03-29 Thread Josh Hursey
I wonder if this is a bug with BLCR (since the segv stack is in the  
BLCR thread). Can you try an non-MPI version of this application that  
uses popen(), and see if BLCR properly checkpoints/restarts it?


If so, we can start to see what Open MPI might be doing to confuse  
things, but I suspect that this might be a bug with BLCR. Either way  
let us know what you find out.


Cheers,
Josh

On Mar 27, 2010, at 6:17 AM, jody wrote:


I'm not sure if this is the cause of your problems:
You define the constant BUFFER_SIZE, but in the code you use a  
constant called BUFSIZ...

Jody


On Fri, Mar 26, 2010 at 10:29 PM, Jean Potsam  
 wrote:

Dear All,
  I am having a problem with openmpi . I have installed  
openmpi 1.4 and blcr 0.8.1


I have written a small mpi application as follows below:

###
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include
#include 
#include 

#define BUFFER_SIZE PIPE_BUF

char * getprocessid()
{
FILE * read_fp;
char buffer[BUFSIZ + 1];
int chars_read;
char * buffer_data="12345";
memset(buffer, '\0', sizeof(buffer));
  read_fp = popen("uname -a", "r");
 /*
  ...
 */
 return buffer_data;
}

int main(int argc, char ** argv)
{
  MPI_Status status;
 int rank;
   int size;
char * thedata;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD,&size);
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
 thedata=getprocessid();
 printf(" the data is %s", thedata);
MPI_Finalize();
}


I get the following result:

###
jean@sunn32:~$ mpicc pipetest2.c -o pipetest2
jean@sunn32:~$ mpirun -np 1 -am ft-enable-cr -mca btl ^openib   
pipetest2

[sun32:19211] *** Process received signal ***
[sun32:19211] Signal: Segmentation fault (11)
[sun32:19211] Signal code: Address not mapped (1)
[sun32:19211] Failing at address: 0x4
[sun32:19211] [ 0] [0xb7f3c40c]
[sun32:19211] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb796868b]
[sun32:19211] [ 2] /usr/local/blcr/lib/libcr.so.0(cri_info_free 
+0x2a) [0xb7a5925a]

[sun32:19211] [ 3] /usr/local/blcr/lib/libcr.so.0 [0xb7a5ac72]
[sun32:19211] [ 4] /lib/libc.so.6(__libc_fork+0x186) [0xb7991266]
[sun32:19211] [ 5] /lib/libc.so.6(_IO_proc_open+0x7e) [0xb7958b6e]
[sun32:19211] [ 6] /lib/libc.so.6(popen+0x6c) [0xb7958dfc]
[sun32:19211] [ 7] pipetest2(getprocessid+0x42) [0x8048836]
[sun32:19211] [ 8] pipetest2(main+0x4d) [0x8048897]
[sun32:19211] [ 9] /lib/libc.so.6(__libc_start_main+0xe5) [0xb7912455]
[sun32:19211] [10] pipetest2 [0x8048761]
[sun32:19211] *** End of error message ***
#


However, If I compile the application using gcc, it works fine. The  
problem arises with:

  read_fp = popen("uname -a", "r");

Does anyone has an idea how to resolve this problem?

Many thanks

Jean


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] questions about checkpoint/restart on multiple clusters of MPI

2010-03-29 Thread Josh Hursey


On Mar 29, 2010, at 11:53 AM, fengguang tian wrote:


hi

i have used the --term option,but the mpirun is still hanging,it is  
the same whether I include the ' / ' or not.I am installing the v1.4  
to see whether the problems are still there. I tried, but some  
problems are still there.


What configure options did you use when building Open MPI?



BTW, my MPI program will have some input file, and will generate  
some output file after some computation, it can be checkpointed,but  
when restart it, some error happened,have you met this kind of  
problem?


Try putting the 'snapc_base_global_snapshot_dir' in the $HOME/.openmpi/ 
mca-params.conf file instead of just on the command line. Like:

snapc_base_global_snapshot_dir=/shared-dir/

I suspect that ompi-restart is looking in the wrong place for your  
checkpoint. By default it will search $HOME (since that is the default  
for snapc_base_global_snapshot_dir). If you put this parameter in the  
mca-params.conf file, then it is always set in any tool (mpirun/ompi- 
checkpoint/ompi-restart) to the specified value. So ompi-restart will  
search the correct location for the checkpoint files.


-- Josh



cheers
fengguang

On Mon, Mar 29, 2010 at 11:42 AM, Josh Hursey mpi.org> wrote:


On Mar 23, 2010, at 1:00 PM, Fernando Lemos wrote:

On Tue, Mar 23, 2010 at 12:55 PM, fengguang tian  
 wrote:


I use mpirun -np 50 -am ft-enable-cr --mca  
snapc_base_global_snapshot_dir

--hostfile .mpihostfile 
to store the global checkpoint snapshot into the shared
directory:/mirror,but the problems are still there,
when ompi-checkpoint, the mpirun is still not killed,it is hanging
there.

So the 'ompi-checkpoint' command does not finish? By default 'ompi- 
checkpoint' does not terminate the MPI job. If you pass the '--term'  
option to it, then it will.




when doing ompi-restart, it shows:

mpiu@nimbus:/mirror$ ompi-restart ompi_global_snapshot_333.ckpt/
--
Error: The filename (ompi_global_snapshot_333.ckpt/) is invalid  
because

either you have not provided a filename
  or provided an invalid filename.
  Please see --help for usage.

--


Try removing the trailing '/' in the command. The current ompi- 
restart is not good about differentiating between :


 ompi_global_snapshot_333.ckpt
and

 ompi_global_snapshot_333.ckpt/


Have you tried OpenMPI 1.5? I got it to work with 1.5, but not with
1.4 (but then I didn't try 1.4 with a shared filesystem).

I would also suggest trying v1.4 or 1.5 to see if your problems  
persist with these versions.


-- Josh




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] question about checkpoint on cluster, mpirun doesn't work on cluster

2010-03-29 Thread Josh Hursey
Does this happen when you run without '-am ft-enable-cr' (so a no-C/R  
run)?


This will help us determine if your problem is with the C/R work or  
with the ORTE runtime. I suspect that there is something odd with your  
system that is confusing the runtime (so not a C/R problem).


Have you made sure to remove the previous versions of Open MPI from  
all machines on your cluster, before installing the new version?  
Sometimes problems like this come up because of mismatches in Open MPI  
versions on a machine.


-- Josh

On Mar 23, 2010, at 5:42 PM, fengguang tian wrote:


I met the same problem with this 
link:http://www.open-mpi.org/community/lists/users/2009/12/11374.php

in the link, they give a solution that use v1.4 open mpi instead of  
v1.3 open mpi. but, I am using v1.7a1r22794 open mpi, and met the  
same problem.

here is what I have done:
my cluster composed of two machines:nimbus(master) and  
nimbus1(slave), when I run mpirun -np 40 -am ft-enable-cr -- 
hostfile .mpihostfile myapplication

on the nimbus, and it doesn't work, it shows:

[nimbus1:21387] opal_os_dirpath_create: Error: Unable to create the  
sub-directory (/tmp/openmpi-sessions-mpiu@nimbus1_0/59759) of (/tmp/ 
openmpi-sessions-mpiu@nimbus1_0/59759/0/1), mkdir failed [1]
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: Error in file util/ 
session_dir.c at line 106
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: Error in file util/ 
session_dir.c at line 399
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: Error in file base/ 
ess_base_std_orted.c at line 301
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: A message is  
attempting to be sent to a process whose contact information is  
unknown in file rml_oob_send.c at line 104
[nimbus1:21387] [[59759,0],1] could not get route to  
[[INVALID],INVALID]
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: A message is  
attempting to be sent to a process whose contact information is  
unknown in file util/show_help.c at line 602
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: Error in file  
ess_env_module.c at line 143
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: A message is  
attempting to be sent to a process whose contact information is  
unknown in file rml_oob_send.c at line 104
[nimbus1:21387] [[59759,0],1] could not get route to  
[[INVALID],INVALID]
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: A message is  
attempting to be sent to a process whose contact information is  
unknown in file util/show_help.c at line 602
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: Error in file runtime/ 
orte_init.c at line 129
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: A message is  
attempting to be sent to a process whose contact information is  
unknown in file rml_oob_send.c at line 104
[nimbus1:21387] [[59759,0],1] could not get route to  
[[INVALID],INVALID]
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: A message is  
attempting to be sent to a process whose contact information is  
unknown in file util/show_help.c at line 602
[nimbus1:21387] [[59759,0],1] ORTE_ERROR_LOG: Error in file orted/ 
orted_main.c at line 355

--
A daemon (pid 10737) died unexpectedly with status 255 while  
attempting

to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed  
shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to  
have the

location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--


cheers
fengguang
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] questions about checkpoint/restart on multiple clusters of MPI

2010-03-29 Thread Josh Hursey


On Mar 23, 2010, at 1:00 PM, Fernando Lemos wrote:

On Tue, Mar 23, 2010 at 12:55 PM, fengguang tian  
 wrote:


I use mpirun -np 50 -am ft-enable-cr --mca  
snapc_base_global_snapshot_dir

--hostfile .mpihostfile 
to store the global checkpoint snapshot into the shared
directory:/mirror,but the problems are still there,
when ompi-checkpoint, the mpirun is still not killed,it is hanging
there.


So the 'ompi-checkpoint' command does not finish? By default 'ompi- 
checkpoint' does not terminate the MPI job. If you pass the '--term'  
option to it, then it will.




when doing ompi-restart, it shows:

mpiu@nimbus:/mirror$ ompi-restart ompi_global_snapshot_333.ckpt/
--
Error: The filename (ompi_global_snapshot_333.ckpt/) is invalid  
because

either you have not provided a filename
   or provided an invalid filename.
   Please see --help for usage.

--




Try removing the trailing '/' in the command. The current ompi-restart  
is not good about differentiating between :

  ompi_global_snapshot_333.ckpt
and
  ompi_global_snapshot_333.ckpt/



Have you tried OpenMPI 1.5? I got it to work with 1.5, but not with
1.4 (but then I didn't try 1.4 with a shared filesystem).


I would also suggest trying v1.4 or 1.5 to see if your problems  
persist with these versions.


-- Josh




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Meaning and the significance of MCA parameter "opal_cr_use_thread"

2010-03-29 Thread Josh Hursey

So the MCA parameter that you mention is explained at the link below:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_use_thread

This enables/disables the C/R thread a runtime if Open MPI was  
configured with C/R thread support:

  http://osl.iu.edu/research/ft/ompi-cr/api.php#conf-enable-ft-thread

The C/R thread enables asynchronous processing of checkpoint requests  
when the application process is not inside the MPI library. The  
purpose of this thread is to improve the responsiveness of the  
checkpoint operation. Without the thread, if the application is in a  
computation loop then the checkpoint will be delayed until the process  
enters the MPI library. With the thread enabled, the checkpoint will  
start in the C/R thread if the application is not in the MPI library.


The primary advantages of the C/R thread are:
 - Response time to the C/R request since the checkpoint is not  
delayed until the process enters the MPI library,
 - Asynchronous processing of the checkpoint while the application is  
executing outside the MPI library (improves the checkpoint overhead  
experienced by the process).


The primary disadvantage of the C/R thread is the additional  
processing task running in parallel with the application. If the C/R  
thread is polling too often it could slow down the main process by  
forcing frequent context switches between the C/R thread and the main  
execution thread. You can adjust the aggressiveness by adjusting the  
parameters at the link below:

  http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_thread_sleep_check

-- Josh

On Mar 24, 2010, at 11:24 AM,  > wrote:


The description for MCA parameter “opal_cr_use_thread” is very short  
at URL:  http://osl.iu.edu/research/ft/ompi-cr/api.php


Can someone explain the usefulness of enabling this parameter vs  
disabling it? In other words, what are pros/cons of disabling it?


 I found that this gets enabled automatically when openmpi library  
is configured with –ft-enable-threads option.


Thanks
Ananda
Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any  
attachments to this message are intended for the exclusive use of  
the addressee(s) and may contain proprietary, confidential or  
privileged information. If you are not the intended recipient, you  
should not disseminate, distribute or copy this e-mail. Please  
notify the sender immediately and destroy all copies of this message  
and any attachments.


WARNING: Computer viruses can be transmitted via email. The  
recipient should check this email and any attachments for the  
presence of viruses. The company accepts no liability for any damage  
caused by any virus transmitted by this email.


www.wipro.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] mpirun with -am ft-enable-cr option takes longer time on certain configurations

2010-03-29 Thread Josh Hursey


On Mar 20, 2010, at 11:14 PM,  > wrote:


I am observing a very strange performance issue with my openmpi  
program.


I have compute intensive openmpi based application that keeps the  
data in memory, process the data and then dumps it to GPFS parallel  
file system. GPFS parallel file system server is connected to a QDR  
infiniband switch from Voltaire.


If my cluster is connected to a DDR infiniband switch which in turn  
connects to file system server on QDR switch, I see that I can run  
my application under checkpoint/restart control (with –am ft-enable- 
cr) and I can checkpoint (ompi-checkpoint) successfully and the  
application gets completed after few additional seconds.


If my cluster is connected to the same QDR switch which connects to  
file system server, I see that my application takes close to 10x  
time to complete if I run it under checkpoint/restart control (with – 
am ft-enable-cr). If I run the same application using a plain mpirun  
command (ie; without -am ft_enable_cr), it finishes within a minute.


The 10x slowdown is without taking a checkpoint, correct? If the  
checkpoint is taking up part of the bandwidth through the same switch  
you are communicating with, then you will see diminished performance  
until the checkpoint is fully established on the storage device(s).  
Many installations separate the communication and storage networks (or  
limiting the bandwidth of one of them) to prevent one from  
unexpectedly demising the performance of the other, even outside of  
the C/R context.


However for a non-checkpointing run to be 10x slower is certainly not  
normal. Try playing with the C/R thread parameters (mentioned in a  
previous email) and see if that helps. If not, we might be able to try  
other things.


-- Josh 





I am using open mpi 1.3.4 and BLCR 0.8.2 for checkpointing

Are there any specific MCA parameters that I should tune to address  
this problem? Any other pointers will be really helpful.


Thanks
Anand
Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any  
attachments to this message are intended for the exclusive use of  
the addressee(s) and may contain proprietary, confidential or  
privileged information. If you are not the intended recipient, you  
should not disseminate, distribute or copy this e-mail. Please  
notify the sender immediately and destroy all copies of this message  
and any attachments.


WARNING: Computer viruses can be transmitted via email. The  
recipient should check this email and any attachments for the  
presence of viruses. The company accepts no liability for any damage  
caused by any virus transmitted by this email.


www.wipro.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] mpirun with -am ft-enable-cr option runs slow if hyperthreading is disabled

2010-03-29 Thread Josh Hursey


On Mar 22, 2010, at 4:41 PM,  > wrote:



Hi

If the run my compute intensive openmpi based program using regular  
invocation of mpirun (ie; mpirun –host  -np cores>), it gets completed in few seconds but if I run the same  
program with “-am ft-enable-cr” option, the program takes 10x time  
to complete.


If I enable hyperthreading on my cluster nodes and then call mpirun  
with “-am ft-enable-cr” option, the program gets completed with few  
additional seconds than the normal mpirun!!


How can I improve the performance of mpirun with “-am ft-enable-cr”  
option when I disable hyperthreading on my cluster nodes? Any  
pointers will be really useful.


FYI, I am using openmpi 1.3.4 library and BLCR 0.8.2. Cluster nodes  
are Nehalem based nodes with  8 cores.


I have not done any performance studies focused on hyperthreading, so  
I can not say specifically what is happening.


The 10x slowdown is certainly unexpected (I don't see this in my  
testing). There usually is a small slowdown (few microseconds) because  
of the message tracking technique used to support the checkpoint  
coordination protocol. I suspect that the cause of your problem is the  
C/R thread which is probably too aggressive for your system. The  
improvement with hyperthreading may be that this thread is able to sit  
on one of the hardware threads and not completely steal the CPU from  
the main application.


You can change how aggressive the thread is by adjusting the two  
parameters below:

 http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_thread_sleep_check
 http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_thread_sleep_wait

I usually set the latter to:
 opal_cr_thread_sleep_wait=1000

Give that a try and let me know is that helps. You might also try to  
upgrade to the 1.4 series, or even the upcoming v1.5.0 release and see  
if the problem persists there.


-- Josh



Thanks
Anand
Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any  
attachments to this message are intended for the exclusive use of  
the addressee(s) and may contain proprietary, confidential or  
privileged information. If you are not the intended recipient, you  
should not disseminate, distribute or copy this e-mail. Please  
notify the sender immediately and destroy all copies of this message  
and any attachments.


WARNING: Computer viruses can be transmitted via email. The  
recipient should check this email and any attachments for the  
presence of viruses. The company accepts no liability for any damage  
caused by any virus transmitted by this email.


www.wipro.com

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] top command output shows huge CPU utilization when openmpi processes resume after the checkpoint

2010-03-29 Thread Josh Hursey


On Mar 21, 2010, at 12:58 PM, Addepalli, Srirangam V wrote:


Yes We have seen this behavior too.

Another behavior I have seen is that one MPI process starts to  
show different elapsed time than its peers. Is it because  
checkpoint happened on behalf of this process?


R

From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On  
Behalf Of ananda.mu...@wipro.com [ananda.mu...@wipro.com]

Sent: Saturday, March 20, 2010 10:18 PM
To: us...@open-mpi.org
Subject: [OMPI users] top command output shows huge CPU utilization  
whenopenmpi processes resume after the checkpoint


When I checkpoint my openmpi application using ompi_checkpoint, I  
see that top command suddenly shows some really huge numbers in "CPU  
%" field such as 150% 200% etc. After sometime, these numbers do  
come back to the normal numbers under 100%. This happens exactly  
around the time checkpoint is completed and when the processes are  
resuming the execution.


One cause for this type of CPU utilization is due to the C/R thread.  
During non-checkpoint/normal processing the thread is polling for a  
checkpoint fairly aggressively. You can change how aggressive the  
thread is by adjusting the two parameters below:

 http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_thread_sleep_check
 http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_thread_sleep_wait

I usually set the latter to:
 opal_cr_thread_sleep_wait=1000

You can also turn off the C/R thread, either by configure'ing without  
it, or disabling it at runtime by setting the 'opal_cr_use_thread'  
parameter to '0':

 http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_use_thread


The CPU increase during the checkpoint may be due to both the Open MPI  
C/R thread, and the BLCR thread becoming active on the machine. You  
might try to determine whether this is BLCR's CPU utilization or Open  
MPI's by creating a single process application and watching the CPU  
utilization when checkpointing with BLCR. You may also want to look at  
the memory consumption of the process to make sure that there is  
enough for BLCR to run efficiently.


This may also be due to processes finished with the checkpoint waiting  
on other peer processes to finish. I don't think we have a good way to  
control how aggressively these waiting processes poll for completion  
of peers. If this becomes a problem we can look into adding a  
parameter similar to opal_cr_thread_sleep_wait to throttle the polling  
on the machine.


The disadvantage of making the various polling for completion loops  
less aggressive, is that the checkpoint may stall the checkpoint and/ 
or application for a little longer than necessary. But if this is  
acceptable to the user, then they can adjust the MCA parameters as  
necessary.




Another behavior I have seen is that one MPI process starts to show  
different elapsed time than its peers. Is it because checkpoint  
happened on behalf of this process?


Can you explain a bit more about what you mean by this? Neither Open  
MPI nor BLCR messes with the timer on the machine, so we are not  
changing it in any way. The process is 'stopped' briefly while BLCR  
takes the checkpoint, so this will extend the running time of the  
process. How much the running time is extended (a.k.a. checkpoint  
overhead) is determined by a bunch of things, but primarily by the  
storage device(s) that the checkpoint is being written to.




For your reference, I am using open mpi 1.3.4 and BLCR 0.8.2 for  
checkpointing.


It would be interesting to know if you see the same behavior with the  
trunk or v1.5 series of Open MPI.


Hope that helps,
Josh



Thanks
Anand

Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any  
attachments to this message are intended for the exclusive use of  
the addressee(s) and may contain proprietary, confidential or  
privileged information. If you are not the intended recipient, you  
should not disseminate, distribute or copy this e-mail. Please  
notify the sender immediately and destroy all copies of this message  
and any attachments.


WARNING: Computer viruses can be transmitted via email. The  
recipient should check this email and any attachments for the  
presence of viruses. The company accepts no liability for any damage  
caused by any virus transmitted by this email.


www.wipro.com


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] can torque support openmpi checkpoint?

2010-03-18 Thread Josh Hursey
I have not been working with the integration of Open MPI and Torque  
directly, so I cannot state how well this is supported. However, the  
BLCR folks have been working on a Torque/Open MPI/BLCR project for a  
while now, and have had some success. You might want to raise the  
question on the BLCR mailing list and see if they can give you an  
update on that project.


-- Josh

On Mar 10, 2010, at 2:58 AM, 马少杰 wrote:





2010-03-10
马少杰
Dear Sir:
  I  can use openmpi with blcr to save checkpoint and  
restart my mpi applications.  Now , I want to torque also support  
openmpi CR. I have known that toque can support serial program by  
use of blcr. Can torque also support openmpi CR? How should I do ?

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] change hosts to restart the checkpoint

2010-03-05 Thread Josh Hursey
This type of failure is usually due to prelink'ing being left enabled  
on one or more of the systems. This has come up multiple times on the  
Open MPI list, but is actually a problem between BLCR and the Linux  
kernel. BLCR has a FAQ entry on this that you will want to check out:

  https://upc-bugs.lbl.gov//blcr/doc/html/FAQ.html#prelink

If that does not work, then we can look into other causes.

-- Josh

On Mar 5, 2010, at 3:06 AM, 马少杰 wrote:





2010-03-05
马少杰
Dear Sir:
   I want to use openmpi and blcr to checkpoint.However, I want  
restart the check point

on other hosts.  For example, I run mpi program using openmpi on
host1 and host2, and I save the checkpoint file at a nfs shared path.
Then I wan to restart the job (ompi-restart -machinefile ma  
ompi_global_snapshot_15865.ckpt) on host3 and
 host4. The 4 host have same hardware and software. If I change the  
hostname (host3 and host4) on machinfile, the error always  occur,

 [node182:27278] *** Process received signal ***
[node182:27278] Signal: Segmentation fault (11)
[node182:27278] Signal code: Address not mapped (1)
[node182:27278] Failing at address: 0x3b81009530
[node182:27275] *** Process received signal ***
[node182:27275] Signal: Segmentation fault (11)
[node182:27275] Signal code: Address not mapped (1)
[node182:27275] Failing at address: 0x3b81009530
[node182:27274] *** Process received signal ***
[node182:27274] Signal: Segmentation fault (11)
[node182:27274] Signal code: Address not mapped (1)
[node182:27274] Failing at address: 0x3b81009530
[node182:27276] *** Process received signal ***
[node182:27276] Signal: Segmentation fault (11)
[node182:27276] Signal code: Address not mapped (1)
[node182:27276] Failing at address: 0x3b81009530
--
mpirun noticed that process rank 9 with PID 27973 on node node183  
exited on signal 11 (Segmentation fault).


  if I comeback the hostname as host1 and host2, it can restart  
succesfully.


 my openmpi version is 1.3.4
 ./configure  --with-ft=cr --enable-mpi-threads --enable-ft-thread -- 
with-blcr=$dir --with-blcr-libdir=/$dir/lib --prefix=$dir_ompi -- 
enable-mpirun-prefix-by-default


 the command run the mpi progrom as
mpirun -np 8 --am ft-enable-cr --mca opal_cr_use_thread 0  - 
machinefile ma ./cpi


vim $HOME/.openmpi/mca-params.conf
crs_base_snapshot_dir=/tmp/cr
snapc_base_global_snapshot_dir=/disk/cr


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] orte-checkpoint hangs

2010-02-25 Thread Josh Hursey

On Feb 10, 2010, at 9:45 AM, Addepalli, Srirangam V wrote:

> I am trying to test orte-checkpoint with a MPI JOB. It how ever hangs for all 
> jobs.  This is how i submit the job is started
> mpirun -np 8 -mca ft-enable cr /apps/nwchem-5.1.1/bin/LINUX64/nwchem 
> siosi6.nw 

This might be the problem, if it wasn't a typo. The command line flag is "-am 
ft-enable-cr" not "-mca ft-enable cr". The former activates a set of MCA 
parameters (in the AMCA file 'ft-enable-cr'). The latter should be ignored by 
the MCA system.

Give that a try and let us know if the behavior changes.

-- Josh

>> From another terminal i try the orte-checkpoint
> 
> ompi-checkpoint -v --term 9338
> [compute-19-12.local:09377] orte_checkpoint: Checkpointing...
> [compute-19-12.local:09377]  PID 9338
> [compute-19-12.local:09377]  Connected to Mpirun [[5009,0],0]
> [compute-19-12.local:09377]  Terminating after checkpoint
> [compute-19-12.local:09377] orte_checkpoint: notify_hnp: Contact Head Node 
> Process PID 9338
> [compute-19-12.local:09377] orte_checkpoint: notify_hnp: Requested a 
> checkpoint of jobid [INVALID]
> [compute-19-12.local:09377] orte_checkpoint: hnp_receiver: Receive a command 
> message.
> [compute-19-12.local:09377] orte_checkpoint: hnp_receiver: Status Update.
> 
> 
> Is there any way to debug the issue to get more information or log messages.
> 
> Rangam
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Torque+BCLR+OpenMPI

2010-02-25 Thread Josh Hursey
Anton,

I don't know if there usual or typical way of initiating a checkpoint amongst 
various resource managers. I know that the BLCR folks (I believe Eric Roman is 
heading this effort - CC'ed) have been investigating a tighter integration of 
Open MPI, BLCR and Torque. He might be able to give you a bit more guidance on 
this topic.

-- Josh

On Feb 10, 2010, at 11:54 PM, Anton Starikov wrote:

> Hi!
> I'm trying to implement checkpointing on out cluster, and I have obvious 
> question.
> 
> I guess this was implemented many times by other users, so I would like is 
> someone share experience with me.
> 
> With serial/multithreaded jobs it is kind of clear. But for parallel?
> 
> We have "fat" 16-core nodes, so user use both OpenMP and MPI programs.
> 
> Shell I just do perform some checks in my checkpointing script and call 
> ompi-checkpoint if after tests I figure our that there is MPI job?
> 
> What is "usual" way?
> 
> Best,
> 
> Anton
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Checkpoint/Restart error

2010-02-01 Thread Josh Hursey
Thanks for the bug report. There are a couple of places in the code  
that, in a sense, hard code '/tmp' as the temporary directory. It  
shouldn't be to hard to fix since there is a common function used in  
the code to discovery the 'true' temporary directory (which defaults  
to /tmp). Of course there might be other complexities once I dig into  
the problem.


I don't know when I will get to this, but I filed a ticket about this  
if you want to track it:

  https://svn.open-mpi.org/trac/ompi/ticket/2208

Thanks again,
Josh

On Jan 29, 2010, at 4:41 PM, Jazcek Braden wrote:


Josh,

I was following this thread as I had similar symptoms and discovered a
peculiar error.  when I launch the program, openmpi follows the
$TMPDIR environment variable and puts the session information in the
$TMPDIR directory.  However ompi-checkpoint seems to be requiring the
sessions file to be in /tmp ignoring the $TMPDIR.  Is this by design
and what would I have to do to get it to obey the $TMPDIR environment
variable.

--
Jazcek Braden
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] checkpointing multi node and multi process applications

2010-01-25 Thread Josh Hursey
Actually, let me roll that back a bit. I was preparing a custom patch  
for the v1.4 series, and it seems that the code does not have the bug  
I mentioned. It is only the v1.5 and trunk that were effected by this.  
The v1.4 series should be fine.


I will still ask that the error message fix be brought over to the  
v1.4 branch, but it is unlikely to fix your problem. However it would  
be useful to know if upgrading to the trunk or v1.5 series fixes this  
problem. The v1.4 series has an old version of the file and metadata  
handling mechanisms, so I am encouraging people to move to the v1.5  
series if possible.


-- Josh

On Jan 25, 2010, at 3:33 PM, Josh Hursey wrote:

So while working on the error message, I noticed that the global  
coordinator was using the wrong path to investigate the checkpoint  
metadata. This particular section of code is not often used (which  
is probably why I could not reproduce). I just committed a fix to  
the Open MPI development trunk:

 https://svn.open-mpi.org/trac/ompi/changeset/22479

Additionally, I am asking for this to be brought over to the v1.4  
and v1.5 release branches:

 https://svn.open-mpi.org/trac/ompi/ticket/2195
 https://svn.open-mpi.org/trac/ompi/ticket/2196

It seems to solve the problem as I could reproduce it. Can you try  
the trunk (either SVN checkout or nightly tarball from tonight) and  
check if this solves your problem?


Cheers,
Josh

On Jan 25, 2010, at 12:14 PM, Josh Hursey wrote:

I am not able to reproduce this problem with the 1.4 branch using a  
hostfile, and node configuration like you mentioned.


I suspect that the error is caused by a failed local checkpoint.  
The error message is triggered when the global coordinator (located  
in 'mpirun') tries to read the metadata written by the application  
in the local snapshot. If the global coordinator cannot properly  
read the metadata, then it will print a variety of error messages  
depending on what is going wrong.


If these are the only two errors produced, then this typically  
means that the local metadata file has been found, but is empty/ 
corrupted. Can you send me the contents of the local checkpoint  
metadata file:
shell$ cat GLOBAL_SNAPSHOT_DIR/ompi_global_snapshot_YYY.ckpt/0/ 
opal_snapshot_0.ckpt/snapshot_meta.data


It should look something like:
-
#
# PID: 23915
# Component: blcr
# CONTEXT: ompi_blcr_context.23915
-

It may also help to see the following metadata file as well:
shell$ cat GLOBAL_SNAPSHOT_DIR/ompi_global_snapshot_YYY.ckpt/ 
global_snapshot_meta.data



If there are other errors printed by the process, that would  
potentially indicate a different problem. So if there are, let me  
know.


This error message should be a bit more specific about which  
process checkpoint is causing the problem, and what the this  
usually indicates. I filed a bug to cleanup the error:

https://svn.open-mpi.org/trac/ompi/ticket/2190

-- Josh

On Jan 21, 2010, at 8:27 AM, Jean Potsam wrote:


Hi Josh/all,

I have upgraded the openmpi to v 1.4  but still get the same error  
when I try executing the application on multiple nodes:


***
Error: expected_component: PID information unavailable!
Error: expected_component: Component Name information unavailable!
***

I am running my application from the node 'portal11' as follows:

mpirun -am ft-enable-cr -np 2 --hostfile hosts  myapp.

The file 'hosts' contains two host names: portal10, portal11.

I am triggering the checkpoint using ompi-checkpoint -v 'PID' from  
portal11.



I configured open mpi as follows:

#

./configure --prefix=/home/jean/openmpi/ --enable-picky --enable- 
debug --enable-mpi-profile --enable-mpi-cxx --enable-pretty-print- 
stacktrace --enable-binaries --enable-trace --enable-static=yes -- 
enable-debug --with-devel-headers=1 --with-mpi-param-check=always  
--with-ft=cr --enable-ft-thread --with-blcr=/usr/local/blcr/ -- 
with-blcr-libdir=/usr/local/blcr/lib --enable-mpi-threads=yes

#

Question:

what do you think can be wrong? Please instruct me on how to  
resolve this problem.


Thank you

Jean




--- On Mon, 11/1/10, Josh Hursey  wrote:

From: Josh Hursey 
Subject: Re: [OMPI users] checkpointing multi node and multi  
process applications

To: "Open MPI Users" 
Date: Monday, 11 January, 2010, 21:42


On Dec 19, 2009, at 7:42 AM, Jean Potsam wrote:

> Hi Everyone,
>I am trying to checkpoint an mpi  
application running on multiple nodes. However, I get some error  
messages when i trigger the checkpointing process.

>
> Error: expected_component: PID information unavailable!
> Error: expected_component: Component Name information unavailable!
>
> I am using  open mpi 1.3 and blcr 0.8.1

Can you try the v1.4 release and see if the problem persists?

>
> I ex

Re: [OMPI users] checkpointing multi node and multi process applications

2010-01-25 Thread Josh Hursey
So while working on the error message, I noticed that the global  
coordinator was using the wrong path to investigate the checkpoint  
metadata. This particular section of code is not often used (which is  
probably why I could not reproduce). I just committed a fix to the  
Open MPI development trunk:

  https://svn.open-mpi.org/trac/ompi/changeset/22479

Additionally, I am asking for this to be brought over to the v1.4 and  
v1.5 release branches:

  https://svn.open-mpi.org/trac/ompi/ticket/2195
  https://svn.open-mpi.org/trac/ompi/ticket/2196

It seems to solve the problem as I could reproduce it. Can you try the  
trunk (either SVN checkout or nightly tarball from tonight) and check  
if this solves your problem?


Cheers,
Josh

On Jan 25, 2010, at 12:14 PM, Josh Hursey wrote:

I am not able to reproduce this problem with the 1.4 branch using a  
hostfile, and node configuration like you mentioned.


I suspect that the error is caused by a failed local checkpoint. The  
error message is triggered when the global coordinator (located in  
'mpirun') tries to read the metadata written by the application in  
the local snapshot. If the global coordinator cannot properly read  
the metadata, then it will print a variety of error messages  
depending on what is going wrong.


If these are the only two errors produced, then this typically means  
that the local metadata file has been found, but is empty/corrupted.  
Can you send me the contents of the local checkpoint metadata file:
 shell$ cat GLOBAL_SNAPSHOT_DIR/ompi_global_snapshot_YYY.ckpt/0/ 
opal_snapshot_0.ckpt/snapshot_meta.data


It should look something like:
-
#
# PID: 23915
# Component: blcr
# CONTEXT: ompi_blcr_context.23915
-

It may also help to see the following metadata file as well:
shell$ cat GLOBAL_SNAPSHOT_DIR/ompi_global_snapshot_YYY.ckpt/ 
global_snapshot_meta.data



If there are other errors printed by the process, that would  
potentially indicate a different problem. So if there are, let me  
know.


This error message should be a bit more specific about which process  
checkpoint is causing the problem, and what the this usually  
indicates. I filed a bug to cleanup the error:

 https://svn.open-mpi.org/trac/ompi/ticket/2190

-- Josh

On Jan 21, 2010, at 8:27 AM, Jean Potsam wrote:


Hi Josh/all,

I have upgraded the openmpi to v 1.4  but still get the same error  
when I try executing the application on multiple nodes:


***
Error: expected_component: PID information unavailable!
Error: expected_component: Component Name information unavailable!
***

I am running my application from the node 'portal11' as follows:

mpirun -am ft-enable-cr -np 2 --hostfile hosts  myapp.

The file 'hosts' contains two host names: portal10, portal11.

I am triggering the checkpoint using ompi-checkpoint -v 'PID' from  
portal11.



I configured open mpi as follows:

#

./configure --prefix=/home/jean/openmpi/ --enable-picky --enable- 
debug --enable-mpi-profile --enable-mpi-cxx --enable-pretty-print- 
stacktrace --enable-binaries --enable-trace --enable-static=yes -- 
enable-debug --with-devel-headers=1 --with-mpi-param-check=always -- 
with-ft=cr --enable-ft-thread --with-blcr=/usr/local/blcr/ --with- 
blcr-libdir=/usr/local/blcr/lib --enable-mpi-threads=yes

#

Question:

what do you think can be wrong? Please instruct me on how to  
resolve this problem.


Thank you

Jean




--- On Mon, 11/1/10, Josh Hursey  wrote:

From: Josh Hursey 
Subject: Re: [OMPI users] checkpointing multi node and multi  
process applications

To: "Open MPI Users" 
Date: Monday, 11 January, 2010, 21:42


On Dec 19, 2009, at 7:42 AM, Jean Potsam wrote:

> Hi Everyone,
>I am trying to checkpoint an mpi  
application running on multiple nodes. However, I get some error  
messages when i trigger the checkpointing process.

>
> Error: expected_component: PID information unavailable!
> Error: expected_component: Component Name information unavailable!
>
> I am using  open mpi 1.3 and blcr 0.8.1

Can you try the v1.4 release and see if the problem persists?

>
> I execute my application as follows:
>
> mpirun -am ft-enable-cr -np 3 --hostfile hosts gol.
>
> My question:
>
> Does openmpi with blcr support checkpointing of multi node  
execution of mpi application? If so, can you provide me with some  
information on how to achieve this.


Open MPI is able to checkpoint a multi-node application (that's  
what it was designed to do). There are some examples at the link  
below:

 http://www.osl.iu.edu/research/ft/ompi-cr/examples.php

-- Josh

>
> Cheers,
>
> Jean.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/li

Re: [OMPI users] checkpointing multi node and multi process applications

2010-01-25 Thread Josh Hursey
I am not able to reproduce this problem with the 1.4 branch using a  
hostfile, and node configuration like you mentioned.


I suspect that the error is caused by a failed local checkpoint. The  
error message is triggered when the global coordinator (located in  
'mpirun') tries to read the metadata written by the application in the  
local snapshot. If the global coordinator cannot properly read the  
metadata, then it will print a variety of error messages depending on  
what is going wrong.


If these are the only two errors produced, then this typically means  
that the local metadata file has been found, but is empty/corrupted.  
Can you send me the contents of the local checkpoint metadata file:
  shell$ cat GLOBAL_SNAPSHOT_DIR/ompi_global_snapshot_YYY.ckpt/0/ 
opal_snapshot_0.ckpt/snapshot_meta.data


It should look something like:
-
#
# PID: 23915
# Component: blcr
# CONTEXT: ompi_blcr_context.23915
-

It may also help to see the following metadata file as well:
 shell$ cat GLOBAL_SNAPSHOT_DIR/ompi_global_snapshot_YYY.ckpt/ 
global_snapshot_meta.data



If there are other errors printed by the process, that would  
potentially indicate a different problem. So if there are, let me know.


This error message should be a bit more specific about which process  
checkpoint is causing the problem, and what the this usually  
indicates. I filed a bug to cleanup the error:

  https://svn.open-mpi.org/trac/ompi/ticket/2190

-- Josh

On Jan 21, 2010, at 8:27 AM, Jean Potsam wrote:


Hi Josh/all,

I have upgraded the openmpi to v 1.4  but still get the same error  
when I try executing the application on multiple nodes:


***
 Error: expected_component: PID information unavailable!
 Error: expected_component: Component Name information unavailable!
***

I am running my application from the node 'portal11' as follows:

mpirun -am ft-enable-cr -np 2 --hostfile hosts  myapp.

The file 'hosts' contains two host names: portal10, portal11.

I am triggering the checkpoint using ompi-checkpoint -v 'PID' from  
portal11.



I configured open mpi as follows:

#

./configure --prefix=/home/jean/openmpi/ --enable-picky --enable- 
debug --enable-mpi-profile --enable-mpi-cxx --enable-pretty-print- 
stacktrace --enable-binaries --enable-trace --enable-static=yes -- 
enable-debug --with-devel-headers=1 --with-mpi-param-check=always -- 
with-ft=cr --enable-ft-thread --with-blcr=/usr/local/blcr/ --with- 
blcr-libdir=/usr/local/blcr/lib --enable-mpi-threads=yes

#

Question:

what do you think can be wrong? Please instruct me on how to resolve  
this problem.


Thank you

Jean




--- On Mon, 11/1/10, Josh Hursey  wrote:

From: Josh Hursey 
Subject: Re: [OMPI users] checkpointing multi node and multi process  
applications

To: "Open MPI Users" 
Date: Monday, 11 January, 2010, 21:42


On Dec 19, 2009, at 7:42 AM, Jean Potsam wrote:

> Hi Everyone,
>I am trying to checkpoint an mpi  
application running on multiple nodes. However, I get some error  
messages when i trigger the checkpointing process.

>
> Error: expected_component: PID information unavailable!
> Error: expected_component: Component Name information unavailable!
>
> I am using  open mpi 1.3 and blcr 0.8.1

Can you try the v1.4 release and see if the problem persists?

>
> I execute my application as follows:
>
> mpirun -am ft-enable-cr -np 3 --hostfile hosts gol.
>
> My question:
>
> Does openmpi with blcr support checkpointing of multi node  
execution of mpi application? If so, can you provide me with some  
information on how to achieve this.


Open MPI is able to checkpoint a multi-node application (that's what  
it was designed to do). There are some examples at the link below:

  http://www.osl.iu.edu/research/ft/ompi-cr/examples.php

-- Josh

>
> Cheers,
>
> Jean.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Checkpoint/Restart error

2010-01-25 Thread Josh Hursey
I tested the 1.4.1 release, and everything worked fine for me (tested  
a few different configurations of nodes/environments).


The ompi-checkpoint error you cited is usually caused by one of two  
things:
 - The PID specified is wrong (which I don't think that is the case  
here)

 - The session directory cannot be found in /tmp.

So I think the problem is the latter. The session directory looks  
something like:

  /tmp/openmpi-sessions-USERNAME@LOCALHOST_0
Within this directory the mpirun process places its contact  
information. ompi-checkpoint uses this contact information to connect  
to the job. If it cannot find it, then it errors out. (We definitely  
need a better error message here. I filed a ticket [1]).


We usually do not recommend running Open MPI as a root user. So I  
would strongly recommend that you do not run as a root user.


With a regular user, check the location of the session directory. Make  
sure that it is in /tmp on the node where 'mpirun' and 'ompi- 
checkpoint' are run.


-- Josh

[1] https://svn.open-mpi.org/trac/ompi/ticket/2189

On Jan 25, 2010, at 5:48 AM, Andreea Costea wrote:


So? anyone? any clue?

Summarize:
- installed OpenMPI 1.4.1 on fresh Centos 5
- mpirun works but ompi-checkpoint throws this error:
ORTE_ERROR_LOG: Not found in file orte-checkpoint.c at line 405
- on another VM I have OpenMPI 1.3.3. installed. Checkpointing works  
fine on guest but has the previous mentioned error on root. Both  
root and guest show the same output after "param -all -all" except  
for the $HOME (which only matters for mca_component_path,  
mca_param_files, snapc_base_global_snapshot_dir)



Thanks,
Andreea


On Tue, Jan 19, 2010 at 9:01 PM, Andreea Costea > wrote:
I noticed one more thing. As I still have some VMs that have OpenMPI  
version 1.3.3 installed I started to use those machines 'till I fix  
the problem with 1.4.1 And while checkpointing on one of this VMs I  
realized that checkpointing as a guest works fine and checkpointing  
as a root outputs the same error like in 1.4.1. : ORTE_ERROR_LOG:  
Not found in file orte-checkpoint.c at line 405


I logged the outputs of "ompi_info --param all all" which I run for  
root and for another user and the only differences were at these  
parameters:


mca_component_path
mca_param_files
snapc_base_global_snapshot_dir

All 3 params differ because of the $HOME.
One more thing: I don't have the directory $HOME/.openmpi

Ideas?

Thanks,
Andreea





On Tue, Jan 19, 2010 at 12:51 PM, Andreea Costea > wrote:
Well... I decided to install a fresh OS to be sure that there is no  
OpenMPI version conflict. So I formatted one of my VMs, did a fresh  
CentOS install, installed BLCR 0.8.2 and OpenMPI 1.4.1 and the  
result: the same. mpirun works but ompi-checkpoint has that error at  
line 405:


[[35906,0],0] ORTE_ERROR_LOG: Not found in file orte-checkpoint.c at  
line 405


As for the files remaining after uninstalling: Jeff you were rigth.  
There is no file left, just some empty directories.


Which might be the problem with that ORTE_ERROR_LOG error?

Thanks,
Andreea

On Fri, Jan 15, 2010 at 11:47 PM, Andreea Costea > wrote:

It's almost midnight here, so I left home, but I will try it tomorrow.
There were some directories left after "make uninstall". I will give  
more details tomorrow.


Thanks Jeff,
Andreea


On Fri, Jan 15, 2010 at 11:30 PM, Jeff Squyres   
wrote:

On Jan 15, 2010, at 8:07 AM, Andreea Costea wrote:

> - I wanted to update to version 1.4.1 and I uninstalled previous  
version like this: make uninstall, and than manually deleted all the  
left over files. the directory where I installed was /usr/local


I'll let Josh answer your CR questions, but I did want to ask about  
this point.  AFAIK, "make uninstall" removes *all* Open MPI files.   
For example:


-
[7:25] $ cd /path/to/my/OMPI/tree
[7:25] $ make install > /dev/null
[7:26] $ find /tmp/bogus/ -type f | wc
   646 646   28082
[7:26] $ make uninstall > /dev/null
[7:27] $ find /tmp/bogus/ -type f | wc
 0   0   0
[7:27] $
-

I realize that some *directories* are left in $prefix, but there  
should be no *files* left.  Are you seeing something different?


--
Jeff Squyres
jsquy...@cisco.com


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] checkpointing multi node and multi process applications

2010-01-11 Thread Josh Hursey


On Dec 19, 2009, at 7:42 AM, Jean Potsam wrote:


Hi Everyone,
   I am trying to checkpoint an mpi application  
running on multiple nodes. However, I get some error messages when i  
trigger the checkpointing process.


Error: expected_component: PID information unavailable!
Error: expected_component: Component Name information unavailable!

I am using  open mpi 1.3 and blcr 0.8.1


Can you try the v1.4 release and see if the problem persists?



I execute my application as follows:

mpirun -am ft-enable-cr -np 3 --hostfile hosts gol.

My question:

Does openmpi with blcr support checkpointing of multi node execution  
of mpi application? If so, can you provide me with some information  
on how to achieve this.


Open MPI is able to checkpoint a multi-node application (that's what  
it was designed to do). There are some examples at the link below:

  http://www.osl.iu.edu/research/ft/ompi-cr/examples.php

-- Josh



Cheers,

Jean.

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2010-01-11 Thread Josh Hursey
h.po3117822 --app /home/cesga/sdiaz/ 
ompi_global_snapshot_12554.ckpt/restart-appfile
sdiaz13073  0.0  0.0 15988  508 pts/0Sl+  15:58   0:00   
|   \_ cr_restart /home/cesga/sdiaz/ 
ompi_global_snapshot_12554.ckpt/0/opal_snapshot_0.ckpt/ 
ompi_blcr_context.12558
sdiaz12558  0.2  0.0 99464 3616 pts/0Sl+  15:58   0:00   
|   \_ ./pi3



Sergio Díaz escribió:


Hi Josh

Here you go the file.

I will try to apply the trunk but I think that I broke-up my  
openmpi installation doing "something" and I don't know what :- 
( . I was modifying the mca parameters...
When I send a job, the orted daemon expanded in the SLAVE host is  
launched in a bucle till they spend all the reserved memory.
It is very strange so I will compile it again, I will reproduce  
the bug and then I will test the trunk.


Thanks a lot for the support and tickets opened.
Sergio


sdiaz30279  0.0  0.0  1888  560 ?Ds   12:54
0:00  \_ /opt/cesga/sge62/utilbin/lx24-x86/qrsh_starter /opt/ 
cesga/sge62/default/spool/compute
sdiaz30286  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/cesga/openmpi-1.3.3/bin/orted - 
mca ess env -mca orte_ess_jobid 219
sdiaz30322  0.0  0.0 52772 1188 ?S12:54
0:00  \_ /bin/bash /opt/cesga/openmpi-1.3.3/bin/orted
sdiaz30358  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/cesga/openmpi-1.3.3/bin/ 
orted
sdiaz30394  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/cesga/openmpi-1.3.3/ 
bin/orted
sdiaz30430  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/cesga/ 
openmpi-1.3.3/bin/orted
sdiaz30466  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/cesga/ 
openmpi-1.3.3/bin/orted
sdiaz30502  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/cesga/ 
openmpi-1.3.3/bin/orted
sdiaz30538  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/cesga/ 
openmpi-1.3.3/bin/orted
sdiaz30574  0.0  0.0 52772 1188 ?D12:54
0:00  \_ /bin/bash /opt/ 
cesga/openmpi-1.3.3/bin/orted





Josh Hursey escribió:



On Nov 12, 2009, at 10:54 AM, Sergio Díaz wrote:


Hi Josh,

You were right. The main problem was the /tmp. SGE uses a  
scratch directory in which the jobs have temporary files.  
Setting TMPDIR to /tmp, checkpoint works!
However, when I try to restart it... I got the following error  
(see ERROR1). Option -v agrees these lines (see ERRO2).


It is concerning that ompi-restart is segfault'ing when it  
errors out. The error message is being generated between the  
launch of the opal-restart starter command and when we try to  
exec(cr_restart). Usually the failure is related to a corruption  
of the metadata stored in the checkpoint.


Can you send me the file below:
 ompi_global_snapshot_28454.ckpt/0/opal_snapshot_0.ckpt/ 
snapshot_meta.data


I was able to reproduce the segv (at least I think it is the  
same one). We failed to check the validity of a string when we  
parse the metadata. I committed a fix to the trunk in r22290,  
and requested that the fix be moved to the v1.4 and v1.5  
branches. If you are interested in seeing when they get applied  
you can follow the following tickets:

  https://svn.open-mpi.org/trac/ompi/ticket/2140
  https://svn.open-mpi.org/trac/ompi/ticket/2141

Can you try the trunk to see if the problem goes away? The  
development trunk and v1.5 series have a bunch of improvements  
to the C/R functionality that were never brought over the v1.3/ 
v1.4 series.




I was trying to use ssh instead of rsh but I was impossible. By  
default it should use ssh and if it finds a problem, it will  
use rsh. It seems that ssh doesn't work because always use rsh.

If I change this MCA parameter, It still uses rsh.
If I set OMPI_MCA_plm_rsh_disable_qrsh variable to 1, It try to  
use ssh and doesn't works. I got --> "bash: orted: command not  
found" and the mpi process dies.
The command which try to execute is the following and I haven't  
found yet the reason why this command doesn't found orted  
because I set the /etc/bashrc in order to get always the right  
path and I have the right path into my application. (see ERROR4).


This seems like an SGE specific issue, so a bit out of my  
domain. Maybe others have suggestions here.


-- Josh 



Many thanks!,
Sergio

P.S. Sorry about these long emails. I just try to show you  
useful information to identify my problems.



ERROR 1
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>&g

Re: [OMPI users] problem restarting multiprocess mpi application

2010-01-11 Thread Josh Hursey


On Dec 13, 2009, at 3:57 PM, Kritiraj Sajadah wrote:


Dear All,
   I am running a simple mpi application which looks as  
follows:


##

#include 
#include 
#include 
#include 
#include 

int main(int argc, char **argv)
{
int rank,size;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
printf("Hello\n");
sleep(15);
printf("Hello again\n" );
sleep(15);
printf("Final Hello\n");
sleep(15);
printf("bye \n");
MPI_Finalize();
return 0;
}
#

When I run my application as follows, it checkpoint correctly but  
when i try to restart it if gives the following errors:


##

ompi-restart ompi_global_snapshot_380.ckpt
Hello again
[sun06:00381] *** Process received signal ***
[sun06:00381] Signal: Bus error (7)
[sun06:00381] Signal code:  (2)
[sun06:00381] Failing at address: 0xae7cb054
[sun06:00381] [ 0] [0xb7f8640c]
[sun06:00381] [ 1] /home/raj/openmpisof/lib/libopen-pal.so. 
0(opal_progress+0x123) [0xb7b95456]
[sun06:00381] [ 2] /home/raj/openmpisof/lib/libopen-pal.so.0  
[0xb7bcb093]
[sun06:00381] [ 3] /home/raj/openmpisof/lib/libopen-pal.so.0  
[0xb7bcae97]
[sun06:00381] [ 4] /home/raj/openmpisof/lib/libopen-pal.so. 
0(opal_crs_blcr_checkpoint+0x187) [0xb7bca69b]
[sun06:00381] [ 5] /home/raj/openmpisof/lib/libopen-pal.so. 
0(opal_cr_inc_core+0xc3) [0xb7b970bd]
[sun06:00381] [ 6] /home/raj/openmpisof/lib/libopen-rte.so.0  
[0xb7cab06f]
[sun06:00381] [ 7] /home/raj/openmpisof/lib/libopen-pal.so. 
0(opal_cr_test_if_checkpoint_ready+0x129) [0xb7b96fca]
[sun06:00381] [ 8] /home/raj/openmpisof/lib/libopen-pal.so.0  
[0xb7b97698]

[sun06:00381] [ 9] /lib/libpthread.so.0 [0xb7ac4f3b]
[sun06:00381] [10] /lib/libc.so.6(clone+0x5e) [0xb7a4bbee]
[sun06:00381] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 399 on node sun06 exited  
on signal 7 (Bus error).

--
#


This could be caused by a variety of things, including a bad BLCR  
installation. :/


Are you sure that your application was between MPI_Init() and  
MPI_Finalize() when you checkpointed?



I am running it as follows:


mpirun -am ft-enable-cr -np 2 -mca btl ^openib -mca  
snapc_base_global_snapshot_dir /tmp mpisleepbas.




Try specifying the MCA parameters in your $HOME/.openmpi/mca- 
params.conf file.




Once a checkpoint it taken, I have to copy it to the home directory  
and try to restart it.


The manual movement of the checkpoint file is not currently supported.  
I filed a bug about it if you want to track it:

  https://svn.open-mpi.org/trac/ompi/ticket/2161



please not that if i used - np 1, it works fine when i restart it.  
The problem is mainly when the application has more than one process  
running.


Are the processes on the same machines or different machines?

-- Josh




Any help will be very appreciated


Raj






___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with checkpointing multihosts, multiprocesses MPI application

2010-01-11 Thread Josh Hursey


On Dec 12, 2009, at 10:03 AM, Kritiraj Sajadah wrote:


Dear All,
I am trying to checkpoint am MPI application which has two  
processes each running on two seperate hosts.


I run the application as follows:

raj@sun32:~$ mpirun -am ft-enable-cr -np 2 --hostfile sunhost -mca  
btl ^openib -mca snapc_base_global_snapshot_dir /tmp m.


Try setting the 'snapc_base_global_snapshot_dir' in your  
$HOME/.openmpi/mca-params.conf file instead of on the command line.  
This way it will be properly picked up by the ompi-restart commands.


See the link below for how to do this:
  http://www.osl.iu.edu/research/ft/ompi-cr/examples.php#uc-ckpt-global



and I trigger the checkpoint as follows:

raj@sun32:~$ ompi-checkpoint -v 30010


The following happens displaying two errors which checkpointng the  
application:



##
I am processor no 0 of a total of 2 procs on host sun32
I am processor no 1 of a total of 2 procs on host sun06
I am processo no 0 of a total of 2 procs on host sun32
I am processo no 1 of a total of 2 procs on host sun06

[sun32:30010] Error: expected_component: PID information unavailable!
[sun32:30010] Error: expected_component: Component Name information  
unavailable!


The only way this error could be generated when checkpointing (versus  
restarting) is if the Snapshot Coordinator failed to propagate the CRS  
component used so that it could be stored in the metadata. If this  
continues to happen try enabling debugging in the snapshot coordinator:

 mpirun -mca snapc_full_verbose 20 ...



I am proceor no 1 of a total of 2 procs on host sun06
I am proceor no 0 of a total of 2 procs on host sun32
bye
bye





when I try to restart the application from the checkpointed file, I  
get the following:


raj@sun32:~$ ompi-restart ompi_global_snapshot_30010.ckpt
--
Error: The filename (opal_snapshot_1.ckpt) is invalid because either  
you have not provided a filename

  or provided an invalid filename.
  Please see --help for usage.

--
I am proceor no 0 of a total of 2 procs on host sun32
bye


This usually indicates that either:
 1) The local checkpoint directory (opal_snapshot_1.ckpt) is missing.  
So the global checkpoint is either corrupted, or the node where rank 1  
resided was not able to access the storage location (/tmp in your  
example).
 2) You moved the ompi_global_snapshot_30010.ckpt directory from /tmp  
to somewhere else. Currently, manually moving the global checkpoint  
directory is not supported.


-- Josh




I would very appreciate if you could give me some ideas on how to  
checkpoint and restart MPI application running on multiple hosts.


Thank you

Regards,

Raj



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Changing location where checkpoints are saved

2009-12-09 Thread Josh Hursey
I took a look at the checkpoint staging and preload functionality. It  
seems that the combination of the two is broken on the v1.3 and v1.4  
branches. I filed a bug about it so that it would not get lost:

  https://svn.open-mpi.org/trac/ompi/ticket/2139

I also attached a patch to partially fix the problem, but the actual  
fix is must more involved. I don't know when I'll get around to  
finishing this bug fix for that branch. :(


However, the current development trunk and v1.5 are know to have a  
working version of this feature. Can you try the trunk or v1.5 and see  
if this fixes the problem?


-- Josh

P.S. If you are interested, we have a slightly better version of the  
documentation, hosted at the link below:

  http://osl.iu.edu/research/ft/ompi-cr/

On Nov 18, 2009, at 1:27 PM, Constantinos Makassikis wrote:


Josh Hursey wrote:

(Sorry for the excessive delay in replying)

On Sep 30, 2009, at 11:02 AM, Constantinos Makassikis wrote:


Thanks for the reply!

Concerning the mca options for checkpointing:
- are verbosity options (e.g.: crs_base_verbose) limited to 0 and  
1 values ?
- in priority options (e.g.: crs_blcr_priority) do lower numbers  
indicate higher priority ?


By searching in the archives of the mailing list I found two  
interesting/useful posts:
- [1] http://www.open-mpi.org/community/lists/users/ 
2008/09/6534.php (for different checkpointing schemes)
- [2] http://www.open-mpi.org/community/lists/users/ 
2009/05/9385.php (for restarting)


Following indications given in [1], I tried to make each process
checkpoint itself in it local /tmp and centralize the resulting
checkpoints in /tmp or $HOME:

Excerpt from mca-params.conf:
-
snapc_base_store_in_place=0
snapc_base_global_snapshot_dir=/tmp or $HOME
crs_base_snapshot_dir=/tmp

COMMANDS used:
--
mpirun -n 2 -machinefile machines -am ft-enable-cr a.out
ompi-checkpoint mpirun_pid



OUTPUT of ompi-checkpoint -v 16753
--
[ic85:17044] orte_checkpoint: Checkpointing...
[ic85:17044] PID 17036
[ic85:17044] Connected to Mpirun [[42098,0],0]
[ic85:17044] orte_checkpoint: notify_hnp: Contact Head Node  
Process PID 17036
[ic85:17044] orte_checkpoint: notify_hnp: Requested a checkpoint  
of jobid [INVALID]
[ic85:17044] orte_checkpoint: hnp_receiver: Receive a command  
message.

[ic85:17044] orte_checkpoint: hnp_receiver: Status Update.
[ic85:17044] Requested - Global Snapshot  
Reference: (null)
[ic85:17044] orte_checkpoint: hnp_receiver: Receive a command  
message.

[ic85:17044] orte_checkpoint: hnp_receiver: Status Update.
[ic85:17044]   Pending - Global Snapshot  
Reference: (null)
[ic85:17044] orte_checkpoint: hnp_receiver: Receive a command  
message.

[ic85:17044] orte_checkpoint: hnp_receiver: Status Update.
[ic85:17044]   Running - Global Snapshot  
Reference: (null)
[ic85:17044] orte_checkpoint: hnp_receiver: Receive a command  
message.

[ic85:17044] orte_checkpoint: hnp_receiver: Status Update.
[ic85:17044] File Transfer - Global Snapshot  
Reference: (null)
[ic85:17044] orte_checkpoint: hnp_receiver: Receive a command  
message.

[ic85:17044] orte_checkpoint: hnp_receiver: Status Update.
[ic85:17044] Error - Global Snapshot  
Reference: ompi_global_snapshot_17036.ckpt




OUTPUT of MPIRUN


[ic85:17038] crs:blcr: blcr_checkpoint_peer: Thread finished with  
status 3
[ic86:20567] crs:blcr: blcr_checkpoint_peer: Thread finished with  
status 3

--
WARNING: Could not preload specified file: File already exists.

Fileset: /tmp/ompi_global_snapshot_17036.ckpt/0
Host: ic85

Will continue attempting to launch the process.

--
[ic85:17036] filem:rsh: wait_all(): Wait failed (-1)
[ic85:17036] [[42098,0],0] ORTE_ERROR_LOG: Error in  
file ../../../../../orte/mca/snapc/full/snapc_full_global.c at  
line 1054


This is a warning about creating the global snapshot directory  
(ompi_global_snapshot_17036.ckpt) for the first checkpoint (seq 0).  
It seems to indicate that the directory existed when the file  
gather started.


A couple things to check:
- Did you clean out the /tmp on all of the nodes with any files  
starting with "opal" or "ompi"?
- Does the error go away when you set  
(snapc_base_global_snapshot_dir=$HOME)?
- Could you try running against a v1.3 release? (I wonder if this  
feature has been broken on the trunk)


Let me know what you find. In the next couple days, I'll try to  
test the trunk again with this feature to make sure that it is  
still working on my test machines.


-- Josh

Hello Josh,

I have switched to v1.3 and re-run with  
snapc_base_global_snapshot_dir=/tmp or $HOME

with a clean /tmp.

In both cases I get the same e

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-12-09 Thread Josh Hursey
i+0x1e) [0x2a9557906e]
> [compute-3-18:28793] [ 4] /opt/cesga/openmpi-1.3.3/lib/libopen- 
pal.so.0(opal_finalize+0x36) [0x2a9556bcfa]

> [compute-3-18:28793] [ 5] opal-restart [0x40312a]
> [compute-3-18:28793] [ 6] /lib64/tls/libc.so.6(__libc_start_main 
+0xdb) [0x33bb61c3fb]

> [compute-3-18:28793] [ 7] opal-restart [0x40272a]
> [compute-3-18:28793] *** End of error message ***
>  
--
> mpirun noticed that process rank 0 with PID 28792 on node  
compute-3-18.local exited on signal 11 (Segmentation fault).
>  
--
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>



ERROR 2
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> [sdiaz@compute-3-18 ~]$ ompi-restart -v  
ompi_global_snapshot_28454.ckpt
>[compute-3-18.local:28941] Checking for the existence of (/home/ 
cesga/sdiaz/ompi_global_snapshot_28454.ckpt)
> [compute-3-18.local:28941] Restarting from file  
(ompi_global_snapshot_28454.ckpt)

> [compute-3-18.local:28941]   Exec in self
> ...
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>



ERROR3
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

>[sdiaz@compute-3-18 ~]$ ompi_info  --all|grep "plm_rsh_agent"
> How many plm_rsh_agent instances to invoke concurrently  
(must be > 0)
> MCA plm: parameter "plm_rsh_agent" (current value: "ssh :  
rsh", data source: default value, synonyms: pls_rsh_agent)
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>


ERROR4
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>/usr/bin/ssh -x compute-3-17.local  orted --debug-daemons -mca ess  
env -mca orte_ess_jobid 2152464384 -mca orte_ess_vpid 1 -mca  
orte_ess_num_procs 2 --hnp-uri >"2152464384.0;tcp:// 
192.168.4.143:59176" -mca mca_base_param_file_prefix ft-enable-cr - 
mca mca_base_param_file_path >/opt/cesga/openmpi-1.3.3/share/openmpi/ 
amca-param-sets:/home_no_usc/cesga/sdiaz/mpi_test -mca  
mca_base_param_file_path_force /home_no_usc/cesga/sdiaz/mpi_test
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>











Josh Hursey escribió:


On Nov 9, 2009, at 5:33 AM, Sergio Díaz wrote:



Hi Josh,

The OpenMPI version is 1.3.3.

The command ompi-ps doesn't work.

[root@compute-3-18 ~]# ompi-ps -j 2726959 -p 16241
[root@compute-3-18 ~]# ompi-ps -v -j 2726959 -p 16241
[compute-3-18.local:16254] orte_ps: Acquiring list of HNPs and  
setting contact info into RML...

[root@compute-3-18 ~]# ompi-ps -v -j 2726959
[compute-3-18.local:16255] orte_ps: Acquiring list of HNPs and  
setting contact info into RML...


[root@compute-3-18 ~]# ps uaxf | grep sdiaz
root 16260  0.0  0.0 51084  680 pts/0S+   13:38
0:00  \_ grep sdiaz
sdiaz16203  0.0  0.0 53164 1220 ?Ss   13:37
0:00  \_ -bash /opt/cesga/sge

Re: [OMPI users] Problem with mpirun -preload-binary option

2009-12-09 Thread Josh Hursey
I verified that the preload functionality works on the trunk. It seems  
to be broken on the v1.3/v1.4 branches. The version of this code has  
changed significantly between the v1.3/v1.4 and the trunk/v1.5  
versions. I filed a bug about this so it does not get lost:

  https://svn.open-mpi.org/trac/ompi/ticket/2139

Can you try this again with either the trunk or v1.5 to see if that  
helps with the preloading?


However you need to fix the password-less login issue before anything  
else will work. If mpirun is prompting you for a password, then it  
will work properly.


-- Josh

On Nov 12, 2009, at 3:50 PM, Qing Pang wrote:

Now that I have passwordless-ssh set up both directions, and  
verified working - I still have the same problem.
I'm able to run ssh/scp on both master and client nodes - (at this  
point, they are pretty much the same), without being asked for  
password. And mpirun works fine if I have the executable put in the  
same directory on both nodes.


But when I tried the preload-binary option, I still have the same  
problem - it asked me for the password of the node running mpirun,  
and then tells that scp failed.


---


Josh Wrote:

Though the --preload-binary option was created while building the  
checkpoint/restart functionality it does not depend on checkpoint/ 
restart function in any way (just a side effect of the initial  
development).


The problem you are seeing is a result of the computing environment  
setup of password-less ssh. The --preload-binary command uses  
'scp' (at the moment) to copy the files from the node running mpirun  
to the compute nodes. The compute nodes are the ones that call  
'scp', so you will need to setup password-less ssh in both directions.


-- Josh

On Nov 11, 2009, at 8:38 AM, Ralph Castain wrote:


I'm no expert on the preload-binary option - but I would suspect that

is the case given your observations.


That option was created to support checkpoint/restart, not for what
you are attempting to do. Like I said, you -should- be able to use  
it for that purpose, but I expect you may hit a few quirks like this  
along the way.


On Nov 11, 2009, at 9:16 AM, Qing Pang wrote:

> Thank you very much for your help! I believe I do have password- 
less
ssh set up, at least from master node to client node (desktop ->  
laptop in my case). If I type >ssh node1 on my desktop terminal, I  
am able to get to the laptop node without being asked for password.  
And as I mentioned, if I copy the example executable from desktop to  
the laptop node using scp, then I am able to run it from desktop  
using both nodes.

> Back to the preload-binary problem - I am asked for the password of
my master node - the node I am working on - not the remote client  
node. Do you mean that I should set up password-less ssh in both  
direction? Does the client node need to access master node through  
password-less ssh to make the preload-binary option work?

>
>
> Ralph Castain Wrote:
>
> It -should- work, but you need password-less ssh setup. See our FAQ
> for how to do that, if you are unfamiliar with it.
>
> On Nov 10, 2009, at 2:02 PM, Qing Pang wrote:
>
> I'm having problem getting the mpirun "preload-binary" option to  
work.

>>
>> I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with

Ethernet cable.
>> If I copy the executable to client nodes using scp, then do  
mpirun,

everything works.

>>
>> But I really want to avoid the copying, so I tried the

-preload-binary option.

>>
>> When I typed the command on my master node as below (gordon- 
desktop

is my master node, and gordon-laptop is the client node):

>>
>>

--

>> gordon_at_gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun
>> -machinefile machine.linux -np 2 --preload-binary $(pwd)/ 
hello_c.out

>>

--

>>
>> I got the following:
>>
>> gordon_at_gordon-desktop's password: (I entered my password here,
why am I asked for the password? I am working under this account  
anyway)

>>
>>
>> WARNING: Remote peer ([[18118,0],1]) failed to preload a file.
>>
>> Exit Status: 256
>> Local File:

/tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/hello_c.out
>> Remote File: /home/gordon/Desktop/openmpi-1.3.3/examples/ 
hello_c.out

>> Command:
>> scp

gordon-desktop:/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
>> /tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/ 
hello_c.out

>>
>> Will continue attempting to launch the process(es).
>>

--

>>

--

>> mpirun was unable to launch the specified application as it could

not access

>> or execute an executable:
>>
>> Executable: /home/gordon/D

Re: [OMPI users] ompi-restart using different nodes

2009-12-09 Thread Josh Hursey
So I tried to reproduce this problem today, and everything worked fine  
for me using the trunk. I haven't tested v1.3/v1.4 yet.


I tried checkpointing with one hostfile then restarting with each of  
the following:

 - No hostfile
 - a hostfile with completely different machines
 - a hostfile with the same machines in the opposite order


I suspect that the problem is not with Open MPI, but your system  
interacting with BLCR. Usually when people cannot restart on a  
different node they have problems with the 'prelink' feature on Linux.  
BLCR has a FAQ item on this:

  https://upc-bugs.lbl.gov//blcr/doc/html/FAQ.html#prelink

So if this is your problem then you will probably not be able to  
checkpoint a single process (non-MPI) application on one node and  
restart on another. Sorry I didn't mention it before, must have  
slipped my mind.


If this turns out to not be the problem, let me know and I'll take  
another look. Also send me any error messages that are displayed.


-- Josh


On Dec 8, 2009, at 1:39 PM, Jonathan Ferland wrote:

I did the same test using 1.3.4 and still the same issue  I also  
tried to use the tm interface instead of specifying the hostfile,  
same result.


thanks,

Jonathan

Josh Hursey wrote:
Though I do not test this scenario (using hostfiles) very often, it  
used to work. The ompi-restart command takes a --hostfile (or -- 
machinefile) argument that is passed directly to the mpirun  
command. I wonder if something broke recently with this handoff. I  
can certainly checkpoint with one set of nodes/allocation and  
restart with another, but most/all of my testing occurs in a SLURM  
environment, so no need for an explicit hostfile.


I'll take a look to see if I can reproduce, but probably will not  
be until next week.


-- Josh

On Dec 2, 2009, at 9:54 AM, Jonathan Ferland wrote:


Hi,

I am trying to use BLCR checkpointing in mpi. I am currently able  
to run my application using some hostfile, checkpoint the run, and  
then restart the application using the same hostfile. The thing I  
would like to do is to restart the application with a different  
hostfile. But this leads to a segfault using 1.3.3.


Is it possible to restart the application using a different  
hostfile (we are using pbs to create the hostfile, so each new  
restart might be on different nodes), how can we do that? If no,  
do you plan to include this in a future release?


thanks

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--




--
Jonathan Ferland, analyste en calcul scientifique
RQCHP (Réseau québécois de calcul de haute performance)

bureau S-252, pavillon Roger-Gaudry, Université de Montréal
téléphone   : 514 343-6111 poste 8852
télécopieur : 514 343-2155
--

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] ompi-restart using different nodes

2009-12-02 Thread Josh Hursey
Though I do not test this scenario (using hostfiles) very often, it  
used to work. The ompi-restart command takes a --hostfile (or -- 
machinefile) argument that is passed directly to the mpirun command. I  
wonder if something broke recently with this handoff. I can certainly  
checkpoint with one set of nodes/allocation and restart with another,  
but most/all of my testing occurs in a SLURM environment, so no need  
for an explicit hostfile.


I'll take a look to see if I can reproduce, but probably will not be  
until next week.


-- Josh

On Dec 2, 2009, at 9:54 AM, Jonathan Ferland wrote:


Hi,

I am trying to use BLCR checkpointing in mpi. I am currently able to  
run my application using some hostfile, checkpoint the run, and then  
restart the application using the same hostfile. The thing I would  
like to do is to restart the application with a different hostfile.  
But this leads to a segfault using 1.3.3.


Is it possible to restart the application using a different hostfile  
(we are using pbs to create the hostfile, so each new restart might  
be on different nodes), how can we do that? If no, do you plan to  
include this in a future release?


thanks

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-11-11 Thread Josh Hursey

On Nov 9, 2009, at 5:33 AM, Sergio Díaz wrote:

> Hi Josh,
> 
> The OpenMPI version is 1.3.3.
> 
> The command ompi-ps doesn't work.
> 
> [root@compute-3-18 ~]# ompi-ps -j 2726959 -p 16241
> [root@compute-3-18 ~]# ompi-ps -v -j 2726959 -p 16241
> [compute-3-18.local:16254] orte_ps: Acquiring list of HNPs and setting 
> contact info into RML...
> [root@compute-3-18 ~]# ompi-ps -v -j 2726959
> [compute-3-18.local:16255] orte_ps: Acquiring list of HNPs and setting 
> contact info into RML...
> 
> [root@compute-3-18 ~]# ps uaxf | grep sdiaz
> root 16260  0.0  0.0 51084  680 pts/0S+   13:38   0:00  \_ 
> grep sdiaz
> sdiaz16203  0.0  0.0 53164 1220 ?Ss   13:37   0:00  \_ -bash 
> /opt/cesga/sge62/default/spool/compute-3-18/job_scripts/2726959
> sdiaz16241  0.0  0.0 41028 2480 ?S13:37   0:00  \_ 
> mpirun -np 2 -am ft-enable-cr ./pi3
> sdiaz16242  0.0  0.0 36484 1840 ?Sl   13:37   0:00  
> \_ /opt/cesga/sge62/bin/lx24-x86/qrsh -inherit -nostdin -V compute-3-17.local 
>  orted -mca ess env -mca orte_ess_jobid 2769879040 -mca orte_ess_vpid 1 -mca 
> orte_ess_num_procs 2 --hnp-uri "2769879040.0;tcp://192.168.4.143:57010" -mca 
> mca_base_param_file_prefix ft-enable-cr -mca mca_base_param_file_path 
> /opt/cesga/openmpi-1.3.3/share/openmpi/amca-param-sets:/home_no_usc/cesga/sdiaz/mpi_test
>  -mca mca_base_param_file_path_force /home_no_usc/cesga/sdiaz/mpi_test
> sdiaz16245  0.1  0.0 99464 4616 ?Sl   13:37   0:00  
> \_ ./pi3
> 
> [root@compute-3-18 ~]# ompi-ps -n c3-18
> [root@compute-3-18 ~]# ompi-ps -n compute-3-18
> [root@compute-3-18 ~]# ompi-ps -n
> 
> There is not directory on the /tmp of the node. However, if the application 
> is run without SGE, the directory is created

This may be the core of the problem. ompi-ps and other command line tools 
(e.g., ompi-checkpoint) look for the Open MPI session directory in /tmp in 
order to find the connection information to connect to the mpirun process 
(internally called the HNP or Head Node Process).

Can you change the location of the temporary directory in SGE? The temporary 
directory is usually set via an environment variable (e.g., TMPDIR, or TMP). So 
removing the environment variable or setting it to /tmp might help.


> but if I do ompi-ps -j MPIRUN_PID, it seems hanged and I interrupt it. Does 
> it take long time?

It should not take a long time. It is just querying the mpirun process for 
state information.

> what means the option -j of ompi-ps command? isn't it related to a batch 
> system(like sge, condor...), is it?

The '-j' option allows the user to specify the Open MPI jobid. This is 
completely different than the jobid provided by the batch system. In general, 
users should not need to specify the -j option. It is useful when you have 
multiple Open MPI jobs, and want a summary of just one of them.

> 
> Thanks for the ticket. I will follow it.
> 
> Talking with Alan, I realized that there are few transport protocols that are 
> supported. And maybe it is the problem. Currently, SGE is using qrsh to 
> expand mpi process. I can change this protocol and use ssh. So, I'm going to 
> test it this afternoon and I will comment to you the results.

Try 'ssh' and see if that helps. I suspect the problem is with the session 
directory location though.

> 
> Regards,
> Sergio
> 
> 
> Josh Hursey escribió:
>> 
>> On Oct 28, 2009, at 7:41 AM, Sergio Díaz wrote: 
>> 
>>> Hello, 
>>> 
>>> I have achieved the checkpoint of an easy program without SGE. Now, I'm 
>>> trying to do the integration openmpi+sge but I have some problems... When I 
>>> try to do checkpoint of the mpirun PID, I got an error similar to the error 
>>> gotten when the PID doesn't exit. The example below. 
>> 
>> I do not have any experience with the SGE environment, so I suspect that 
>> there may something 'special' about the environment that is tripping up the 
>> ompi-checkpoint tool. 
>> 
>> First of all, what version of Open MPI are you using? 
>> 
>> Somethings to check: 
>>  - Does 'ompi-ps' work when your application is running? 
>>  - Is there an /tmp/openmpi-sessions-* directory on the node where mpirun is 
>> currently running? This directory contains information on how to connect to 
>> the mpirun process from an external tool, if it's missing then this could be 
>> the cause of the problem. 
>> 
>>> 
>>> Any ideas? 
>>> Somebody have a script to do it automatic with SGE?. For example I have one 
>>> to do checkpoint each X seconds with

Re: [OMPI users] mpirun noticed that process rank 1 ... exited on signal 13 (Broken pipe).

2009-11-11 Thread Josh Hursey

On Nov 6, 2009, at 7:59 AM, Kritiraj Sajadah wrote:

> Hi Everyone,
>  I have install openmpi 1.3 and blcr 0.81 on my laptop (single 
> processor).
> 
> I am trying to checkpoint a small test application:
> 
> ###
> 
> #include 
> #include 
> #include 
> #include
> #include
> 
> int main(int argc, char **argv)
> {
> int rank,size;
> MPI_Init(&argc, &argv);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Comm_size(MPI_COMM_WORLD, &size);
> printf("I am processor no %d of a total of %d procs \n", rank, size);
> system("sleep 10");
> printf("I am processor no %d of a total of %d procs \n", rank, size);
> system("sleep 10");
> printf("I am processor no %d of a total of %d procs \n", rank, size);
> system("sleep 10");
> printf("mpisleep bye \n");
> MPI_Finalize();
> return 0;
> }
> ###
> 
> I compile it as follows:
> 
> mpicc mpisleep.c -o mpisleep
> 
> and i run it as follows:
> 
> mpirun -am ft-enable-cr -np 2 mpisleep.
> 
> When i try checkpointing ( ompi-checkpoint -v 8118) it, it checkpoints fine 
> but when i restart it, i get the following:
> 
> I am processor no 0 of a total of 2 procs 
> I am processor no 1 of a total of 2 procs 
> mpisleep bye 
> --
> mpirun noticed that process rank 1 with PID 8118 on node raj-laptop exited on 
> signal 13 (Broken pipe).
> --

Does the behavior change if you remove the 'system()' calls and replace them 
with 'sleep()'. The 'system()' call is a shorthand for fork/exec. fork/exec has 
been known to cause problems when called my an MPI process.

Give that a try and let me know if it helps.

-- Josh

> 
> Any suggestions is very much appreciated
> 
> Raj
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Problem with mpirun -preload-binary option

2009-11-11 Thread Josh Hursey
Though the --preload-binary option was created while building the 
checkpoint/restart functionality it does not depend on checkpoint/restart 
function in any way (just a side effect of the initial development).

The problem you are seeing is a result of the computing environment setup of 
password-less ssh. The --preload-binary command uses 'scp' (at the moment) to 
copy the files from the node running mpirun to the compute nodes. The compute 
nodes are the ones that call 'scp', so you will need to setup password-less ssh 
in both directions.

-- Josh

On Nov 11, 2009, at 8:38 AM, Ralph Castain wrote:

> I'm no expert on the preload-binary option - but I would suspect that is the 
> case given your observations.
> 
> That option was created to support checkpoint/restart, not for what you are 
> attempting to do. Like I said, you -should- be able to use it for that 
> purpose, but I expect you may hit a few quirks like this along the way.
> 
> On Nov 11, 2009, at 9:16 AM, Qing Pang wrote:
> 
>> Thank you very much for your help! I believe I do have password-less ssh set 
>> up, at least from master node to client node (desktop -> laptop in my case). 
>> If I type >ssh node1 on my desktop terminal, I am able to get to the laptop 
>> node without being asked for password. And as I mentioned, if I copy the 
>> example executable from desktop to the laptop node using scp, then I am able 
>> to run it from desktop using both nodes.
>> Back to the preload-binary problem - I am asked for the password of my 
>> master node - the node I am working on - not the remote client node. Do you 
>> mean that I should set up password-less ssh in both direction? Does the 
>> client node need to access master node through password-less ssh to make the 
>> preload-binary option work?
>> 
>> 
>> Ralph Castain Wrote:
>> 
>> It -should- work, but you need password-less ssh setup. See our FAQ
>> for how to do that, if you are unfamiliar with it.
>> 
>> On Nov 10, 2009, at 2:02 PM, Qing Pang wrote:
>> 
>> I'm having problem getting the mpirun "preload-binary" option to work.
>>> 
>>> I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with Ethernet cable.
>>> If I copy the executable to client nodes using scp, then do mpirun, 
>>> everything works.
>>> 
>>> But I really want to avoid the copying, so I tried the -preload-binary 
>>> option.
>>> 
>>> When I typed the command on my master node as below (gordon-desktop is my 
>>> master node, and gordon-laptop is the client node):
>>> 
>>> --
>>> gordon_at_gordon-desktop:~/Desktop/openmpi-1.3.3/examples$  mpirun
>>> -machinefile machine.linux -np 2 --preload-binary $(pwd)/hello_c.out
>>> --
>>> 
>>> I got the following:
>>> 
>>> gordon_at_gordon-desktop's password:  (I entered my password here, why am I 
>>> asked for the password? I am working under this account anyway)
>>> 
>>> 
>>> WARNING: Remote peer ([[18118,0],1]) failed to preload a file.
>>> 
>>> Exit Status: 256
>>> Local  File: 
>>> /tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/hello_c.out
>>> Remote File: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
>>> Command:
>>> scp  gordon-desktop:/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
>>> /tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/hello_c.out
>>> 
>>> Will continue attempting to launch the process(es).
>>> --
>>> --
>>> mpirun was unable to launch the specified application as it could not access
>>> or execute an executable:
>>> 
>>> Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
>>> Node: node1
>>> 
>>> while attempting to start process rank 1.
>>> --
>>> 
>>> Had anyone succeeded with the 'preload-binary' option with the similar 
>>> settings? I assume this mpirun option should work when compiling openmpi 
>>> with default  options? Anything I need to set?
>>> 
>>> --qing
>>> 
>>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Question about checkpoint/restart protocol

2009-11-06 Thread Josh Hursey


On Nov 5, 2009, at 4:46 AM, Mohamed Adel wrote:


Dear Sergio,

Thank you for your reply. I've inserted the modules into the kernel  
and it all worked fine. But there is still a weired issue. I use the  
command "mpirun -n 2 -am ft-enable-cr -H comp001 checkpoint-restart- 
test" to start the an mpi job. I then use "ompi-checkpoint PID" to  
checkpoint a job, but the ompi-checkpoint didn't respond and the  
mpirun produces the following.


--
An MPI process has executed an operation involving a call to the
"fork()" system call to create a child process.  Open MPI is currently
operating in a condition that could result in memory corruption or
other system errors; your MPI job may hang, crash, or produce silent
data corruption.  The use of fork() (or system() or other calls that
create child processes) is strongly discouraged.

The process that invoked fork was:

 Local host:  comp001.local (PID 23514)
 MPI_COMM_WORLD rank: 0

If you are *absolutely sure* that your application will successfully
and correctly survive a call to fork(), you may disable this warning
by setting the mpi_warn_on_fork MCA parameter to 0.
--
[login01.local:21425] 1 more process has sent help message help-mpi- 
runtime.txt / mpi_init:warn-fork
[login01.local:21425] Set MCA parameter "orte_base_help_aggregate"  
to 0 to see all help / error messages


Notice: if the -n option has a value more than 1, then this error  
occurs, but if the -n option has the value 1 then the ompi- 
checkpoint succeeds, mpirun produces the same message and ompi- 
restart fails with the message

[login01:21417] *** Process received signal ***
[login01:21417] Signal: Segmentation fault (11)
[login01:21417] Signal code: Address not mapped (1)
[login01:21417] Failing at address: (nil)
[login01:21417] [ 0] /lib64/libpthread.so.0 [0x32df20de70]
[login01:21417] [ 1] /home/mab/openmpi-1.3.3/lib/openmpi/ 
mca_crs_blcr.so [0x2b093509dfee]
[login01:21417] [ 2] /home/mab/openmpi-1.3.3/lib/openmpi/ 
mca_crs_blcr.so(opal_crs_blcr_restart+0xd9) [0x2b093509d251]

[login01:21417] [ 3] opal-restart [0x401c3e]
[login01:21417] [ 4] /lib64/libc.so.6(__libc_start_main+0xf4)  
[0x32dea1d8b4]

[login01:21417] [ 5] opal-restart [0x401399]
[login01:21417] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 21417 on node  
login01.local exited on signal 11 (Segmentation fault).

--

Any help with that will be appreciated?


I have not seen this behavior before. The first error is Open MPI  
warning you that one of your MPI processes is trying to use fork(), so  
you may want to make sure that your application is not using any system 
() or fork() function calls. Open MPI internally should not be using  
any of these functions from within the MPI library linked to the  
application.


When you reloaded the BLCR module, did you rebuild Open MPI and  
install it in a clean directory (not over the top of the old directory)?


Have you tried to checkpoint/restart an non-MPI process with BLCR on  
your system? This will help to rule out installation problems with BLCR.


I suspect that Open MPI is not building correctly, or something in  
your build environment is confusing/corrupting the build. Can you send  
me your config.log, it may help me pinpoint the problem if it is build  
related.


-- Josh



Thanks in advance,
Mohamed Adel


From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On  
Behalf Of Sergio Díaz [sd...@cesga.es]

Sent: Thursday, November 05, 2009 11:38 AM
To: Open MPI Users
Subject: Re: [OMPI users] Question about checkpoint/restart protocol

Hi,

Did you load the BLCR modules before compiling OpenMPI?

Regards,
Sergio

Mohamed Adel escribió:

Dear OMPI users,

I'm a new OpenMPI user. I've configured openmpi-1.3.3 with those  
options "./configure --prefix=/home/mab/openmpi-1.3.3 --with-sge -- 
enable-ft-thread --with-ft=cr --enable-mpi-threads --enable-static  
--disable-shared --with-blcr=/home/mab/blcr-0.8.2/" then compiled  
and installed it successfully.
Now I'm trying to use the checkpoint/restart protocol. I run a  
program with the options "mpirun -n 2 -am ft-enable-cr -H localhost  
prime/checkpoint-restart-test" but I receive the following error:


*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[madel:28896] Abort before MPI_INIT completed successfully; not  
able to guarantee that all other processes were killed!

--
It looks like opal_init failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel proce

Re: [OMPI users] checkpoint opempi-1.3.3+sge62

2009-11-06 Thread Josh Hursey


On Oct 28, 2009, at 7:41 AM, Sergio Díaz wrote:


Hello,

I have achieved the checkpoint of an easy program without SGE. Now,  
I'm trying to do the integration openmpi+sge but I have some  
problems... When I try to do checkpoint of the mpirun PID, I got an  
error similar to the error gotten when the PID doesn't exit. The  
example below.


I do not have any experience with the SGE environment, so I suspect  
that there may something 'special' about the environment that is  
tripping up the ompi-checkpoint tool.


First of all, what version of Open MPI are you using?

Somethings to check:
 - Does 'ompi-ps' work when your application is running?
 - Is there an /tmp/openmpi-sessions-* directory on the node where  
mpirun is currently running? This directory contains information on  
how to connect to the mpirun process from an external tool, if it's  
missing then this could be the cause of the problem.




Any ideas?
Somebody have a script to do it automatic with SGE?. For example I  
have one to do checkpoint each X seconds with BLCR and non-mpi jobs.  
It is launched by SGE if you have configured the queue and the ckpt  
environment.


I do not know of any integration of the Open MPI checkpointing work  
with SGE at the moment.


As far as time triggered checkpointing, I have a feature ticket open  
about this:

  https://svn.open-mpi.org/trac/ompi/ticket/1961

It is not available yet, but in the works.




Is it possible choose the name of the ckpt folder when you do the  
ompi-checkpoint? I can't find the option to do it.


Not at this time. Though I could see it as a useful feature, and  
shouldn't be too hard to implement. I filed a ticket if you want to  
follow the progress:

  https://svn.open-mpi.org/trac/ompi/ticket/2098

-- Josh




Regards,
Sergio




[sdiaz@compute-3-17 ~]$ ps auxf

root 20044  0.0  0.0  4468 1224 ?S13:28   0:00  \_  
sge_shepherd-2645150 -bg
sdiaz20072  0.0  0.0 53172 1212 ?Ss   13:28   0:00   
\_ -bash /opt/cesga/sge62/default/spool/compute-3-17/job_scripts/ 
2645150
sdiaz20112  0.2  0.0 41028 2480 ?S13:28
0:00  \_ mpirun -np 2 -am ft-enable-cr pi3
sdiaz20113  0.0  0.0 36484 1824 ?Sl   13:28
0:00  \_ /opt/cesga/sge62/bin/lx24-x86/qrsh -inherit - 
nostdin -V compute-3-18..
sdiaz20116  1.2  0.0 99464 4616 ?Sl   13:28
0:00  \_ pi3



[sdiaz@compute-3-17 ~]$ ompi-checkpoint 20112
[compute-3-17.local:20124] HNP with PID 20112 Not found!

[sdiaz@compute-3-17 ~]$ ompi-checkpoint -s 20112
[compute-3-17.local:20135] HNP with PID 20112 Not found!

[sdiaz@compute-3-17 ~]$ ompi-checkpoint -s --term 20112
[compute-3-17.local:20136] HNP with PID 20112 Not found!

[sdiaz@compute-3-17 ~]$ ompi-checkpoint --hnp-pid 20112
--
ompi-checkpoint PID_OF_MPIRUN
  Open MPI Checkpoint Tool

   -am Aggregate MCA parameter set file list
   -gmca|--gmca  
 Pass global MCA parameters that are  
applicable to
 all contexts (arg0 is the parameter name;  
arg1 is

 the parameter value)
-h|--helpThis help message
   --hnp-jobid This should be the jobid of the HNP whose
 applications you wish to checkpoint.
   --hnp-pid   This should be the pid of the mpirun whose
 applications you wish to checkpoint.
   -mca|--mca  
 Pass context-specific MCA parameters; they  
are
 considered global if --gmca is not used and  
only
 one context is specified (arg0 is the  
parameter

 name; arg1 is the parameter value)
-s|--status  Display status messages describing the  
progression

 of the checkpoint
   --termTerminate the application after checkpoint
-v|--verbose Be Verbose
-w|--nowait  Do not wait for the application to finish
 checkpointing before returning

--
[sdiaz@compute-3-17 ~]$ exit
logout
Connection to c3-17 closed.
[sdiaz@svgd mpi_test]$ ssh c3-18
Last login: Wed Oct 28 13:24:12 2009 from svgd.local
-bash-3.00$ ps auxf |grep sdiaz

sdiaz14412  0.0  0.0  1888  560 ?Ss   13:28   0:00   
\_ /opt/cesga/sge62/utilbin/lx24-x86/qrsh_starter /opt/cesga/sge62/ 
default/spool/compute-3-18/active_jobs/2645150.1/1.compute-3-18
sdiaz14419  0.0  0.0 35728 2260 ?S13:28
0:00  \_ orted -mca ess env -mca orte_ess_jobid 2295267328 - 
mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri  
2295267328.0;tcp://192.168.4.144:36596 -mca  
mca_base_param_file_prefix ft-enable-cr -mca  
mca_base_param_file_path /opt/cesga/openmpi-1.3.3/share/op

Re: [OMPI users] problems with checkpointing an mpi job

2009-11-06 Thread Josh Hursey


On Oct 30, 2009, at 1:35 PM, Hui Jin wrote:


Hi All,
I got a problem when trying to checkpoint a mpi job.
I will really appreciate if you can help me fix the problem.
the blcr package was installed successfully on the cluster.
I configure the ompenmpi with flags,
./configure --with-ft=cr --enable-ft-thread --enable-mpi-threads -- 
with-blcr=/usr/local --with-blcr-libdir=/usr/local/lib/

The installation looks correct. The open MPI version is 1.3.3

I got the following output when issueing ompi_info:

root@hec:/export/home/hjin/test# ompi_info | grep ft
   MCA rml: ftrm (MCA v2.0, API v2.0, Component v1.3.3)
root@hec:/export/home/hjin/test# ompi_info | grep crs
   MCA crs: none (MCA v2.0, API v2.0, Component v1.3.3)
It seems the MCA crs is lost but I have no idea about how to get it.


This is an artifact of the way ompi_info searches for components. This  
came up before on the users list:

  http://www.open-mpi.org/community/lists/users/2009/09/10667.php

I filed a bug about this, if you want to track its progress:
  https://svn.open-mpi.org/trac/ompi/ticket/2097



To run a checkpointable application, I run:
mpirun -np 2 --host hec -am ft-enable-cr test_mpi

however, when trying to checkpoint at another terminal of the same  
host, I have the following,

root@hec:~# ompi-checkpoint -v 29234
[hec:29243] orte_checkpoint: Checkpointing...
[hec:29243]  PID 29234
[hec:29243]  Connected to Mpirun [[46621,0],0]
[hec:29243] orte_checkpoint: notify_hnp: Contact Head Node Process  
PID 29234
[hec:29243] orte_checkpoint: notify_hnp: Requested a checkpoint of  
jobid [INVALID]

[hec:29243] orte_checkpoint: hnp_receiver: Receive a command message.
[hec:29243] orte_checkpoint: hnp_receiver: Status Update.
[hec:29243] Requested - Global Snapshot Reference:  
(null)

[hec:29243] orte_checkpoint: hnp_receiver: Receive a command message.
[hec:29243] orte_checkpoint: hnp_receiver: Status Update.
[hec:29243]   Pending - Global Snapshot Reference:  
(null)

[hec:29243] orte_checkpoint: hnp_receiver: Receive a command message.
[hec:29243] orte_checkpoint: hnp_receiver: Status Update.
[hec:29243]   Running - Global Snapshot Reference:  
(null)


There is some error msg at the terminal of the running applicaiton,  
as,

--
Error: The process with PID 29236 is not checkpointable.
 This could be due to one of the following:
  - An application with this PID doesn't currently exist
  - The application with this PID isn't checkpointable
  - The application with this PID isn't an OPAL application.
 We were looking for the named files:
   /tmp/opal_cr_prog_write.29236
   /tmp/opal_cr_prog_read.29236
--
[hec:29234] local) Error: Unable to initiate the handshake with peer  
[[46621,1],1]. -1
[hec:29234] [[46621,0],0] ORTE_ERROR_LOG: Error in file  
snapc_full_global.c at line 567
[hec:29234] [[46621,0],0] ORTE_ERROR_LOG: Error in file  
snapc_full_global.c at line 1054


This means that either the MPI application did not respond to the  
checkpoint request in time, or that the application was not  
checkpointable for some other reason.


Some options to try:
 - Set the 'snapc_full_max_wait_time' MCA parameter to say 60, the  
default is 20 seconds before giving up. You can also set it to 0,  
which indicates to the runtime to wait indefinitely.

   shell$ mpirun -mca snapc_full_max_wait_time 60
 - Try cleaning out the /tmp directory on all of the nodes, maybe  
this has something to do with disks being full (though usually we  
would see other symptoms).


If that doesn't help, can you send me the config.log from your build  
of Open MPI. If those do not work, I would suspect that something in  
the configure of Open MPI might have gone wrong.


-- Josh







does anyone have some hint to fix this problem?

Thanks,
Hui Jin

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] problem using openmpi with DMTCP

2009-11-06 Thread Josh Hursey

(Sorry for the excessive delay in replying)

I do not have any experience with the DMTCP project, so I can only  
speculate on what might be going on here. If you are using DMTCP to  
transparently checkpoint Open MPI you will need to make sure that you  
are not using any other interconnect other than TCP.


If you are building an OPAL CRS component for DMTCP (actually you  
probably want their MTCP project which is just the local checkpoint/ 
restart service), then what you might be seeing are the TCP sockets  
that are left open across a checkpoint operation. As an optimization  
for checkpoint->continue we leave sockets open when we checkpoint.  
Since most checkpoint/restart services will skip over the socket fd  
(since they are not supported) and take the checkpoint we leave them  
open, and close them only on restart. I suspect that DMTCP is erroring  
out since it is trying to do something else with those fds.


You may want to try just using the MTCP project, or ask for a way to  
shut off the socket negotiation and just ignore the socket fds.


Let me know how it goes.

-- Josh

On Sep 28, 2009, at 9:55 AM, Kritiraj Sajadah wrote:


Dear All,
 I am trying to integrate DMTCP with openmpi. IF I run a c  
application, it works fine. But when I execute the program using  
mpirun, It checkpoints application but gives error when restarting  
the application.


#
[31007] WARNING at connection.cpp:303 in restore; REASON='JWARNING 
((_sockDomain == AF_INET || _sockDomain == AF_UNIX ) && _sockType ==  
SOCK_STREAM) failed'

id() = 2ab3f248-30933-4ac0d75a(99007)
_sockDomain = 10
_sockType = 1
_sockProtocol = 0
Message: socket type not yet [fully] supported
[31007] WARNING at connection.cpp:303 in restore; REASON='JWARNING 
((_sockDomain == AF_INET || _sockDomain == AF_UNIX ) && _sockType ==  
SOCK_STREAM) failed'

id() = 2ab3f248-30943-4ac0d75c(99007)
_sockDomain = 10
_sockType = 1
_sockProtocol = 0
Message: socket type not yet [fully] supported
[31013] WARNING at connection.cpp:87 in restartDup2; REASON='JWARNING 
(_real_dup2 ( oldFd, fd ) == fd) failed'

oldFd = 537
fd = 1
(strerror((*__errno_location ( = Bad file descriptor
[31013] WARNING at connectionmanager.cpp:627 in closeAll;  
REASON='JWARNING(_real_close ( i->second ) ==0) failed'

i->second = 537
(strerror((*__errno_location ( = Bad file descriptor
[31015] WARNING at connectionmanager.cpp:627 in closeAll;  
REASON='JWARNING(_real_close ( i->second ) ==0) failed'

i->second = 537
(strerror((*__errno_location ( = Bad file descriptor
[31017] WARNING at connectionmanager.cpp:627 in closeAll;  
REASON='JWARNING(_real_close ( i->second ) ==0) failed'

i->second = 537
(strerror((*__errno_location ( = Bad file descriptor
[31007] WARNING at connectionmanager.cpp:627 in closeAll;  
REASON='JWARNING(_real_close ( i->second ) ==0) failed'

i->second = 537
(strerror((*__errno_location ( = Bad file descriptor
MTCP: mtcp_restart_nolibc: mapping current version of /usr/lib/gconv/ 
gconv-modules.cache into memory;

 _not_ file as it existed at time of checkpoint.
 Change mtcp_restart_nolibc.c:634 and re-compile, if you want  
different behavior.
[31015] ERROR at connection.cpp:372 in restoreOptions;  
REASON='JASSERT(ret == 0) failed'

(strerror((*__errno_location ( = Invalid argument
fds[0] = 6
opt->first = 26
opt->second.size() = 4
Message: restoring setsockopt failed
Terminating...
#

Any suggestions is very welcomed.

regards,

Raj



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




  1   2   >