I found the problem, Howard - has nothing to do with the Cray, but is a 
selection issue on the state framework.


> On Aug 21, 2015, at 7:37 AM, Howard Pritchard <hpprit...@gmail.com> wrote:
> 
> I will check if i can reproduce on nersc systems.
> 
> ----------
> 
> sent from my smart phonr so no good type.
> 
> Howard
> 
> On Aug 21, 2015 7:51 AM, "Ralph Castain" <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> wrote:
> I’ll take a look at it
> 
> > On Aug 20, 2015, at 11:34 PM, Mark Santcroos <mark.santcr...@rutgers.edu 
> > <mailto:mark.santcr...@rutgers.edu>> wrote:
> >
> > Hi all,
> >
> > I see the errors below on startup of orte-dvm on a Cray XE/XK hybrid.
> > Didn't track the commit that caused it yet, but maybe somebody has a clue 
> > from the error already.
> > Last known to work was on July 14. The 2.x branch works fine.
> >
> > Please let me know if this should be a ticket.
> >
> > Thanks
> >
> > Mark
> >
> >
> > marksant@nid25254:~> orte-dvm
> > VMURI: 2210136064.0;usock;tcp://10.128.99.109:52334 
> > <http://10.128.99.109:52334/>
> > [nid25254:32107] OPAL dss:unpack: got type 110 when expecting type 9
> > [nid25254:32107] [[33724,0],0] ORTE_ERROR_LOG: Pack data mismatch in file 
> > ../../../../orte/mca/odls/base/odls_base_default_fns.c at line 261
> > marksant@nid25254:~> orte-dvm -d
> > [nid25254:32172] procdir: /tmp/openmpi-sessions-45504@nid25254_0/33659/0/0
> > [nid25254:32172] jobdir: /tmp/openmpi-sessions-45504@nid25254_0/33659/0
> > [nid25254:32172] top: openmpi-sessions-45504@nid25254_0
> > [nid25254:32172] tmp: /tmp
> > [nid25254:32172] sess_dir_cleanup: job session dir does not exist
> > [nid25254:32172] procdir: /tmp/openmpi-sessions-45504@nid25254_0/33659/0/0
> > [nid25254:32172] jobdir: /tmp/openmpi-sessions-45504@nid25254_0/33659/0
> > [nid25254:32172] top: openmpi-sessions-45504@nid25254_0
> > [nid25254:32172] tmp: /tmp
> > VMURI: 2205876224 <tel:2205876224>.0;usock;tcp://10.128.99.109:39208 
> > <http://10.128.99.109:39208/>
> > [nid25254:32172] plm:alps: final top-level argv:
> > [nid25254:32172] plm:alps:     aprun -n 1 -N 1 -cc none -e 
> > PMI_NO_PREINITIALIZE=1 -e PMI_NO_FORK=1 -L 21959 orted -mca orte_debug 1 
> > --hnp-topo-sig 4N:2S:4L3:16L2:32L1:32C:32H:x86_64 -mca ess_base_jobid 
> > 2205876224 -mca ess_base_vpid 1 -mca ess_base_num_procs 2 -mca orte_hnp_uri 
> > 2205876224.0;usock;tcp://10.128.99.109:39208 <http://10.128.99.109:39208/>
> > [nid25254:32172] plm:alps: Set 
> > prefix:/u/sciteam/marksant/openmpi/installed/HEAD
> > [nid25254:32172] plm:alps: reset PATH: 
> > /u/sciteam/marksant/openmpi/installed/HEAD/bin:/u/sciteam/marksant/openmpi/installed/HEAD/bin:/u/sciteam/marksant/openmpi/tools/bin:/opt/cray/pmi/5.0.6-1.0000.10439.140.3.gem/bin:/opt/gcc/4.8.2/bin:/sw/xe/darshan/2.3.0/darshan-2.3.0_cle52/bin:/sw/admin/scripts:/sw/user/scripts:/sw/xe/altd/bin:/opt/moab/8.1/bin:/opt/moab/8.1/sbin:/opt/torque/5.0.2/sbin:/opt/torque/5.0.2/bin:/opt/cray/mpt/7.2.0/gni/bin:/opt/cray/craype/2.3.0/bin:/opt/cray/llm/default/bin:/opt/cray/llm/default/etc:/opt/cray/xpmem/0.1-2.0502.55507.3.2.gem/bin:/opt/cray/dmapp/7.0.1-1.0502.9501.5.211.gem/bin:/opt/cray/ugni/5.0-1.0502.9685.4.24.gem/bin:/opt/cray/udreg/2.3.2-1.0502.9275.1.25.gem/bin:/opt/cray/lustre-cray_gem_s/2.5_3.0.101_0.31.1_1.0502.8394.15.1-1.0502.19859.16.1/sbin:/opt/cray/lustre-cray_gem_s/2.5_3.0.101_0.31.1_1.0502.8394.15.1-1.0502.19859.16.1/bin:/opt/cray/alps/5.2.1-2.0502.9649.23.1.gem/sbin:/opt/cray/alps/5.2.1-2.0502.9649.23.1.gem/bin:/opt/cray/sdb/1.0-
> > 1.0502.55976.5.27.gem/bin:/opt/cray/nodestat/2.2-1.0502.53712.3.109.gem/bin:/opt/modules/3.2.10.3/bin:/u/sciteam/marksant/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/usr/lib/qt3/bin:/opt/cray/bin
> >  
> > <http://3.2.10.3/bin:/u/sciteam/marksant/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/usr/lib/qt3/bin:/opt/cray/bin>
> > [nid25254:32172] plm:alps: reset LD_LIBRARY_PATH: 
> > /u/sciteam/marksant/openmpi/installed/HEAD/lib:/u/sciteam/marksant/openmpi/installed/HEAD/lib:/opt/gcc/mpc/0.8.1/lib:/opt/gcc/mpfr/2.4.2/lib:/opt/gcc/gmp/4.3.2/lib:/opt/gcc/4.8.2/snos/lib64:/sw/xe/darshan/2.3.0/darshan-2.3.0_cle52/lib
> > [nid21959:01177] procdir: /tmp/openmpi-sessions-45504@nid21959_0/33659/0/1
> > [nid21959:01177] jobdir: /tmp/openmpi-sessions-45504@nid21959_0/33659/0
> > [nid21959:01177] top: openmpi-sessions-45504@nid21959_0
> > [nid21959:01177] tmp: /tmp
> > [nid21959:01177] sess_dir_cleanup: job session dir does not exist
> > [nid21959:01177] procdir: /tmp/openmpi-sessions-45504@nid21959_0/33659/0/1
> > [nid21959:01177] jobdir: /tmp/openmpi-sessions-45504@nid21959_0/33659/0
> > [nid21959:01177] top: openmpi-sessions-45504@nid21959_0
> > [nid21959:01177] tmp: /tmp
> > [nid25254:32172] [[33659,0],0] orted:comm:process_commands() Processing 
> > Command: ORTE_DAEMON_ADD_LOCAL_PROCS
> > [nid25254:32172] OPAL dss:unpack: got type 110 when expecting type 9
> > [nid25254:32172] [[33659,0],0] ORTE_ERROR_LOG: Pack data mismatch in file 
> > ../../../../orte/mca/odls/base/odls_base_default_fns.c at line 261
> > [nid25254:32172] [[33659,0],0] orted:comm:add_procs failed to launch on 
> > error Pack data mismatch
> > [nid25254:32172] [[33659,0],0] orted:comm:process_commands() Processing 
> > Command: ORTE_DAEMON_EXIT_CMD
> > [nid21959:01177] 
> > [[33659,0],1]:../../../../../orte/mca/errmgr/default_orted/errmgr_default_orted.c(251)
> >  updating exit status to 1
> > [nid25254:32172] sess_dir_finalize: proc session dir does not exist
> > [nid25254:32172] sess_dir_cleanup: job session dir does not exist
> > exiting with status 0
> > marksant@nid25254:~> [nid21959:01177] sess_dir_finalize: proc session dir 
> > does not exist
> > [nid21959:01177] sess_dir_cleanup: job session dir does not exist
> > exiting with status 1
> > Application 25938733 exit codes: 1
> > Application 25938733 resources: utime ~0s, stime ~1s, Rss ~21456, inblocks 
> > ~4629, outblocks ~104
> >
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org <mailto:de...@open-mpi.org>
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> > <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/devel/2015/08/17781.php 
> > <http://www.open-mpi.org/community/lists/devel/2015/08/17781.php>
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <mailto:de...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel 
> <http://www.open-mpi.org/mailman/listinfo.cgi/devel>
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17782.php 
> <http://www.open-mpi.org/community/lists/devel/2015/08/17782.php>_______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/08/17783.php

Reply via email to