Hi,

Building with --disable-pmix-dstore, my program now works fine inside
and outside the containers. When/if I have the time, I will investigate
the "shared memory segments" aspects.

Thanks,
John

On Fri, 2017-05-26 at 15:23 -0700, r...@open-mpi.org wrote:
You can also get around it by configuring OMPI with “--disable-pmix-dstore”


On May 26, 2017, at 3:02 PM, Howard Pritchard 
<hpprit...@gmail.com<mailto:hpprit...@gmail.com>> wrote:
Hi John,

In the 2.1.x release stream a shared memory capability was introduced into the 
PMIx component.

I know nothing about LXC containers, but it looks to me like there's some issue 
when PMIx tries
to create these shared memory segments.  I'd check to see if there's something 
about your
container configuration that is preventing the creation of shared memory 
segments.

Howard


2017-05-26 15:18 GMT-06:00 John Marshall 
<john.marsh...@ssc-spc.gc.ca<mailto:john.marsh...@ssc-spc.gc.ca>>:
Hi,

I have built openmpi 2.1.1 with hpcx-1.8 and tried to run some mpi code under
ubuntu 14.04 and LXC (1.x) but I get the following:

[ib7-bc2oo42-be10p16.science.gc.ca:16035<http://ib7-bc2oo42-be10p16.science.gc.ca:16035/>]
 PMIX ERROR: OUT-OF-RESOURCE in file src/dstore/pmix_esh.c at line 1651
[ib7-bc2oo42-be10p16.science.gc.ca:16035<http://ib7-bc2oo42-be10p16.science.gc.ca:16035/>]
 PMIX ERROR: OUT-OF-RESOURCE in file src/dstore/pmix_esh.c at line 1751
[ib7-bc2oo42-be10p16.science.gc.ca:16035<http://ib7-bc2oo42-be10p16.science.gc.ca:16035/>]
 PMIX ERROR: OUT-OF-RESOURCE in file src/dstore/pmix_esh.c at line 1114
[ib7-bc2oo42-be10p16.science.gc.ca:16035<http://ib7-bc2oo42-be10p16.science.gc.ca:16035/>]
 PMIX ERROR: OUT-OF-RESOURCE in file src/common/pmix_jobdata.c at line 93
[ib7-bc2oo42-be10p16.science.gc.ca:16035<http://ib7-bc2oo42-be10p16.science.gc.ca:16035/>]
 PMIX ERROR: OUT-OF-RESOURCE in file src/common/pmix_jobdata.c at line 333
[ib7-bc2oo42-be10p16.science.gc.ca:16035<http://ib7-bc2oo42-be10p16.science.gc.ca:16035/>]
 PMIX ERROR: OUT-OF-RESOURCE in file src/server/pmix_server.c at line 606

I do not get the same outside of the LXC container and my code runs fine.

I've looked for more info on these messages but could not find anything
helpful. Are these messages indicative of something missing in, or some
incompatibility with, the container?

When I build using 2.0.2, I do not have a problem running inside or outside of
the container.

Thanks,
John

_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@lists.open-mpi.org<mailto:users@lists.open-mpi.org>
https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to