Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-03 Thread r...@open-mpi.org
Can you try a newer version of OMPI, say the 3.0.0 release? Just curious to 
know if we perhaps “fixed” something relevant.


> On Oct 3, 2017, at 5:33 PM, Anthony Thyssen  wrote:
> 
> FYI...
> 
> The problem is discussed further in 
> 
> Redhat Bugzilla: Bug 1321154 - numa enabled torque don't work
>https://bugzilla.redhat.com/show_bug.cgi?id=1321154 
> 
> 
> I'd seen this previous as it required me to add "num_node_boards=1" to each 
> node in the
> /var/lib/torque/server_priv/nodes  to get torque to at least work.  
> Specifically by munging
> the $PBS_NODES" (which comes out correcT) into a host list containing the 
> correct
> "slot=" counts.  But of course now that I have compiled OpenMPI using 
> "--with-tm" that
> should not have been needed as in fact is now ignored by OpenMPI in a 
> Torque-PBS
> environment.
> 
> However it seems ever since "NUMA" support was into the Torque RPM's, has 
> also caused
> the current problems, and is still continuing.   The last action is a new 
> EPEL "test' version
> (August 2017),  I will try shortly.
> 
> Take you for your help, though I am still open to suggestions for a 
> replacement.
> 
>   Anthony Thyssen ( System Programmer ) >
>  --
>Encryption... is a powerful defensive weapon for free people.
>It offers a technical guarantee of privacy, regardless of who is
>running the government... It's hard to think of a more powerful,
>less dangerous tool for liberty.--  Esther Dyson
>  --
> 
> 
> 
> On Wed, Oct 4, 2017 at 9:02 AM, Anthony Thyssen  > wrote:
> Thank you Gilles.  At least I now have something to follow though with.
> 
> As a FYI, the torque is the pre-built version from the Redhat Extras (EPEL) 
> archive.
> torque-4.2.10-10.el7.x86_64
> 
> Normally pre-build packages have no problems, but in this case.
> 
> 
> 
> 
> On Tue, Oct 3, 2017 at 3:39 PM, Gilles Gouaillardet  > wrote:
> Anthony,
> 
> 
> we had a similar issue reported some times ago (e.g. Open MPI ignores torque 
> allocation),
> 
> and after quite some troubleshooting, we ended up with the same behavior 
> (e.g. pbsdsh is not working as expected).
> 
> see https://www.mail-archive.com/users@lists.open-mpi.org/msg29952.html 
>  for the 
> last email.
> 
> 
> from an Open MPI point of view, i would consider the root cause is with your 
> torque install.
> 
> this case was reported at 
> http://www.clusterresources.com/pipermail/torqueusers/2016-September/018858.html
>  
> 
> 
> and no conclusion was reached.
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> On 10/3/2017 2:02 PM, Anthony Thyssen wrote:
> The stdin and stdout are saved to separate channels.
> 
> It is interesting that the output from pbsdsh is node21.emperor 5 times, even 
> though $PBS_NODES is the 5 individual nodes.
> 
> Attached are the two compressed files, as well as the pbs_hello batch used.
> 
> Anthony Thyssen ( System Programmer )   >>
>  --
>   There are two types of encryption:
> One that will prevent your sister from reading your diary, and
> One that will prevent your government.   -- Bruce Schneier
>  --
> 
> 
> 
> 
> On Tue, Oct 3, 2017 at 2:39 PM, Gilles Gouaillardet    >> wrote:
> 
> Anthony,
> 
> 
> in your script, can you
> 
> 
> set -x
> 
> env
> 
> pbsdsh hostname
> 
> mpirun --display-map --display-allocation --mca ess_base_verbose
> 10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
> 
> 
> and then compress and send the output ?
> 
> 
> Cheers,
> 
> 
> Gilles
> 
> 
> On 10/3/2017 1:19 PM, Anthony Thyssen wrote:
> 
> I noticed that too.  Though the submitting host for torque is
> a different host (main head node, "shrek"),  "node21" is the
> host that torque runs the batch script (and the mpirun
> command) it being the first node in the "dualcore" resource group.
> 
> Adding option...
> 
> It fixed the hostname in the allocation map, though had no
> effect on the outcome.  The allocation is still simply ignored.
> 
> 

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-03 Thread Anthony Thyssen
FYI...

The problem is discussed further in

Redhat Bugzilla: Bug 1321154 - numa enabled torque don't work
   https://bugzilla.redhat.com/show_bug.cgi?id=1321154

I'd seen this previous as it required me to add "num_node_boards=1" to each
node in the
/var/lib/torque/server_priv/nodes  to get torque to at least work.
Specifically by munging
the $PBS_NODES" (which comes out correcT) into a host list containing the
correct
"slot=" counts.  But of course now that I have compiled OpenMPI using
"--with-tm" that
should not have been needed as in fact is now ignored by OpenMPI in a
Torque-PBS
environment.

However it seems ever since "NUMA" support was into the Torque RPM's, has
also caused
the current problems, and is still continuing.   The last action is a new
EPEL "test' version
(August 2017),  I will try shortly.

Take you for your help, though I am still open to suggestions for a
replacement.

  Anthony Thyssen ( System Programmer )
 --
   Encryption... is a powerful defensive weapon for free people.
   It offers a technical guarantee of privacy, regardless of who is
   running the government... It's hard to think of a more powerful,
   less dangerous tool for liberty.--  Esther Dyson
 --



On Wed, Oct 4, 2017 at 9:02 AM, Anthony Thyssen 
wrote:

> Thank you Gilles.  At least I now have something to follow though with.
>
> As a FYI, the torque is the pre-built version from the Redhat Extras
> (EPEL) archive.
> torque-4.2.10-10.el7.x86_64
>
> Normally pre-build packages have no problems, but in this case.
>
>
>
>
> On Tue, Oct 3, 2017 at 3:39 PM, Gilles Gouaillardet 
> wrote:
>
>> Anthony,
>>
>>
>> we had a similar issue reported some times ago (e.g. Open MPI ignores
>> torque allocation),
>>
>> and after quite some troubleshooting, we ended up with the same behavior
>> (e.g. pbsdsh is not working as expected).
>>
>> see https://www.mail-archive.com/users@lists.open-mpi.org/msg29952.html
>> for the last email.
>>
>>
>> from an Open MPI point of view, i would consider the root cause is with
>> your torque install.
>>
>> this case was reported at http://www.clusterresources.co
>> m/pipermail/torqueusers/2016-September/018858.html
>>
>> and no conclusion was reached.
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>> On 10/3/2017 2:02 PM, Anthony Thyssen wrote:
>>
>>> The stdin and stdout are saved to separate channels.
>>>
>>> It is interesting that the output from pbsdsh is node21.emperor 5 times,
>>> even though $PBS_NODES is the 5 individual nodes.
>>>
>>> Attached are the two compressed files, as well as the pbs_hello batch
>>> used.
>>>
>>> Anthony Thyssen ( System Programmer )>> >
>>>  ---
>>> ---
>>>   There are two types of encryption:
>>> One that will prevent your sister from reading your diary, and
>>> One that will prevent your government.   -- Bruce Schneier
>>>  ---
>>> ---
>>>
>>>
>>>
>>>
>>> On Tue, Oct 3, 2017 at 2:39 PM, Gilles Gouaillardet >> > wrote:
>>>
>>> Anthony,
>>>
>>>
>>> in your script, can you
>>>
>>>
>>> set -x
>>>
>>> env
>>>
>>> pbsdsh hostname
>>>
>>> mpirun --display-map --display-allocation --mca ess_base_verbose
>>> 10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
>>>
>>>
>>> and then compress and send the output ?
>>>
>>>
>>> Cheers,
>>>
>>>
>>> Gilles
>>>
>>>
>>> On 10/3/2017 1:19 PM, Anthony Thyssen wrote:
>>>
>>> I noticed that too.  Though the submitting host for torque is
>>> a different host (main head node, "shrek"),  "node21" is the
>>> host that torque runs the batch script (and the mpirun
>>> command) it being the first node in the "dualcore" resource
>>> group.
>>>
>>> Adding option...
>>>
>>> It fixed the hostname in the allocation map, though had no
>>> effect on the outcome.  The allocation is still simply ignored.
>>>
>>> ===8>> PBS Job Number   9000
>>> PBS batch run on node21.emperor
>>> Time it was started  2017-10-03_14:11:20
>>> Current Directory/net/shrek.emperor/home/shrek/anthony
>>> Submitted work dir   /home/shrek/anthony/mpi-pbs
>>> Number of Nodes  5
>>> Nodefile List   /var/lib/torque/aux//9000.shrek.emperor
>>> node21.emperor
>>> node25.emperor
>>> node24.emperor
>>> node23.emperor
>>> node22.emperor
>>> ---
>>>
>>> ==  

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-03 Thread Anthony Thyssen
Thank you Gilles.  At least I now have something to follow though with.

As a FYI, the torque is the pre-built version from the Redhat Extras (EPEL)
archive.
torque-4.2.10-10.el7.x86_64

Normally pre-build packages have no problems, but in this case.




On Tue, Oct 3, 2017 at 3:39 PM, Gilles Gouaillardet 
wrote:

> Anthony,
>
>
> we had a similar issue reported some times ago (e.g. Open MPI ignores
> torque allocation),
>
> and after quite some troubleshooting, we ended up with the same behavior
> (e.g. pbsdsh is not working as expected).
>
> see https://www.mail-archive.com/users@lists.open-mpi.org/msg29952.html
> for the last email.
>
>
> from an Open MPI point of view, i would consider the root cause is with
> your torque install.
>
> this case was reported at http://www.clusterresources.co
> m/pipermail/torqueusers/2016-September/018858.html
>
> and no conclusion was reached.
>
>
> Cheers,
>
>
> Gilles
>
>
> On 10/3/2017 2:02 PM, Anthony Thyssen wrote:
>
>> The stdin and stdout are saved to separate channels.
>>
>> It is interesting that the output from pbsdsh is node21.emperor 5 times,
>> even though $PBS_NODES is the 5 individual nodes.
>>
>> Attached are the two compressed files, as well as the pbs_hello batch
>> used.
>>
>> Anthony Thyssen ( System Programmer )> >
>>  ---
>> ---
>>   There are two types of encryption:
>> One that will prevent your sister from reading your diary, and
>> One that will prevent your government.   -- Bruce Schneier
>>  ---
>> ---
>>
>>
>>
>>
>> On Tue, Oct 3, 2017 at 2:39 PM, Gilles Gouaillardet > > wrote:
>>
>> Anthony,
>>
>>
>> in your script, can you
>>
>>
>> set -x
>>
>> env
>>
>> pbsdsh hostname
>>
>> mpirun --display-map --display-allocation --mca ess_base_verbose
>> 10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname
>>
>>
>> and then compress and send the output ?
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>> On 10/3/2017 1:19 PM, Anthony Thyssen wrote:
>>
>> I noticed that too.  Though the submitting host for torque is
>> a different host (main head node, "shrek"),  "node21" is the
>> host that torque runs the batch script (and the mpirun
>> command) it being the first node in the "dualcore" resource group.
>>
>> Adding option...
>>
>> It fixed the hostname in the allocation map, though had no
>> effect on the outcome.  The allocation is still simply ignored.
>>
>> ===8> PBS Job Number   9000
>> PBS batch run on node21.emperor
>> Time it was started  2017-10-03_14:11:20
>> Current Directory/net/shrek.emperor/home/shrek/anthony
>> Submitted work dir   /home/shrek/anthony/mpi-pbs
>> Number of Nodes  5
>> Nodefile List   /var/lib/torque/aux//9000.shrek.emperor
>> node21.emperor
>> node25.emperor
>> node24.emperor
>> node23.emperor
>> node22.emperor
>> ---
>>
>> ==  ALLOCATED NODES  ==
>> node21.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
>> node25.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
>> node24.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
>> node23.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
>> node22.emperor: slots=1 max_slots=0 slots_inuse=0 state=UP
>> =
>> node21.emperor
>> node21.emperor
>> node21.emperor
>> node21.emperor
>> node21.emperor
>> ===8>
>>
>>   Anthony Thyssen ( System Programmer )
>> 
>> > >>
>>  ---
>> ---
>>The equivalent of an armoured car should always be used to
>>protect any secret kept in a cardboard box.
>>-- Anthony Thyssen, On the use of Encryption
>>  ---
>> ---
>>
>>
>>
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org 
>> https://lists.open-mpi.org/mailman/listinfo/users
>> 
>>
>>
>> ___
>> users mailing list

Re: [OMPI users] OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Jim Maas
Not sure exactly how but back up and running, with all cores.  Thanks to
all.  Hopefully the new binary for Ubuntu with open-mpi 3.0.0 won't be too
far away.

Very Best,
J

On 3 October 2017 at 21:06, r...@open-mpi.org  wrote:

> You can add it to the default MCA param file, if you want -
> /etc/openmpi-mca-params.conf
>
> On Oct 3, 2017, at 12:44 PM, Jim Maas  wrote:
>
> Thanks RHC  where do I put that so it will be in the environment?
>
> J
>
> On 3 October 2017 at 16:01, r...@open-mpi.org  wrote:
>
>> As Gilles said, we default to slots = cores, not HTs. If you want to
>> treat HTs as independent cpus, then you need to add
>> OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1 in your environment.
>>
>> On Oct 3, 2017, at 7:27 AM, Jim Maas  wrote:
>>
>> Tried this and got this error, and slots are available, nothing else is
>> running.
>>
>> > cl <- startMPIcluster(count=7)
>> 
>> --
>> There are not enough slots available in the system to satisfy the 7 slots
>> that were requested by the application:
>>   /usr/local/lib/R/bin/Rscript
>>
>> Either request fewer slots for your application, or make more slots
>> available
>> for use.
>> 
>> --
>> Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves =
>> count,  :
>>   MPI_ERR_SPAWN: could not spawn processes
>> >
>>
>> On 3 October 2017 at 15:07, Gilles Gouaillardet > d...@gmail.com> wrote:
>>
>>> Thanks, i will have a look at it.
>>>
>>> By default, a slot is a core, so there are 6 slots on your system.
>>> Could your app spawn 6 procs on top of the initial proc ? That would be
>>> 7 slots and there are only 6.
>>> What if you ask 5 slots only ?
>>>
>>> With some parameters i do not know off hand, you could either
>>> oversubscribe or use hyperthreads as slots. In both cases, 7 slots would be
>>> available.
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> Jim Maas  wrote:
>>> Thanks Gilles, relative noob here at this level, apologies if
>>> nonsensical!
>>>
>>> I removed previous versions of open mpi which were compiled from source
>>> using sudo make uninstall ...
>>> downloaded new open-mpi 3.0.0 in tar.gz
>>> configure --disable-dlopen
>>> sudo make install
>>>
>>>
>>> then ran sudo ldconfig
>>>
>>> updated R, downloaded R-3.4.2.tar.gz
>>> ./configure
>>> sudo make install
>>>
>>>
>>> Then run R from sudo
>>>
>>> sudo R
>>> once running
>>> install.packages("Rmpi")
>>> install.packages("doMPI")
>>>
>>> both of these load and test fine during install
>>>
>>> Then from R run
>>>
>>> rm(list=ls(all=TRUE))
>>> library(doMPI)
>>>
>>> ## load MPI cluster
>>> cl <- startMPIcluster(count=6)
>>>
>>>
>>> At this point it throws the error, doesn't find any of the slots.
>>>
>>> There is a precompiled version of Rmpi that installs an older version of
>>> open-mpi directly from Ubuntu, but I think the mpi version is an older one
>>> so I wanted to try using the new version.
>>>
>>>
>>> I use this 6 core (12) as  test bed before uploading to a cluster.  It
>>> is Ubuntu 16.04 Linux, lstopo pdf is attached.
>>>
>>> Thanks,
>>>
>>> J
>>>
>>>
>>> On 3 October 2017 at 14:06, Gilles Gouaillardet >> d...@gmail.com> wrote:
>>>
 Hi Jim,

 can you please provide minimal instructions on how to reproduce the
 issue ?
 we know Open MPI, but i am afraid few or none of us know about Rmpi nor
 doMPI.
 once you explain how to download and build these, and how to run the
 failing test,
 we ll be able to investigate that.

 also, can you describe your environment ?
 i assume one ubuntu machine, can you please run
 lstopo
 on and post the output ?

 did you use to have some specific settings in the system-wide conf
 file (e.g. /.../etc/openmpi-mca-params.conf) ?
 if yes, can you post these, the syntax might have changed in 3.0.0

 Cheers,

 Gilles

 On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  wrote:
 > I've used this for years, just updated open-mpi to 3.0.0 and reloaded
 R,
 > have reinstalled doMPI and thus Rmpi but when I try to use
 startMPICluster,
 > asking for 6 slots (there are 12 on this machine) I get this error.
 Where
 > can I start to debug it?
 >
 > Thanks
 > J
 > 
 --
 > There are not enough slots available in the system to satisfy the 6
 slots
 > that were requested by the application:
 >   /usr/lib/R/bin/Rscript
 >
 > Either request fewer slots for your application, or make more slots
 > available
 > for use.
 > 
 --
 > Error in mpi.comm.spawn(slave = rscript, slavearg = args, 

Re: [OMPI users] OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread r...@open-mpi.org
You can add it to the default MCA param file, if you want - 
/etc/openmpi-mca-params.conf

> On Oct 3, 2017, at 12:44 PM, Jim Maas  wrote:
> 
> Thanks RHC  where do I put that so it will be in the environment?
> 
> J
> 
> On 3 October 2017 at 16:01, r...@open-mpi.org  
> > wrote:
> As Gilles said, we default to slots = cores, not HTs. If you want to treat 
> HTs as independent cpus, then you need to add 
> OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1 in your environment.
> 
>> On Oct 3, 2017, at 7:27 AM, Jim Maas > > wrote:
>> 
>> Tried this and got this error, and slots are available, nothing else is 
>> running.
>> 
>> > cl <- startMPIcluster(count=7)
>> --
>> There are not enough slots available in the system to satisfy the 7 slots
>> that were requested by the application:
>>   /usr/local/lib/R/bin/Rscript
>> 
>> Either request fewer slots for your application, or make more slots available
>> for use.
>> --
>> Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,  
>> : 
>>   MPI_ERR_SPAWN: could not spawn processes
>> > 
>> 
>> On 3 October 2017 at 15:07, Gilles Gouaillardet 
>> > wrote:
>> Thanks, i will have a look at it.
>> 
>> By default, a slot is a core, so there are 6 slots on your system.
>> Could your app spawn 6 procs on top of the initial proc ? That would be 7 
>> slots and there are only 6.
>> What if you ask 5 slots only ?
>> 
>> With some parameters i do not know off hand, you could either oversubscribe 
>> or use hyperthreads as slots. In both cases, 7 slots would be available.
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> Jim Maas > wrote:
>> Thanks Gilles, relative noob here at this level, apologies if nonsensical!
>> 
>> I removed previous versions of open mpi which were compiled from source 
>> using sudo make uninstall ...
>> downloaded new open-mpi 3.0.0 in tar.gz
>> configure --disable-dlopen
>> sudo make install
>> 
>> 
>> then ran sudo ldconfig
>> 
>> updated R, downloaded R-3.4.2.tar.gz
>> ./configure
>> sudo make install
>> 
>> 
>> Then run R from sudo
>> 
>> sudo R
>> once running 
>> install.packages("Rmpi")
>> install.packages("doMPI")
>> 
>> both of these load and test fine during install
>> 
>> Then from R run
>> 
>> rm(list=ls(all=TRUE))
>> library(doMPI)
>> 
>> ## load MPI cluster
>> cl <- startMPIcluster(count=6)
>> 
>> 
>> At this point it throws the error, doesn't find any of the slots.
>> 
>> There is a precompiled version of Rmpi that installs an older version of 
>> open-mpi directly from Ubuntu, but I think the mpi version is an older one 
>> so I wanted to try using the new version.
>> 
>> 
>> I use this 6 core (12) as  test bed before uploading to a cluster.  It is 
>> Ubuntu 16.04 Linux, lstopo pdf is attached.
>> 
>> Thanks,
>> 
>> J
>> 
>> 
>> On 3 October 2017 at 14:06, Gilles Gouaillardet 
>> > wrote:
>> Hi Jim,
>> 
>> can you please provide minimal instructions on how to reproduce the issue ?
>> we know Open MPI, but i am afraid few or none of us know about Rmpi nor 
>> doMPI.
>> once you explain how to download and build these, and how to run the
>> failing test,
>> we ll be able to investigate that.
>> 
>> also, can you describe your environment ?
>> i assume one ubuntu machine, can you please run
>> lstopo
>> on and post the output ?
>> 
>> did you use to have some specific settings in the system-wide conf
>> file (e.g. /.../etc/openmpi-mca-params.co nf) 
>> ?
>> if yes, can you post these, the syntax might have changed in 3.0.0
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas > > wrote:
>> > I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
>> > have reinstalled doMPI and thus Rmpi but when I try to use startMPICluster,
>> > asking for 6 slots (there are 12 on this machine) I get this error.  Where
>> > can I start to debug it?
>> >
>> > Thanks
>> > J
>> > --
>> > There are not enough slots available in the system to satisfy the 6 slots
>> > that were requested by the application:
>> >   /usr/lib/R/bin/Rscript
>> >
>> > Either request fewer slots for your application, or make more slots
>> > available
>> > for use.
>> > --
>> > Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,
>> > :
>> >   MPI_ERR_SPAWN: could not spawn processes
>> > --
>> > Jim Maas
>> >
>> > jimmaasuk  

Re: [OMPI users] OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Jim Maas
Thanks RHC  where do I put that so it will be in the environment?

J

On 3 October 2017 at 16:01, r...@open-mpi.org  wrote:

> As Gilles said, we default to slots = cores, not HTs. If you want to treat
> HTs as independent cpus, then you need to add 
> OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1
> in your environment.
>
> On Oct 3, 2017, at 7:27 AM, Jim Maas  wrote:
>
> Tried this and got this error, and slots are available, nothing else is
> running.
>
> > cl <- startMPIcluster(count=7)
> --
> There are not enough slots available in the system to satisfy the 7 slots
> that were requested by the application:
>   /usr/local/lib/R/bin/Rscript
>
> Either request fewer slots for your application, or make more slots
> available
> for use.
> --
> Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves =
> count,  :
>   MPI_ERR_SPAWN: could not spawn processes
> >
>
> On 3 October 2017 at 15:07, Gilles Gouaillardet  gouaillar...@gmail.com> wrote:
>
>> Thanks, i will have a look at it.
>>
>> By default, a slot is a core, so there are 6 slots on your system.
>> Could your app spawn 6 procs on top of the initial proc ? That would be 7
>> slots and there are only 6.
>> What if you ask 5 slots only ?
>>
>> With some parameters i do not know off hand, you could either
>> oversubscribe or use hyperthreads as slots. In both cases, 7 slots would be
>> available.
>>
>> Cheers,
>>
>> Gilles
>>
>> Jim Maas  wrote:
>> Thanks Gilles, relative noob here at this level, apologies if nonsensical!
>>
>> I removed previous versions of open mpi which were compiled from source
>> using sudo make uninstall ...
>> downloaded new open-mpi 3.0.0 in tar.gz
>> configure --disable-dlopen
>> sudo make install
>>
>>
>> then ran sudo ldconfig
>>
>> updated R, downloaded R-3.4.2.tar.gz
>> ./configure
>> sudo make install
>>
>>
>> Then run R from sudo
>>
>> sudo R
>> once running
>> install.packages("Rmpi")
>> install.packages("doMPI")
>>
>> both of these load and test fine during install
>>
>> Then from R run
>>
>> rm(list=ls(all=TRUE))
>> library(doMPI)
>>
>> ## load MPI cluster
>> cl <- startMPIcluster(count=6)
>>
>>
>> At this point it throws the error, doesn't find any of the slots.
>>
>> There is a precompiled version of Rmpi that installs an older version of
>> open-mpi directly from Ubuntu, but I think the mpi version is an older one
>> so I wanted to try using the new version.
>>
>>
>> I use this 6 core (12) as  test bed before uploading to a cluster.  It is
>> Ubuntu 16.04 Linux, lstopo pdf is attached.
>>
>> Thanks,
>>
>> J
>>
>>
>> On 3 October 2017 at 14:06, Gilles Gouaillardet > gouaillar...@gmail.com> wrote:
>>
>>> Hi Jim,
>>>
>>> can you please provide minimal instructions on how to reproduce the
>>> issue ?
>>> we know Open MPI, but i am afraid few or none of us know about Rmpi nor
>>> doMPI.
>>> once you explain how to download and build these, and how to run the
>>> failing test,
>>> we ll be able to investigate that.
>>>
>>> also, can you describe your environment ?
>>> i assume one ubuntu machine, can you please run
>>> lstopo
>>> on and post the output ?
>>>
>>> did you use to have some specific settings in the system-wide conf
>>> file (e.g. /.../etc/openmpi-mca-params.conf) ?
>>> if yes, can you post these, the syntax might have changed in 3.0.0
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  wrote:
>>> > I've used this for years, just updated open-mpi to 3.0.0 and reloaded
>>> R,
>>> > have reinstalled doMPI and thus Rmpi but when I try to use
>>> startMPICluster,
>>> > asking for 6 slots (there are 12 on this machine) I get this error.
>>> Where
>>> > can I start to debug it?
>>> >
>>> > Thanks
>>> > J
>>> > 
>>> --
>>> > There are not enough slots available in the system to satisfy the 6
>>> slots
>>> > that were requested by the application:
>>> >   /usr/lib/R/bin/Rscript
>>> >
>>> > Either request fewer slots for your application, or make more slots
>>> > available
>>> > for use.
>>> > 
>>> --
>>> > Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves =
>>> count,
>>> > :
>>> >   MPI_ERR_SPAWN: could not spawn processes
>>> > --
>>> > Jim Maas
>>> >
>>> > jimmaasuk  at gmail.com
>>> >
>>> >
>>> > ___
>>> > users mailing list
>>> > users@lists.open-mpi.org
>>> > https://lists.open-mpi.org/mailman/listinfo/users
>>> ___
>>> users mailing list
>>> users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/users
>>>
>>
>>
>>
>> --
>>
>> ___
>> users 

Re: [OMPI users] Error building openmpi on Raspberry pi 2

2017-10-03 Thread Pavel Shamis
I'm building on ARMv8 (64bit kernel, ompi master) and so far no problems.

On Wed, Sep 27, 2017 at 7:34 AM, Jeff Layton  wrote:

> I could never get OpenMPI < 2.x to build on a Pi 2. I ended up using the
> binary from the repos. Pi 3 is a different matter - I got that to build
> after a little experimentation :)
>
> Jeff
>
>
>
> On Wednesday, September 27, 2017 8:03 AM, Nathan Hjelm 
> wrote:
>
>
> Open MPI does not officially support ARM in the v2.1 series. Can you
> download a nightly tarball from https://www.open-mpi.org/nightly/master/ and
> see if it works for you?
>
> -Nathan
>
> > On Sep 26, 2017, at 7:32 PM, Faraz Hussain  wrote:
> >
> > I am receiving the make errors below on my pi 2:
> >
> > pi@pi001:~/openmpi-2.1.1 $ uname -a
> > Linux pi001 4.9.35-v7+ #1014 SMP Fri Jun 30 14:47:43 BST 2017 armv7l
> GNU/Linux
> >
> > pi@pi001:~/openmpi-2.1.1 $ make -j 4
> > .
> > .
> > .
> > .
> > make[2]: Entering directory '/home/pi/openmpi-2.1.1/opal/asm'
> >  CPPASatomic-asm.lo
> > atomic-asm.S: Assembler messages:
> > atomic-asm.S:7: Error: selected processor does not support ARM mode `dmb'
> > atomic-asm.S:15: Error: selected processor does not support ARM mode
> `dmb'
> > atomic-asm.S:23: Error: selected processor does not support ARM mode
> `dmb'
> > atomic-asm.S:55: Error: selected processor does not support ARM mode
> `dmb'
> > atomic-asm.S:70: Error: selected processor does not support ARM mode
> `dmb'
> > atomic-asm.S:86: Error: selected processor does not support ARM mode
> `ldrexd r4,r5,[r0]'
> > atomic-asm.S:91: Error: selected processor does not support ARM mode
> `strexd r1,r6,r7,[r0]'
> > atomic-asm.S:107: Error: selected processor does not support ARM mode
> `ldrexd r4,r5,[r0]'
> > atomic-asm.S:112: Error: selected processor does not support ARM mode
> `strexd r1,r6,r7,[r0]'
> > atomic-asm.S:115: Error: selected processor does not support ARM mode
> `dmb'
> > atomic-asm.S:130: Error: selected processor does not support ARM mode
> `ldrexd r4,r5,[r0]'
> > atomic-asm.S:135: Error: selected processor does not support ARM mode
> `dmb'
> > atomic-asm.S:136: Error: selected processor does not support ARM mode
> `strexd r1,r6,r7,[r0]'
> > Makefile:1743: recipe for target 'atomic-asm.lo' failed
> > make[2]: *** [atomic-asm.lo] Error 1
> > make[2]: Leaving directory '/home/pi/openmpi-2.1.1/opal/asm'
> > Makefile:2307: recipe for target 'all-recursive' failed
> > make[1]: *** [all-recursive] Error 1
> > make[1]: Leaving directory '/home/pi/openmpi-2.1.1/opal'
> > Makefile:1806: recipe for target 'all-recursive' failed
> > make: *** [all-recursive] Error 1
> >
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread r...@open-mpi.org
As Gilles said, we default to slots = cores, not HTs. If you want to treat HTs 
as independent cpus, then you need to add 
OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1 in your environment.

> On Oct 3, 2017, at 7:27 AM, Jim Maas  wrote:
> 
> Tried this and got this error, and slots are available, nothing else is 
> running.
> 
> > cl <- startMPIcluster(count=7)
> --
> There are not enough slots available in the system to satisfy the 7 slots
> that were requested by the application:
>   /usr/local/lib/R/bin/Rscript
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,  : 
>   MPI_ERR_SPAWN: could not spawn processes
> > 
> 
> On 3 October 2017 at 15:07, Gilles Gouaillardet 
> > wrote:
> Thanks, i will have a look at it.
> 
> By default, a slot is a core, so there are 6 slots on your system.
> Could your app spawn 6 procs on top of the initial proc ? That would be 7 
> slots and there are only 6.
> What if you ask 5 slots only ?
> 
> With some parameters i do not know off hand, you could either oversubscribe 
> or use hyperthreads as slots. In both cases, 7 slots would be available.
> 
> Cheers,
> 
> Gilles
> 
> Jim Maas > wrote:
> Thanks Gilles, relative noob here at this level, apologies if nonsensical!
> 
> I removed previous versions of open mpi which were compiled from source using 
> sudo make uninstall ...
> downloaded new open-mpi 3.0.0 in tar.gz
> configure --disable-dlopen
> sudo make install
> 
> 
> then ran sudo ldconfig
> 
> updated R, downloaded R-3.4.2.tar.gz
> ./configure
> sudo make install
> 
> 
> Then run R from sudo
> 
> sudo R
> once running 
> install.packages("Rmpi")
> install.packages("doMPI")
> 
> both of these load and test fine during install
> 
> Then from R run
> 
> rm(list=ls(all=TRUE))
> library(doMPI)
> 
> ## load MPI cluster
> cl <- startMPIcluster(count=6)
> 
> 
> At this point it throws the error, doesn't find any of the slots.
> 
> There is a precompiled version of Rmpi that installs an older version of 
> open-mpi directly from Ubuntu, but I think the mpi version is an older one so 
> I wanted to try using the new version.
> 
> 
> I use this 6 core (12) as  test bed before uploading to a cluster.  It is 
> Ubuntu 16.04 Linux, lstopo pdf is attached.
> 
> Thanks,
> 
> J
> 
> 
> On 3 October 2017 at 14:06, Gilles Gouaillardet 
> > wrote:
> Hi Jim,
> 
> can you please provide minimal instructions on how to reproduce the issue ?
> we know Open MPI, but i am afraid few or none of us know about Rmpi nor doMPI.
> once you explain how to download and build these, and how to run the
> failing test,
> we ll be able to investigate that.
> 
> also, can you describe your environment ?
> i assume one ubuntu machine, can you please run
> lstopo
> on and post the output ?
> 
> did you use to have some specific settings in the system-wide conf
> file (e.g. /.../etc/openmpi-mca-params.co nf) ?
> if yes, can you post these, the syntax might have changed in 3.0.0
> 
> Cheers,
> 
> Gilles
> 
> On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  > wrote:
> > I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
> > have reinstalled doMPI and thus Rmpi but when I try to use startMPICluster,
> > asking for 6 slots (there are 12 on this machine) I get this error.  Where
> > can I start to debug it?
> >
> > Thanks
> > J
> > --
> > There are not enough slots available in the system to satisfy the 6 slots
> > that were requested by the application:
> >   /usr/lib/R/bin/Rscript
> >
> > Either request fewer slots for your application, or make more slots
> > available
> > for use.
> > --
> > Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,
> > :
> >   MPI_ERR_SPAWN: could not spawn processes
> > --
> > Jim Maas
> >
> > jimmaasuk  at gmail.com 
> >
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org 
> > https://lists.open-mpi.org/mailman/listinfo/users 
> > 
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://lists.open-mpi.org/mailman/listinfo/users 
> 
> 
> 
> 
> -- 
> 
> 

Re: [OMPI users] OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Jim Maas
Tried this and got this error, and slots are available, nothing else is
running.

> cl <- startMPIcluster(count=7)
--
There are not enough slots available in the system to satisfy the 7 slots
that were requested by the application:
  /usr/local/lib/R/bin/Rscript

Either request fewer slots for your application, or make more slots
available
for use.
--
Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,
:
  MPI_ERR_SPAWN: could not spawn processes
>

On 3 October 2017 at 15:07, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Thanks, i will have a look at it.
>
> By default, a slot is a core, so there are 6 slots on your system.
> Could your app spawn 6 procs on top of the initial proc ? That would be 7
> slots and there are only 6.
> What if you ask 5 slots only ?
>
> With some parameters i do not know off hand, you could either
> oversubscribe or use hyperthreads as slots. In both cases, 7 slots would be
> available.
>
> Cheers,
>
> Gilles
>
> Jim Maas  wrote:
> Thanks Gilles, relative noob here at this level, apologies if nonsensical!
>
> I removed previous versions of open mpi which were compiled from source
> using sudo make uninstall ...
> downloaded new open-mpi 3.0.0 in tar.gz
> configure --disable-dlopen
> sudo make install
>
>
> then ran sudo ldconfig
>
> updated R, downloaded R-3.4.2.tar.gz
> ./configure
> sudo make install
>
>
> Then run R from sudo
>
> sudo R
> once running
> install.packages("Rmpi")
> install.packages("doMPI")
>
> both of these load and test fine during install
>
> Then from R run
>
> rm(list=ls(all=TRUE))
> library(doMPI)
>
> ## load MPI cluster
> cl <- startMPIcluster(count=6)
>
>
> At this point it throws the error, doesn't find any of the slots.
>
> There is a precompiled version of Rmpi that installs an older version of
> open-mpi directly from Ubuntu, but I think the mpi version is an older one
> so I wanted to try using the new version.
>
>
> I use this 6 core (12) as  test bed before uploading to a cluster.  It is
> Ubuntu 16.04 Linux, lstopo pdf is attached.
>
> Thanks,
>
> J
>
>
> On 3 October 2017 at 14:06, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
>> Hi Jim,
>>
>> can you please provide minimal instructions on how to reproduce the issue
>> ?
>> we know Open MPI, but i am afraid few or none of us know about Rmpi nor
>> doMPI.
>> once you explain how to download and build these, and how to run the
>> failing test,
>> we ll be able to investigate that.
>>
>> also, can you describe your environment ?
>> i assume one ubuntu machine, can you please run
>> lstopo
>> on and post the output ?
>>
>> did you use to have some specific settings in the system-wide conf
>> file (e.g. /.../etc/openmpi-mca-params.conf) ?
>> if yes, can you post these, the syntax might have changed in 3.0.0
>>
>> Cheers,
>>
>> Gilles
>>
>> On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  wrote:
>> > I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
>> > have reinstalled doMPI and thus Rmpi but when I try to use
>> startMPICluster,
>> > asking for 6 slots (there are 12 on this machine) I get this error.
>> Where
>> > can I start to debug it?
>> >
>> > Thanks
>> > J
>> > 
>> --
>> > There are not enough slots available in the system to satisfy the 6
>> slots
>> > that were requested by the application:
>> >   /usr/lib/R/bin/Rscript
>> >
>> > Either request fewer slots for your application, or make more slots
>> > available
>> > for use.
>> > 
>> --
>> > Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves =
>> count,
>> > :
>> >   MPI_ERR_SPAWN: could not spawn processes
>> > --
>> > Jim Maas
>> >
>> > jimmaasuk  at gmail.com
>> >
>> >
>> > ___
>> > users mailing list
>> > users@lists.open-mpi.org
>> > https://lists.open-mpi.org/mailman/listinfo/users
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
>
>
> --
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Jim Maas
Previously it worked fine if I asked for 12, I'm sure you are correct, it
is only 6 physical cores, but with hyperthreading or whatever, it looks
like 12.  The system monitor shows 12.

Thanks
J

On 3 October 2017 at 15:07, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Thanks, i will have a look at it.
>
> By default, a slot is a core, so there are 6 slots on your system.
> Could your app spawn 6 procs on top of the initial proc ? That would be 7
> slots and there are only 6.
> What if you ask 5 slots only ?
>
> With some parameters i do not know off hand, you could either
> oversubscribe or use hyperthreads as slots. In both cases, 7 slots would be
> available.
>
> Cheers,
>
> Gilles
>
> Jim Maas  wrote:
> Thanks Gilles, relative noob here at this level, apologies if nonsensical!
>
> I removed previous versions of open mpi which were compiled from source
> using sudo make uninstall ...
> downloaded new open-mpi 3.0.0 in tar.gz
> configure --disable-dlopen
> sudo make install
>
>
> then ran sudo ldconfig
>
> updated R, downloaded R-3.4.2.tar.gz
> ./configure
> sudo make install
>
>
> Then run R from sudo
>
> sudo R
> once running
> install.packages("Rmpi")
> install.packages("doMPI")
>
> both of these load and test fine during install
>
> Then from R run
>
> rm(list=ls(all=TRUE))
> library(doMPI)
>
> ## load MPI cluster
> cl <- startMPIcluster(count=6)
>
>
> At this point it throws the error, doesn't find any of the slots.
>
> There is a precompiled version of Rmpi that installs an older version of
> open-mpi directly from Ubuntu, but I think the mpi version is an older one
> so I wanted to try using the new version.
>
>
> I use this 6 core (12) as  test bed before uploading to a cluster.  It is
> Ubuntu 16.04 Linux, lstopo pdf is attached.
>
> Thanks,
>
> J
>
>
> On 3 October 2017 at 14:06, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
>> Hi Jim,
>>
>> can you please provide minimal instructions on how to reproduce the issue
>> ?
>> we know Open MPI, but i am afraid few or none of us know about Rmpi nor
>> doMPI.
>> once you explain how to download and build these, and how to run the
>> failing test,
>> we ll be able to investigate that.
>>
>> also, can you describe your environment ?
>> i assume one ubuntu machine, can you please run
>> lstopo
>> on and post the output ?
>>
>> did you use to have some specific settings in the system-wide conf
>> file (e.g. /.../etc/openmpi-mca-params.conf) ?
>> if yes, can you post these, the syntax might have changed in 3.0.0
>>
>> Cheers,
>>
>> Gilles
>>
>> On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  wrote:
>> > I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
>> > have reinstalled doMPI and thus Rmpi but when I try to use
>> startMPICluster,
>> > asking for 6 slots (there are 12 on this machine) I get this error.
>> Where
>> > can I start to debug it?
>> >
>> > Thanks
>> > J
>> > 
>> --
>> > There are not enough slots available in the system to satisfy the 6
>> slots
>> > that were requested by the application:
>> >   /usr/lib/R/bin/Rscript
>> >
>> > Either request fewer slots for your application, or make more slots
>> > available
>> > for use.
>> > 
>> --
>> > Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves =
>> count,
>> > :
>> >   MPI_ERR_SPAWN: could not spawn processes
>> > --
>> > Jim Maas
>> >
>> > jimmaasuk  at gmail.com
>> >
>> >
>> > ___
>> > users mailing list
>> > users@lists.open-mpi.org
>> > https://lists.open-mpi.org/mailman/listinfo/users
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
>
>
>
> --
> Jim Maas
> 74 Turner Road
> Norwich, Norfolk, UK.
> NR2 4HB
>
> jimmaasuk  at gmail.com
> http://www.jamaas.com
> + 44 (0)771 985 8698 <07719%20858698>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>



-- 
Jim Maas
74 Turner Road
Norwich, Norfolk, UK.
NR2 4HB

jimmaasuk  at gmail.com
http://www.jamaas.com
+ 44 (0)771 985 8698
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Gilles Gouaillardet
Thanks, i will have a look at it.

By default, a slot is a core, so there are 6 slots on your system.
Could your app spawn 6 procs on top of the initial proc ? That would be 7 slots 
and there are only 6.
What if you ask 5 slots only ?

With some parameters i do not know off hand, you could either oversubscribe or 
use hyperthreads as slots. In both cases, 7 slots would be available.

Cheers,

Gilles

Jim Maas  wrote:
>Thanks Gilles, relative noob here at this level, apologies if nonsensical!
>
>
>I removed previous versions of open mpi which were compiled from source using 
>sudo make uninstall ...
>
>downloaded new open-mpi 3.0.0 in tar.gz
>
>configure --disable-dlopen
>
>sudo make install
>
>
>
>then ran sudo ldconfig
>
>
>updated R, downloaded R-3.4.2.tar.gz
>
>./configure
>
>sudo make install
>
>
>
>Then run R from sudo
>
>
>sudo R
>
>once running 
>
>install.packages("Rmpi")
>
>install.packages("doMPI")
>
>
>both of these load and test fine during install
>
>
>Then from R run
>
>
>rm(list=ls(all=TRUE))
>
>library(doMPI)
>
>
>## load MPI cluster
>
>cl <- startMPIcluster(count=6)
>
>
>
>At this point it throws the error, doesn't find any of the slots.
>
>
>There is a precompiled version of Rmpi that installs an older version of 
>open-mpi directly from Ubuntu, but I think the mpi version is an older one so 
>I wanted to try using the new version.
>
>
>
>I use this 6 core (12) as  test bed before uploading to a cluster.  It is 
>Ubuntu 16.04 Linux, lstopo pdf is attached.
>
>
>Thanks,
>
>
>J
>
>
>
>On 3 October 2017 at 14:06, Gilles Gouaillardet 
> wrote:
>
>Hi Jim,
>
>can you please provide minimal instructions on how to reproduce the issue ?
>we know Open MPI, but i am afraid few or none of us know about Rmpi nor doMPI.
>once you explain how to download and build these, and how to run the
>failing test,
>we ll be able to investigate that.
>
>also, can you describe your environment ?
>i assume one ubuntu machine, can you please run
>lstopo
>on and post the output ?
>
>did you use to have some specific settings in the system-wide conf
>file (e.g. /.../etc/openmpi-mca-params.conf) ?
>if yes, can you post these, the syntax might have changed in 3.0.0
>
>Cheers,
>
>Gilles
>
>
>On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  wrote:
>> I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
>> have reinstalled doMPI and thus Rmpi but when I try to use startMPICluster,
>> asking for 6 slots (there are 12 on this machine) I get this error.  Where
>> can I start to debug it?
>>
>> Thanks
>> J
>> --
>> There are not enough slots available in the system to satisfy the 6 slots
>> that were requested by the application:
>>   /usr/lib/R/bin/Rscript
>>
>> Either request fewer slots for your application, or make more slots
>> available
>> for use.
>> --
>> Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,
>> :
>>   MPI_ERR_SPAWN: could not spawn processes
>> --
>> Jim Maas
>>
>> jimmaasuk  at gmail.com
>>
>>
>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>___
>users mailing list
>users@lists.open-mpi.org
>https://lists.open-mpi.org/mailman/listinfo/users
>
>
>
>
>-- 
>
>Jim Maas
>74 Turner Road
>
>Norwich, Norfolk, UK.
>NR2 4HB
>
>jimmaasuk  at gmail.com
>
>http://www.jamaas.com
>+ 44 (0)771 985 8698
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Jim Maas
Thanks Gilles, relative noob here at this level, apologies if nonsensical!

I removed previous versions of open mpi which were compiled from source
using sudo make uninstall ...
downloaded new open-mpi 3.0.0 in tar.gz
configure --disable-dlopen
sudo make install


then ran sudo ldconfig

updated R, downloaded R-3.4.2.tar.gz
./configure
sudo make install


Then run R from sudo

sudo R
once running
install.packages("Rmpi")
install.packages("doMPI")

both of these load and test fine during install

Then from R run

rm(list=ls(all=TRUE))
library(doMPI)

## load MPI cluster
cl <- startMPIcluster(count=6)


At this point it throws the error, doesn't find any of the slots.

There is a precompiled version of Rmpi that installs an older version of
open-mpi directly from Ubuntu, but I think the mpi version is an older one
so I wanted to try using the new version.


I use this 6 core (12) as  test bed before uploading to a cluster.  It is
Ubuntu 16.04 Linux, lstopo pdf is attached.

Thanks,

J


On 3 October 2017 at 14:06, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Hi Jim,
>
> can you please provide minimal instructions on how to reproduce the issue ?
> we know Open MPI, but i am afraid few or none of us know about Rmpi nor
> doMPI.
> once you explain how to download and build these, and how to run the
> failing test,
> we ll be able to investigate that.
>
> also, can you describe your environment ?
> i assume one ubuntu machine, can you please run
> lstopo
> on and post the output ?
>
> did you use to have some specific settings in the system-wide conf
> file (e.g. /.../etc/openmpi-mca-params.conf) ?
> if yes, can you post these, the syntax might have changed in 3.0.0
>
> Cheers,
>
> Gilles
>
> On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  wrote:
> > I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
> > have reinstalled doMPI and thus Rmpi but when I try to use
> startMPICluster,
> > asking for 6 slots (there are 12 on this machine) I get this error.
> Where
> > can I start to debug it?
> >
> > Thanks
> > J
> > 
> --
> > There are not enough slots available in the system to satisfy the 6 slots
> > that were requested by the application:
> >   /usr/lib/R/bin/Rscript
> >
> > Either request fewer slots for your application, or make more slots
> > available
> > for use.
> > 
> --
> > Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves =
> count,
> > :
> >   MPI_ERR_SPAWN: could not spawn processes
> > --
> > Jim Maas
> >
> > jimmaasuk  at gmail.com
> >
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>



-- 
Jim Maas
74 Turner Road
Norwich, Norfolk, UK.
NR2 4HB

jimmaasuk  at gmail.com
http://www.jamaas.com
+ 44 (0)771 985 8698


system.pdf
Description: Adobe PDF document
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Gilles Gouaillardet
Hi Jim,

can you please provide minimal instructions on how to reproduce the issue ?
we know Open MPI, but i am afraid few or none of us know about Rmpi nor doMPI.
once you explain how to download and build these, and how to run the
failing test,
we ll be able to investigate that.

also, can you describe your environment ?
i assume one ubuntu machine, can you please run
lstopo
on and post the output ?

did you use to have some specific settings in the system-wide conf
file (e.g. /.../etc/openmpi-mca-params.conf) ?
if yes, can you post these, the syntax might have changed in 3.0.0

Cheers,

Gilles

On Tue, Oct 3, 2017 at 7:34 PM, Jim Maas  wrote:
> I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
> have reinstalled doMPI and thus Rmpi but when I try to use startMPICluster,
> asking for 6 slots (there are 12 on this machine) I get this error.  Where
> can I start to debug it?
>
> Thanks
> J
> --
> There are not enough slots available in the system to satisfy the 6 slots
> that were requested by the application:
>   /usr/lib/R/bin/Rscript
>
> Either request fewer slots for your application, or make more slots
> available
> for use.
> --
> Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,
> :
>   MPI_ERR_SPAWN: could not spawn processes
> --
> Jim Maas
>
> jimmaasuk  at gmail.com
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] upgraded mpi and R and now cannot find slots

2017-10-03 Thread Jim Maas
I've used this for years, just updated open-mpi to 3.0.0 and reloaded R,
have reinstalled doMPI and thus Rmpi but when I try to use startMPICluster,
asking for 6 slots (there are 12 on this machine) I get this error.  Where
can I start to debug it?

Thanks
J
--
There are not enough slots available in the system to satisfy the 6 slots
that were requested by the application:
  /usr/lib/R/bin/Rscript

Either request fewer slots for your application, or make more slots
available
for use.
--
Error in mpi.comm.spawn(slave = rscript, slavearg = args, nslaves = count,
:
  MPI_ERR_SPAWN: could not spawn processes
-- 
Jim Maas

jimmaasuk  at gmail.com
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users