> > where you can see "ncpus = 1" (I still do not know why 4 lines were > printed - > > (total of 40 nodes) and each node has 1 CPU and 1 GPU! >
> #PBS -l select=1:ncpus=8:mpiprocs=8 > aprun -n 4 p.sh ./ncpus.py > You can request 8 CPUs from a job scheduler, but if each node the script runs on contains only one virtual/physical core, then cpu_count() will return 1. If that CPU supports multi-threading, you would typically get 2. For example, on my workstation: `--> egrep "processor|model name|core id" /proc/cpuinfo processor : 0 model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz core id : 0 processor : 1 model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz core id : 1 processor : 2 model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz core id : 0 processor : 3 model name : Intel(R) Core(TM) i3-4160 CPU @ 3.60GHz core id : 1 `--> python3 -c "from sklearn.externals import joblib; print(joblib.cpu_count())" 4 It seems that in this situation, if you're wanting to parallelize *independent* sklearn calculations (e.g., changing dataset or random seed), you'll ask for the MPI by PBS processes like you have, but you'll need to place the sklearn computations in a function and then take care of distributing that function call across the MPI processes. Then again, if the runs are independent, it's a lot easier to write a for loop in a shell script that changes the dataset/seed and submits it to the job scheduler to let the job handler take care of the parallel distribution. (I do this when performing 10+ independent runs of sklearn modeling, where models use multiple threads during calculations; in my case, SLURM then takes care of finding the available nodes to distribute the work to.) Hope this helps. J.B.
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn