[issue32986] multiprocessing, default assumption of Pool size unhelpful

2019-02-25 Thread Eryk Sun


Change by Eryk Sun :


--
nosy: +eryksun

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Matt Harvey

Matt Harvey  added the comment:

@njs your sketch in msg313406 looks good.  Probably better to go with 
OMP_NUM_THREADS than NCPUS. 
M

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Nathaniel Smith

Nathaniel Smith  added the comment:

> You mean duplicating "nproc"'s logic in Python?

Yeah.

> If someone wants to do the grunt work of implementing/testing it...

Well, that's true of any bug fix / improvement :-). The logic isn't terribly 
complicated though, something roughly like:

def parse_omp_envvar(env_value):
return int(env_value.strip().split(",")[0])

def estimate_cpus():
limit = float("inf")
if "OMP_THREAD_LIMIT" in os.environ:
limit = parse_omp_envvar(os.environ["OMP_THREAD_LIMIT"])

if "OMP_NUM_THREADS" in os.environ:
cpus = parse_omp_envvar(os.environ["OMP_NUM_THREADS"])
else:
try:
cpus = len(os.sched_getaffinity(os.getpid()))
except AttributeError, OSError:
cpus = os.cpu_count()

return min(cpus, limit)

> There's also the question of how that affects non-scientific workloads. 
> People can use thread pools or process pools for other purposes, such as 
> distributing (blocking) I/O.

We already have some heuristics for this: IIRC the thread pool executor 
defaults to cpu_count() * 5 threads (b/c Python threads are really intended for 
I/O-bound workloads), and the process pool executor and multiprocessing.Pool 
defaults to cpu_count() processes (b/c processes are better suited to CPU-bound 
workloads). Neither of these heuristics is perfect. But inasmuch as it makes 
sense at all to use the cpu count as part of the heuristic, it surely will work 
better to use a more accurate estimate of the available cpus.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Nathaniel Smith

Nathaniel Smith  added the comment:

I can't find any evidence that NPROCS is used by other batch schedulers (I 
looked at SLURM, Torque, and SGE). @M J Harvey, do you have any other examples 
of systems that use it?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

> So this seems like a reasonable heuristic approach to me.

You mean duplicating "nproc"'s logic in Python?  If someone wants to do the 
grunt work of implementing/testing it...

There's also the question of how that affects non-scientific workloads. People 
can use thread pools or process pools for other purposes, such as distributing 
(blocking) I/O.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Nathaniel Smith

Nathaniel Smith  added the comment:

That stackoverflow thread points to the GNU coreutils 'nproc', which is an 
interesting compendium of knowledge about this problem.

It looks like their algorithm is roughly:

1. Determine how many CPUs *could* this program access, by going down a list of 
possible options and using the first that works: pthread_getaffinity_np -> 
sched_getaffinity -> GetProcessAffinityMask -> sysconf(_SC_NUMPROCESSORS_ONLN) 
-> some arcane stuff specific to HP-UX, IRIX, etc.

2. Parse the OMP_NUM_THREADS and OMP_THREAD_LIMIT envvars (this is not quite 
trivial, there's some handling of whitespace and commas and references to the 
OMP spec)

3. If OMP_NUM_THREADS is set, return min(OMP_NUM_THREADS, OMP_THREAD_LIMIT). 
Otherwise, return min(available_processors, OMP_THREAD_LIMIT).

Step (1) handles both the old affinity APIs, and also the cpuset system that 
docker uses. (From cpuset(7): "Cpusets are integrated with the 
sched_setaffinity(2) scheduling  affinity  mechanism".) Step (2) relies on the 
quasi-standard OMP_* envvars to let you choose something explicitly.

The PBS Pro docs say that they set both NPROCS and OMP_NUM_THREADS. See section 
6.1.7 of https://pbsworks.com/pdfs/PBSUserGuide14.2.pdf

So this seems like a reasonable heuristic approach to me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Note that to avoid any kind of environment variable-driven Denial of Service, 
we should probably ignore NCPUS when sys.flags.ignore_environment is set.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Matt Harvey

Matt Harvey  added the comment:

Hi,

No, using the affinity's not useful to us as, in the general case, the batch 
system (PBS Pro in our case) isn't using cgroups or cpusets (it does control 
ave cpu use by monitoring rusage of the process group). 

Several other batch system I've worked with either set NCPUS directly or have a 
method for site-specific customisation of the job's environment.

That doesn't preclude using the affinity as an alternative to os.cpu_count()

As @pitrou correctly observes, probably better to have a simple, 
well-sign-posted way for the sysadmins to influence the pool default than try 
to overload multiprocessing with complex heuristics.

--
nosy: +Matt Harvey

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

I don't think we want to hardcode special cases for each resource-limiting 
framework out there: some people care about Docker, others about cgroups, CPU 
affinity settings, etc.  NCPUS has the nice property that each of those 
frameworks can set it, and users or sysadmins can also override it easily.  
It's also trivially queried from Python.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Matthew Rocklin

Matthew Rocklin  added the comment:

I agree that this is a common issue.  We see it both when people use batch 
schedulers as well as when people use Docker containers.  I don't have enough 
experience with batch schedulers to verify that NCPUS is commonly set.  I would 
encourage people to also look at what Docker uses.  

After a quick (two minute) web search I couldn't find the answer, but I suspect 
that one exists.  I've raised a question on Stack Overflow here: 
https://stackoverflow.com/questions/49151296/how-many-cpus-does-my-docker-container-have

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Antoine Pitrou

Change by Antoine Pitrou :


--
type: behavior -> enhancement
versions:  -Python 3.6, Python 3.7

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Nathaniel Smith

Nathaniel Smith  added the comment:

This is a duplicate of bpo-26692 and bpo-23530, and possibly others.

My impression is that len(os.sched_getaffinity(os.getpid())) is the Right 
Guess. Currently sched_getaffinity isn't implemented on Windows, but bpo-23530 
has some example code for how it could/should be implemented.

@M J Harvey: does this return the right thing for your batch jobs?

--
nosy: +njs

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-07 Thread Antoine Pitrou

Antoine Pitrou  added the comment:

Matt, what do you think about this proposal? Is NCPUS the right thing to look 
at?

--
nosy: +Matthew Rocklin
type:  -> behavior
versions: +Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-02 Thread Ned Deily

Change by Ned Deily :


--
nosy: +davin, pitrou

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-02 Thread Roundup Robot

Change by Roundup Robot :


--
keywords: +patch
pull_requests: +5726
stage:  -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32986] multiprocessing, default assumption of Pool size unhelpful

2018-03-02 Thread M J Harvey

New submission from M J Harvey :

Hi,

multiprocessing's default assumption about Pool size is os.cpu_count() ie all 
the cores visible to the OS.

This is tremendously unhelpful when running multiprocessing code inside an HPC 
batch system (PBS Pro in my case), as there's no way to hint to the code that 
the # of cpus actually allocated to it may be fewer.

It's quite tedious to have to explain this to every single person trying to use 
it.

Proposal: multiprocessing should look for a hint for default Pool size from the 
environment variable "NCPUS" which most batch systems set. If that's not set, 
or its value is invalid, fall back to os.cpu_count() as before

--
components: Library (Lib)
messages: 313150
nosy: M J Harvey
priority: normal
severity: normal
status: open
title: multiprocessing, default assumption of Pool size unhelpful
versions: Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com