On Wed, 25 Oct 2023, Ivan Krylov wrote:

Summary: at the end of this message is a link to an R package
implementing an interface for managing the use of execution units in R
packages. As a package maintainer, would you agree to use something
like this? Does it look sufficiently reasonable to become a part of R?
Read on for why I made these particular interface choices.

My understanding of the problem stated by Simon Urbanek and Uwe Ligges
[1,2] is that we need a way to set and distribute the CPU core
allowance between multiple packages that could be using very different
methods to achieve parallel execution on the local machine, including
threads and child processes. We could have multiple well-meaning
packages, each of them calling each other using a different parallelism
technology: imagine parallel::makeCluster(getOption('mc.cores'))
combined with parallel::mclapply(mc.cores = getOption('mc.cores')) and
with an OpenMP program that also spawns getOption('mc.cores') threads.
A parallel BLAS or custom multi-threading using std::thread could add
more fuel to the fire.


Hi Ivan,

  Generally, I like the idea. A few comments:

* from a package developer point of view, I would prefer to have a clear idea of how many threads I could use. So having a core R function like "getMaxThreads()" or similar would be useful. What that function returns could be governed by a package.

In fact, it might be a good idea to allow to have several packages implementing "thread governors" for different situations.

* it would make sense to think through whether we want (or not) to allow package developers to call omp_set_num_threads() or whether this is done by R.

This is hairier than you might think. Allowing it forces every package to call omp_set_num_threads() before OMP block, because there is no way to know which packaged was called before.

Not allowing to call omp_set_num_threads() might make it difficult to use all the threads, and force R to initialize OpenMP on startup.

* Speaking of initialization of OpenMP, I have seen situations where spawning some regular pthread threads and then initializing OpenMP forces all pthread threads to a single CPU.

I think this is because OpenMP sets thread affinity for all the process threads, but only distributes its own.

* This also raises the question of how affinity is managed. If you have called makeForkCluster() to create 10 R instances and then each uses 2 OpenMP threads, you do not want those occupying only 2 cpu execution threads instead of 20.

* From the user perspective, it might be useful to be able to limit number of threads per package by using patterns or regular expressions. Often, the reason for limiting number of threads is to reduce memory usage.

* Speaking of memory usage, glibc has parameters like MALLOC_ARENA_MAX that have great impact on memory usage of multithreaded programs. I usually set it to 1, but then I take extra care to make as few memory allocation calls as possible within individual threads.

best

Vladimir Dergachev

______________________________________________
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Reply via email to