Hi Ivan, Interesting package, and I'll provide more feedback later. For a comparison, I'd recommend looking at how GNU make does parallel processing. It uses the concept of job server and job slots. What I like about it is that it is implemented at the OS level because make needs to support interacting with non-make processes. On Windows it uses a named semaphore, and on Unix-like system it uses named pipes or simple pipes to pass tokens around.
https://www.gnu.org/software/make/manual/make.html#Job-Slots Cheers, Reed On Wed, Oct 25, 2023 at 5:55 AM Ivan Krylov <krylov.r...@gmail.com> wrote: > Summary: at the end of this message is a link to an R package > implementing an interface for managing the use of execution units in R > packages. As a package maintainer, would you agree to use something > like this? Does it look sufficiently reasonable to become a part of R? > Read on for why I made these particular interface choices. > > My understanding of the problem stated by Simon Urbanek and Uwe Ligges > [1,2] is that we need a way to set and distribute the CPU core > allowance between multiple packages that could be using very different > methods to achieve parallel execution on the local machine, including > threads and child processes. We could have multiple well-meaning > packages, each of them calling each other using a different parallelism > technology: imagine parallel::makeCluster(getOption('mc.cores')) > combined with parallel::mclapply(mc.cores = getOption('mc.cores')) and > with an OpenMP program that also spawns getOption('mc.cores') threads. > A parallel BLAS or custom multi-threading using std::thread could add > more fuel to the fire. > > Workarounds applied by the package maintainers nowadays are both > cumbersome (sometimes one has to talk to some package that lives > downstream in the call stack and isn't even an explicit dependency, > because it's the one responsible for the threads) and not really enough > (most maintainers forget to restore the state after they are done, so a > single example() may slow down the operations that follow). > > The problem is complicated by the fact that not every parallel > operation can explicitly accept the CPU core limit as a parameter. For > example, data.table's implicit parallelism is very convenient, and so > are parallel BLASes (which don't have a standard interface to change > the number of threads), so we shouldn't be prohibiting implicit > parallelism. > > It's also not always obvious how to split the cores between the > potentially parallel sections. While it's typically best to start with > the outer loop (e.g. better have 16 R processes solving relatively > small linear algebra problems back to back than have one R process > spinning 15 of its 16 OpenBLAS threads in sched_yield()), it may be > more efficient to give all 16 threads back to BLAS (and save on > transferring the problems and solutions between processes) once the > problems become large enough to give enough work to all of the cores. > > So as a user, I would like an interface that would both let me give all > of the cores to the program if that's what I need (something like > setCPUallowance(parallelly::availableCores())) _and_ let me be more > detailed when necessary (something like setCPUallowance(overall = 7, > packages = c(foobar = 1), BLAS = 2) to limit BLAS threads to 2, > disallow parallelism in the foobar package because it wastes too much > time, and limit R as a whole to 7 cores because I want to surf the 'net > on the remaining one while the Monte-Carlo simulation is going on). As > a package developer, I'd rather not think about any of that and just > use a function call like getCPUallowance() for the default number of > cores in every situation. > > Can we implement such an interface? The main obstacle here is not being > able to know when each parallel region beings and ends. Does the > package call fork()? std::thread{}? Start a local mirai cluster? We > have to trust (and verify during R CMD check) the package to create the > given number of units of execution and tells us when they are done. > > The closest interface that I see being implementable is a system of > tokens with reference semantics: getCPUallowance() returns a special > object containing the number of tokens the caller is allowed to use and > sets an environment variable with the remaining number of cores. Any R > child processes pick up the number of cores from the environment > variable. Any downstream calls to getCPUallowance(), aware of the > tokens already handed out, return a reduced number of remaining CPU > cores. Once the package is done executing a parallel section, it > returns the CPU allowance back to R by calling something like > close(token), which updates the internal allowance value (and the > environment variable). (A finalizer can also be set on the tokens to > ensure that CPU cores won't be lost.) > > Here's a package implementing this idea: > < > https://urldefense.com/v3/__https://codeberg.org/aitap/R-CPUallowance__;!!IKRxdwAv5BmarQ!YKbTjFJ7y9HrRl3jgCC0ouo8C5sMSCIkKuIWlUjSZz_B2rehUKl6JefgeeuKrP-46slSQW0dO1VOOWVObOy5GQ$ > >. Currently missing are > terrible hacks to determine the BLAS type at runtime and resolve the > necessary symbols to set the number of BLAS threads, depending on > whether it's OpenBLAS, flexiblas, MKL, or something else. Does it feel > over-engineered? I hope that, even if not a good solution, this would > let us move towards a unified solution that could just work™ on > everything ranging from laptops to CRAN testing machines to HPCs. > > -- > Best regards, > Ivan > > [1] > https://urldefense.com/v3/__https://stat.ethz.ch/pipermail/r-package-devel/2023q3/009484.html__;!!IKRxdwAv5BmarQ!YKbTjFJ7y9HrRl3jgCC0ouo8C5sMSCIkKuIWlUjSZz_B2rehUKl6JefgeeuKrP-46slSQW0dO1VOOWUAUyE0DQ$ > > [2] > https://urldefense.com/v3/__https://stat.ethz.ch/pipermail/r-package-devel/2023q3/009513.html__;!!IKRxdwAv5BmarQ!YKbTjFJ7y9HrRl3jgCC0ouo8C5sMSCIkKuIWlUjSZz_B2rehUKl6JefgeeuKrP-46slSQW0dO1VOOWWuNYr1EQ$ > > ______________________________________________ > R-package-devel@r-project.org mailing list > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-package-devel__;!!IKRxdwAv5BmarQ!YKbTjFJ7y9HrRl3jgCC0ouo8C5sMSCIkKuIWlUjSZz_B2rehUKl6JefgeeuKrP-46slSQW0dO1VOOWXoVH6F-Q$ > [[alternative HTML version deleted]] ______________________________________________ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel