It's difficult to comment without knowing more detail about numbers of workers, their relative speed, number of tasks and their expected completion times.
As an extreme example, say you have 4 workers (all of the same speed) and 2x15-minute tasks and 16x1-minute tasks. Depending on how this is scheduled, this will take between 15 to 19 minutes. Optimally: Worker 1: 15 Worker 2: 15 Worker 3: 1 1 1 1 1 1 1 1 Worker 4: 1 1 1 1 1 1 1 1 After 8 minutes, workers 3 and 4 will be idle, and remain idle for the remaining 7 minutes before workers 1 and 2 finish. I had a similar problem where I had fast and slow workers, and initially split the work into a number of tasks similar to the number of workers. This left an overhang similar to what you describe. In my case more granularity helped. Splitting into many tasks so that #tasks >> #workers helped. On Thursday, April 7, 2016 at 2:21:28 AM UTC+10, Thomas Covert wrote: > The manual suggests that pmap(f, lst) will dynamically "feed" elements of > lst to the function f as each worker completes its previous assignment, and > in my read of the code for pmap, this is indeed what it does. > > However, I have found that, in practice, many of the workers that I spin > up for pmap tasks are idle for the last, say, half of the total time needed > to complete the task. In my pmap usage, it is the case that the complexity > of the workload varies across elements of lst, so that some elements should > take a long time to compute (say, 15 minutes on a core of my machine) and > others a short time (less than 1 minute). Knowing about this heterogeneity > and observing this pattern of idle workers after about half of the work is > done would normally lead me to think that pmap is scheduling workers ahead > of time, not dynamically. Some workers will get "lucky" and have easier > than average workload, and others are unlucky and have harder workload. At > the end of the calculation, only the unlucky workers are still working. > However, this isn't what pmap is doing, so I'm kinda confused. > > Am I crazy? The documentation for pmap says that it is scheduling tasks > dynamically and I am pre-randomizing the order of work in lst so that > worker 1 doesn't get easier tasks, in expectation, than worker N. Or is it > more likely that I've got a bug somewhere? > > >