[julia-users] Re: pmap scheduling and idle workers near the "end" of a job

Thomas Covert Wed, 06 Apr 2016 16:24:06 -0700

Its hard to construct a MWE without the data I am using, but I can give a 
bit more detail.


There are 20 workers and about 4200 elements in lst, so using your 
terminology, I've got many more tasks than workers.  If lst were sorted by 
complexity (its not, I randomized the order), the complexity of item i is 
roughly cubic in i.

On Wednesday, April 6, 2016 at 4:29:16 PM UTC-5, Greg Plowman wrote:
>
>
> It's difficult to comment without knowing more detail about numbers of 
> workers, their relative speed, number of tasks and their expected 
> completion times.
>
> As an extreme example, say you have 4 workers (all of the same speed) and 
> 2x15-minute tasks and 16x1-minute tasks.
>
> Depending on how this is scheduled, this will take between 15 to 19 
> minutes. Optimally:
> Worker 1: 15
> Worker 2: 15
> Worker 3: 1 1 1 1 1 1 1 1 
> Worker 4: 1 1 1 1 1 1 1 1
>
> After 8 minutes, workers 3 and 4 will be idle, and remain idle for the 
> remaining 7 minutes before workers 1 and 2 finish.
>
> I had a similar problem where I had fast and slow workers, and initially 
> split the work into a number of tasks similar to the number of workers.
> This left an overhang similar to what you describe.
> In my case more granularity helped. Splitting into many tasks so that 
> #tasks >> #workers helped.
>
>
>
> On Thursday, April 7, 2016 at 2:21:28 AM UTC+10, Thomas Covert wrote:
>
>> The manual suggests that pmap(f, lst) will dynamically "feed" elements of 
>> lst to the function f as each worker completes its previous assignment, and 
>> in my read of the code for pmap, this is indeed what it does.
>>
>> However, I have found that, in practice, many of the workers that I spin 
>> up for pmap tasks are idle for the last, say, half of the total time needed 
>> to complete the task.  In my pmap usage, it is the case that the complexity 
>> of the workload varies across elements of lst, so that some elements should 
>> take a long time to compute (say, 15 minutes on a core of my machine) and 
>> others a short time (less than 1 minute).  Knowing about this heterogeneity 
>> and observing this pattern of idle workers after about half of the work is 
>> done would normally lead me to think that pmap is scheduling workers ahead 
>> of time, not dynamically.  Some workers will get "lucky" and have easier 
>> than average workload, and others are unlucky and have harder workload.  At 
>> the end of the calculation, only the unlucky workers are still working. 
>>  However, this isn't what pmap is doing, so I'm kinda confused. 
>>
>> Am I crazy?  The documentation for pmap says that it is scheduling tasks 
>> dynamically and I am pre-randomizing the order of work in lst so that 
>> worker 1 doesn't get easier tasks, in expectation, than worker N.  Or is it 
>> more likely that I've got a bug somewhere?
>>
>>
>>

[julia-users] Re: pmap scheduling and idle workers near the "end" of a job

Reply via email to