[julia-users] Re: pmap scheduling and idle workers near the "end" of a job

2016-04-06 Thread 'Greg Plowman' via julia-users
In that case, try sorting the tasks in descending order of complexity (i.e. 
start longest running tasks first).


On Thursday, April 7, 2016 at 9:23:56 AM UTC+10, Thomas Covert wrote:

> Its hard to construct a MWE without the data I am using, but I can give a 
> bit more detail.  
>
> There are 20 workers and about 4200 elements in lst, so using your 
> terminology, I've got many more tasks than workers.  If lst were sorted by 
> complexity (its not, I randomized the order), the complexity of item i is 
> roughly cubic in i.
>
> On Wednesday, April 6, 2016 at 4:29:16 PM UTC-5, Greg Plowman wrote:
>>
>>
>> It's difficult to comment without knowing more detail about numbers of 
>> workers, their relative speed, number of tasks and their expected 
>> completion times.
>>
>> As an extreme example, say you have 4 workers (all of the same speed) and 
>> 2x15-minute tasks and 16x1-minute tasks.
>>
>> Depending on how this is scheduled, this will take between 15 to 19 
>> minutes. Optimally:
>> Worker 1: 15
>> Worker 2: 15
>> Worker 3: 1 1 1 1 1 1 1 1 
>> Worker 4: 1 1 1 1 1 1 1 1
>>
>> After 8 minutes, workers 3 and 4 will be idle, and remain idle for the 
>> remaining 7 minutes before workers 1 and 2 finish.
>>
>> I had a similar problem where I had fast and slow workers, and initially 
>> split the work into a number of tasks similar to the number of workers.
>> This left an overhang similar to what you describe.
>> In my case more granularity helped. Splitting into many tasks so that 
>> #tasks >> #workers helped.
>>
>>
>>
>> On Thursday, April 7, 2016 at 2:21:28 AM UTC+10, Thomas Covert wrote:
>>
>>> The manual suggests that pmap(f, lst) will dynamically "feed" elements 
>>> of lst to the function f as each worker completes its previous assignment, 
>>> and in my read of the code for pmap, this is indeed what it does.
>>>
>>> However, I have found that, in practice, many of the workers that I spin 
>>> up for pmap tasks are idle for the last, say, half of the total time needed 
>>> to complete the task.  In my pmap usage, it is the case that the complexity 
>>> of the workload varies across elements of lst, so that some elements should 
>>> take a long time to compute (say, 15 minutes on a core of my machine) and 
>>> others a short time (less than 1 minute).  Knowing about this heterogeneity 
>>> and observing this pattern of idle workers after about half of the work is 
>>> done would normally lead me to think that pmap is scheduling workers ahead 
>>> of time, not dynamically.  Some workers will get "lucky" and have easier 
>>> than average workload, and others are unlucky and have harder workload.  At 
>>> the end of the calculation, only the unlucky workers are still working. 
>>>  However, this isn't what pmap is doing, so I'm kinda confused. 
>>>
>>> Am I crazy?  The documentation for pmap says that it is scheduling tasks 
>>> dynamically and I am pre-randomizing the order of work in lst so that 
>>> worker 1 doesn't get easier tasks, in expectation, than worker N.  Or is it 
>>> more likely that I've got a bug somewhere?
>>>
>>>
>>>

[julia-users] Re: pmap scheduling and idle workers near the "end" of a job

2016-04-06 Thread Thomas Covert
Its hard to construct a MWE without the data I am using, but I can give a 
bit more detail.  

There are 20 workers and about 4200 elements in lst, so using your 
terminology, I've got many more tasks than workers.  If lst were sorted by 
complexity (its not, I randomized the order), the complexity of item i is 
roughly cubic in i.

On Wednesday, April 6, 2016 at 4:29:16 PM UTC-5, Greg Plowman wrote:
>
>
> It's difficult to comment without knowing more detail about numbers of 
> workers, their relative speed, number of tasks and their expected 
> completion times.
>
> As an extreme example, say you have 4 workers (all of the same speed) and 
> 2x15-minute tasks and 16x1-minute tasks.
>
> Depending on how this is scheduled, this will take between 15 to 19 
> minutes. Optimally:
> Worker 1: 15
> Worker 2: 15
> Worker 3: 1 1 1 1 1 1 1 1 
> Worker 4: 1 1 1 1 1 1 1 1
>
> After 8 minutes, workers 3 and 4 will be idle, and remain idle for the 
> remaining 7 minutes before workers 1 and 2 finish.
>
> I had a similar problem where I had fast and slow workers, and initially 
> split the work into a number of tasks similar to the number of workers.
> This left an overhang similar to what you describe.
> In my case more granularity helped. Splitting into many tasks so that 
> #tasks >> #workers helped.
>
>
>
> On Thursday, April 7, 2016 at 2:21:28 AM UTC+10, Thomas Covert wrote:
>
>> The manual suggests that pmap(f, lst) will dynamically "feed" elements of 
>> lst to the function f as each worker completes its previous assignment, and 
>> in my read of the code for pmap, this is indeed what it does.
>>
>> However, I have found that, in practice, many of the workers that I spin 
>> up for pmap tasks are idle for the last, say, half of the total time needed 
>> to complete the task.  In my pmap usage, it is the case that the complexity 
>> of the workload varies across elements of lst, so that some elements should 
>> take a long time to compute (say, 15 minutes on a core of my machine) and 
>> others a short time (less than 1 minute).  Knowing about this heterogeneity 
>> and observing this pattern of idle workers after about half of the work is 
>> done would normally lead me to think that pmap is scheduling workers ahead 
>> of time, not dynamically.  Some workers will get "lucky" and have easier 
>> than average workload, and others are unlucky and have harder workload.  At 
>> the end of the calculation, only the unlucky workers are still working. 
>>  However, this isn't what pmap is doing, so I'm kinda confused. 
>>
>> Am I crazy?  The documentation for pmap says that it is scheduling tasks 
>> dynamically and I am pre-randomizing the order of work in lst so that 
>> worker 1 doesn't get easier tasks, in expectation, than worker N.  Or is it 
>> more likely that I've got a bug somewhere?
>>
>>
>>

[julia-users] Re: pmap scheduling and idle workers near the "end" of a job

2016-04-06 Thread 'Greg Plowman' via julia-users

It's difficult to comment without knowing more detail about numbers of 
workers, their relative speed, number of tasks and their expected 
completion times.

As an extreme example, say you have 4 workers (all of the same speed) and 
2x15-minute tasks and 16x1-minute tasks.

Depending on how this is scheduled, this will take between 15 to 19 
minutes. Optimally:
Worker 1: 15
Worker 2: 15
Worker 3: 1 1 1 1 1 1 1 1 
Worker 4: 1 1 1 1 1 1 1 1

After 8 minutes, workers 3 and 4 will be idle, and remain idle for the 
remaining 7 minutes before workers 1 and 2 finish.

I had a similar problem where I had fast and slow workers, and initially 
split the work into a number of tasks similar to the number of workers.
This left an overhang similar to what you describe.
In my case more granularity helped. Splitting into many tasks so that 
#tasks >> #workers helped.



On Thursday, April 7, 2016 at 2:21:28 AM UTC+10, Thomas Covert wrote:

> The manual suggests that pmap(f, lst) will dynamically "feed" elements of 
> lst to the function f as each worker completes its previous assignment, and 
> in my read of the code for pmap, this is indeed what it does.
>
> However, I have found that, in practice, many of the workers that I spin 
> up for pmap tasks are idle for the last, say, half of the total time needed 
> to complete the task.  In my pmap usage, it is the case that the complexity 
> of the workload varies across elements of lst, so that some elements should 
> take a long time to compute (say, 15 minutes on a core of my machine) and 
> others a short time (less than 1 minute).  Knowing about this heterogeneity 
> and observing this pattern of idle workers after about half of the work is 
> done would normally lead me to think that pmap is scheduling workers ahead 
> of time, not dynamically.  Some workers will get "lucky" and have easier 
> than average workload, and others are unlucky and have harder workload.  At 
> the end of the calculation, only the unlucky workers are still working. 
>  However, this isn't what pmap is doing, so I'm kinda confused. 
>
> Am I crazy?  The documentation for pmap says that it is scheduling tasks 
> dynamically and I am pre-randomizing the order of work in lst so that 
> worker 1 doesn't get easier tasks, in expectation, than worker N.  Or is it 
> more likely that I've got a bug somewhere?
>
>
>