Hi. Sorry for being late.

> Depends on what you mean by lazy task creation, gcc schedules
> tasks lazily if they aren't if (0), some data structure if created
> for them when encountering #pragma omp task directive, but I guess
> any implementation will do something like that.

I mean the following implementation by Lazy Task Generation:

- 1 CPU core has 1 worker
- 1 worker has 1 deque (LIFO)
- 1 deque has some tasks
- What worker does are:
  - Execute tasks from the head of deque
  - Steel a task from the tail of deque in another core
  - When task A encounters "#pragma omp task" derective, worker creates a task
    and immediately execute it. Worker pushes A to the head of deque.
    (Here occurs context switch)
    This is important point because A can move to other deques. (*)
  - Steel a task from a deque in another core when the deque on the core is 
empty

My associate sinior has already made a library which realizes this scheduling
algorithm above.
(It is called `MassiveThreads' but the paper related to its work is written
in Japanese.)
MassiveThreads has proved this algorithm makes things like OpenMP Task speedy.

Taking this implementation,
- Nested `task' derectives can be processed naturally
- Given that task A is a member of deque D and task A1 is created in D
  when task A encounters `task' derective. (See (*))
  A1 runs soon after it is created. So although A will execute some functions
  which takes too long to finish, this work can be done after A is stolen into
  another deque than D


Anyway, I'd like to read some materials refered to when current libgomp `task'
is implemented (to read the code smoothly).

Do you know any of that?

> What your testcase shows is not whether tasks are created lazily or not, but
> how good/poor #pragma omp taskwait implementation is.  And, for your testcase
> libgomp/task.c (GOMP_taskwait) definitely could be improved.  Currently it 
> only
> tries to schedule in children that will be awaited by the current tasks and if
> there are no such children, goes to sleep, waiting for them to complete.
> Scheduling in random unrelated tasks is problematic, because the unrelated
> task might take too long to complete and delay the taskwait for way too long
> (note, gcc doesn't have untied tasks, all tasks are tied once they are 
> scheduled
> onto some particular tasks - setcontext/swapcontext is quite fragile thing to 
> do).
> But it is true it could very well schedule tasks that are taskwaited by tasks
> taskwaited by current task, and transitively further.  Plus, be able to 
> temporarily
> awake such a sleeping thread if there are tasks it can transitively taskwait
> for, as if those don't complete, the current taskwait won't return.
> 
>       Jakub

--
Sho Nakatani

Reply via email to