Hey Sho, > I totally agree with this point. > Currently, I'm planning to implement tied task using breath-first > scheduler wrote in > section 3.1 of "Evaluation of OpenMP Task Scheduling Strategies" by Nanos > Group. > http://www.sarc-ip.org/files/null/Workshop/1234128788173__TSchedStrat-iwomp08.pdf > > That is: > * A team has one team queue which contains tasks. > * A team has some (user-level) threads. > * A thread can have one running task. > * A thread has private queue which contains tasks. > * When a task is created, it is queued in team queue. > * Each thread steals tasks from the team queue and inserts it in the > private queue. > * Once tied task is executed in a thread, it is queued only in the > private queue in the thread > when it encounters `taskwait'. > * Each thread runs a task from its private queue. > > But I'm not sure how to achieve good load-balancing and what kind of > cutoff strategy to take. > As for load-balancing, I'll read Nanos4 implementations and ask Nanos > Group for it. > (Of course your advice will do :-) ) > > As for cutoff, basically I can choose `max-tasks' strategy or > `max-levels' strategy. > When number of tasks or recursion levels exceed this value, the > scheduler stops its work > and execute each task as sentences in sequential programs. > But "Evaluation of OpenMP Task Scheduling Strategies" says better > cutoff strategy is different > from application to application.
Right, start with distributing the queues and then think about load balancing. I would say don't worry too much about cut-offs at this point. Finding a good cut-off strategy that works without drawbacks is pretty much an open research problem. Just spawn the tasks and focus on efficient task creation and scheduling. In my experience, going from a centralized to a distributed task pool already makes a huge difference. To get a better overview of other implementations, which you can compare to libgomp, I recommend a couple of papers. For example: - OpenMP Tasks in IBM XL Compilers, X. Teruel et al. - Support for OpenMP Tasks in Nanos v4, X. Teruel et al. - OpenMP 3.0 Tasking Implementation in OpenUH, C. Addison et al. - A Runtime Implementation of OpenMP Tasks, J. LaGrone et al. You should be able to find copies online. If not, let me know. -Andreas