Hello again Andreas, (I just forgot to Cc to GCC ML, so resending this email)
> Right, start with distributing the queues and then think about load > balancing. OK. > I would say don't worry too much about cut-offs at this point. Finding a > good cut-off strategy that works without drawbacks is pretty much an > open research problem. Just spawn the tasks and focus on efficient task > creation and scheduling. In my experience, going from a centralized to a > distributed task pool already makes a huge difference. As you say, deciding cut-off strategy sounds pretty difficult. I read "Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs" and learned a litt about Adaptive Task Cutoff, in which good cutoff is found by profiling data collected early in the programs' execution. I guess it's almost impossible to implement it in the GSoC term but I'm interested in challenging it in the future. > To get a better overview of other implementations, which you can compare > to libgomp, I recommend a couple of papers. For example: > > - OpenMP Tasks in IBM XL Compilers, X. Teruel et al. > - Support for OpenMP Tasks in Nanos v4, X. Teruel et al. > - OpenMP 3.0 Tasking Implementation in OpenUH, C. Addison et al. > - A Runtime Implementation of OpenMP Tasks, J. LaGrone et al. Thanks. I'll read all of them ;-) -- Sho Nakatani