On Tue, 8 Mar 2011 21:21:29 +0000 Andrew Doran <a...@netbsd.org> wrote:
> Your program makes life somewhat difficult for the scheduler > because your jobs are short lived. It sees a number of threads in a > tight loop of communication together, running short tasks and so will > try to run these on the same CPU. > > When you signal and unlock each task queue, it's likely to preempt the > controlling thread and run the short job on the same CPU, so you get > a sort of serializing effect because the controlling thread is off > the CPU and can't signal the next worker thread to run. > OK this probably explains why CPUs are stuck waiting. Maybe this is a good strategy for executing batch jobs, but it seems to work poorly for bursts of short parallel jobs. I imagine in the near future, as the number of CPU cores goes up, this type of concurrent programming will be more common, e.g. a serial path, followed by bursts of short parallel sections. > As each worker thread consumes CPU time it will be penalised by > having its priority lowered below that of the controlling thread. > This cancels the preemption effect, causing things to be queued up on > the controlling CPU thereby allowing remote CPUs to steal tasks from > it and get a slice of the action. > > For some reason this distribution as a result of lowered priority is > not happening. perhaps because of this test in sched_takecpu(): > > 385 /* If CPU of this thread is idling - run > there */ 386 if (ci_rq->r_count == 0) { > > Can you try changing this to: > > /* > * If CPU of this thread is idling - run there. > * XXXAD Test for PL_PPWAIT and special case for vfork()?? > */ > if (ci->ci_data.cpu_onproc == ci->ci_data.cpu_idlelwp) > I've tried these changes, but unfortunately it makes little/no difference.