Hello Tejun Heo,

I've backported the schedule_on_each_cpu() "direct excution" patch on 3.0.30-rt50,
and  It  fixed my problem.

attachment is the effective patch.

However, I do not understand why machine1 can expose problem, but machine2 not.

I guess, because it's rt-kernel's preempt level related, so , is this difference due to cpu performance?

How do you think about this ?

Thank you~
diff -up linux-3.0.30-rt50/kernel/workqueue.c.bak 
linux-3.0.30-rt50/kernel/workqueue.c
--- linux-3.0.30-rt50/kernel/workqueue.c.bak    2013-06-08 19:09:06.801059232 
+0800
+++ linux-3.0.30-rt50/kernel/workqueue.c        2013-06-08 19:09:15.680069626 
+0800
@@ -1922,6 +1922,7 @@ static int worker_thread(void *__worker)
 
        /* tell the scheduler that this is a workqueue worker */
        worker->task->flags |= PF_WQ_WORKER;
+       smp_mb();
 woke_up:
        spin_lock_irq(&gcwq->lock);
 
@@ -2736,6 +2737,7 @@ EXPORT_SYMBOL(schedule_delayed_work_on);
 int schedule_on_each_cpu(work_func_t func)
 {
        int cpu;
+       int orig = -1;
        struct work_struct __percpu *works;
 
        works = alloc_percpu(struct work_struct);
@@ -2744,13 +2746,20 @@ int schedule_on_each_cpu(work_func_t fun
 
        get_online_cpus();
 
+       if(current->flags & PF_WQ_WORKER)
+               orig = raw_smp_processor_id();
+
        for_each_online_cpu(cpu) {
                struct work_struct *work = per_cpu_ptr(works, cpu);
 
                INIT_WORK(work, func);
-               schedule_work_on(cpu, work);
+               if(cpu != orig)
+                       schedule_work_on(cpu, work);
        }
 
+       if (orig >= 0)
+               func(per_cpu_ptr(works,orig));
+
        for_each_online_cpu(cpu)
                flush_work(per_cpu_ptr(works, cpu));
 

Reply via email to