Hi,

it turned out ipipe_critical_enter is broken on SMP > 2 CPUs: On one
CPU, Linux may have acquired an rwlock for reading when being preempted
by the critical IPI. On some other CPU, Linux may have entered
write_lock_irq[save] before the IPI arrived. The reader will be stuck in
__ipipe_do_critical_sync, the writer in __write_lock_failed - forever.
First seen on real silicon (once per "few" hundreds of boots), finally
caught under KVM and nailed down.

Two approaches to resolve this issue come to my mind so far. The first
one is to restart the whole ipipe_critical_enter after some (how many?)
cycles of futile waiting. The other is to accept the critical IPI even
if the top-most domain is stalled (as it sits in write_lock_irq), but
I'm not 100% that our optimistic IRQ mask will always allow this when
Linux is on the top (I assume we can safely require other domains to
avoid such deadlocks by design).

Comments? Better ideas?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

_______________________________________________
Adeos-main mailing list
[email protected]
https://mail.gna.org/listinfo/adeos-main

Reply via email to