Philippe Gerum wrote: > On Fri, 2007-07-20 at 16:20 +0200, Jan Kiszka wrote: > >> OK, let's go through this another time, this time under the motto "get >> the locking right". As a start (and a help for myself), here comes an >> overview of the scheme the final version may expose - as long as there >> are separate locks: >> >> gatekeeper_thread / xnshadow_relax: >> rpilock, followed by nklock >> (while xnshadow_relax puts both under irqsave...) >> > > The relaxing thread must not be preempted in primary mode before it > schedules out but after it has been linked to the RPI list, otherwise > the root thread would benefit from a spurious priority boost. This said, > in the UP case, we have no lock to contend for anyway, so the point of > discussing whether we should have the rpilock or not is moot here. > >> xnshadow_unmap: >> nklock, then rpilock nested >> > > This one is the hardest to solve. > >> xnshadow_start: >> rpilock, followed by nklock >> >> xnshadow_renice: >> nklock, then rpilock nested >> >> schedule_event: >> only rpilock >> >> setsched_event: >> nklock, followed by rpilock, followed by nklock again >> >> And then there is xnshadow_rpi_check which has to be fixed to: >> nklock, followed by rpilock (here was our lock-up bug) >> > > rpilock -> nklock in fact.
Yes, meant it the other way around: The invocation of xnpod_renice_root() must be moved out of nklock - which should be trivial, correct? > The last lockup was rather likely due to the > gatekeeper's dangerous nesting of nklock -> rpilock -> nklock. This path - as one of three with this ordering - surely triggered the bug. But given the fact that the other two nestings of this kind are yet unresolvable while our reversely ordered nesting in xnshadow_rpi_check is, it is clear that the latter one is the weak point. So far we only have a fix for Mathias' test case which stresses just a subset of all rpilock paths appropriately. > >> That's a scheme which /should/ be safe. Unfortunately, I see no way to >> get rid of the remaining nestings. >> > > There is one, which consists of getting rid of the rpilock entirely. The > purpose of such lock is to protect the RPI list when fixing the > situation after a task migration in secondary mode triggered from the > Linux side. Addressing the latter issue differently may solve the > problem more elegantly than figuring out how to combine the two locks, > or hammering the hot path with the nklock. Will look at this. Even the better! Looking forward. Jan
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Xenomai-core mailing list Xenomai-core@gna.org https://mail.gna.org/listinfo/xenomai-core