Dear all, To make a formal description of the bug/mismatch of implemented protocol(s), we prepared a ticket in https://devel.rtems.org/ticket/4742#no2, which contains the bug description and counter example. I hope it can help to give us a clear overview of the bug/mismatch that we found.
Best, Junjie Gedare Bloom <ged...@rtems.org> 于2022年9月21日周三 23:50写道: > CC: devel@rtems.org > > Preserving this discussion for posterity. > > On Fri, Sep 16, 2022 at 3:09 PM Jian-Jia Chen > <jian-jia.c...@cs.uni-dortmund.de> wrote: > > > > Dear Gedare, dear Sebastian, > > I would like to clarify a few things. > > > > == historical perspectives == > > First of all, let me shortly talk about the history of PCP and the > extension of it. PCP was proposed in 1989 with two advantages in comparison > to PIP > > > > Under PCP, the access of resources in a nested manner is deadlock-free > > Under PCP, a higher-priority job J_i is blocked by at most one > lower-priority critical section whose priority ceiling has a higher > priority than or the same priority as J_i. > > > > > > PCP therefore has become an interesting result. However, in its > original design, a lower-priority job only inherits the priority of a > higher-priority job that it blocks. It was later noted that the original > PCP design can be extended to the so-called immediate ceiling priority > protocol (ICPP) which promotes the priority of a lower-priority job > immediately to its priority ceiling after the lock. This extension ensures > that the two important properties of PCP remain valid. > > > > Now, in our discussion, we should further clarify whether the > implementation of RTEMS is the ICPP described above or not. If yes, then it > should be deadlock-free under any nested resource access sequence. > Otherwise, it is not supposed to be the ICPP described above. You may want > to call it “restrictive ICPP” or anything alike. > > > > When we submitted the paper, one reviewer also asked a question > whether the issue we found is a bug or a feature. I believe that it is both > because there is no documentation. However, in case there is no > documentation on such a limitation, then it is a bug as the implementation > does not comply to the ICPP; otherwise, it is a feature for the implemented > “restrictive ICPP”. > > > > == designers’ perspectives == > > > > Now, coming to the discussion whether “acquiring mutexes in an > arbitrary priority order is a good application design” or not, I think we > have to also clarify two things: > > > > The program's semantics of resource access: it is supposed to lock first > resource R1 (guarded by mutex M1) and then potentially resource R2 (guarded > by mutex M2) in a nested manner, depending on the condition needed. > > The resource priority ceiling: it is unrelated to the program context > but just used to ensure that there is no deadlock in any request order and > there is at most one blocking from the lower-priority jobs. > > > > > > So, (1) is about the context of the program(s) and (2) is about the > resource management schemes. > > > > One can of course enforce the sequence of locks, but as soon as the > designer changes the priority assignment of the tasks or adds a new task, > the program may have to be rewritten in their statements and may not be > able to keep the same program behavior. Therefore, I think the beauty of > PCP and ICPP for achieving the deadlock-free property is destroyed if the > mutex usage has to follow a strict order to ensure deadlock free. In fact, > under this strict order, as there is no circular waiting, there is by > definition no deadlock. > > > > == ending == > > > > Furthermore, no matter which approach is taken, it is important to > document the assertions/assumptions that must hold to lead to the > correctness of the implemented IPCC. > > > > > > Best Regards, > > Jian-Jia Chen > > > > > > > > On Sep 16, 2022, at 9:47 AM, Kuan-Hsun Chen <c00...@gmail.com> wrote: > > > > > > > > ---------- Forwarded message --------- > > From: Sebastian Huber <sebastian.hu...@embedded-brains.de> > > > > Date: Fri, Sep 16, 2022 at 08:09 > > Subject: Re: [PATCH] icpp remedy > > To: Gedare Bloom <ged...@rtems.org>, Kuan-Hsun Chen <c00...@gmail.com> > > Cc: Strange369 <junjie....@tu-dortmund.de>, rtems-de...@rtems.org < > devel@rtems.org> > > > > > > On 15.09.22 00:06, Gedare Bloom wrote: > > > On Tue, Sep 13, 2022 at 12:42 PM Kuan-Hsun Chen<c00...@gmail.com> > wrote: > > >> Thanks for the prompt reply. Yes, I will guide Junjie to make a > ticket and go through the issue. > > >> > > >>> Is there a test case for this? > > >> Yes, a test case is also ready to be reviewed and can be part of the > testsuite to test out ICPP (MrsP should also pass). > > >> > > >>> If it applies to 5 or 4.11, there needs to be another ticket to get > the fix back ported. > > >> So each release version with one ticket? We only check 5.1 in this > work. My intuitive guess is that if the functionality does not change over > the versions, the remedy should be similar. > > >> Let us figure it out. > > >> > > >> On Tue, Sep 13, 2022 at 8:21 PM Joel Sherrill<j...@rtems.org> wrote: > > >>> > > >>> > > >>> On Tue, Sep 13, 2022, 1:14 PM Kuan-Hsun Chen<c00...@gmail.com> > wrote: > > >>>> (Oh, I just noticed that the patch was submitted faster than my > mail to share this patch :P). > > >>> > > >>> No problem. That's why I asked what the issue was. > > >>> > > >>> It needs at least a much better log message. It needs a ticket given > the gravity of the issue. That ticket can explain the situation and include > links. Probably needs a comment above the change explaining what's going on. > > >>> > > >>> Is there a test case for this? > > >>> > > >>> If it applies to 5 or 4.11, there needs to be another ticket to get > the fix back ported. > > >>> > > >>> Great catch. Just need to do a good job on the log, test, ticket(s), > etc > > >>> > > >>> --joel > > >>> > > >>>> Hi Joel, > > >>>> > > >>>> Let me clarify what happened here. In our recent study, we adopt > Frama-C to formally verify the implementation of ICPP (also MrsP) and note > that the current flow may lead to a deadlock. The patch is a simple remedy > to fix the issue. > > >>>> > > >>>> The resource’s ceiling is not checked against the thread’s base > priority, but against its current dynamic priority derived from the task’s > priority aggregation. However, a resource’s ceiling is required to be set > as the highest base priority of all tasks that are requesting it. This > mismatch may lead to a deadlock by denying legitimate nested resource > access if resources are requested in descending order of priority. So this > adaption is needed to resolve the issue. A similar issue can also be found > in MrsP due to the usage of this function. A detailed explanation can be > found at the end of Section V in the attached preprint, and a toy example > is provided there to reveal the issue. > > >>>> > > > By changing this to check the base priority instead of the dynamic > > > priority, the mutex seize will now allow a task to acquire a ceiling > > > mutex even though its dynamic priority is higher than the ceiling. The > > > specification to check against the base priority of tasks comes from > > > the ICPP without nested lock behavior as I recall, and I'm not 100% > > > sure where the requirement comes from that "a resource’s ceiling is > > > required to be set as the highest base priority of all tasks that are > > > requesting it" -- is that requirement proved anywhere for ICPP with > > > nested resources? I'm a little concerned about making this kind of > > > fundamental change deep within the (shared) locking protocol code, > > > especially since it primarily seems to cater to a uniprocessor corner > > > case. > > > > > > This is a tricky situation, and clearly a lot of work has already gone > > > into discovering it. I think we need to discuss this patch a bit more. > > > If I understood correctly, this deadlock can't happen if locks are > > > always acquired in the increasing priority order. Changing this code > > > allows for applications to acquire locks in arbitrary priority order. > > > I'm not sure that is permissible in multiprocessor systems to still > > > ensure correctness of the locking protocol correctness. It would be > > > therefore also valid to state this as an explicit requirement for > > > applications to make correct use of ICPP. The same thing is also true > > > for MrsP, and for multiprocessor locking protocols in general I > > > believe it is required for locks to be acquired in priority order? I'd > > > be inclined to instead say that we should confirm that locks are > > > acquired in the proper priority ordering as a way to avoid the > > > identified deadlock situation. > > > > The check against the dynamic priority is there for a long time (at > > least RTEMS 3.6.0). The questions is how much policy we want to enforce > > through such checks. I am not sure if acquiring mutexes in an arbitrary > > priority order is a good application design. However, from an > > implementation point of view we could remove the check completely. Each > > priority ceiling mutex provides a priority node for the task, so it is > > independent of other priority contributions. We also have a dedicated > > deadlock detection. > > > > > > > > Either way this discussion/patch goes, we should document these issues > > > inhttps:// > docs.rtems.org/branches/master/c-user/key_concepts.html#priorityceiling > > > > Yes. > > > > -- > > embedded brains GmbH > > Herr Sebastian HUBER > > Dornierstr. 4 > > 82178 Puchheim > > Germany > > email: sebastian.hu...@embedded-brains.de > > phone: +49-89-18 94 741 - 16 > > fax: +49-89-18 94 741 - 08 > > > > Registergericht: Amtsgericht München > > Registernummer: HRB 157899 > > Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler > > Unsere Datenschutzerklärung finden Sie hier: > > https://embedded-brains.de/datenschutzerklaerung/ > > -- > > Diese Mail wurde mobil geschrieben. Etwaige Rechtschreibfehler sind > volle Absicht und als großzügiges Geschenk zu verstehen. > > > > > -- Junjie Shi, M.Sc. Technische Universität Dortmund Informatik, Lehrstuhl 12 Design Automation of Embedded Systems Otto-Hahn-Str. 16, E05 44227 Dortmund, Germany Phone: +49 0231 755-6121 Fax: +49 0231 755-6116
_______________________________________________ devel mailing list devel@rtems.org http://lists.rtems.org/mailman/listinfo/devel