On Sat, Feb 07, 2026 at 08:01:23PM -0800, Douglas Anderson wrote: > The way the mailbox core behaves when you pass a NULL `mssg` parameter > to mbox_send_message() is a little questionable. Specifically, the > mailbox core stores the currently active message directly in its > `active_req` field. In at least two places it decides that if this > field is `NULL` then there is no active request. That means if `mssg` > is ever NULL it will cause the mailbox core to think is no active > request. The two places where it does this are: > > 1. When a client calls mbox_send_message(), if `active_req` is NULL > then it will call the mailbox controller to send the new message > even if the mailbox controller hasn't yet called mbox_chan_txdone() > on the previous (NULL) message. > 2. The mailbox core will never call the client's `tx_done()` callback > with a NULL message because `tx_tick()` returns early whenever the > message is NULL. > > Though the above doesn't look like it was a conscious design choice, > it does have the benefit of providing a simple way to assert an > edge-triggered interrupt to the remote processor on the other side of > the mailbox. Specifically: > > 1. Like a normal edge-triggered interrupt, if multiple edges arrive > before the interrupt is Acked they are coalesced. > 2. Like a normal edge-triggered interrupt, as long as the receiver > (the remote processor in this case) "Ack"s the interrupt _before_ > checking for work and the sender (the mailbox client in this case) > posts the interrupt _after_ adding new work then we can always be > certain that new work will be noticed. This assumes that the > mailbox client and remote processor have some out-of-band way to > communicate work and the mailbox is just being used as an > interrupt. > > Doing a `git grep -A1 mbox_send_message | grep NULL` shows 14 hits in > mainline today, but it's not 100% clear if all of those users are > relying on the benefits/quirks of the existing behavior. > > Since the current NULL `mssg` behavior is a bit questionable but has > some benefits, let's: > > 1. Deprecate the NULL behavior and print a warning. > 2. Add a new mbox_ring_doorbell() function that is very similar to the > existing NULL `mssg` case but a tad bit cleaner. > > The design of the new mbox_ring_doorbell() will be to maximize > compatibility with the old NULL `mssg` behavior. Specifically: > > * We'll still pass NULL to the mailbox controller to indicate a > doorbell. > * Doorbells will not be queued and won't have txdone. > * We'll call immediately into the mailbox controller when a doorbell > is posted. > > With the above, any mailbox clients that don't mix doorbells and > normal messages are intended to see no change in behavior when > switching to the new API. Using the new API, which officiall documents > that mbox_client_txdone() shouldn't be called for doorbells, does > allow us to remove those calls. > > There are two differences in behavior between the old sending a NULL > message and the new mbox_ring_doorbell(): > > 1. If the mailbox controller returned an error when trying to send a > NULL message, the old NULL message could have ended up being queued > up in the core's FIFO. Now we will just return the error. > 2. If a client rings a doorbell while a non-doorbell message is in > progress, previously NULL messages would have been "queued" in that > case and now doorbells will be immediately posted. > > I'm hoping that nobody was relying on either of the two differences. > In general holding NULL messages in the mailbox core's queue has odd > behavior and is hard to reason about. Hopefully it's reasonable to > assume nobody was doing this. > > As mentioned above, it should be noted that it's now documented that > "txdone" shouldn't be called (by both mailbox drivers and clients) for > doorbells. That being said, in most cases it won't hurt since the > mailbox core will ignore the bogus "txdone". The only case where it's > critical for a mailbox controller not to call "txdone" for a doorbell > is when a mailbox channel mixes normal messages and doorbells and > cares about the txdone callback. Specifically, when you ring a > doorbell and immediately send a normal message, if the controller > calls "txdone" for the doorbell it could look as if the normal message > finished before it should have. This issue also would have happened > with the old NULL `mssg`, though. > > Signed-off-by: Douglas Anderson <[email protected]>
I like how this cleans up the logic hacks, but perhaps even more so how it takes a step towards cleaning up the mailbox API when it comes to expectations between client and provider implementations. Reviewed-by: Bjorn Andersson <[email protected]> Regards, Bjorn

