On Mon, Jul 27, 2015 at 3:18 PM, Sudeep Holla <sudeep.ho...@arm.com> wrote: > On 27/07/15 04:26, Jassi Brar wrote: >> >>>> >>>>> we might end-up waiting >>>>> for atleast a jiffy even though the response for that message from the >>>>> remote is received via interrupt and processed in relatively smaller >>>>> time granularity. >>>>> >>>> That is wrong. >>> >>> >>> No see below. >>> >>>> If the controller supports TX interrupt it should set txdone_irq, >>>> which prevents polling i.e, controller driver calls mbox_chan_txdone. >>>> >>>> If the controller doesn't support TX interrupt but the client >>>> receives some ack packet, then the client should set knows_txdone and >>>> call mbox_client_txdone. Again you don't have to wait on polling. >>>> >>> >>> Sorry if I was not clear in the commit message, but I thought I did >>> mention TXDONE_BY_POLL. The case I am referring is definitely not >>> TXDONE_BY_IRQ or TXDONE_BY_ACK. >>> >> That statement is still wrong. The TXDONE_BY_POLL modifier does't make it >> right. >> > > I am fine to modify/clarify that statement. > >> Anyways, I see you meant the 3rd case of neither IRQ nor ACK. >> > > Yes the remote indicates by setting a flag in status register. > However, looking at the arm_scpi.c the protocol does support TXDONE_BY_ACK that is, every command has a reply packet telling if the command was successful or failure. When you receive a reply, obviously the command has already been received by the remote. Which is mbox_client.knows_txdone or TXDONE_BY_ACK.
>> It seems your remote doesn't send some protocol level 'ack' packet >> replying if the command was successfully executed or not. That means >> Linux can't differentiate successful execution of the command from a >> silent failure (remote still has to set the TX_done flag to make way >> for next messages). > > Agreed and again I confirm the remote processor in question just sets > the flag always and correctly and doesn't use a protocol ACK. > As I note above, the arm_scpi.c tells a different story. >>> 2. Because of that, under stress testing with multiple clients active at >>> a time, I am seeing the mailbox buffer overflows quite easily just >>> because it's blocked on Tx polling(almost 10x slower) and doesn't >>> process new requests though the remote can handle. >>> >> Yeah this situation may arise. The following fix is needed regardless, I >> think. >> > > IIUC it just triggers poll_txdone when chan->active_req is set, but that > will anyway happen through timer, no ? > No. It polls whenever a new message is queued so that we are not waiting on a completed but sleeping task. > Anyway it's not the main topic of discussion here and can be taken up > separately. > >> ============================ >> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c >> index 07fd507..a1dded9 100644 >> --- a/drivers/mailbox/mailbox.c >> +++ b/drivers/mailbox/mailbox.c >> @@ -64,7 +64,12 @@ static void msg_submit(struct mbox_chan *chan) >> >> spin_lock_irqsave(&chan->lock, flags); >> >> - if (!chan->msg_count || chan->active_req) >> + if (chan->active_req) { >> + err = 0; >> + goto exit; >> + } >> + >> + if (!chan->msg_count) >> goto exit; >> >> count = chan->msg_count; >> ============================ >> >>> Hope this clarifies the reasons for switching to hrtimer. >>> >> I am not against using hrtimer, just need to make sure we don't simply >> suppress the symptoms of wrong implementation. > > Agreed, and that's a valid concern. So far based on the testing and > benchmarking done so far, we don't think this patch is suppressing > anything incorrectly. > > If you still have concerns with this solution, please explain them here > so that we can discuss and come to conclusion and the issue is fixed. > I just replied on the patch where you set cl->knows_txdone = false; and which is not the case. We may use hrtimer for polling, but your platform doesn't have to rely on that. -jassi -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/