On 12/24/2025 6:15 AM, Stephan Gerhold wrote:
> On Tue, Dec 23, 2025 at 01:13:50AM -0800, Jingyi Wang wrote:
>> From: Gokul krishna Krishnakumar <[email protected]>
>>
>> Subsystems can be brought out of reset by entities such as bootloaders.
>> As the irq enablement could be later than subsystem bring up, the state
>> of subsystem should be checked by reading SMP2P bits and performing ping
>> test.
>>
>> A new qcom_pas_attach() function is introduced. if a crash state is
>> detected for the subsystem, rproc_report_crash() is called. If the
>> subsystem is ready either at the first check or within a 5-second timeout
>> and the ping is successful, it will be marked as "attached". The ready
>> state could be set by either ready interrupt or handover interrupt.
>>
>> If "early_boot" is set by kernel but "subsys_booted" is not completed
>> within the timeout, It could be the early boot feature is not supported
>> by other entities. In this case, the state will be marked as RPROC_OFFLINE
>> so that the PAS driver can load the firmware and start the remoteproc. As
>> the running state is set once attach function is called, the watchdog or
>> fatal interrupt received can be handled correctly.
>>
>> Signed-off-by: Gokul krishna Krishnakumar
>> <[email protected]>
>> Co-developed-by: Jingyi Wang <[email protected]>
>> Signed-off-by: Jingyi Wang <[email protected]>
>> ---
>> drivers/remoteproc/qcom_q6v5.c | 87 ++++++++++++++++++++++++++++++++-
>> drivers/remoteproc/qcom_q6v5.h | 11 ++++-
>> drivers/remoteproc/qcom_q6v5_adsp.c | 2 +-
>> drivers/remoteproc/qcom_q6v5_mss.c | 2 +-
>> drivers/remoteproc/qcom_q6v5_pas.c | 97
>> ++++++++++++++++++++++++++++++++++++-
>> drivers/remoteproc/qcom_q6v5_wcss.c | 2 +-
>> 6 files changed, 195 insertions(+), 6 deletions(-)
>>
>> [...]
>> diff --git a/drivers/remoteproc/qcom_q6v5_pas.c
>> b/drivers/remoteproc/qcom_q6v5_pas.c
>> index 52680ac99589..7e890e18dd82 100644
>> --- a/drivers/remoteproc/qcom_q6v5_pas.c
>> +++ b/drivers/remoteproc/qcom_q6v5_pas.c
>> [...]
>> @@ -434,6 +440,85 @@ static unsigned long qcom_pas_panic(struct rproc *rproc)
>> return qcom_q6v5_panic(&pas->q6v5);
>> }
>>
>> +static int qcom_pas_attach(struct rproc *rproc)
>> +{
>> + int ret;
>> + struct qcom_pas *pas = rproc->priv;
>> + bool ready_state;
>> + bool crash_state;
>> +
>> + pas->q6v5.running = true;
>> + ret = irq_get_irqchip_state(pas->q6v5.fatal_irq,
>> + IRQCHIP_STATE_LINE_LEVEL, &crash_state);
>> +
>> + if (ret)
>> + goto disable_running;
>> +
>> + if (crash_state) {
>> + dev_err(pas->dev, "Sub system has crashed before driver
>> probe\n");
>> + rproc_report_crash(rproc, RPROC_FATAL_ERROR);
>
> Have you tested this case? From quick review of the code in
> remoteproc_core.c I'm skeptical if this will work correctly:
>
> 1. Remoteproc is in RPROC_DETACHED state during auto boot
> 2. qcom_pas_attach() runs and calls rproc_report_crash(), then fails so
> RPROC_DETACHED state remains
> 3. rproc_crash_handler_work() sets RPROC_CRASHED and starts recovery
> 4. rproc_boot_recovery() calls rproc_stop()
> 5. rproc_stop() calls rproc_stop_subdevices(), but because the
> remoteproc was never attached, the subdevices were never started.
>
> In this situation, rproc_stop_subdevices() should not be called. I would
> expect you will need to make some minor changes to the remoteproc_core
> to support handling crashes during RPROC_DETACHED state.
>
> I might be reading the code wrong, but please make sure that you
> simulate this case at runtime and check that it works correctly.
>
Recheked this part, current path has some issue on recovery and subdev handling.
As in current code, rproc_report_crash is called in the ISR/callback, it may
take some effort to call it in this attach path.
>> + ret = -EINVAL;
>> + goto disable_running;
>> + }
>> +
>> + ret = irq_get_irqchip_state(pas->q6v5.ready_irq,
>> + IRQCHIP_STATE_LINE_LEVEL, &ready_state);
>> +
>> + if (ret)
>> + goto disable_running;
>> +
>> + enable_irq(pas->q6v5.handover_irq);
>> +
>> + if (unlikely(!ready_state)) {
>> + /* Set a 5 seconds timeout in case the early boot is delayed */
>> + ret = wait_for_completion_timeout(&pas->q6v5.subsys_booted,
>> +
>> msecs_to_jiffies(EARLY_ATTACH_TIMEOUT_MS));
>> +
>
> Again, have you tested this case?
>
> As I already wrote in v2, I don't see how this case will work reliably
> in practice. How do you ensure that the handover resources will be kept
> on during the Linux boot process until the remoteproc has completed
> booting?
>
> Also, above you enable the handover_irq. Let's assume a handover IRQ
> does come in while you are waiting here. Then q6v5_handover_interrupt()
> will call q6v5->handover(q6v5); to disable the handover resources
> (clocks, power domains), but you never enabled those. I would expect
> that you get some bad reference count warnings in the kernel log.
>
> I would still suggest dropping this code entirely. As far as I
> understand the response from Aiqun(Maria) Yu [1], there is no real use
> case for this on current platforms. If you want to keep this, you would
> need to vote for the handover resources during probe() (and perhaps
> more, this case is quite tricky).
>
> Please test all your changes carefully in v4.
>
Thanks for your detailed review, the handover resources was indeed
neglected in the design, we will re-evaluate this part.
> Thanks,
> Stephan
>
> [1]:
> https://lore.kernel.org/linux-arm-msm/[email protected]/
Thanks,
Jingyi