On 6/11/2026 5:10 PM, Stephan Gerhold wrote: > On Thu, Jun 11, 2026 at 11:10:25AM +0800, Aiqun(Maria) Yu wrote: >> On 5/22/2026 8:07 PM, Stephan Gerhold wrote: >>> On Tue, May 19, 2026 at 12:24:23AM -0700, Jingyi Wang wrote: >>>> Subsystems can be brought out of reset by entities such as bootloaders. >>>> As the irq enablement could be later than subsystem bring up, the state >>>> of subsystem should be checked by reading SMP2P bits. >>>> >>>> A new qcom_pas_attach() function is introduced. if a crash state is >>>> detected for the subsystem, rproc_report_crash() is called. If the ready >>>> state is detected, it will be marked as "attached", otherwise it could >>>> be the early boot feature is not supported by other entities. In this >>>> case, the state will be marked as RPROC_OFFLINE so that the PAS driver >>>> can load the firmware and start the remoteproc. >>>> >>>> Co-developed-by: Gokul Krishna Krishnakumar >>>> <[email protected]> >>>> Signed-off-by: Gokul Krishna Krishnakumar >>>> <[email protected]> >>>> Signed-off-by: Jingyi Wang <[email protected]> >>> >>> Unfortunately, removing the ping-pong functionality that was present in >>> previous patch versions makes the whole mechanism a lot more fragile. >>> I'm not entirely sure if this has changed in SMP2P v2 or more recent >>> firmware versions, but in my experience the SMP2P "ready" bit does not >>> tell you if the remoteproc is actually running. The problem is that the >>> "ready" bit is asserted by the remoteproc when the firmware is ready, >>> but it is not cleared when you shutdown or forcibly stop the remoteproc. >>> >>> If this is still the case, you can easily reproduce that with the >>> following test: >>> >>> 1. Start the system as usual and let it attach the remoteproc >>> 2. Manually stop the remoteproc in sysfs (echo stop > state) >>> 3. modprobe -r qcom_q6v5_pas >>> 4. modprobe qcom_q6v5_pas >>> 5. If the "ready" bit is still set, the driver will try attaching the >>> remoteproc, but it's actually not running. No recovery will happen. >>> >>> In this situation, it is very difficult to detect the correct remoteproc >>> state without relying on an additional query mechanism like the >>> ping-pong feature. >> >> This a valid use case and concern. We had a discussion with Bjorn, and >> want to take this scenario into consideration of the separate robustness >> improvement series[1]. >> Stephan could you agree to have the basic function in this series can be >> go in firstly. >> >> [1] >> https://lore.kernel.org/all/[email protected]/ >> >>> >>> You can make it a bit more reliable if you also check the status of the >>> "stop-ack" bit. This would tell you if the remoteproc was cleanly >>> stopped with the SMP2P "stop" mechanism. However, that will typically >>> still not fix the case above since nowadays remoteprocs are typically >>> stopped via the QMI qcom_sysmon and the "stop-ack" is not set in that >>> case. I believe this might set the separate "shutdown-ack" bit though >>> that is described for some SoCs, I never finished testing that. >>> >>> And even if you check both "stop-ack" and "shutdown-ack", that doesn't >>> tell you if the remoteproc was forcibly killed using >>> qcom_scm_pas_shutdown() without gracefully stopping it first. The ideal >>> solution would be querying the PAS API to tell us if the remoteproc is >>> actively running, but the last time I checked I was unfortunately not >>> able to find a documented call that would tell us that. >> >> It is a state currently kernel don't know whether the remoteproc is >> offline or crashed when ready==1 && error==0 && ping-pong==0 scenario. >> If it is re-modprob, the software don't have any data and only the >> firmware can tell us whether if it is active or not per my understanding. >> >> Maybe let's have this scenario and solution discussion in the other >> series I mentioned before. >> > > If you add a new feature upstream, you must make sure that it is > reasonably robust and reliable. The other series is about generic > limitations in the remoteproc subsystem, so I don't think you should > move QC-specific parts over there as well (personally, I would have > probably kept all of it in one series to make it easier to understand, > but that's subjective). > > With the current firmware design, it's hard - probably impossible - to > make the status detection perfectably reliable. I would therefore choose > some reasonable compromise to start with. Given that Shawn (and actually > me as well) would like to have attach working without firmware support > for the ping-pong functionality, I think it would be reasonable to start > with the basic detection scheme discussed above, i.e. > > ready==1 && handover==1 && fatal==0 && stop-ack==0 && shutdown-ack==0
Ready==1 and fatal==0 is already checked in current patchset. I am not sure about handover state, may need double check. stop-ack/shutdown-ack can be added per my understanding. > > The ping-pong functionality could be added later for platforms that > support it. It would be good to have the interrupts already defined in > the device tree, so you can tweak the driver without making DT changes > later. > > Thanks, > Stephan -- Thx and BRs, Aiqun(Maria) Yu

