On Mon, Nov 12, 2018 at 06:46:26PM -0200, Martin Pieuchot wrote:
> On 11/11/18(Sun) 13:10, Ben Pye wrote:
> > Hey all,
> > 
> > In my quest for better OpenBSD support on my Chromebook 13 I have found
> > that sdmmc(4)'s current strategy for suspend/resume only really works
> > for removable storage. Upon a suspend it marks the device as dying, and
> > upon resume will detach the card and re-scan the bus. This makes sense
> > for removable SD cards as they may be removed whilst the device is
> > asleep, however for non-removable eMMC we can make the assumption that
> > the device will not be removed during sleep.
> > 
> > By assuming the device stays present over a sleep we can avoid the
> > detach and re-scan behaviour that sdmmc(4) currently uses. This is
> > important if you root storage device is an eMMC device /and/ you are
> > using softraid(4) as softraid(4) will not handle the detach/attach.
> > Despite this on my hardware at least suspend/resume works more reliably
> > if we can avoid this detach behaviour too, on this device we would
> > really have to detach before we enter sleep as the bus can break during
> > suspend and then the detach/reattach fails after resume meaning that my
> > sd0 device just disappears.
> > 
> > Instead for non-removable devices upon sleep we can deselect the card
> > but otherwise leave it configured. On resume we still perform a host
> > reset in sdhc(4) and then we can re-do the device initialisation in
> > sdmmc(4). During initialisation we use some of the values determined in
> > the initial sdmmc(4) init as the goal is to put the device into the same
> > state. We also hook up to the same sdmmc_function's etc so that the
> > child devices (scsibus etc) do not have to do anything special on a
> > resume. With this behaviour suspend/resume works reliably on my HP
> > Chromebook 13 and softraid for encryption continues to work.
> > 
> > There are a few unresolved questions however and that's why I'm posting
> > this here now:
> > 
> > 1. How do we know if a device is non-removable, I don't think an MMC
> > device indicates this, maybe we have to determine this by the
> > controller?
> 
> If you find a way to know that would be interesting.  However that
> doesn't mean we want to go this road.
> 
> I'd rephrase the question differently.  How do we know if a device,
> attached to a removable bus, has been removed or swapped while the
> machine was suspended?
> 
> Do you know how other OSes handle this case?  What kind of changes
> would be required to handle this in OpenBSD?  At which layer?  USB
> currently follows the same logic, so it could be solved there as
> well with the same stone.

Good point. As far as I can tell Linux attempts to keep the device
initialised, removable card or not. They check the CID and if it has
changed I think they are detaching it - so for example, a removable
card. 

Perhaps taking this approach would be better - if we could keep
removable devices attached if they haven't changed then it would be
possible to use them as your root device for example. This would have a
greater impact though and might have a more potential for breaking
other devices.

> 
> > 2. There are some error cases in sdmmc_mem_scan that are unhandled. They
> > shouldn't ever be hit, if they were it would indicate that something to
> > do with our non-removable card has changed over a suspend/resume cycle.
> 
> What should happen if such error occurs?

I think just propagating the error up to sdmmc_activate should be
sufficient as I believe this should result in the device and it's
children being detached and freed. In fact if we wanted to take this
approach for removable devices then this is where I suppose we would
notice that the device has changed.

> 
> > 3. I appear to hit an issue with some commands failing after a suspend
> > if I don't add a small delay to sdhc_wait_intr_cold, I haven't yet been
> > able to work out why this is. Specifically, the second time we read
> > EXT_CSD in sdmmc_mem_mmc_init fails.
> 
> That would be nice to investigate.  Such workarounds are a good start,
> but not a nice solution.

Understood. I just wanted to get it working for now so we could work out
when we should even be going down this code path.

Reply via email to