> On 13 Nov 2018, at 9:36 am, Ben Pye <[email protected]> wrote:
> 
> On Mon, Nov 12, 2018 at 06:46:26PM -0200, Martin Pieuchot wrote:
>> On 11/11/18(Sun) 13:10, Ben Pye wrote:
>>> Hey all,
>>> 
>>> In my quest for better OpenBSD support on my Chromebook 13 I have found
>>> that sdmmc(4)'s current strategy for suspend/resume only really works
>>> for removable storage. Upon a suspend it marks the device as dying, and
>>> upon resume will detach the card and re-scan the bus. This makes sense
>>> for removable SD cards as they may be removed whilst the device is
>>> asleep, however for non-removable eMMC we can make the assumption that
>>> the device will not be removed during sleep.
>>> 
>>> By assuming the device stays present over a sleep we can avoid the
>>> detach and re-scan behaviour that sdmmc(4) currently uses. This is
>>> important if you root storage device is an eMMC device /and/ you are
>>> using softraid(4) as softraid(4) will not handle the detach/attach.
>>> Despite this on my hardware at least suspend/resume works more reliably
>>> if we can avoid this detach behaviour too, on this device we would
>>> really have to detach before we enter sleep as the bus can break during
>>> suspend and then the detach/reattach fails after resume meaning that my
>>> sd0 device just disappears.
>>> 
>>> Instead for non-removable devices upon sleep we can deselect the card
>>> but otherwise leave it configured. On resume we still perform a host
>>> reset in sdhc(4) and then we can re-do the device initialisation in
>>> sdmmc(4). During initialisation we use some of the values determined in
>>> the initial sdmmc(4) init as the goal is to put the device into the same
>>> state. We also hook up to the same sdmmc_function's etc so that the
>>> child devices (scsibus etc) do not have to do anything special on a
>>> resume. With this behaviour suspend/resume works reliably on my HP
>>> Chromebook 13 and softraid for encryption continues to work.
>>> 
>>> There are a few unresolved questions however and that's why I'm posting
>>> this here now:
>>> 
>>> 1. How do we know if a device is non-removable, I don't think an MMC
>>> device indicates this, maybe we have to determine this by the
>>> controller?
>> 
>> If you find a way to know that would be interesting.  However that
>> doesn't mean we want to go this road.
>> 
>> I'd rephrase the question differently.  How do we know if a device,
>> attached to a removable bus, has been removed or swapped while the
>> machine was suspended?
>> 
>> Do you know how other OSes handle this case?  What kind of changes
>> would be required to handle this in OpenBSD?  At which layer?  USB
>> currently follows the same logic, so it could be solved there as
>> well with the same stone.
> 
> Good point. As far as I can tell Linux attempts to keep the device
> initialised, removable card or not. They check the CID and if it has
> changed I think they are detaching it - so for example, a removable
> card.

You could borrow bits of mpath(4), specifically the devid parts that are used 
to identify whether a device is the same or not. However, devids rely on a 
device providing unique identifiers, and we know that this isn't reliable.

I'd go a bit further than using properties of a device and and try to compare 
something on the disk that should be unique, eg, the duid from the disklabel.

> Perhaps taking this approach would be better - if we could keep
> removable devices attached if they haven't changed then it would be
> possible to use them as your root device for example. This would have a
> greater impact though and might have a more potential for breaking
> other devices.

I'd argue that the problem of / going away is greater than the other problems 
that will come up.

> 
>> 
>>> 2. There are some error cases in sdmmc_mem_scan that are unhandled. They
>>> shouldn't ever be hit, if they were it would indicate that something to
>>> do with our non-removable card has changed over a suspend/resume cycle.
>> 
>> What should happen if such error occurs?
> 
> I think just propagating the error up to sdmmc_activate should be
> sufficient as I believe this should result in the device and it's
> children being detached and freed. In fact if we wanted to take this
> approach for removable devices then this is where I suppose we would
> notice that the device has changed.
> 
>> 
>>> 3. I appear to hit an issue with some commands failing after a suspend
>>> if I don't add a small delay to sdhc_wait_intr_cold, I haven't yet been
>>> able to work out why this is. Specifically, the second time we read
>>> EXT_CSD in sdmmc_mem_mmc_init fails.
>> 
>> That would be nice to investigate.  Such workarounds are a good start,
>> but not a nice solution.
> 
> Understood. I just wanted to get it working for now so we could work out
> when we should even be going down this code path.

Reply via email to