Re: sdmmc(4) support for suspend/resume with eMMC devices
Mark Kettenis wrote: > > 1. How do we know if a device is non-removable, I don't think an MMC > > device indicates this, maybe we have to determine this by the > > controller? > > The hardware really doesn't know. It all depends how the controller > is actually wired up. You have to rely on the system firmware > (e.g. BIOS) to tell you. For ACPI systems the SD device namespace is > supposed to include a _RMV method that indicates whether the device is > removeable or not: > > https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/other-acpi-namespace-objects > > For OpenFirmware/FDT systems there is a "non-removable" property. > > Of course firmware can be buggy and lie to you... Right. But we could "assume" that the rootfs on sdmmc should be "not ejected" at DVACT_SUSPEND time, and resurrected at DVACT_RESUME time. By walking sdmmc -> ... scsibus .. sd ... to sc_dev, and comparing it against rootdv and "whoa, we hold this one". If it isn't there when you resume you crash, or funny enough, we could simply go back to sleep since it doesn't exist, hoping for it to be present when we resume next time.
Re: sdmmc(4) support for suspend/resume with eMMC devices
> Date: Sun, 11 Nov 2018 13:10:19 + > From: Ben Pye > > Hey all, > > In my quest for better OpenBSD support on my Chromebook 13 I have found > that sdmmc(4)'s current strategy for suspend/resume only really works > for removable storage. Upon a suspend it marks the device as dying, and > upon resume will detach the card and re-scan the bus. This makes sense > for removable SD cards as they may be removed whilst the device is > asleep, however for non-removable eMMC we can make the assumption that > the device will not be removed during sleep. > > By assuming the device stays present over a sleep we can avoid the > detach and re-scan behaviour that sdmmc(4) currently uses. This is > important if you root storage device is an eMMC device /and/ you are > using softraid(4) as softraid(4) will not handle the detach/attach. > Despite this on my hardware at least suspend/resume works more reliably > if we can avoid this detach behaviour too, on this device we would > really have to detach before we enter sleep as the bus can break during > suspend and then the detach/reattach fails after resume meaning that my > sd0 device just disappears. > > Instead for non-removable devices upon sleep we can deselect the card > but otherwise leave it configured. On resume we still perform a host > reset in sdhc(4) and then we can re-do the device initialisation in > sdmmc(4). During initialisation we use some of the values determined in > the initial sdmmc(4) init as the goal is to put the device into the same > state. We also hook up to the same sdmmc_function's etc so that the > child devices (scsibus etc) do not have to do anything special on a > resume. With this behaviour suspend/resume works reliably on my HP > Chromebook 13 and softraid for encryption continues to work. > > There are a few unresolved questions however and that's why I'm posting > this here now: > > 1. How do we know if a device is non-removable, I don't think an MMC > device indicates this, maybe we have to determine this by the > controller? The hardware really doesn't know. It all depends how the controller is actually wired up. You have to rely on the system firmware (e.g. BIOS) to tell you. For ACPI systems the SD device namespace is supposed to include a _RMV method that indicates whether the device is removeable or not: https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/other-acpi-namespace-objects For OpenFirmware/FDT systems there is a "non-removable" property. Of course firmware can be buggy and lie to you... > 2. There are some error cases in sdmmc_mem_scan that are unhandled. They > shouldn't ever be hit, if they were it would indicate that something to > do with our non-removable card has changed over a suspend/resume cycle. > > 3. I appear to hit an issue with some commands failing after a suspend > if I don't add a small delay to sdhc_wait_intr_cold, I haven't yet been > able to work out why this is. Specifically, the second time we read > EXT_CSD in sdmmc_mem_mmc_init fails. I believe that eMMC actually require a magic power-on sequence to fully reset themselves. Another posibility is that the bus width isn't properly reset. After a power-cycle the eMMC device should be operating in 1-bit mode again, but the sdmmc stack might think it is still operating in 4-bit or 8-bit mode. > In it's current state this patch almost certainly breaks the > suspend/resume behaviour for non-removable devices, this should be > easily resolved once we can determine which sdmmc(4) devices are > removable. > > Ben. > > > Index: sys/dev/sdmmc/sdhc.c > === > RCS file: /cvs/src/sys/dev/sdmmc/sdhc.c,v > retrieving revision 1.61 > diff -u -p -r1.61 sdhc.c > --- sys/dev/sdmmc/sdhc.c 6 Sep 2018 10:15:17 - 1.61 > +++ sys/dev/sdmmc/sdhc.c 11 Nov 2018 12:54:29 - > @@ -53,6 +53,7 @@ struct sdhc_host { > u_int8_t regs[14]; /* host controller state */ > u_int16_t intr_status; /* soft interrupt status */ > u_int16_t intr_error_status;/* soft error status */ > + u_int8_t vdd; /* current vdd */ > > bus_dmamap_t adma_map; > bus_dma_segment_t adma_segs[1]; > @@ -373,8 +374,13 @@ sdhc_activate(struct device *self, int a > for (n = 0; n < sc->sc_nhosts; n++) { > hp = sc->sc_host[n]; > (void)sdhc_host_reset(hp); > - for (i = 0; i < sizeof hp->regs; i++) > - HWRITE1(hp, i, hp->regs[i]); > + > + /* > + * XXX: We might still want this for non-removable > + * devices? > + */ > + //for (i = 0; i < sizeof hp->regs; i++) > + // HWRITE1(hp, i, hp->regs[i]); > } > rv = config_activate_children(self, act); > brea
Re: sdmmc(4) support for suspend/resume with eMMC devices
> Do you know how other OSes handle this case? What kind of changes > would be required to handle this in OpenBSD? At which layer? USB > currently follows the same logic, so it could be solved there as > well with the same stone. No USB is much much harder. If I understand correctly the entire bus needs to be probed, all devices powered up freshly, and consider hubs also...
Re: sdmmc(4) support for suspend/resume with eMMC devices
There are two types of SD cards. Ones that go away, and ones that don't. > You could borrow bits of mpath(4), specifically the devid parts that > are used to identify whether a device is the same or not. However, > devids rely on a device providing unique identifiers, and we know that > this isn't reliable. In regards to either, I still really dislike the mpath model. It is difficult to actually push a device to detach, therefore there is no unwinding. For instance vnodes stay around. I've objected to this "keep it around" approach since the beginning. Always, the problem was identifying do we want it to detach or not, but the mpath model is "keep it around". > I'd go a bit further than using properties of a device and and try to > compare something on the disk that should be unique, eg, the duid from > the disklabel. But that requires an IO. And new IO's must not happen, because once a new IO is done other queued IO's from before suspend will flow through the code paths, perhaps say doing a write to the wrong SD card...
Re: sdmmc(4) support for suspend/resume with eMMC devices
On 12/11/18(Mon) 23:36, Ben Pye wrote: > On Mon, Nov 12, 2018 at 06:46:26PM -0200, Martin Pieuchot wrote: > > On 11/11/18(Sun) 13:10, Ben Pye wrote: > > > Hey all, > > > > > > In my quest for better OpenBSD support on my Chromebook 13 I have found > > > that sdmmc(4)'s current strategy for suspend/resume only really works > > > for removable storage. Upon a suspend it marks the device as dying, and > > > upon resume will detach the card and re-scan the bus. This makes sense > > > for removable SD cards as they may be removed whilst the device is > > > asleep, however for non-removable eMMC we can make the assumption that > > > the device will not be removed during sleep. > > > > > > By assuming the device stays present over a sleep we can avoid the > > > detach and re-scan behaviour that sdmmc(4) currently uses. This is > > > important if you root storage device is an eMMC device /and/ you are > > > using softraid(4) as softraid(4) will not handle the detach/attach. > > > Despite this on my hardware at least suspend/resume works more reliably > > > if we can avoid this detach behaviour too, on this device we would > > > really have to detach before we enter sleep as the bus can break during > > > suspend and then the detach/reattach fails after resume meaning that my > > > sd0 device just disappears. > > > > > > Instead for non-removable devices upon sleep we can deselect the card > > > but otherwise leave it configured. On resume we still perform a host > > > reset in sdhc(4) and then we can re-do the device initialisation in > > > sdmmc(4). During initialisation we use some of the values determined in > > > the initial sdmmc(4) init as the goal is to put the device into the same > > > state. We also hook up to the same sdmmc_function's etc so that the > > > child devices (scsibus etc) do not have to do anything special on a > > > resume. With this behaviour suspend/resume works reliably on my HP > > > Chromebook 13 and softraid for encryption continues to work. > > > > > > There are a few unresolved questions however and that's why I'm posting > > > this here now: > > > > > > 1. How do we know if a device is non-removable, I don't think an MMC > > > device indicates this, maybe we have to determine this by the > > > controller? > > > > If you find a way to know that would be interesting. However that > > doesn't mean we want to go this road. > > > > I'd rephrase the question differently. How do we know if a device, > > attached to a removable bus, has been removed or swapped while the > > machine was suspended? > > > > Do you know how other OSes handle this case? What kind of changes > > would be required to handle this in OpenBSD? At which layer? USB > > currently follows the same logic, so it could be solved there as > > well with the same stone. > > Good point. As far as I can tell Linux attempts to keep the device > initialised, removable card or not. They check the CID and if it has > changed I think they are detaching it - so for example, a removable > card. If you go down this road you need a way to ensure no IO will reach the medium until you've finished to check & validate the CID. Or is there any other way to do it? That is my question, what other OSes do? :) Did you look at USB as well? > Perhaps taking this approach would be better - if we could keep > removable devices attached if they haven't changed then it would be > possible to use them as your root device for example. That would be great but, as usual, somebody has to do the work :)
Re: sdmmc(4) support for suspend/resume with eMMC devices
> On 13 Nov 2018, at 9:36 am, Ben Pye wrote: > > On Mon, Nov 12, 2018 at 06:46:26PM -0200, Martin Pieuchot wrote: >> On 11/11/18(Sun) 13:10, Ben Pye wrote: >>> Hey all, >>> >>> In my quest for better OpenBSD support on my Chromebook 13 I have found >>> that sdmmc(4)'s current strategy for suspend/resume only really works >>> for removable storage. Upon a suspend it marks the device as dying, and >>> upon resume will detach the card and re-scan the bus. This makes sense >>> for removable SD cards as they may be removed whilst the device is >>> asleep, however for non-removable eMMC we can make the assumption that >>> the device will not be removed during sleep. >>> >>> By assuming the device stays present over a sleep we can avoid the >>> detach and re-scan behaviour that sdmmc(4) currently uses. This is >>> important if you root storage device is an eMMC device /and/ you are >>> using softraid(4) as softraid(4) will not handle the detach/attach. >>> Despite this on my hardware at least suspend/resume works more reliably >>> if we can avoid this detach behaviour too, on this device we would >>> really have to detach before we enter sleep as the bus can break during >>> suspend and then the detach/reattach fails after resume meaning that my >>> sd0 device just disappears. >>> >>> Instead for non-removable devices upon sleep we can deselect the card >>> but otherwise leave it configured. On resume we still perform a host >>> reset in sdhc(4) and then we can re-do the device initialisation in >>> sdmmc(4). During initialisation we use some of the values determined in >>> the initial sdmmc(4) init as the goal is to put the device into the same >>> state. We also hook up to the same sdmmc_function's etc so that the >>> child devices (scsibus etc) do not have to do anything special on a >>> resume. With this behaviour suspend/resume works reliably on my HP >>> Chromebook 13 and softraid for encryption continues to work. >>> >>> There are a few unresolved questions however and that's why I'm posting >>> this here now: >>> >>> 1. How do we know if a device is non-removable, I don't think an MMC >>> device indicates this, maybe we have to determine this by the >>> controller? >> >> If you find a way to know that would be interesting. However that >> doesn't mean we want to go this road. >> >> I'd rephrase the question differently. How do we know if a device, >> attached to a removable bus, has been removed or swapped while the >> machine was suspended? >> >> Do you know how other OSes handle this case? What kind of changes >> would be required to handle this in OpenBSD? At which layer? USB >> currently follows the same logic, so it could be solved there as >> well with the same stone. > > Good point. As far as I can tell Linux attempts to keep the device > initialised, removable card or not. They check the CID and if it has > changed I think they are detaching it - so for example, a removable > card. You could borrow bits of mpath(4), specifically the devid parts that are used to identify whether a device is the same or not. However, devids rely on a device providing unique identifiers, and we know that this isn't reliable. I'd go a bit further than using properties of a device and and try to compare something on the disk that should be unique, eg, the duid from the disklabel. > Perhaps taking this approach would be better - if we could keep > removable devices attached if they haven't changed then it would be > possible to use them as your root device for example. This would have a > greater impact though and might have a more potential for breaking > other devices. I'd argue that the problem of / going away is greater than the other problems that will come up. > >> >>> 2. There are some error cases in sdmmc_mem_scan that are unhandled. They >>> shouldn't ever be hit, if they were it would indicate that something to >>> do with our non-removable card has changed over a suspend/resume cycle. >> >> What should happen if such error occurs? > > I think just propagating the error up to sdmmc_activate should be > sufficient as I believe this should result in the device and it's > children being detached and freed. In fact if we wanted to take this > approach for removable devices then this is where I suppose we would > notice that the device has changed. > >> >>> 3. I appear to hit an issue with some commands failing after a suspend >>> if I don't add a small delay to sdhc_wait_intr_cold, I haven't yet been >>> able to work out why this is. Specifically, the second time we read >>> EXT_CSD in sdmmc_mem_mmc_init fails. >> >> That would be nice to investigate. Such workarounds are a good start, >> but not a nice solution. > > Understood. I just wanted to get it working for now so we could work out > when we should even be going down this code path.
Re: sdmmc(4) support for suspend/resume with eMMC devices
On Mon, Nov 12, 2018 at 06:46:26PM -0200, Martin Pieuchot wrote: > On 11/11/18(Sun) 13:10, Ben Pye wrote: > > Hey all, > > > > In my quest for better OpenBSD support on my Chromebook 13 I have found > > that sdmmc(4)'s current strategy for suspend/resume only really works > > for removable storage. Upon a suspend it marks the device as dying, and > > upon resume will detach the card and re-scan the bus. This makes sense > > for removable SD cards as they may be removed whilst the device is > > asleep, however for non-removable eMMC we can make the assumption that > > the device will not be removed during sleep. > > > > By assuming the device stays present over a sleep we can avoid the > > detach and re-scan behaviour that sdmmc(4) currently uses. This is > > important if you root storage device is an eMMC device /and/ you are > > using softraid(4) as softraid(4) will not handle the detach/attach. > > Despite this on my hardware at least suspend/resume works more reliably > > if we can avoid this detach behaviour too, on this device we would > > really have to detach before we enter sleep as the bus can break during > > suspend and then the detach/reattach fails after resume meaning that my > > sd0 device just disappears. > > > > Instead for non-removable devices upon sleep we can deselect the card > > but otherwise leave it configured. On resume we still perform a host > > reset in sdhc(4) and then we can re-do the device initialisation in > > sdmmc(4). During initialisation we use some of the values determined in > > the initial sdmmc(4) init as the goal is to put the device into the same > > state. We also hook up to the same sdmmc_function's etc so that the > > child devices (scsibus etc) do not have to do anything special on a > > resume. With this behaviour suspend/resume works reliably on my HP > > Chromebook 13 and softraid for encryption continues to work. > > > > There are a few unresolved questions however and that's why I'm posting > > this here now: > > > > 1. How do we know if a device is non-removable, I don't think an MMC > > device indicates this, maybe we have to determine this by the > > controller? > > If you find a way to know that would be interesting. However that > doesn't mean we want to go this road. > > I'd rephrase the question differently. How do we know if a device, > attached to a removable bus, has been removed or swapped while the > machine was suspended? > > Do you know how other OSes handle this case? What kind of changes > would be required to handle this in OpenBSD? At which layer? USB > currently follows the same logic, so it could be solved there as > well with the same stone. Good point. As far as I can tell Linux attempts to keep the device initialised, removable card or not. They check the CID and if it has changed I think they are detaching it - so for example, a removable card. Perhaps taking this approach would be better - if we could keep removable devices attached if they haven't changed then it would be possible to use them as your root device for example. This would have a greater impact though and might have a more potential for breaking other devices. > > > 2. There are some error cases in sdmmc_mem_scan that are unhandled. They > > shouldn't ever be hit, if they were it would indicate that something to > > do with our non-removable card has changed over a suspend/resume cycle. > > What should happen if such error occurs? I think just propagating the error up to sdmmc_activate should be sufficient as I believe this should result in the device and it's children being detached and freed. In fact if we wanted to take this approach for removable devices then this is where I suppose we would notice that the device has changed. > > > 3. I appear to hit an issue with some commands failing after a suspend > > if I don't add a small delay to sdhc_wait_intr_cold, I haven't yet been > > able to work out why this is. Specifically, the second time we read > > EXT_CSD in sdmmc_mem_mmc_init fails. > > That would be nice to investigate. Such workarounds are a good start, > but not a nice solution. Understood. I just wanted to get it working for now so we could work out when we should even be going down this code path.
Re: sdmmc(4) support for suspend/resume with eMMC devices
On 11/11/18(Sun) 13:10, Ben Pye wrote: > Hey all, > > In my quest for better OpenBSD support on my Chromebook 13 I have found > that sdmmc(4)'s current strategy for suspend/resume only really works > for removable storage. Upon a suspend it marks the device as dying, and > upon resume will detach the card and re-scan the bus. This makes sense > for removable SD cards as they may be removed whilst the device is > asleep, however for non-removable eMMC we can make the assumption that > the device will not be removed during sleep. > > By assuming the device stays present over a sleep we can avoid the > detach and re-scan behaviour that sdmmc(4) currently uses. This is > important if you root storage device is an eMMC device /and/ you are > using softraid(4) as softraid(4) will not handle the detach/attach. > Despite this on my hardware at least suspend/resume works more reliably > if we can avoid this detach behaviour too, on this device we would > really have to detach before we enter sleep as the bus can break during > suspend and then the detach/reattach fails after resume meaning that my > sd0 device just disappears. > > Instead for non-removable devices upon sleep we can deselect the card > but otherwise leave it configured. On resume we still perform a host > reset in sdhc(4) and then we can re-do the device initialisation in > sdmmc(4). During initialisation we use some of the values determined in > the initial sdmmc(4) init as the goal is to put the device into the same > state. We also hook up to the same sdmmc_function's etc so that the > child devices (scsibus etc) do not have to do anything special on a > resume. With this behaviour suspend/resume works reliably on my HP > Chromebook 13 and softraid for encryption continues to work. > > There are a few unresolved questions however and that's why I'm posting > this here now: > > 1. How do we know if a device is non-removable, I don't think an MMC > device indicates this, maybe we have to determine this by the > controller? If you find a way to know that would be interesting. However that doesn't mean we want to go this road. I'd rephrase the question differently. How do we know if a device, attached to a removable bus, has been removed or swapped while the machine was suspended? Do you know how other OSes handle this case? What kind of changes would be required to handle this in OpenBSD? At which layer? USB currently follows the same logic, so it could be solved there as well with the same stone. > 2. There are some error cases in sdmmc_mem_scan that are unhandled. They > shouldn't ever be hit, if they were it would indicate that something to > do with our non-removable card has changed over a suspend/resume cycle. What should happen if such error occurs? > 3. I appear to hit an issue with some commands failing after a suspend > if I don't add a small delay to sdhc_wait_intr_cold, I haven't yet been > able to work out why this is. Specifically, the second time we read > EXT_CSD in sdmmc_mem_mmc_init fails. That would be nice to investigate. Such workarounds are a good start, but not a nice solution.
sdmmc(4) support for suspend/resume with eMMC devices
Hey all, In my quest for better OpenBSD support on my Chromebook 13 I have found that sdmmc(4)'s current strategy for suspend/resume only really works for removable storage. Upon a suspend it marks the device as dying, and upon resume will detach the card and re-scan the bus. This makes sense for removable SD cards as they may be removed whilst the device is asleep, however for non-removable eMMC we can make the assumption that the device will not be removed during sleep. By assuming the device stays present over a sleep we can avoid the detach and re-scan behaviour that sdmmc(4) currently uses. This is important if you root storage device is an eMMC device /and/ you are using softraid(4) as softraid(4) will not handle the detach/attach. Despite this on my hardware at least suspend/resume works more reliably if we can avoid this detach behaviour too, on this device we would really have to detach before we enter sleep as the bus can break during suspend and then the detach/reattach fails after resume meaning that my sd0 device just disappears. Instead for non-removable devices upon sleep we can deselect the card but otherwise leave it configured. On resume we still perform a host reset in sdhc(4) and then we can re-do the device initialisation in sdmmc(4). During initialisation we use some of the values determined in the initial sdmmc(4) init as the goal is to put the device into the same state. We also hook up to the same sdmmc_function's etc so that the child devices (scsibus etc) do not have to do anything special on a resume. With this behaviour suspend/resume works reliably on my HP Chromebook 13 and softraid for encryption continues to work. There are a few unresolved questions however and that's why I'm posting this here now: 1. How do we know if a device is non-removable, I don't think an MMC device indicates this, maybe we have to determine this by the controller? 2. There are some error cases in sdmmc_mem_scan that are unhandled. They shouldn't ever be hit, if they were it would indicate that something to do with our non-removable card has changed over a suspend/resume cycle. 3. I appear to hit an issue with some commands failing after a suspend if I don't add a small delay to sdhc_wait_intr_cold, I haven't yet been able to work out why this is. Specifically, the second time we read EXT_CSD in sdmmc_mem_mmc_init fails. In it's current state this patch almost certainly breaks the suspend/resume behaviour for non-removable devices, this should be easily resolved once we can determine which sdmmc(4) devices are removable. Ben. Index: sys/dev/sdmmc/sdhc.c === RCS file: /cvs/src/sys/dev/sdmmc/sdhc.c,v retrieving revision 1.61 diff -u -p -r1.61 sdhc.c --- sys/dev/sdmmc/sdhc.c6 Sep 2018 10:15:17 - 1.61 +++ sys/dev/sdmmc/sdhc.c11 Nov 2018 12:54:29 - @@ -53,6 +53,7 @@ struct sdhc_host { u_int8_t regs[14]; /* host controller state */ u_int16_t intr_status; /* soft interrupt status */ u_int16_t intr_error_status;/* soft error status */ + u_int8_t vdd; /* current vdd */ bus_dmamap_t adma_map; bus_dma_segment_t adma_segs[1]; @@ -373,8 +374,13 @@ sdhc_activate(struct device *self, int a for (n = 0; n < sc->sc_nhosts; n++) { hp = sc->sc_host[n]; (void)sdhc_host_reset(hp); - for (i = 0; i < sizeof hp->regs; i++) - HWRITE1(hp, i, hp->regs[i]); + + /* +* XXX: We might still want this for non-removable +* devices? +*/ + //for (i = 0; i < sizeof hp->regs; i++) + // HWRITE1(hp, i, hp->regs[i]); } rv = config_activate_children(self, act); break; @@ -420,6 +426,8 @@ sdhc_host_reset(sdmmc_chipset_handle_t s s = splsdmmc(); + hp->vdd = 0; + /* Disable all interrupts. */ HWRITE2(hp, SDHC_NINTR_SIGNAL_EN, 0); @@ -490,6 +498,14 @@ sdhc_bus_power(sdmmc_chipset_handle_t sc u_int8_t vdd; int s; + /* +* If the requested vdd is the same as current vdd return. +*/ + if (hp->vdd == ocr) + return 0; + + hp->vdd = ocr; + s = splsdmmc(); /* @@ -1104,6 +1120,9 @@ sdhc_wait_intr_cold(struct sdhc_host *hp { int status; + /* XXX: Need this delay for some commands after suspend, why? */ + delay(10); + mask |= SDHC_ERROR_INTERRUPT; timo = timo * tick; status = hp->intr_status; @@ -1283,6 +1302,8 @@ sdhc_dump_regs(struct sdhc_host *hp) HREAD4(hp, SDHC_PRESENT_STATE), SDHC_PRESENT_STATE_BITS); printf("0x%02x POWER_CTL:%x\n", SDHC_POWER_CTL, HRE