Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On Fri, Feb 27, 2015 at 08:23:41PM +0200, Imre Deak wrote: We've checked today with Ville a few machines we've found from GEN2 to GEN5+. There was one Thinkpad x61s (GEN4) where I could reproduce the exact same problem and get rid of it using the same workaround. All the others were non-Lenovo (including GEN4) mobile and desktop platforms and none of these had the problem. I also tried it on a Lenovo IVB ultrabook, which also works fine. So the current theory is that it's related to GEN4 Thinkpad BIOSes, and so we need to change the condition to match that at least. If you need more ThinkPads to test with I have a few at home :) Kind regards, David ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On Thu, Feb 26, 2015 at 09:01:27PM +0100, Daniel Vetter wrote: On Thu, Feb 26, 2015 at 08:50:48PM +0200, Imre Deak wrote: [snip] The problem seems to be that after the kernel puts the device into D3 the BIOS still tries to access it, or otherwise assumes that it's in D0. This is clearly bogus, since ACPI mandates that devices are put into D3 by the OSPM if they are not wake-up sources. In the future we want to unify more of the driver's runtime and system suspend paths, for example by skipping all the system suspend/hibernation hooks if the device is runtime suspended already. Accordingly for all other platforms the goal is still to properly power down the device during hibernation. [snip] If we see more of these troublesome machines we might need to apply an even bigger hammer like gen 5 or so. But as long as there's just 1 report I think this is the right approach. Bjorn, as soon as we have your tested-by that this work we can apply this and shovel it through the backporting machinery. Isn't there a risk that this will negatively impact machines using gen4 that *don't* have a buggy BIOS? Wouldn't a quirk tied to the laptop or BIOS version be better suited -- or possibly a parameter that can be passed to the driver, which would make it easier to test if others suffering from similar symptoms on other systems suffer from the same issue or not? Just my 2¢. Kind regards, David Weinehall ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On Fri, 2015-02-27 at 14:15 +0200, David Weinehall wrote: On Thu, Feb 26, 2015 at 09:01:27PM +0100, Daniel Vetter wrote: On Thu, Feb 26, 2015 at 08:50:48PM +0200, Imre Deak wrote: [snip] The problem seems to be that after the kernel puts the device into D3 the BIOS still tries to access it, or otherwise assumes that it's in D0. This is clearly bogus, since ACPI mandates that devices are put into D3 by the OSPM if they are not wake-up sources. In the future we want to unify more of the driver's runtime and system suspend paths, for example by skipping all the system suspend/hibernation hooks if the device is runtime suspended already. Accordingly for all other platforms the goal is still to properly power down the device during hibernation. [snip] If we see more of these troublesome machines we might need to apply an even bigger hammer like gen 5 or so. But as long as there's just 1 report I think this is the right approach. Bjorn, as soon as we have your tested-by that this work we can apply this and shovel it through the backporting machinery. Isn't there a risk that this will negatively impact machines using gen4 that *don't* have a buggy BIOS? Wouldn't a quirk tied to the laptop or BIOS version be better suited -- or possibly a parameter that can be passed to the driver, which would make it easier to test if others suffering from similar symptoms on other systems suffer from the same issue or not? Just my 2¢. We've checked today with Ville a few machines we've found from GEN2 to GEN5+. There was one Thinkpad x61s (GEN4) where I could reproduce the exact same problem and get rid of it using the same workaround. All the others were non-Lenovo (including GEN4) mobile and desktop platforms and none of these had the problem. I also tried it on a Lenovo IVB ultrabook, which also works fine. So the current theory is that it's related to GEN4 Thinkpad BIOSes, and so we need to change the condition to match that at least. --Imre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On to, 2015-02-26 at 10:34 +0100, Bjørn Mork wrote: Imre Deak imre.d...@intel.com writes: That patch fixes the problem, with only pci_set_power_state commented out. Do you still want me to try with pci_disable_device() commented out as well? No, but it would help if you could still try the two attached patch separately, without any of the previous workarounds. Based on the result, we'll follow up with a fix that adds for your specific platform either of the attached workarounds or simply avoids putting the device into D3 (corresponding to the patch you already tried). None of those patches made any difference. The laptop still hangs at power-off. Not really surprising, is it? Previous testing shows that the hang occurs at the pci_set_power_state(drm_dev-pdev, PCI_D3hot) call in the poweroff_late hook. It is hard to see how replacing it by an attempt to set D3cold, or adding any call after this point, could possibly change anything. The system is stil hanging at the pci_set_power_state() call. Judging from the blinking LED, I assume that it's not pci_set_power_state() that hangs the machine, but the hang happens in BIOS code. The generic pci-driver code will put the i915 device into PCI_D3hot for you, won't it? Why do you need to duplicate that in the driver, *knowing* that doing so breaks (at least some) systems? Letting the pci core put the device into D3 wouldn't get rid of the problem. It's putting the device into D3 in the first place what causes it. I honestly don't think this let's try some random code is the proper way to fix this bug (or any other bug for that matter). You need to start understanding the code you write, and the first step is by actually explaining the changes you make. We have a good understanding about the issue: the BIOS on your platform does something unexpected behind the back of the driver/kernel. In that sense the patches were not random, for instance the first one is based on an existing workaround for an issue that resembles quite a lot yours, see pci_pm_poweroff_noirq(). I also believe that you completely miss the fact that this bug has survived a full release cycle before you became aware of it, and the implications this has wrt other affected systems/users. I assume you understand that my system isn't one-of-a-kind, This means that there are other affected users with identical/similar systems. Now, if none of those users reported the bug to you (we all know why: Linux kernel development is currently limited by the available testing resources, NOT by the available developer resources), then how do you know that there aren't a number of other systems affected as well? Let me answer that for you: You don't. Which is why you must explain the mechanism triggering the bug, proving that it is a chipset/system specific issue. Because that's the only way you will *know* that you have solved the problem not only for me, but for all affected users. IMHO, the only safe and sane solution at the moment is the revert patch I posted. It's a simple fix, reverting back to the *known* working state before this regression was introduced. Then you can start over from there, trying to implement this properly. The current way is the proper one that we want for the generic case. The issue on your platform is the exception, so working around that is a sensible choice. Attached is the proposed fix for this issue. --Imre From 5c23657bc168db12a1ba100ab65fabd305c89c8a Mon Sep 17 00:00:00 2001 From: Imre Deak imre.d...@intel.com Date: Thu, 26 Feb 2015 18:38:53 +0200 Subject: [PATCH] drm/i915: gm45: work around hang during hibernation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bjørn reported that his machine hang during hibernation and eventually bisected the problem to the following commit: commit da2bc1b9db3351addd293e5b82757efe1f77ed1d Author: Imre Deak imre.d...@intel.com Date: Thu Oct 23 19:23:26 2014 +0300 drm/i915: add poweroff_late handler The problem seems to be that after the kernel puts the device into D3 the BIOS still tries to access it, or otherwise assumes that it's in D0. This is clearly bogus, since ACPI mandates that devices are put into D3 by the OSPM if they are not wake-up sources. In the future we want to unify more of the driver's runtime and system suspend paths, for example by skipping all the system suspend/hibernation hooks if the device is runtime suspended already. Accordingly for all other platforms the goal is still to properly power down the device during hibernation. Reference: http://lists.freedesktop.org/archives/intel-gfx/2015-February/060633.html Reported-and-bisected-by: Bjørn Mork bj...@mork.no Signed-off-by: Imre Deak imre.d...@intel.com --- drivers/gpu/drm/i915/i915_drv.c | 27 ++- 1 file changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On Thu, Feb 26, 2015 at 08:50:48PM +0200, Imre Deak wrote: On to, 2015-02-26 at 10:34 +0100, Bjørn Mork wrote: Imre Deak imre.d...@intel.com writes: That patch fixes the problem, with only pci_set_power_state commented out. Do you still want me to try with pci_disable_device() commented out as well? No, but it would help if you could still try the two attached patch separately, without any of the previous workarounds. Based on the result, we'll follow up with a fix that adds for your specific platform either of the attached workarounds or simply avoids putting the device into D3 (corresponding to the patch you already tried). None of those patches made any difference. The laptop still hangs at power-off. Not really surprising, is it? Previous testing shows that the hang occurs at the pci_set_power_state(drm_dev-pdev, PCI_D3hot) call in the poweroff_late hook. It is hard to see how replacing it by an attempt to set D3cold, or adding any call after this point, could possibly change anything. The system is stil hanging at the pci_set_power_state() call. Judging from the blinking LED, I assume that it's not pci_set_power_state() that hangs the machine, but the hang happens in BIOS code. The generic pci-driver code will put the i915 device into PCI_D3hot for you, won't it? Why do you need to duplicate that in the driver, *knowing* that doing so breaks (at least some) systems? Letting the pci core put the device into D3 wouldn't get rid of the problem. It's putting the device into D3 in the first place what causes it. I honestly don't think this let's try some random code is the proper way to fix this bug (or any other bug for that matter). You need to start understanding the code you write, and the first step is by actually explaining the changes you make. We have a good understanding about the issue: the BIOS on your platform does something unexpected behind the back of the driver/kernel. In that sense the patches were not random, for instance the first one is based on an existing workaround for an issue that resembles quite a lot yours, see pci_pm_poweroff_noirq(). I also believe that you completely miss the fact that this bug has survived a full release cycle before you became aware of it, and the implications this has wrt other affected systems/users. I assume you understand that my system isn't one-of-a-kind, This means that there are other affected users with identical/similar systems. Now, if none of those users reported the bug to you (we all know why: Linux kernel development is currently limited by the available testing resources, NOT by the available developer resources), then how do you know that there aren't a number of other systems affected as well? Let me answer that for you: You don't. Which is why you must explain the mechanism triggering the bug, proving that it is a chipset/system specific issue. Because that's the only way you will *know* that you have solved the problem not only for me, but for all affected users. IMHO, the only safe and sane solution at the moment is the revert patch I posted. It's a simple fix, reverting back to the *known* working state before this regression was introduced. Then you can start over from there, trying to implement this properly. The current way is the proper one that we want for the generic case. The issue on your platform is the exception, so working around that is a sensible choice. Attached is the proposed fix for this issue. --Imre From 5c23657bc168db12a1ba100ab65fabd305c89c8a Mon Sep 17 00:00:00 2001 From: Imre Deak imre.d...@intel.com Date: Thu, 26 Feb 2015 18:38:53 +0200 Subject: [PATCH] drm/i915: gm45: work around hang during hibernation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bjørn reported that his machine hang during hibernation and eventually bisected the problem to the following commit: commit da2bc1b9db3351addd293e5b82757efe1f77ed1d Author: Imre Deak imre.d...@intel.com Date: Thu Oct 23 19:23:26 2014 +0300 drm/i915: add poweroff_late handler The problem seems to be that after the kernel puts the device into D3 the BIOS still tries to access it, or otherwise assumes that it's in D0. This is clearly bogus, since ACPI mandates that devices are put into D3 by the OSPM if they are not wake-up sources. In the future we want to unify more of the driver's runtime and system suspend paths, for example by skipping all the system suspend/hibernation hooks if the device is runtime suspended already. Accordingly for all other platforms the goal is still to properly power down the device during hibernation. Reference: http://lists.freedesktop.org/archives/intel-gfx/2015-February/060633.html Reported-and-bisected-by: Bjørn Mork bj...@mork.no Signed-off-by: Imre Deak imre.d...@intel.com ---
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On Thu, Feb 26, 2015 at 08:50:48PM +0200, Imre Deak wrote: On to, 2015-02-26 at 10:34 +0100, Bjørn Mork wrote: Imre Deak imre.d...@intel.com writes: That patch fixes the problem, with only pci_set_power_state commented out. Do you still want me to try with pci_disable_device() commented out as well? No, but it would help if you could still try the two attached patch separately, without any of the previous workarounds. Based on the result, we'll follow up with a fix that adds for your specific platform either of the attached workarounds or simply avoids putting the device into D3 (corresponding to the patch you already tried). None of those patches made any difference. The laptop still hangs at power-off. Not really surprising, is it? Previous testing shows that the hang occurs at the pci_set_power_state(drm_dev-pdev, PCI_D3hot) call in the poweroff_late hook. It is hard to see how replacing it by an attempt to set D3cold, or adding any call after this point, could possibly change anything. The system is stil hanging at the pci_set_power_state() call. Judging from the blinking LED, I assume that it's not pci_set_power_state() that hangs the machine, but the hang happens in BIOS code. The generic pci-driver code will put the i915 device into PCI_D3hot for you, won't it? Why do you need to duplicate that in the driver, *knowing* that doing so breaks (at least some) systems? Letting the pci core put the device into D3 wouldn't get rid of the problem. It's putting the device into D3 in the first place what causes it. I honestly don't think this let's try some random code is the proper way to fix this bug (or any other bug for that matter). You need to start understanding the code you write, and the first step is by actually explaining the changes you make. We have a good understanding about the issue: the BIOS on your platform does something unexpected behind the back of the driver/kernel. In that sense the patches were not random, for instance the first one is based on an existing workaround for an issue that resembles quite a lot yours, see pci_pm_poweroff_noirq(). I also believe that you completely miss the fact that this bug has survived a full release cycle before you became aware of it, and the implications this has wrt other affected systems/users. I assume you understand that my system isn't one-of-a-kind, This means that there are other affected users with identical/similar systems. Now, if none of those users reported the bug to you (we all know why: Linux kernel development is currently limited by the available testing resources, NOT by the available developer resources), then how do you know that there aren't a number of other systems affected as well? Let me answer that for you: You don't. Which is why you must explain the mechanism triggering the bug, proving that it is a chipset/system specific issue. Because that's the only way you will *know* that you have solved the problem not only for me, but for all affected users. IMHO, the only safe and sane solution at the moment is the revert patch I posted. It's a simple fix, reverting back to the *known* working state before this regression was introduced. Then you can start over from there, trying to implement this properly. The current way is the proper one that we want for the generic case. The issue on your platform is the exception, so working around that is a sensible choice. Attached is the proposed fix for this issue. --Imre From 5c23657bc168db12a1ba100ab65fabd305c89c8a Mon Sep 17 00:00:00 2001 From: Imre Deak imre.d...@intel.com Date: Thu, 26 Feb 2015 18:38:53 +0200 Subject: [PATCH] drm/i915: gm45: work around hang during hibernation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Bjørn reported that his machine hang during hibernation and eventually bisected the problem to the following commit: commit da2bc1b9db3351addd293e5b82757efe1f77ed1d Author: Imre Deak imre.d...@intel.com Date: Thu Oct 23 19:23:26 2014 +0300 drm/i915: add poweroff_late handler The problem seems to be that after the kernel puts the device into D3 the BIOS still tries to access it, or otherwise assumes that it's in D0. This is clearly bogus, since ACPI mandates that devices are put into D3 by the OSPM if they are not wake-up sources. In the future we want to unify more of the driver's runtime and system suspend paths, for example by skipping all the system suspend/hibernation hooks if the device is runtime suspended already. Accordingly for all other platforms the goal is still to properly power down the device during hibernation. Reference: http://lists.freedesktop.org/archives/intel-gfx/2015-February/060633.html Reported-and-bisected-by: Bjørn Mork bj...@mork.no Signed-off-by: Imre Deak imre.d...@intel.com ---
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
Ville Syrjälä ville.syrj...@linux.intel.com writes: @@ -651,7 +651,14 @@ static int i915_drm_suspend_late(struct drm_device *drm_dev) } pci_disable_device(drm_dev-pdev); -pci_set_power_state(drm_dev-pdev, PCI_D3hot); +/* + * During hibernation on some GM45 platforms the BIOS may try to access + * the device even though it's already in D3 and hang the machine. So + * leave the device in D0 on those platforms and hope the BIOS will + * power down the device properly. Please include the model of the known bad machine in this comment, to help future archaeologists. Here are some details: bjorn@nemi:~$ grep . /sys/class/dmi/id/{bios,product}* 2/dev/null /sys/class/dmi/id/bios_date:12/19/2011 /sys/class/dmi/id/bios_vendor:LENOVO /sys/class/dmi/id/bios_version:6EET55WW (3.15 ) /sys/class/dmi/id/product_name:2776LEG /sys/class/dmi/id/product_version:ThinkPad X301 Please let me know if you need some other data. Bjørn ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
Imre Deak imre.d...@intel.com writes: Attached is the proposed fix for this issue. --Imre From 5c23657bc168db12a1ba100ab65fabd305c89c8a Mon Sep 17 00:00:00 2001 From: Imre Deak imre.d...@intel.com Date: Thu, 26 Feb 2015 18:38:53 +0200 Subject: [PATCH] drm/i915: gm45: work around hang during hibernation I can confirm that this patch solves the problem for me. Thanks. Bjørn ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
Imre Deak imre.d...@intel.com writes: That patch fixes the problem, with only pci_set_power_state commented out. Do you still want me to try with pci_disable_device() commented out as well? No, but it would help if you could still try the two attached patch separately, without any of the previous workarounds. Based on the result, we'll follow up with a fix that adds for your specific platform either of the attached workarounds or simply avoids putting the device into D3 (corresponding to the patch you already tried). None of those patches made any difference. The laptop still hangs at power-off. Not really surprising, is it? Previous testing shows that the hang occurs at the pci_set_power_state(drm_dev-pdev, PCI_D3hot) call in the poweroff_late hook. It is hard to see how replacing it by an attempt to set D3cold, or adding any call after this point, could possibly change anything. The system is stil hanging at the pci_set_power_state() call. The generic pci-driver code will put the i915 device into PCI_D3hot for you, won't it? Why do you need to duplicate that in the driver, *knowing* that doing so breaks (at least some) systems? I honestly don't think this let's try some random code is the proper way to fix this bug (or any other bug for that matter). You need to start understanding the code you write, and the first step is by actually explaining the changes you make. I also believe that you completely miss the fact that this bug has survived a full release cycle before you became aware of it, and the implications this has wrt other affected systems/users. I assume you understand that my system isn't one-of-a-kind, This means that there are other affected users with identical/similar systems. Now, if none of those users reported the bug to you (we all know why: Linux kernel development is currently limited by the available testing resources, NOT by the available developer resources), then how do you know that there aren't a number of other systems affected as well? Let me answer that for you: You don't. Which is why you must explain the mechanism triggering the bug, proving that it is a chipset/system specific issue. Because that's the only way you will *know* that you have solved the problem not only for me, but for all affected users. IMHO, the only safe and sane solution at the moment is the revert patch I posted. It's a simple fix, reverting back to the *known* working state before this regression was introduced. Then you can start over from there, trying to implement this properly. Thanks, Bjørn ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On ti, 2015-02-24 at 20:00 +0100, Bjørn Mork wrote: Imre Deak imre.d...@intel.com writes: The poweroff handlers undo the actions of the thaw handlers. As the original commit stated saving the registers is not needed there, but it's also not a big overhead and there should be no problem doing it. We are planning to optimize the hibernation sequence by for example not shutting down the display between freeze and thaw, and also getting rid of unnecessary steps at the power off phase. But before that we want to actually unify things rather than having special cases, as maintaining the special code paths caused already quite a lot of problems for us so far. That sounds like a worthy goal. I don't understand what you hope to achieve by having a poweroff_late hook, since there aren't really anything useful left to do at the point it is called, but if you want a dummy callback there for code structure reasons then fine. There are the following reasons for shutting down the device properly during hibernation: - ACPI mandates that the OSPM (the kernel in our case) puts all devices into D3 that are not wake-up sources (i915 is not) (Kudos to Ville for pointing this out) - Embedded panels have a well defined shutdown sequence. We don't have any good reason to not follow this, in fact for some panels the subsequent reinitialization could be problematic in case of a hard power-off. (Thanks to Jani for this info) In fact the only reason why we didn't put the device into PCI D3 before the patch you bisected, is kind of arbitrary. The PCI core puts every device into D3 unless its driver saves the device's PCI state on its own. Since before that patch we did save the state, but haven't put the device into D3 it stayed in D0. That was definitely not intentional and as such we depended on the BIOS to correct this for us afterwards. But you cannot just run around breaking stuff while slowly moving towards this goal over multiple releases... v3.19 is currently broken and that seems totally unnecessary. In any case: You should have noticed this problem while testing your patches. The breakage is 100% reproducible. Unfortunately I had to do a bisect to realize what you had done to the i915 driver, something I unfortunately didn't find time to do before v3.19 was released. But I do find it unnecessary to release with such bugs. Any attempt to exercise the code path you modified would have revealed the bug. We tested these patches on numerous platforms and haven't seen the issue you reported. Based on your feedback the current assumption is that this is a bug in your BIOS, which tries to access the device despite it being powered down. We're trying our best to test each change we commit on a big set of platforms, but - especially on older hardware with random BIOS versions/configurations - full coverage is not possible. So we depend on reports like yours a lot to fill in the gaps. Reverting the commit may hide some other issue, so before doing that could you try the following patch: http://lists.freedesktop.org/archives/intel-gfx/2015-February/060529.html Makes no difference. I assume that patch fixes an unrelated bug? The age and reported symptoms indicates so. Note that I am reporting a regression introduced after v3.18, while that seems to fix a bug introduced in v3.17. Both v3.17 and v3.18 (including v3.18.6), as well as earlier releases, work fine for me. Ok, thanks for trying. If with that you still get the hang could you try on top of that the patch below, first having only pci_set_power_state uncommented, then both pci_set_power_state and pci_disable_device uncommented? That patch fixes the problem, with only pci_set_power_state commented out. Do you still want me to try with pci_disable_device() commented out as well? No, but it would help if you could still try the two attached patch separately, without any of the previous workarounds. Based on the result, we'll follow up with a fix that adds for your specific platform either of the attached workarounds or simply avoids putting the device into D3 (corresponding to the patch you already tried). Thanks, Imre From fe8b90c976f14eab80cb6715d0405157163d316e Mon Sep 17 00:00:00 2001 From: Imre Deak imre.d...@intel.com Date: Wed, 25 Feb 2015 19:38:53 +0200 Subject: [PATCH] drm/i915: zero PCI_COMMAND at the end of hibernation Signed-off-by: Imre Deak imre.d...@intel.com --- drivers/gpu/drm/i915/i915_drv.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 4badb23..b226cc6 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -653,6 +653,8 @@ static int i915_drm_suspend_late(struct drm_device *drm_dev) pci_disable_device(drm_dev-pdev); pci_set_power_state(drm_dev-pdev, PCI_D3hot); + pci_write_config_word(drm_dev-pdev, PCI_COMMAND, 0); + return 0; } -- 2.1.0 From
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
Imre Deak imre.d...@intel.com writes: The poweroff handlers undo the actions of the thaw handlers. As the original commit stated saving the registers is not needed there, but it's also not a big overhead and there should be no problem doing it. We are planning to optimize the hibernation sequence by for example not shutting down the display between freeze and thaw, and also getting rid of unnecessary steps at the power off phase. But before that we want to actually unify things rather than having special cases, as maintaining the special code paths caused already quite a lot of problems for us so far. That sounds like a worthy goal. I don't understand what you hope to achieve by having a poweroff_late hook, since there aren't really anything useful left to do at the point it is called, but if you want a dummy callback there for code structure reasons then fine. But you cannot just run around breaking stuff while slowly moving towards this goal over multiple releases... v3.19 is currently broken and that seems totally unnecessary. In any case: You should have noticed this problem while testing your patches. The breakage is 100% reproducible. Unfortunately I had to do a bisect to realize what you had done to the i915 driver, something I unfortunately didn't find time to do before v3.19 was released. But I do find it unnecessary to release with such bugs. Any attempt to exercise the code path you modified would have revealed the bug. Reverting the commit may hide some other issue, so before doing that could you try the following patch: http://lists.freedesktop.org/archives/intel-gfx/2015-February/060529.html Makes no difference. I assume that patch fixes an unrelated bug? The age and reported symptoms indicates so. Note that I am reporting a regression introduced after v3.18, while that seems to fix a bug introduced in v3.17. Both v3.17 and v3.18 (including v3.18.6), as well as earlier releases, work fine for me. If with that you still get the hang could you try on top of that the patch below, first having only pci_set_power_state uncommented, then both pci_set_power_state and pci_disable_device uncommented? That patch fixes the problem, with only pci_set_power_state commented out. Do you still want me to try with pci_disable_device() commented out as well? Bjørn ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: fix failure to power off after hibernate
On ti, 2015-02-24 at 15:58 +0100, Bjørn Mork wrote: This fixes a bug where hibernation completes, but the system fails to power off after the image has been saved. Bisection lead to commit da2bc1b9db33 (drm/i915: add poweroff_late handler) which added a .poweroff_late hook pointing to the same function as the .freeze_late hook, without any justification or explanation why this would be correct, except for consistency. The assumption that this makes the power off handling identical to the S3 suspend and S4 freeze handling is completely bogus. As clearly documented in Documentation/power/devices.txt, the poweroff* hooks are part of the hibernation API along with the freeze* hooks. The phases involved in hibernation are: prepare, freeze, freeze_late, freeze_noirq, thaw_noirq, thaw_early, thaw, complete, prepare, poweroff, poweroff_late, poweroff_noirq With the above sequence in mind, it should be fairly obvious that you cannot save registers and disable the device in both the freeze_late and poweroff_late phases. Or as Documentation/power/devices.txt explains it: The poweroff, poweroff_late and poweroff_noirq callbacks should do essentially the same things as the suspend, suspend_late and suspend_noirq callbacks, respectively. The only notable difference is that they need not store the device register values, because the registers should already have been stored during the freeze, freeze_late or freeze_noirq phases. The consistency between suspend and hibernate was already in place, with freeze_late pointing to the same function as suspend_late. There is no need for any poweroff_late hook in this driver. The poweroff handlers undo the actions of the thaw handlers. As the original commit stated saving the registers is not needed there, but it's also not a big overhead and there should be no problem doing it. We are planning to optimize the hibernation sequence by for example not shutting down the display between freeze and thaw, and also getting rid of unnecessary steps at the power off phase. But before that we want to actually unify things rather than having special cases, as maintaining the special code paths caused already quite a lot of problems for us so far. Reverting the commit may hide some other issue, so before doing that could you try the following patch: http://lists.freedesktop.org/archives/intel-gfx/2015-February/060529.html If with that you still get the hang could you try on top of that the patch below, first having only pci_set_power_state uncommented, then both pci_set_power_state and pci_disable_device uncommented? Thanks, Imre diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 4badb23..dc91d4b 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -968,6 +968,23 @@ static int i915_pm_suspend_late(struct device *dev) return i915_drm_suspend_late(drm_dev); } +static int i915_pm_poweroff_late(struct device *dev) +{ + struct drm_device *drm_dev = dev_to_i915(dev)-dev; + struct drm_i915_private *dev_priv = drm_dev-dev_private; + int ret; + + ret = intel_suspend_complete(dev_priv); + + if (ret) + return ret; + + pci_disable_device(drm_dev-pdev); +// pci_set_power_state(drm_dev-pdev, PCI_D3hot); + + return 0; +} + static int i915_pm_resume_early(struct device *dev) { struct drm_device *drm_dev = dev_to_i915(dev)-dev; @@ -1535,7 +1552,7 @@ static const struct dev_pm_ops i915_pm_ops = { .thaw_early = i915_pm_resume_early, .thaw = i915_pm_resume, .poweroff = i915_pm_suspend, - .poweroff_late = i915_pm_suspend_late, + .poweroff_late = i915_pm_poweroff_late, .restore_early = i915_pm_resume_early, .restore = i915_pm_resume, ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx