Re: [PATCH 2/2] drm/nouveau: Queue hpd_work on (runtime) resume

2017-01-24 Thread Hans de Goede

Hi,

On 01/24/2017 02:00 AM, Mario Kleiner wrote:

On 11/21/2016 05:50 PM, Hans de Goede wrote:

We need to call drm_helper_hpd_irq_event() on resume to properly detect
monitor connection / disconnection on some laptops, use hpd_work for
this to avoid deadlocks.



Hi,

this seems to introduce a hang of nouveau in 4.10-rc if the gpu is runtime 
resumed while no displays are connected at all.

I get a permanent hang - need to power cycle to recover - if i either

a) Boot a MacPro test machine which has two discrete cards, one radeon, one 
geforce, but the displays are only connected to the radeon, whereas the nvidia 
gpu has no displays connected during boot.

b) On a gmux'ed MacBookPro 2010 intel + nvidia if i switch to the intel card via 
vgaswitcheroo (echo IGD > ...vgaswitcheroo/switch) and then after nouveau has 
powered down the nvidia, i use echo ON > ...vgaswitcheroo/switch) to power up the 
nvidia again, but now with nothing connected to its outputs.

I can prevent the hang if i either boot with nouveau.runpm=0, or connect 
displays to the nvidia in case a), or if i remove the new schedule_work() for 
hpd_work in the nouveau_pmops_runtime_resume() function.

Otherwise i get a hanging gpu-manager process on Ubuntu and this repeating in 
my kernel log:


Weird, Ben do you have any ideas what might be causing this ?

Regards,

Hans




[  246.899424] INFO: task kworker/7:1:127 blocked for more than 120 seconds.
[  246.899476]   Tainted: G  I 4.9.0-rc8 #60
[  246.899511] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  246.899561] kworker/7:1 D0   127  2 0x
[  246.899573] Workqueue: pm pm_runtime_work
[  246.899576]  917bcbeb2680  917c0e8d 
917c0d89
[  246.899582]  917c165da058 9c86866c7b70 8c920fab 
917c0de3d360
[  246.899587]  0086 0de3d360 917c165da058 

[  246.899593] Call Trace:
[  246.899601]  [] ? __schedule+0x2fb/0xb30
[  246.899605]  [] schedule+0x40/0x90
[  246.899608]  [] rpm_resume+0x14a/0x740
[  246.899614]  [] ? wake_atomic_t_function+0x60/0x60
[  246.899617]  [] pm_runtime_forbid+0x43/0x50
[  246.899678]  [] nouveau_pmops_runtime_suspend+0xc5/0xd0 
[nouveau]
[  246.899684]  [] pci_pm_runtime_suspend+0x5d/0x190
[  246.899687]  [] ? pci_pm_runtime_resume+0xa0/0xa0
[  246.899690]  [] __rpm_callback+0x32/0x70
[  246.899693]  [] rpm_callback+0x24/0x80
[  246.899695]  [] ? pci_pm_runtime_resume+0xa0/0xa0
[  246.899698]  [] rpm_suspend+0x11e/0x6f0
[  246.899701]  [] pm_runtime_work+0x7b/0xc0
[  246.899707]  [] process_one_work+0x1f8/0x750
[  246.899710]  [] ? process_one_work+0x179/0x750
[  246.899713]  [] worker_thread+0x4b/0x4f0
[  246.899717]  [] ? preempt_count_sub+0x4c/0x80
[  246.899720]  [] ? process_one_work+0x750/0x750
[  246.899723]  [] kthread+0x102/0x120
[  246.899728]  [] ? trace_hardirqs_on_caller+0x16/0x1c0
[  246.899732]  [] ? kthread_park+0x60/0x60
[  246.899735]  [] ret_from_fork+0x2a/0x40
[  246.899738] INFO: lockdep is turned off.
[  246.899751] INFO: task gpu-manager:1173 blocked for more than 120 seconds.
[  246.899796]   Tainted: G  I 4.9.0-rc8 #60
[  246.899832] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  246.899880] gpu-manager D0  1173  1 0x
[  246.899884]  917bcbd38580  917c0e8d5180 
917bcbced180
[  246.899889]  917c169da058 9c8688573ce0 8c920fab 
917c0de3d360
[  246.899894]  0086 0de3d360 917c169da058 

[  246.899899] Call Trace:
[  246.899903]  [] ? __schedule+0x2fb/0xb30
[  246.899906]  [] schedule+0x40/0x90
[  246.899909]  [] __pm_runtime_barrier+0x95/0x140
[  246.899913]  [] ? wake_atomic_t_function+0x60/0x60
[  246.899915]  [] pm_runtime_barrier+0x5b/0xc0
[  246.899919]  [] pci_config_pm_runtime_get+0x3b/0x60
[  246.899922]  [] pci_read_config+0x7c/0x260
[  246.899927]  [] sysfs_kf_bin_read+0x50/0x80
[  246.899929]  [] kernfs_fop_read+0xba/0x1c0
[  246.899934]  [] __vfs_read+0x28/0x130
[  246.899939]  [] ? security_file_permission+0x9e/0xc0
[  246.899942]  [] ? rw_verify_area+0x4e/0xc0
[  246.899945]  [] vfs_read+0x96/0x140
[  246.899948]  [] SyS_pread64+0x7a/0x90
[  246.899952]  [] entry_SYSCALL_64_fastpath+0x23/0xc6
[  246.899955]  [] ? trace_hardirqs_off_caller+0x1f/0xc0
[  246.899958] INFO: lockdep is turned off.
[  369.775936] INFO: task kworker/7:1:127 blocked for more than 120 seconds.
[  369.775989]   Tainted: G  I 4.9.0-rc8 #60
[  369.776024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  369.776074] kworker/7:1 D0   127  2 0x
[  369.776086] Workqueue: pm pm_runtime_work
[  369.776090]  917bcbeb2680  917c0e8d 
917c0d89
[  369.776096]  917c165da058 9c86866c7b70 8c920fab 
917c0de3d360
[  369.776101]  0086 000

Re: [PATCH 2/2] drm/nouveau: Queue hpd_work on (runtime) resume

2017-01-23 Thread Mario Kleiner

On 11/21/2016 05:50 PM, Hans de Goede wrote:

We need to call drm_helper_hpd_irq_event() on resume to properly detect
monitor connection / disconnection on some laptops, use hpd_work for
this to avoid deadlocks.



Hi,

this seems to introduce a hang of nouveau in 4.10-rc if the gpu is 
runtime resumed while no displays are connected at all.


I get a permanent hang - need to power cycle to recover - if i either

a) Boot a MacPro test machine which has two discrete cards, one radeon, 
one geforce, but the displays are only connected to the radeon, whereas 
the nvidia gpu has no displays connected during boot.


b) On a gmux'ed MacBookPro 2010 intel + nvidia if i switch to the intel 
card via vgaswitcheroo (echo IGD > ...vgaswitcheroo/switch) and then 
after nouveau has powered down the nvidia, i use echo ON > 
...vgaswitcheroo/switch) to power up the nvidia again, but now with 
nothing connected to its outputs.


I can prevent the hang if i either boot with nouveau.runpm=0, or connect 
displays to the nvidia in case a), or if i remove the new 
schedule_work() for hpd_work in the nouveau_pmops_runtime_resume() function.


Otherwise i get a hanging gpu-manager process on Ubuntu and this 
repeating in my kernel log:


[  246.899424] INFO: task kworker/7:1:127 blocked for more than 120 seconds.
[  246.899476]   Tainted: G  I 4.9.0-rc8 #60
[  246.899511] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.

[  246.899561] kworker/7:1 D0   127  2 0x
[  246.899573] Workqueue: pm pm_runtime_work
[  246.899576]  917bcbeb2680  917c0e8d 
917c0d89
[  246.899582]  917c165da058 9c86866c7b70 8c920fab 
917c0de3d360
[  246.899587]  0086 0de3d360 917c165da058 


[  246.899593] Call Trace:
[  246.899601]  [] ? __schedule+0x2fb/0xb30
[  246.899605]  [] schedule+0x40/0x90
[  246.899608]  [] rpm_resume+0x14a/0x740
[  246.899614]  [] ? wake_atomic_t_function+0x60/0x60
[  246.899617]  [] pm_runtime_forbid+0x43/0x50
[  246.899678]  [] 
nouveau_pmops_runtime_suspend+0xc5/0xd0 [nouveau]

[  246.899684]  [] pci_pm_runtime_suspend+0x5d/0x190
[  246.899687]  [] ? pci_pm_runtime_resume+0xa0/0xa0
[  246.899690]  [] __rpm_callback+0x32/0x70
[  246.899693]  [] rpm_callback+0x24/0x80
[  246.899695]  [] ? pci_pm_runtime_resume+0xa0/0xa0
[  246.899698]  [] rpm_suspend+0x11e/0x6f0
[  246.899701]  [] pm_runtime_work+0x7b/0xc0
[  246.899707]  [] process_one_work+0x1f8/0x750
[  246.899710]  [] ? process_one_work+0x179/0x750
[  246.899713]  [] worker_thread+0x4b/0x4f0
[  246.899717]  [] ? preempt_count_sub+0x4c/0x80
[  246.899720]  [] ? process_one_work+0x750/0x750
[  246.899723]  [] kthread+0x102/0x120
[  246.899728]  [] ? trace_hardirqs_on_caller+0x16/0x1c0
[  246.899732]  [] ? kthread_park+0x60/0x60
[  246.899735]  [] ret_from_fork+0x2a/0x40
[  246.899738] INFO: lockdep is turned off.
[  246.899751] INFO: task gpu-manager:1173 blocked for more than 120 
seconds.

[  246.899796]   Tainted: G  I 4.9.0-rc8 #60
[  246.899832] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.

[  246.899880] gpu-manager D0  1173  1 0x
[  246.899884]  917bcbd38580  917c0e8d5180 
917bcbced180
[  246.899889]  917c169da058 9c8688573ce0 8c920fab 
917c0de3d360
[  246.899894]  0086 0de3d360 917c169da058 


[  246.899899] Call Trace:
[  246.899903]  [] ? __schedule+0x2fb/0xb30
[  246.899906]  [] schedule+0x40/0x90
[  246.899909]  [] __pm_runtime_barrier+0x95/0x140
[  246.899913]  [] ? wake_atomic_t_function+0x60/0x60
[  246.899915]  [] pm_runtime_barrier+0x5b/0xc0
[  246.899919]  [] pci_config_pm_runtime_get+0x3b/0x60
[  246.899922]  [] pci_read_config+0x7c/0x260
[  246.899927]  [] sysfs_kf_bin_read+0x50/0x80
[  246.899929]  [] kernfs_fop_read+0xba/0x1c0
[  246.899934]  [] __vfs_read+0x28/0x130
[  246.899939]  [] ? security_file_permission+0x9e/0xc0
[  246.899942]  [] ? rw_verify_area+0x4e/0xc0
[  246.899945]  [] vfs_read+0x96/0x140
[  246.899948]  [] SyS_pread64+0x7a/0x90
[  246.899952]  [] entry_SYSCALL_64_fastpath+0x23/0xc6
[  246.899955]  [] ? trace_hardirqs_off_caller+0x1f/0xc0
[  246.899958] INFO: lockdep is turned off.
[  369.775936] INFO: task kworker/7:1:127 blocked for more than 120 seconds.
[  369.775989]   Tainted: G  I 4.9.0-rc8 #60
[  369.776024] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.

[  369.776074] kworker/7:1 D0   127  2 0x
[  369.776086] Workqueue: pm pm_runtime_work
[  369.776090]  917bcbeb2680  917c0e8d 
917c0d89
[  369.776096]  917c165da058 9c86866c7b70 8c920fab 
917c0de3d360
[  369.776101]  0086 0de3d360 917c165da058 


[  369.776106] Call Trace:
[  369.776114]  [] ? __schedule+0x2fb/0xb30

[PATCH 2/2] drm/nouveau: Queue hpd_work on (runtime) resume

2016-11-21 Thread Hans de Goede
We need to call drm_helper_hpd_irq_event() on resume to properly detect
monitor connection / disconnection on some laptops, use hpd_work for
this to avoid deadlocks.

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 3100fd88..b564ab8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -692,7 +692,12 @@ nouveau_pmops_resume(struct device *dev)
return ret;
pci_set_master(pdev);

-   return nouveau_do_resume(drm_dev, false);
+   ret = nouveau_do_resume(drm_dev, false);
+
+   /* Monitors may have been connected / disconnected during suspend */
+   schedule_work(&nouveau_drm(drm_dev)->hpd_work);
+
+   return ret;
 }

 static int
@@ -766,6 +771,10 @@ nouveau_pmops_runtime_resume(struct device *dev)
nvif_mask(&device->object, 0x088488, (1 << 25), (1 << 25));
vga_switcheroo_set_dynamic_switch(pdev, VGA_SWITCHEROO_ON);
drm_dev->switch_power_state = DRM_SWITCH_POWER_ON;
+
+   /* Monitors may have been connected / disconnected during suspend */
+   schedule_work(&nouveau_drm(drm_dev)->hpd_work);
+
return ret;
 }

-- 
2.9.3