Re: [Nouveau] Nouveau: kernel hang on Optimus+Intel+NVidia GeForce 1060m

2017-09-17 Thread Peter Wu
Hi Nicholas,

Increasing the timeout won't help as the dGPU is not powered/available. If you 
have an external monitor connected, the dGPU will stay powered on and not 
suspend (with a (temporary) effect similar to booting with nouveau.runpm=0).

The reason for the failure to restore power is still unknown, it seems related 
to the PCI core.

Kind regards,
Peter
https://lekensteyn.nl
(pardon my brevity, top-posting and formatting, sent from my phone)


On 13 September 2017 23:13:59 BST, Nicolas Mercier  
wrote:
>Thanks Tobias, I tried this but unfortunately the only effect was thta
>the
>boot was delayed by an additional 4 seconds :(
>The original timeout is at
>drivers/gpu/drm/nouveau/nvkm/subdev/secboot/ls_ucode_msgqueue.c
>I tried to increase that timeout, but it did not seem to make a
>difference
>either.
>
>I think I get this error less often when I have a cable plugged in the
>output of that card at boot, whereas I always get this error when I
>boot
>without a monitor attached to the card.
>
>On Wed, Sep 13, 2017 at 2:28 PM, Tobias Klausmann <
>tobias.johannes.klausm...@mni.thm.de> wrote:
>
>> Hi,
>>
>> the system fails to initialize your vbios using secureboot (i had a
>rare
>> chance to on my system to witness it again), for now i traced it to
>> acr_boot_falcon() in
>> "linux/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0148cdec.c" where
>it
>> throws -110 which is -ETIMEDOUT. You could try to increase the
>timeout
>> and see if it helps something, similar to the following:
>>
>>
>> diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
>> b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
>> index 77273b53672c..fc0cb187d80d 100644
>> --- a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
>> +++ b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
>> @@ -326,7 +326,7 @@ nvkm_msgqueue_post(struct nvkm_msgqueue *priv,
>enum
>> msgqueue_msg_priority prio,
>> int ret;
>>
>> if (wait_init &&
>!wait_for_completion_timeout(&priv->init_done,
>> -msecs_to_jiffies(1000)))
>> +msecs_to_jiffies(5000)))
>> return -ETIMEDOUT;
>>
>> queue = priv->func->cmd_queue(priv, prio);
>>
>> diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
>> b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
>> index fec0273158f6..c2ae525a0780 100644
>> --- a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
>> +++ b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue_0137c63d.c
>> @@ -279,6 +279,7 @@ acr_boot_falcon(struct nvkm_msgqueue *priv, enum
>> nvkm_secboot_falcon falcon)
>> u32 flags;
>> u32 falcon_id;
>> } cmd;
>> +   const struct nvkm_subdev *subdev = priv->falcon->owner;
>>
>> memset(&cmd, 0, sizeof(cmd));
>>
>> @@ -290,7 +291,8 @@ acr_boot_falcon(struct nvkm_msgqueue *priv, enum
>> nvkm_secboot_falcon falcon)
>> nvkm_msgqueue_post(priv, MSGQUEUE_MSG_PRIORITY_HIGH,
>&cmd.hdr,
>> acr_boot_falcon_callback, &completed, true);
>>
>> -   if (!wait_for_completion_timeout(&completed,
>> msecs_to_jiffies(1000)))
>> +   nvkm_error(subdev, "waiting for timeout in acr_boot_falcon
>> (msgqueue_0137bca5)\n");
>> +   if (!wait_for_completion_timeout(&completed,
>> msecs_to_jiffies(5000)))
>> return -ETIMEDOUT;
>>
>> return 0;
>>
>>
>>
>> On 9/13/17 11:37 AM, Nicolas Mercier wrote:
>> > I am still looking for a solution. I have hacked around in the code
>> > and found out the following:
>> > - Nouveau prefers using PCIE power managemet over ACPI Optimus
>calls.
>> > I tried to force it to use Optimus ACPI calls, but there was an
>error
>> > calling the ACPI method so it bails out and uses PCIE PM anyway.
>> > - I tried to debug the PCIE pm states which internally uses ACPI to
>> > turn power on/off. I could print different statuses here and there.
>> > When the power is switched off, ACPI calls turn the power off then
>the
>> > kernel successfully puts the device in state D3Cold (also turning
>off
>> > power to the PCI Express port). When waking up, ACPI turns the
>power
>> > on, apparently successfully (Device [PEGP] transitioned to D0). But
>a
>> > read from the PCI bus to get the power state & other flags return
>> > 65535 (~0) and the kernel fails to set the device in D0 (although
>ACPI
>> > claims it is in D0)
>> > The call to pci_raw_set_power_state (in drivers/pci/pci.c) seems to
>> > fail because pci_read_config_word returns "~0" (and does not return
>> > any error code)
>> >
>> > I have tried different things; if I use pcie_port_pm=off, the
>NVidia
>> > card goes to state D3Hot (if I am not mistaken, its PCIE port is
>still
>> > powered) but that did not fix it. I tried to turn on or off
>different
>> > PCI/PCIexpress features such as hotplug, PM and so on. The only
>thing
>> > that works is that PM is fully disabled, which equals to the device
>> > not being powered off, so

Re: [Nouveau] [PATCH v2 0/7] Modernize vga_switcheroo by using device link for HDA

2018-03-05 Thread Peter Wu
1/0x80 [nouveau]
 nouveau_drm_unload+0x6b/0xd0 [nouveau]
 drm_dev_unregister+0x3c/0xe0
 drm_put_dev+0x2e/0x60
 nouveau_drm_device_remove+0x37/0x50 [nouveau]
 pci_device_remove+0x36/0xb0
 device_release_driver_internal+0x160/0x230
 driver_detach+0x3a/0x70
 bus_remove_driver+0x58/0xd0
 pci_unregister_driver+0x3b/0x90
 nouveau_drm_exit+0x15/0x432 [nouveau]
 SyS_delete_module+0x16c/0x230

Issue 8 - acpi: sleeping function in atomic context.
(Issue is likely not related to this patch set.)
At some point I also got a BUG, nouveau was already unloaded and I ran:
"echo 1 | tee /sys/bus/pci/devices/:01:00.{0,1}/remove"

BUG: sleeping function called from invalid context at 
/home/peter/linux/mm/slab.h:419
in_atomic(): 1, irqs_disabled(): 0, pid: 4844, name: kworker/3:4
INFO: lockdep is turned off.
CPU: 3 PID: 4844 Comm: kworker/3:4 Tainted: GW
4.15.0testing-00020-gb33d50c5c6ad #55
Hardware name: Notebook 
P65_P67RGRERA/P65_P67RGRERA, BIOS 1.05.16 05/16/2016
Workqueue: events_power_efficient srcu_invoke_callbacks
Call Trace:
 dump_stack+0x5f/0x86
 ___might_sleep+0x20c/0x240
 kmem_cache_alloc_trace+0x4d/0x230
 acpi_ut_evaluate_object+0x68/0x23c
 ? srcu_invoke_callbacks+0xa2/0x150
 acpi_rs_get_prt_method_data+0x42/0xa2
 acpi_get_irq_routing_table+0x70/0x9f
 ? __slab_free+0x11c/0x380
 acpi_pci_irq_find_prt_entry+0x83/0x330
 ? srcu_invoke_callbacks+0xa2/0x150
 acpi_pci_irq_lookup+0x27/0x2e0
 acpi_pci_irq_disable+0x45/0xb0
 pci_release_dev+0x29/0x60
 device_release+0x2d/0x80
 kobject_put+0xb7/0x190
 __device_link_free_srcu+0x32/0x40
 srcu_invoke_callbacks+0xba/0x150
 process_one_work+0x273/0x670
 worker_thread+0x4a/0x400
 kthread+0x100/0x140
 ? process_one_work+0x670/0x670
 ? kthread_create_worker_on_cpu+0x50/0x50
 ? do_syscall_64+0x56/0x1a0
 ? SyS_exit_group+0x10/0x10

Issue 9 - potential memory corruption.
At some point (possibly after issue 7, but I am not fully sure), I saw
an artifact in the text console which would persist even when switching
between consoles. It was gone after system sleep/resume. If I remember
correctly, it looked like something from a Xorg session which I killed
before.

That was the bad news, the good news:
- Loading nouveau and snd-hda-intel (in any order) while RPM is enabled
  and the port was in D3cold works.
- RPM interaction between audio and GPU seems good, when audio resumes,
  the GPU RPM counter increments, when audio suspends it decrements.
- As the GPU enters D3cold, I can observe significant power savings
  through /sys/class/power_supply/BAT0/ (no regressions here).
- In a default configuration I have no audio function (see also nouveau
  bug https://bugs.freedesktop.org/show_bug.cgi?id=75985) so most of the
  above issues should not occur.

Hope it helps, and if desired you can add:
Tested-by: Peter Wu 

For the following patches, you can also add my Reviewed-by:

vga_switcheroo: Update PCI current_state on power change
vga_switcheroo: Deduplicate power state tracking
vga_switcheroo: Use device link for HDA controller
vga_switcheroo: Let HDA autosuspend on mux change
drm/nouveau: Runtime suspend despite HDA being unbound

The two other PCI patches look fine as well.

Kind regards,
Peter

On Sat, Mar 03, 2018 at 10:53:24AM +0100, Lukas Wunner wrote:
> Modernize vga_switcheroo by using a device link to enforce a runtime PM
> dependency from an HDA controller to the GPU it's integrated into, v2.
> 
> Changes since v1:
> 
> - Replace patch [1/7] to use pci_save_state() / pci_restore_state()
>   for consistency between runtime PM code path of bound and unbound
>   devices. (Rafael, Bjorn)
> 
> - Patch [5/7]: Drop an unnecessary initialization. (Bjorn)
>   Rephrase error message on failed link addition for clarity.
> 
> Link to v1:
> https://www.spinics.net/lists/dri-devel/msg165889.html
> 
> Testing on more machines would be greatly appreciated, particularly
> Nvidia Optimus or AMD PowerXpress.
> 
> The series is based on 4.16-rc3.  To test it on 4.15, you need to
> cherry-pick 7506dc798993 and 2fa6d6cdaf28.  For your convenience
> I've pushed a 4.15-based branch to:
> https://github.com/l1k/linux/commits/switcheroo_devlink_v2
> 
> Minimal test procedure:
> 
> - Note well: Recent Optimus require that a Mini-DP or HDMI cable is
>   plugged in on boot for the HDA device to be present.
> 
> - Check that HDA, GPU and root port autosuspend when not in use:
>   cat /sys/bus/pci/devices/:01:00.1/power/runtime_status  # HDA
>   cat /sys/bus/pci/devices/:01:00.0/power/runtime_status  # GPU
>   cat /sys/bus/pci/devices/:00:01.0/power/runtime_status  # Root Port
> 
> - Check that all three autoresume when accessing the HDA:
>   hdajack

Re: [Nouveau] Rewriting Intel PCI bridge prefetch base address bits solves nvidia graphics issues

2018-08-24 Thread Peter Wu
like disabling MSI/interrupts before
suspend, setting the Enable Clock Power Management bit in PCI Express
Link Control and more, but applying these changes were so far not really
successful.

Some supporting files for that investigation are here:
https://github.com/Lekensteyn/acpi-stuff/tree/master/d3test

Karol noticed that by not setting the State in PMCSR to D3 for the
Nvidia GPU during runtime suspend, then the device would successfully
resume. However, based on traces using VFIO-PCI, it does not seem a good
solution as Windows does not behave like that.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Rewriting Intel PCI bridge prefetch base address bits solves nvidia graphics issues

2018-08-28 Thread Peter Wu
On Tue, Aug 28, 2018 at 10:23:24AM +0800, Daniel Drake wrote:
> On Fri, Aug 24, 2018 at 11:42 PM, Peter Wu  wrote:
> > Are these systems also affected through runtime power management? For
> > example:
> >
> > modprobe nouveau# should enable runtime PM
> > sleep 6 # wait for runtime suspend to kick in
> > lspci -s1:  # runtime resume by reading PCI config space
> >
> > On laptops from about 2015-2016 with a GTX 9xxM this sequence results in
> > hangs on various laptops
> > (https://bugzilla.kernel.org/show_bug.cgi?id=156341).
> 
> This works fine here. I'm facing a different issue.

Just to be sure, after "sleep", do both devices report "suspended" in
/sys/bus/pci/devices/:00:1c.0/power/runtime_status
/sys/bus/pci/devices/:01:00.0/power/runtime_status

and was this reproduced with a recent mainline kernel with no special
cmdline options? The endlessm kernel on Github seems to have quite some
patches, one of them explicitly disable runtime PM:
https://github.com/endlessm/linux/commit/8b128b50cd6725eee2ae9025a1510a221d9b42f2

> >> After a lot of experimentation I found a workaround: during resume,
> >> set the value of PCI_PREF_BASE_UPPER32 to 0 on the parent PCI bridge.
> >> Easily done in drivers/pci/quirks.c. Now all nvidia stuff works fine.
> >
> > I am curious, how did you discover this? While this could work, perhaps
> > there are alternative workarounds/fixes?
> 
> Based on the observation that the following procedure works fine (note
> the addition of step 3):
> 
> 1. Boot
> 2. Suspend/resume
> 3. echo rescan > /sys/bus/pci/devices/:00:1c.0/rescan
> 4. Load nouveau driver
> 5. Start X
> 
> I worked through the rescan codepath until I had isolated the specific
> code which magically makes things work (in pci_bridge_check_ranges).
> 
> Having found that, step 3 in the above test procedure can be replaced
> with a simple:
>setpci -s 00:1c.0 0x28.l=0
> 
> > When you say "parent PCI" bridge, is that actually the device you see in
> > "lspci -tv"? On a Dell XPS 9560, the GPU is under a different device:
> >
> >   -[:00]-+-00.0  Intel Corporation Xeon E3-1200 v6/7th Gen Core 
> > Processor Host Bridge/DRAM Registers
> >  +-01.0-[01]00.0  NVIDIA Corporation GP107M [GeForce GTX 
> > 1050 Mobile]
> >
> >  00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3-1200 v5/E3-1500 
> > v5/6th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 05)
> 
> Yes, it's the parent bridge shown by lspci. The address of this varies
> from system to system.

Could you share some details:
- acpidump
- lspci -nnvvv
- BIOS version (from /sys/class/dmi/id/)
- kernel version (mainline?)

Perhaps there is some magic in the ACPI suspend or resume path that
causes this.

> >> 1. Is the Intel PCI bridge misbehaving here? Why does writing the same
> >> value of PCI_PREF_BASE_UPPER32 make any difference at all?
> >
> > At what point in the suspend code path did you insert this write? It is
> > possible that the write somehow acted as a fence/memory barrier?
> 
> static void quirk_pref_base_upper32(struct pci_dev *dev)
> {
>u32 pref_base_upper32;
>pci_read_config_dword(dev, PCI_PREF_BASE_UPPER32, &pref_base_upper32);
>pci_write_config_dword(dev, PCI_PREF_BASE_UPPER32, pref_base_upper32);
> }
> DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_INTEL,  0x9d10, 
> quirk_pref_base_upper32);
> 
> I don't think it's acting as a barrier. I tried changing this code to
> rewrite other registers such as PCI_PREF_MEMORY_BASE and that makes
> the bug come back.
> 
> >> 2. Who is responsible for saving and restoring PCI bridge
> >> configuration during suspend and resume? Linux? ACPI? BIOS?
> >
> > Not sure about PCI bridges, but at least for the PCI Express Capability
> > registers, it is in control of the OS when control is granted via the
> > ACPI _OSC method.
> 
> I guess you are referring to pci_save_pcie_state(). I can't see
> anything equivalent for the bridge registers.

Yes that would be the function, called via pci_save_state.

> > I recently compared PCI configuration space access and ACPI method
> > invocation using QEMU + VFIO with Linux 4.18, Windows 7 and Windows 10
> > (1803). There were differences like disabling MSI/interrupts before
> > suspend, setting the Enable Clock Power Management bit in PCI Express
> > Link Control and more, but applying these changes were so far not really
> > successful.
> 
> Interesting. Do you know any way that I could spy on Windows' acces

Re: [Nouveau] Rewriting Intel PCI bridge prefetch base address bits solves nvidia graphics issues

2018-08-30 Thread Peter Wu
On Thu, Aug 30, 2018 at 03:41:43PM +0800, Daniel Drake wrote:
> On Tue, Aug 28, 2018 at 5:57 PM, Peter Wu  wrote:
> > Just to be sure, after "sleep", do both devices report "suspended" in
> > /sys/bus/pci/devices/:00:1c.0/power/runtime_status
> > /sys/bus/pci/devices/:01:00.0/power/runtime_status
> >
> > and was this reproduced with a recent mainline kernel with no special
> > cmdline options? The endlessm kernel on Github seems to have quite some
> > patches, one of them explicitly disable runtime PM:
> > https://github.com/endlessm/linux/commit/8b128b50cd6725eee2ae9025a1510a221d9b42f2
> 
> Yes, I checked for this issue in the past and I'm certain that nouveau
> runtime pm works fine.
> 
> I also checked again now on X542UQ and the results are the same.
> nouveau can do runtime suspend/resume (confirmed by reading
> runtime_status) and then render 3D graphics OK. lspci is fine too. It
> is just S3 suspend that is affected. This was testing on Linux 4.18
> unmodified. I had to set nouveau runpm parameter to 1 for it to use
> runtime pm.
> 
> Also checked with Karol's patch, the S3 issue is still there. Seems
> like 2 different issues.
> 
> > Could you share some details:
> > - acpidump
> > - lspci -nnvvv
> > - BIOS version (from /sys/class/dmi/id/)
> > - kernel version (mainline?)
> 
> Linux 4.18 mainline
> BIOS version: X542UQ.202
> acpidump: 
> https://gist.githubusercontent.com/dsd/79352284d4adce14f30d70e94fad89f2/raw/ed9480e924be413fff567da2edd5a2a7a86619d0/gistfile1.txt
> pci: 
> https://gist.githubusercontent.com/dsd/79352284d4adce14f30d70e94fad89f2/raw/ed9480e924be413fff567da2edd5a2a7a86619d0/pci

Thanks, based on the \_SB.PCI0.HGOF implementation, it looks like this
model will not be affected by the runtime suspend issue (it sets the
"Link Disable" register which is known to work for other models).

As the BIOS date is not visible, can you also confirm that this message
is visible in dmesg?

nouveau: detected PR support, will not use DSM

FWIW, the latest BIOS version is 305, released at 2018/08/07:
https://www.asus.com/Laptops/ASUS-VivoBook-15-X542UQ/HelpDesk_BIOS/

> > Only non-bridge devices can be passed to a guest, but perhaps logging
> > access to the emulated bridge is already sufficient. The Prefetchable
> > Base Upper 32 Bits register is at offset 0x28.
> >
> > In a trace where the Nvidia device is disabled/enabled via Device
> > Manager, I see writes on the enable path:
> >
> > 2571@1535108904.593107:rp_write_config (ioh3420, @0x28, 0x0, len=0x4)
> >
> > For Linux, I only see one write at startup, none on runtime resume.
> > I did not test system sleep/resume. (disable/enable is arguably a bit
> > different from system s/r, you may want to do additional testing here.)
> 
> I managed to install Win10 Home under virt-manager with the nvidia
> device passed through.
> However the nvidia windows driver installer refuses to install, says:
> The NVIDIA graphics driver is not compatible with this version of Windows.
> This graphics driver could not find compatible graphics hardware.
> 
> One trick for similar sounding problems is to change hypervisor vendor
> ID but no luck here.

For laptops, it appears that you have to do at least two things:
- Ensure that the Subsystem Vendor/Product ID are set.
- Expose a _ROM ACPI method that provides VBIOS.

Perhaps you also need to provide a "_DSM" method that emulates at least
the "Optimus" interface for GUID a486d8f8-0bda-471b-a72b-6042a6b5bee0.

You probably lost interest here, but if you want to continue anyway this
is what allowed me to install the driver on the XPS 9560:
https://github.com/Lekensteyn/acpi-stuff/blob/master/d3test/fakedev.asl

If you adapt if for your environment, note:
- I have only tested this with the q35 machine type with an additional
  ioh3420 root port. See the XPS956/boot-vm script.
- The \_SB.PCI0.SE0 device should match the root port:
  cat /sys/bus/pci/devices/:00:1c.0/firmware_node/path
  (the SE0 name is chosen by QEMU.)
- The "NET" (\_SB.PCI0.SE0.NET) device name is arbitrary chosen by me,
  it currently assumes PCI address 01:00.0:
  Name (_ADR, 0x) // _ADR: Address (dev+fn only, 01:00.0)
- The _DSM method is copied from the XPS 9560 SSDT with external method
  references removed (focus on the code with "OPCI" true, the other two
  with NBCI and SGCI are irrelevant). One obvious difference with your
  SSDT is function 0x10, your OPVK ("Optimus Validation Key Object" is
  different and there is another "OPDR" check afterwards.

> I was going to check if I can monitor PCI bridge config space access
> even without the nvid

Re: [Nouveau] Rewriting Intel PCI bridge prefetch base address bits solves nvidia graphics issues

2018-09-05 Thread Peter Wu
On Wed, Sep 05, 2018 at 02:26:51PM +0800, Daniel Drake wrote:
> On Tue, Aug 28, 2018 at 5:57 PM, Peter Wu  wrote:
> > Only non-bridge devices can be passed to a guest, but perhaps logging
> > access to the emulated bridge is already sufficient. The Prefetchable
> > Base Upper 32 Bits register is at offset 0x28.
> >
> > In a trace where the Nvidia device is disabled/enabled via Device
> > Manager, I see writes on the enable path:
> >
> > 2571@1535108904.593107:rp_write_config (ioh3420, @0x28, 0x0, len=0x4)
> 
> Did you do anything special to get an emulated bridge included in this setup?

Yes, I followed instructions in QEMU's docs/pcie.txt and ended up with:

-device ioh3420,id=rp1,bus=pcie.0,addr=1c.0,port=1
-device 
vfio-pci,bus=rp1,host=01:00.0,rombar=0,x-pci-sub-vendor-id=0x1028,x-pci-sub-device-id=0x07be

(Subvendor/device IDs are from lspci -nnv).

> Folllowing the instructions at
> https://wiki.archlinux.org/index.php/PCI_passthrough_via_OVMF I can
> successfully pass through devices to windows running under
> virt-manager. In the nvidia GPU case I haven't got passed the driver
> installation failure, but I can pass through other devices OK and
> install their drivers.

After installing drivers, it would still not start. For that to work I
had to pass the VBIOS via an ACPI _ROM method:

-acpitable file=fakedev.aml
-fw_cfg name=opt/nl.lekensteyn/vfio-vbios,file=vbios.rom

These options were taken from:
https://github.com/Lekensteyn/acpi-stuff/blob/master/d3test/XPS9560/boot-vm

fakedev.asl source file and instructions to extract the VBIOS:
https://github.com/Lekensteyn/acpi-stuff/tree/master/d3test

> However I do not end up with any PCI-to-PCI bridges in this setup. The
> passed through device sits at address 00:08.0, parent is the PCI host
> bridge 00:00.0.
> 
> (I'm trying to spy if Windows appears to restore or reset the PCI
> bridge prefetch registers upon resume)

If you want to suspend the guest, note that Windows refuses suspend
with the default VGA adapter (see "devicequery /a"). Try the QXL adapter
with https://gitlab.freedesktop.org/spice/win32/qxl-wddm-dod

-vga qxl -device qemu-xhci -device usb-tablet

Not sure how well tested this is, I had to patch Linux to avoid an oops.
If I try this on Windows, it successfully suspends ("info status" in
QEMU monitor says "paused (suspended)"), but resume ends up with a black
screen...

Luckily, the important information is already logged. Windows 10 indeed
seems to write to "Prefetchable Base Upper 32 Bits" on resume[1].
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl

[1]: QEMU output (annotated with register names) for
./run-vm.sh -device usb-tablet -vga qxl /tmp/w10.qcow2 -trace 
rp_read_config,file=/dev/stdout -trace rp_write_config,file=/dev/stdout


NET._PS3
32481@1536163097.415976:rp_write_config  (ioh3420, @0x12c, 0x0, len=0x4)
 AER: Root Error Command
32481@1536163097.415999:rp_write_config  (ioh3420, @0xac, 0x0, len=0x2) 
 PCIE: Root Control
32481@1536163097.416008:rp_read_config  (ioh3420, @0xac, len=0x2) 0x0   
 PCIE: Root Control
32481@1536163097.416017:rp_read_config  (ioh3420, @0xa0, len=0x2) 0x0   
 PCIE: Link Control
32481@1536163097.416024:rp_write_config  (ioh3420, @0xb0, 0x1, len=0x4) 
 PCIE: Root Status
32481@1536163097.416057:rp_read_config  (ioh3420, @0x4, len=0x2) 0x506  
 Command
32481@1536163097.416066:rp_read_config  (ioh3420, @0xc, len=0x1) 0x0
 Cacheline Size
32481@1536163097.416073:rp_read_config  (ioh3420, @0xd, len=0x1) 0x0
 Latency Timer
32481@1536163097.416081:rp_read_config  (ioh3420, @0x3c, len=0x1) 0x0   
 Interrupt Line
32481@1536163097.416088:rp_read_config  (ioh3420, @0x19, len=0x1) 0x1   
 Secondary Bus Number
32481@1536163097.416095:rp_read_config  (ioh3420, @0x1a, len=0x1) 0x1   
 Subordiante Bus Number
32481@1536163097.416103:rp_read_config  (ioh3420, @0x3e, len=0x2) 0x2   
 Bridge Control
32481@1536163097.416129:rp_read_config  (ioh3420, @0x0, len=0x2) 0x8086 
 Device ID
32481@1536163097.416136:rp_read_config  (ioh3420, @0x2, len=0x2) 0x3420 
 Vendor ID
32481@1536163097.416143:rp_read_config  (ioh3420, @0x8, len=0x1) 0x2
 Revision
32481@1536163097.416150:rp_read_config  (ioh3420, @0x9, len=0x1) 0x0
 Class Code
32481@1536163097.416156:rp_read_config  (ioh3420, @0xa, len=0x1) 0x4
  

Re: [Nouveau] [PATCH] PCI: Reprogram bridge prefetch registers on resume

2018-09-07 Thread Peter Wu
On Fri, Sep 07, 2018 at 01:36:14PM +0800, Daniel Drake wrote:
<..>
> Thomas Martitz reports that this workaround also solves an issue where
> the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive
> after S3 suspend/resume.

Where was this claimed? It is not stated in the linked bug:
(https://bugs.freedesktop.org/show_bug.cgi?id=105760

> On resume, reprogram the PCI bridge prefetch registers, including the
> magic register mentioned above.
> 
> This matches Win10 behaviour, which also rewrites these registers
> during S3 resume (checked with qemu tracing).

Windows 10 unconditionally rewrites these registers (BAR, I/O Base +
Limit, Memory Base + Limit, etc. from top to bottom), see annotations:
https://www.spinics.net/lists/linux-pci/msg75856.html

Linux has a generic "restore" operation that works backwards from the
end of the PCI config space to the beginning, see
pci_restore_config_space. Do you have a dmesg where you see the
"restoring config space at offset" messages?

Would it be reasonable to unconditionally write these registers in
pci_restore_config_dword, like Windows does?

Kind regards,
Peter
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] PCI: Reprogram bridge prefetch registers on resume

2018-09-11 Thread Peter Wu
On Fri, Sep 07, 2018 at 05:26:47PM -0500, Bjorn Helgaas wrote:
> [+cc LKML, Dave, Luming]
> 
> On Fri, Sep 07, 2018 at 05:05:15PM +0200, Peter Wu wrote:
> > On Fri, Sep 07, 2018 at 01:36:14PM +0800, Daniel Drake wrote:
> > <..>
> > > Thomas Martitz reports that this workaround also solves an issue where
> > > the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive
> > > after S3 suspend/resume.
> > 
> > Where was this claimed? It is not stated in the linked bug:
> > (https://bugs.freedesktop.org/show_bug.cgi?id=105760
> > 
> > > On resume, reprogram the PCI bridge prefetch registers, including the
> > > magic register mentioned above.
> > > 
> > > This matches Win10 behaviour, which also rewrites these registers
> > > during S3 resume (checked with qemu tracing).
> > 
> > Windows 10 unconditionally rewrites these registers (BAR, I/O Base +
> > Limit, Memory Base + Limit, etc. from top to bottom), see annotations:
> > https://www.spinics.net/lists/linux-pci/msg75856.html
> > 
> > Linux has a generic "restore" operation that works backwards from the
> > end of the PCI config space to the beginning, see
> > pci_restore_config_space. Do you have a dmesg where you see the
> > "restoring config space at offset" messages?
> > 
> > Would it be reasonable to unconditionally write these registers in
> > pci_restore_config_dword, like Windows does?
> 
> That sounds reasonable to me.
> 
> We did write them unconditionally, prior to 04d9c1a1100b ("[PATCH]
> PCI: Improve PCI config space writeback") [1].  That commit apparently
> fixed suspend on some laptop.
> 
> But at that time, we restored the config space in order of dword 0, 1,
> 2, ... 15, which means we restored the command register before the
> BARs and windows, and it's conceivable that the problem was the
> ordering, not the rewriting of the same value.
> 
> Only a week later, we reversed the order with 8b8c8d280ab2 ("[PATCH]
> PCI: reverse pci config space restore order") [2], which seems like a
> good idea and even includes a spec reference.  I found similar
> language in the Intel ICH 10 datasheet, sec 14.1.3 [3].
> 
> So it seems reasonable to me to try writing them unconditionally in
> reverse order (the same order we use today).

Some fields are readonly (device/vendor ID), others have side-effects
(write-to-clear semantics like Master Data Parity Error in the Status
register).

So in addition to always write the registers, what about writing only a
subset? Following the logic applied by Windows, for Type-1:

- Restore memory-related stuff:
  - Restore BAR0 (offset 0x10, dword).
  - Restore BAR1 (offset 0x14, dword).
  - Restore I/O Base + Limit (offset 0x1c, WORD, not dword).
  - Restore Memory Base + Limit (offset 0x20, dword).
  - Restore Prefetchable Memory Base + Limit (offset 0x24, dword).
  - Restore Prefetchable Base Upper 32 bits (offset 0x28, dword).
(+wait for 500us, why?)
  - Restore Prefetchable Limit Upper 32 bits (offset 0x2c, dword).
(+wait for 500us, why?)
  - Restore I/O Base + Limit Upper 32 bits (offset 0x30, dword).
(+wait for 500us, why?)
  - Restore Expansion ROM Address (offset 0x38, dword).
- Restore at offset 0x3c:
  - Restore Interrupt Line (offset 0x3c, byte).
  - Restore Bridge Control (offset 0x3e, word).
(+read value again, why? Some kind of flush?)
- Restore at offset 0x18:
  - Restore Primary Bus Number (offset 0x18, byte).
  - Restore Secondary Bus Number (offset 0x19, byte).
  - Restore Subordinate Bus Number (offset 0x1a, byte).
  - (Restore Secondary Status (offset 0x1b, byte)? Not seen in Windows.)
- Restore at offset 0x0c:
  - Cacheline Size (offset 0x0c, byte).
  - Primary Latency Timer (offset 0x0d, byte).
- Restore at offset 0x04:
  - Restore command register.
(+wait for 500us)
  - (Actually Windows clears BM+Mem+I/O bits, then sets
PCIE.DeviceControl=7, PCIE.DeviceControl2=0 and then enables
BM+Mem+I/O again).
  - Restore (or write to clear?) Status (offset 0x06, word).
- Note that the following fields are omitted (mostly readonly):
  - Device ID + Vendor ID (offset 0x00, dword).
  - Revision ID + Class Code  (offset 0x08, dword).
  - Header Type + BIST (offset 0x0e, byte + byte).
  - Capabilities Pointer + Reserved (offset 0x34, dword).
  - Interrupt Pin (offset 0x3d, byte).

This is more complicated than the current loop, but the logic of writing
addresses first and then configuring others sound reasonable to me.

Kind regards,
Peter

> 
> [1] 
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=04d9c1a1100b
> [2] 
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8b8c8d280ab2
> [3] Intel ICH 10 IBL 373635
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v2] PCI: Reprogram bridge prefetch registers on resume

2018-09-12 Thread Peter Wu
On Wed, Sep 12, 2018 at 02:45:23PM +0800, Daniel Drake wrote:
> On 38+ Intel-based Asus products, the nvidia GPU becomes unusable
> after S3 suspend/resume. The affected products include multiple
> generations of nvidia GPUs and Intel SoCs. After resume, nouveau logs
> many errors such as:
> 
> fifo: fault 00 [READ] at 00555000 engine 00 [GR] client 04 
> [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
> DRM: failed to idle channel 0 [DRM]
> 
> Similarly, the nvidia proprietary driver also fails after resume
> (black screen, 100% CPU usage in Xorg process). We shipped a sample
> to Nvidia for diagnosis, and their response indicated that it's a
> problem with the parent PCI bridge (on the Intel SoC), not the GPU.
> 
> Runtime suspend/resume works fine, only S3 suspend is affected.
> 
> We found a workaround: on resume, rewrite the Intel PCI bridge
> 'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In
> the cases that I checked, this register has value 0 and we just have to
> rewrite that value.
> 
> Linux already saves and restores PCI config space during suspend/resume,
> but this register was being skipped because upon resume, it already
> has value 0 (the correct, pre-suspend value).
> 
> Intel appear to have previously acknowledged this behaviour and the
> requirement to rewrite this register.
> https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23
> 
> Based on that, rewrite the prefetch register values even when that
> appears unnecessary.
> 
> We have confirmed this solution on all the affected models we have
> in-hands (X542UQ, UX533FD, X530UN, V272UN).
> 
> Additionally, this solves an issue where r8169 MSI-X interrupts were
> broken after S3 suspend/resume on Asus X441UAR. This issue was recently
> worked around in commit 7bb05b85bc2d ("r8169: don't use MSI-X on
> RTL8106e"). It also fixes the same issue on RTL6186evl/8111evl on an
> Aimfor-tech laptop that we had not yet patched. I suspect it will also
> fix the issue that was worked around in commit 7c53a722459c ("r8169:
> don't use MSI-X on RTL8168g").
> 
> Thomas Martitz reports that this change also solves an issue where
> the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive
> after S3 suspend/resume.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
> Signed-off-by: Daniel Drake 
> ---
> 
> Notes:
> Replaces patch:
>PCI: add prefetch quirk to work around Asus/Nvidia suspend issues
> 
> Some of the more verbose info was moved to bugzilla:
> https://bugzilla.kernel.org/show_bug.cgi?id=201069
> 
> This patch is aimed at v4.19 (and maybe v4.18-stable); we may follow
> up with more intrusive improvements for v4.20+.
> 
> v2: reimplement the register restore within the existing
> pci_restore_config_space() code.
> 
>  drivers/pci/pci.c | 31 +--
>  1 file changed, 21 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 29ff9619b5fa..e1704100e72d 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1289,13 +1289,15 @@ int pci_save_state(struct pci_dev *dev)
>  EXPORT_SYMBOL(pci_save_state);
>  
>  static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
> -  u32 saved_val, int retry)
> +  u32 saved_val, int retry, bool force)
>  {
>   u32 val;
>  
> - pci_read_config_dword(pdev, offset, &val);
> - if (val == saved_val)
> - return;
> + if (!force) {
> + pci_read_config_dword(pdev, offset, &val);
> + if (val == saved_val)
> + return;
> + }
>  
>   for (;;) {
>   pci_dbg(pdev, "restoring config space at offset %#x (was %#x, 
> writing %#x)\n",

"val" is uninitialized and used in this debug message.

To avoid adding another parameter and changing unrelated lines below,
what about:

force = pdev->header_type == PCI_HEADER_TYPE_BRIDGE &&
0x24 <= start && start <= 0x2c;

(or start == 0x28 || start == 0x24 || start == 0x28 if you want to be
very explicit). This would reduce in a smaller diff for this temporary
solution.

Kind regards,
Peter

> @@ -1313,25 +1315,34 @@ static void pci_restore_config_dword(struct pci_dev 
> *pdev, int offset,
>  }
>  
>  static void pci_restore_config_space_range(struct pci_dev *pdev,
> -int start, int end, int retry)
> +int start, int end, int retry,
> +bool force)
>  {
>   int index;
>  
>   for (index = end; index >= start; index--)
>   pci_restore_config_dword(pdev, 4 * index,
>pdev->saved_config_space[index],
> -  retry);
> +  retry, force);
>  }
>  
>  static void pci_restore_config_space(str

Re: [Nouveau] [PATCH v3] PCI: Reprogram bridge prefetch registers on resume

2018-09-13 Thread Peter Wu
On Thu, Sep 13, 2018 at 08:52:29AM +0200, Rafael J. Wysocki wrote:
> On Thu, Sep 13, 2018 at 5:37 AM Daniel Drake  wrote:
> >
> > On 38+ Intel-based Asus products, the nvidia GPU becomes unusable
> > after S3 suspend/resume. The affected products include multiple
> > generations of nvidia GPUs and Intel SoCs. After resume, nouveau logs
> > many errors such as:
> >
> > fifo: fault 00 [READ] at 00555000 engine 00 [GR] client 04
> >   [HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
> > DRM: failed to idle channel 0 [DRM]
> >
> > Similarly, the nvidia proprietary driver also fails after resume
> > (black screen, 100% CPU usage in Xorg process). We shipped a sample
> > to Nvidia for diagnosis, and their response indicated that it's a
> > problem with the parent PCI bridge (on the Intel SoC), not the GPU.
> >
> > Runtime suspend/resume works fine, only S3 suspend is affected.
> >
> > We found a workaround: on resume, rewrite the Intel PCI bridge
> > 'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In
> > the cases that I checked, this register has value 0 and we just have to
> > rewrite that value.
> >
> > Linux already saves and restores PCI config space during suspend/resume,
> > but this register was being skipped because upon resume, it already
> > has value 0 (the correct, pre-suspend value).
> >
> > Intel appear to have previously acknowledged this behaviour and the
> > requirement to rewrite this register.
> > https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23
> >
> > Based on that, rewrite the prefetch register values even when that
> > appears unnecessary.
> >
> > We have confirmed this solution on all the affected models we have
> > in-hands (X542UQ, UX533FD, X530UN, V272UN).
> >
> > Additionally, this solves an issue where r8169 MSI-X interrupts were
> > broken after S3 suspend/resume on Asus X441UAR. This issue was recently
> > worked around in commit 7bb05b85bc2d ("r8169: don't use MSI-X on
> > RTL8106e"). It also fixes the same issue on RTL6186evl/8111evl on an
> > Aimfor-tech laptop that we had not yet patched. I suspect it will also
> > fix the issue that was worked around in commit 7c53a722459c ("r8169:
> > don't use MSI-X on RTL8168g").
> >
> > Thomas Martitz reports that this change also solves an issue where
> > the AMD Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive
> > after S3 suspend/resume.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
> > Signed-off-by: Daniel Drake 
> 
> Still
> 
> Reviewed-by: Rafael J. Wysocki 

Reviewed-By: Peter Wu 

> > ---
> >  drivers/pci/pci.c | 25 +
> >  1 file changed, 17 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 29ff9619b5fa..5d58220b6997 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -1289,12 +1289,12 @@ int pci_save_state(struct pci_dev *dev)
> >  EXPORT_SYMBOL(pci_save_state);
> >
> >  static void pci_restore_config_dword(struct pci_dev *pdev, int offset,
> > -u32 saved_val, int retry)
> > +u32 saved_val, int retry, bool force)
> >  {
> > u32 val;
> >
> > pci_read_config_dword(pdev, offset, &val);
> > -   if (val == saved_val)
> > +   if (!force && val == saved_val)
> > return;
> >
> > for (;;) {
> > @@ -1313,25 +1313,34 @@ static void pci_restore_config_dword(struct pci_dev 
> > *pdev, int offset,
> >  }
> >
> >  static void pci_restore_config_space_range(struct pci_dev *pdev,
> > -  int start, int end, int retry)
> > +  int start, int end, int retry,
> > +  bool force)
> >  {
> > int index;
> >
> > for (index = end; index >= start; index--)
> > pci_restore_config_dword(pdev, 4 * index,
> >  pdev->saved_config_space[index],
> > -retry);
> > +retry, force);
> >  }
> >
> >  static void pci_restore_config_space(struct pci_dev *pdev)
> >  {
> > if (pdev->hdr_type == PCI_HEADER_TYPE_NORMAL) {
> > -   pci_restore_config_

[Nouveau] lockdep splat while exiting PRIME

2014-06-08 Thread Peter Wu
Hi,

While trying PRIME, I got a lockdep warning after exiting glxgears. Is
it harmful? The command was:

DRI_PRIME=1 glxgears

Offload provider is a GT425M (NVC0), output sink is an Intel i5-460M.

Kind regards,
Peter

dmesg:
=
[ INFO: possible recursive locking detected ]
3.15.0-rc8-custom-00058-gd2cfd31 #1 Tainted: G   O 
-
X/25827 is trying to acquire lock:
 (&dev->struct_mutex){+.+.+.}, at: [] 
i915_gem_unmap_dma_buf+0x36/0xd0 [i915]

but task is already holding lock:
 (&dev->struct_mutex){+.+.+.}, at: [] 
drm_gem_object_handle_unreference_unlocked+0x105/0x130 [drm]

other info that might help us debug this:
 Possible unsafe locking scenario:
   CPU0
   
  lock(&dev->struct_mutex);
  lock(&dev->struct_mutex);

 *** DEADLOCK ***
 May be due to missing lock nesting notation
1 lock held by X/25827:
 #0:  (&dev->struct_mutex){+.+.+.}, at: [] 
drm_gem_object_handle_unreference_unlocked+0x105/0x130 [drm]

stack backtrace:
CPU: 1 PID: 25827 Comm: X Tainted: G   O  
3.15.0-rc8-custom-00058-gd2cfd31 #1
Hardware name: CLEVO CO.B7130   
/B7130   , BIOS 6.00 08/27/2010
 822588a0 880230767ae0 815f14da 880226594260
 880230767bb0 810a1461 30767bc0 880226594288
 880230767b00 880226594ae0 00464232 0001
Call Trace:
 [] dump_stack+0x4e/0x7a
 [] __lock_acquire+0x19d1/0x1ab0
 [] lock_acquire+0x95/0x130
 [] ? i915_gem_unmap_dma_buf+0x36/0xd0 [i915]
 [] ? i915_gem_unmap_dma_buf+0x36/0xd0 [i915]
 [] mutex_lock_nested+0x65/0x400
 [] ? i915_gem_unmap_dma_buf+0x36/0xd0 [i915]
 [] i915_gem_unmap_dma_buf+0x36/0xd0 [i915]
 [] dma_buf_unmap_attachment+0x4c/0x70
 [] drm_prime_gem_destroy+0x22/0x40 [drm]
 [] nouveau_gem_object_del+0x3e/0x60 [nouveau]
 [] drm_gem_object_free+0x2a/0x40 [drm]
 [] drm_gem_object_handle_unreference_unlocked+0x128/0x130 
[drm]
 [] drm_gem_handle_delete+0xba/0x110 [drm]
 [] drm_gem_close_ioctl+0x25/0x30 [drm]
 [] drm_ioctl+0x1e0/0x5f0 [drm]
 [] ? drm_gem_handle_create+0x40/0x40 [drm]
 [] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [] ? trace_hardirqs_on_caller+0x15d/0x200
 [] ? trace_hardirqs_on+0xd/0x10
 [] ? _raw_spin_unlock_irqrestore+0x42/0x80
 [] nouveau_drm_ioctl+0x65/0xa0 [nouveau]
 [] do_vfs_ioctl+0x2f0/0x4f0
 [] ? __fget+0xac/0xf0
 [] ? __fget+0x5/0xf0
 [] SyS_ioctl+0x81/0xa0
 [] system_call_fastpath+0x16/0x1b
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH] drm/nouveau: check function before using it

2016-05-14 Thread Peter Wu
Do not unconditionally invoke function 0x1B without checking for its
availability, it leads to an infinite loop on some firmware.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104791
Fixes: 5addcf0a5f0fad ("nouveau: add runtime PM support (v0.9)")
Signed-off-by: Peter Wu 
---
Hi,

I am not sure what exactly 0x1B is used for, it seems related to HDMI (Audio?).
Bit0 of Arg3 is "OPFL", bit 1 is "OPVL" according to some firmware.

On some laptops (P651RA) it sets "SGFL" (Switchable Graphics FLags?), on others
it has no side-effects but returns a value based on Arg3 (Clevo P155M).
A Medion P7624 stores values to "OPTF" (OPTimus Flags?) and if this flag is
true, \_SB.PCI0.PEG0.PEGP._ON will write zero to "NHDM".

While NOUVEAU_DSM_HAS_OPT is removed from this patch, I kept the flags return
value which can be useful for other probe outcomes (such as whether _PR3 is
found on the parent device).

Kinds regards,
Peter
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 41 +-
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index cdf5227..002371a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -45,6 +45,7 @@
 static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
+   bool flags_func_detected;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -58,7 +59,6 @@ bool nouveau_is_v1_dsm(void) {
 }
 
 #define NOUVEAU_DSM_HAS_MUX 0x1
-#define NOUVEAU_DSM_HAS_OPT 0x2
 
 #ifdef CONFIG_VGA_SWITCHEROO
 static const char nouveau_dsm_muid[] = {
@@ -110,9 +110,9 @@ static int nouveau_optimus_dsm(acpi_handle handle, int 
func, int arg, uint32_t *
  * requirements on the fourth parameter, so a private implementation
  * instead of using acpi_check_dsm().
  */
-static int nouveau_check_optimus_dsm(acpi_handle handle)
+static uint32_t nouveau_check_optimus_dsm(acpi_handle handle)
 {
-   int result;
+   uint32_t result;
 
/*
 * Function 0 returns a Buffer containing available functions.
@@ -123,9 +123,13 @@ static int nouveau_check_optimus_dsm(acpi_handle handle)
 
/*
 * ACPI Spec v4 9.14.1: if bit 0 is zero, no function is supported.
-* If the n-th bit is enabled, function n is supported
+* If the n-th bit is enabled, function n is supported.
+* Check for both bit zero and the NOUVEAU_DSM_OPTIMUS_CAPS since
+* some implementations return 0x8001 on invalid parameters.
 */
-   return result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS);
+   if (result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS))
+   return result;
+   return 0;
 }
 
 static int nouveau_dsm(acpi_handle handle, int func, int arg)
@@ -212,7 +216,7 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
-static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
+static int nouveau_dsm_pci_probe(struct pci_dev *pdev, uint32_t *optimus_funcs)
 {
acpi_handle dhandle;
int retval = 0;
@@ -228,10 +232,8 @@ static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
   1 << NOUVEAU_DSM_POWER))
retval |= NOUVEAU_DSM_HAS_MUX;
 
-   if (nouveau_check_optimus_dsm(dhandle))
-   retval |= NOUVEAU_DSM_HAS_OPT;
-
-   if (retval & NOUVEAU_DSM_HAS_OPT) {
+   *optimus_funcs = nouveau_check_optimus_dsm(dhandle);
+   if (*optimus_funcs) {
uint32_t result;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
@@ -252,7 +254,7 @@ static bool nouveau_dsm_detect(void)
struct acpi_buffer buffer = {sizeof(acpi_method_name), 
acpi_method_name};
struct pci_dev *pdev = NULL;
int has_dsm = 0;
-   int has_optimus = 0;
+   uint32_t optimus_funcs = 0;
int vga_count = 0;
bool guid_valid;
int retval;
@@ -268,30 +270,28 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
+   retval = nouveau_dsm_pci_probe(pdev, &optimus_funcs);
if (retval & NOUVEAU_DSM_HAS_MUX)
has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
+   retval = nouveau_dsm_pci_probe(pdev, &optimu

[Nouveau] [PATCH v2] drm/nouveau: check function before using it

2016-05-14 Thread Peter Wu
Do not unconditionally invoke function 0x1B without checking for its
availability, it leads to an infinite loop on some firmware.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104791
Fixes: 5addcf0a5f0fad ("nouveau: add runtime PM support (v0.9)")
Signed-off-by: Peter Wu 
---
 v2: only write optimus_funcs when an Optimus DSM is found.
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 43 ++
 1 file changed, 23 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index cdf5227..009712a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -45,6 +45,7 @@
 static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
+   bool flags_func_detected;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -58,7 +59,6 @@ bool nouveau_is_v1_dsm(void) {
 }
 
 #define NOUVEAU_DSM_HAS_MUX 0x1
-#define NOUVEAU_DSM_HAS_OPT 0x2
 
 #ifdef CONFIG_VGA_SWITCHEROO
 static const char nouveau_dsm_muid[] = {
@@ -110,9 +110,9 @@ static int nouveau_optimus_dsm(acpi_handle handle, int 
func, int arg, uint32_t *
  * requirements on the fourth parameter, so a private implementation
  * instead of using acpi_check_dsm().
  */
-static int nouveau_check_optimus_dsm(acpi_handle handle)
+static uint32_t nouveau_check_optimus_dsm(acpi_handle handle)
 {
-   int result;
+   uint32_t result;
 
/*
 * Function 0 returns a Buffer containing available functions.
@@ -123,9 +123,13 @@ static int nouveau_check_optimus_dsm(acpi_handle handle)
 
/*
 * ACPI Spec v4 9.14.1: if bit 0 is zero, no function is supported.
-* If the n-th bit is enabled, function n is supported
+* If the n-th bit is enabled, function n is supported.
+* Check for both bit zero and the NOUVEAU_DSM_OPTIMUS_CAPS since
+* some implementations return 0x8001 on invalid parameters.
 */
-   return result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS);
+   if (result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS))
+   return result;
+   return 0;
 }
 
 static int nouveau_dsm(acpi_handle handle, int func, int arg)
@@ -212,10 +216,11 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
-static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
+static int nouveau_dsm_pci_probe(struct pci_dev *pdev, uint32_t *optimus_funcs)
 {
acpi_handle dhandle;
int retval = 0;
+   uint32_t supported_funcs;
 
dhandle = ACPI_HANDLE(&pdev->dev);
if (!dhandle)
@@ -228,11 +233,10 @@ static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
   1 << NOUVEAU_DSM_POWER))
retval |= NOUVEAU_DSM_HAS_MUX;
 
-   if (nouveau_check_optimus_dsm(dhandle))
-   retval |= NOUVEAU_DSM_HAS_OPT;
-
-   if (retval & NOUVEAU_DSM_HAS_OPT) {
+   supported_funcs = nouveau_check_optimus_dsm(dhandle);
+   if (supported_funcs) {
uint32_t result;
+   *optimus_funcs = supported_funcs;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
dev_info(&pdev->dev, "optimus capabilities: %s, status %s%s\n",
@@ -252,7 +256,7 @@ static bool nouveau_dsm_detect(void)
struct acpi_buffer buffer = {sizeof(acpi_method_name), 
acpi_method_name};
struct pci_dev *pdev = NULL;
int has_dsm = 0;
-   int has_optimus = 0;
+   uint32_t optimus_funcs = 0;
int vga_count = 0;
bool guid_valid;
int retval;
@@ -268,30 +272,28 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
+   retval = nouveau_dsm_pci_probe(pdev, &optimus_funcs);
if (retval & NOUVEAU_DSM_HAS_MUX)
has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
+   retval = nouveau_dsm_pci_probe(pdev, &optimus_funcs);
if (retval & NOUVEAU_DSM_HAS_MUX)
has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
}
 
/* find the optimus DSM or the old v1 DSM */
-   if (has_optimus == 1) {
+   if (optimus_funcs) {
acpi_get_name(nouveau_dsm_priv.dhandle, ACPI_FU

Re: [Nouveau] [PATCH] gpu/nouveau/nouveau_acpi.c: Fix Type Mismatch ACPI warning

2016-05-22 Thread Peter Wu
On Fri, May 20, 2016 at 02:22:57AM +, Marcos Souza wrote:
[..]
> > > I don't know if this is the right thing to do, I just looked at
> > intel_acpi.c to check how to use/check for ACPI Package.
> > > The patch below silenced the "type mismatch" warnings, and some of the
> > "evaluated _DSM" ones.
> > >
> > > If this is not the right approach, please let me know how to fix it, I
> > don't have knowledge in ACPI, but I really want to help.
> > >
> > >  drivers/gpu/drm/nouveau/nouveau_acpi.c | 14 +-
> > >  1 file changed, 1 insertion(+), 13 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > index cdf5227..f04aef3 100644
> > > --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > @@ -73,22 +73,10 @@ static const char nouveau_op_dsm_muid[] = {
> > >
> > >  static int nouveau_optimus_dsm(acpi_handle handle, int func, int arg,
> > uint32_t *result)
> > >  {
> > > - int i;
> > >   union acpi_object *obj;
> > > - char args_buff[4];
> > > - union acpi_object argv4 = {
> > > - .buffer.type = ACPI_TYPE_BUFFER,
> > > - .buffer.length = 4,
> > > - .buffer.pointer = args_buff
> > > - };
> > > -
> > > - /* ACPI is little endian, AABBCCDD becomes {DD,CC,BB,AA} */
> > > - for (i = 0; i < 4; i++)
> > > - args_buff[i] = (arg >> i * 8) & 0xFF;
> > > -
> > >   *result = 0;
> > >   obj = acpi_evaluate_dsm_typed(handle, nouveau_op_dsm_muid,
> > 0x0100,
> > > -   func, &argv4, ACPI_TYPE_BUFFER);
> > > +   func, NULL, ACPI_TYPE_PACKAGE);

This effectively removes the fourth parameter (actually, using a zero
as fourth argument), making the function useless.

> > The last parameter you give to `acpi_evaluate_dsm_typed()` is the return
> > type
> > you expect (see [3]), which will be a buffer if func is 0, and is
> > implementation dependent otherwise (see section 9.14.1 _DSM of [4]). So you
> > don’t want to change it to ACPI_TYPE_PACKAGE. If you look at the
> > implementation
> > of `acpi_evaluate_dsm()` (which is called by `acpi_evaluate_dsm_typed()`),
> > it
> > will automatically create a package for the 4th argument, if you pass it a
> > NULL
> > pointer (see [5]).
> >
> > [3]:
> > https://github.com/torvalds/linux/blob/46c13450624e36302547a2ac3695f2350fe7ffc3/include/acpi/acpi_bus.h#L69
> > [4]: http://www.acpi.info/DOWNLOADS/ACPI_5_Errata%20A.pdf
> > [5]:
> > https://github.com/torvalds/linux/blob/46c13450624e36302547a2ac3695f2350fe7ffc3/drivers/acpi/utils.c#L628
> 
> 
> Thanks for all the links. I'll read the docs and send a new version of the
> patch when it makes more sense instead of just replacing random things.

The warning is unavoidable, most firmware expect a Buffer for the fourth
argument (counting from 0, this is Arg3). Excerpt from the DSM method on
a recent Skylake laptop (Clevo P651, but this format is found on many
other models from various manufacturers too):

If ((Arg2 == 0x1A))
{
CreateField (Arg3, 0x18, 0x02, OMPR)
CreateField (Arg3, Zero, One, FLCH)
CreateField (Arg3, One, One, DVSR)
CreateField (Arg3, 0x02, One, DVSC)

(The sections below refer to the one in the ACPI 6.1 document that can
be found at http://uefi.org/specifications.)

The first parameter for CreateField is evaluated as buffer (sec 19.6.21).
According to 19.3.5.6 (Data Types and Type Conversions) an implicit
conversion to a Buffer is only possible from an Integer and String, a
Package does not belong to the possibilities.

Note that the return value may be an integer for unsupported revision
IDs or UUIDs (like 0x8002). These should be compatible with Buffers
though as stated above and acpi_check_dsm() can handle that case, but
unfortunately sets a Package as fourth argument and can therefore not be
used in nouveau.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH 3/4] drm/nouveau/acpi: check for function 0x1B before using it

2016-05-24 Thread Peter Wu
Do not unconditionally invoke function 0x1B without checking for its
availability, it leads to an infinite loop on some firmware.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104791
Fixes: 5addcf0a5f0fad ("nouveau: add runtime PM support (v0.9)")
Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 17 -
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 71d5e6a..df9f73e 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -45,6 +45,7 @@
 static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
+   bool optimus_flags_detected;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -212,7 +213,7 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
 };
 
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, bool *has_mux,
- bool *has_opt)
+ bool *has_opt, bool *has_opt_flags)
 {
acpi_handle dhandle;
bool supports_mux;
@@ -236,6 +237,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
bool *has_mux,
nouveau_dsm_priv.dhandle = dhandle;
*has_mux = supports_mux;
*has_opt = !!optimus_funcs;
+   *has_opt_flags = optimus_funcs & (1 << NOUVEAU_DSM_OPTIMUS_FLAGS);
 
if (optimus_funcs) {
uint32_t result;
@@ -255,6 +257,7 @@ static bool nouveau_dsm_detect(void)
struct pci_dev *pdev = NULL;
bool has_mux = false;
bool has_optimus = false;
+   bool has_optimus_flags = false;
int vga_count = 0;
bool guid_valid;
bool ret = false;
@@ -269,13 +272,15 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus);
+   nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus,
+ &has_optimus_flags);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus);
+   nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus,
+ &has_optimus_flags);
}
 
/* find the optimus DSM or the old v1 DSM */
@@ -285,6 +290,7 @@ static bool nouveau_dsm_detect(void)
printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method 
%s handle\n",
acpi_method_name);
nouveau_dsm_priv.optimus_detected = true;
+   nouveau_dsm_priv.optimus_flags_detected = has_optimus_flags;
ret = true;
} else if (vga_count == 2 && has_mux && guid_valid) {
acpi_get_name(nouveau_dsm_priv.dhandle, ACPI_FULL_PATHNAME,
@@ -317,8 +323,9 @@ void nouveau_switcheroo_optimus_dsm(void)
if (!nouveau_dsm_priv.optimus_detected)
return;
 
-   nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_FLAGS,
-   0x3, &result);
+   if (nouveau_dsm_priv.optimus_flags_detected)
+   nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, 
NOUVEAU_DSM_OPTIMUS_FLAGS,
+   0x3, &result);
 
nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_CAPS,
NOUVEAU_DSM_OPTIMUS_SET_POWERDOWN, &result);
-- 
2.8.2

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH 2/4] drm/nouveau/acpi: return supported DSM functions

2016-05-24 Thread Peter Wu
Return the set of supported functions to the caller. No functional
changes.

Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 45fa9b2..71d5e6a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -107,7 +107,7 @@ static int nouveau_optimus_dsm(acpi_handle handle, int 
func, int arg, uint32_t *
  * requirements on the fourth parameter, so a private implementation
  * instead of using acpi_check_dsm().
  */
-static int nouveau_check_optimus_dsm(acpi_handle handle)
+static int nouveau_dsm_get_optimus_functions(acpi_handle handle)
 {
int result;
 
@@ -122,7 +122,9 @@ static int nouveau_check_optimus_dsm(acpi_handle handle)
 * ACPI Spec v4 9.14.1: if bit 0 is zero, no function is supported.
 * If the n-th bit is enabled, function n is supported
 */
-   return result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS);
+   if (result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS))
+   return result;
+   return 0;
 }
 
 static int nouveau_dsm(acpi_handle handle, int func, int arg)
@@ -214,7 +216,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
bool *has_mux,
 {
acpi_handle dhandle;
bool supports_mux;
-   bool supports_opt;
+   int optimus_funcs;
 
dhandle = ACPI_HANDLE(&pdev->dev);
if (!dhandle)
@@ -225,17 +227,17 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
bool *has_mux,
 
supports_mux = acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
  1 << NOUVEAU_DSM_POWER);
-   supports_opt = nouveau_check_optimus_dsm(dhandle);
+   optimus_funcs = nouveau_dsm_get_optimus_functions(dhandle);
 
/* Does not look like a Nvidia device. */
-   if (!supports_mux && !supports_opt)
+   if (!supports_mux && !optimus_funcs)
return;
 
nouveau_dsm_priv.dhandle = dhandle;
*has_mux = supports_mux;
-   *has_opt = supports_opt;
+   *has_opt = !!optimus_funcs;
 
-   if (supports_opt) {
+   if (optimus_funcs) {
uint32_t result;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
-- 
2.8.2

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH 1/4] drm/nouveau/acpi: ensure matching ACPI handle and supported functions

2016-05-24 Thread Peter Wu
Ensure that the returned set of supported DSM functions (MUX, Optimus)
match the ACPI handle that is set in nouveau_dsm_pci_probe.

As there are no machines with a MUX function on just one PCI device and
an Optimus on another, there should not be a functional impact. This
change however makes this implicit assumption more obvious.

Convert int to bool and rename has_dsm to has_mux while at it.

Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 55 ++
 1 file changed, 23 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index cdf5227..45fa9b2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -57,9 +57,6 @@ bool nouveau_is_v1_dsm(void) {
return nouveau_dsm_priv.dsm_detected;
 }
 
-#define NOUVEAU_DSM_HAS_MUX 0x1
-#define NOUVEAU_DSM_HAS_OPT 0x2
-
 #ifdef CONFIG_VGA_SWITCHEROO
 static const char nouveau_dsm_muid[] = {
0xA0, 0xA0, 0x95, 0x9D, 0x60, 0x00, 0x48, 0x4D,
@@ -212,26 +209,33 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
-static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
+static void nouveau_dsm_pci_probe(struct pci_dev *pdev, bool *has_mux,
+ bool *has_opt)
 {
acpi_handle dhandle;
-   int retval = 0;
+   bool supports_mux;
+   bool supports_opt;
 
dhandle = ACPI_HANDLE(&pdev->dev);
if (!dhandle)
-   return false;
+   return;
 
if (!acpi_has_method(dhandle, "_DSM"))
-   return false;
+   return;
+
+   supports_mux = acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
+ 1 << NOUVEAU_DSM_POWER);
+   supports_opt = nouveau_check_optimus_dsm(dhandle);
 
-   if (acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
-  1 << NOUVEAU_DSM_POWER))
-   retval |= NOUVEAU_DSM_HAS_MUX;
+   /* Does not look like a Nvidia device. */
+   if (!supports_mux && !supports_opt)
+   return;
 
-   if (nouveau_check_optimus_dsm(dhandle))
-   retval |= NOUVEAU_DSM_HAS_OPT;
+   nouveau_dsm_priv.dhandle = dhandle;
+   *has_mux = supports_mux;
+   *has_opt = supports_opt;
 
-   if (retval & NOUVEAU_DSM_HAS_OPT) {
+   if (supports_opt) {
uint32_t result;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
@@ -240,10 +244,6 @@ static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " 
: "",
 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec 
supported" : "");
}
-   if (retval)
-   nouveau_dsm_priv.dhandle = dhandle;
-
-   return retval;
 }
 
 static bool nouveau_dsm_detect(void)
@@ -251,11 +251,10 @@ static bool nouveau_dsm_detect(void)
char acpi_method_name[255] = { 0 };
struct acpi_buffer buffer = {sizeof(acpi_method_name), 
acpi_method_name};
struct pci_dev *pdev = NULL;
-   int has_dsm = 0;
-   int has_optimus = 0;
+   bool has_mux = false;
+   bool has_optimus = false;
int vga_count = 0;
bool guid_valid;
-   int retval;
bool ret = false;
 
/* lookup the MXM GUID */
@@ -268,32 +267,24 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
-   if (retval & NOUVEAU_DSM_HAS_MUX)
-   has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
+   nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
-   if (retval & NOUVEAU_DSM_HAS_MUX)
-   has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
+   nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus);
}
 
/* find the optimus DSM or the old v1 DSM */
-   if (has_optimus == 1) {
+   if (has_optimus) {
acpi_get_name(nouveau_dsm_priv.dhandle, ACPI_FULL_PATHNAME,
&buffer);
printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method 
%s handle\n",
acpi_method_name);
nouveau_dsm_priv.optimus_

[Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-05-24 Thread Peter Wu
Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
can be runtime-suspended which disables power resources via ACPI. This
is incompatible with DSM, resulting in a GPU device which is still in D3
and locks up the kernel on resume.

Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
debugger trace) and stop using the DSM functions for D3cold when power
resources are available on the parent PCIe port.

 [1]: 
https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold

Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 34 ++
 1 file changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index df9f73e..e469df7 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
bool optimus_flags_detected;
+   bool optimus_skip_dsm;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -212,8 +213,26 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
+/* Firmware supporting Windows 8 or later do not use _DSM to put the device 
into
+ * D3cold, they instead rely on disabling power resources on the parent. */
+static bool nouveau_pr3_present(struct pci_dev *pdev)
+{
+   struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
+   struct acpi_device *ad;
+
+   if (!parent_pdev)
+   return false;
+
+   ad = ACPI_COMPANION(&parent_pdev->dev);
+   if (!ad)
+   return false;
+
+   return ad->power.flags.power_resources;
+}
+
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, bool *has_mux,
- bool *has_opt, bool *has_opt_flags)
+ bool *has_opt, bool *has_opt_flags,
+ bool *has_power_resources)
 {
acpi_handle dhandle;
bool supports_mux;
@@ -238,6 +257,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
bool *has_mux,
*has_mux = supports_mux;
*has_opt = !!optimus_funcs;
*has_opt_flags = optimus_funcs & (1 << NOUVEAU_DSM_OPTIMUS_FLAGS);
+   *has_power_resources = false;
 
if (optimus_funcs) {
uint32_t result;
@@ -247,6 +267,8 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
bool *has_mux,
 (result & OPTIMUS_ENABLED) ? "enabled" : "disabled",
 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " 
: "",
 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec 
supported" : "");
+
+   *has_power_resources = nouveau_pr3_present(pdev);
}
 }
 
@@ -258,6 +280,7 @@ static bool nouveau_dsm_detect(void)
bool has_mux = false;
bool has_optimus = false;
bool has_optimus_flags = false;
+   bool has_power_resources = false;
int vga_count = 0;
bool guid_valid;
bool ret = false;
@@ -273,14 +296,14 @@ static bool nouveau_dsm_detect(void)
vga_count++;
 
nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus,
- &has_optimus_flags);
+ &has_optimus_flags, &has_power_resources);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus,
- &has_optimus_flags);
+ &has_optimus_flags, &has_power_resources);
}
 
/* find the optimus DSM or the old v1 DSM */
@@ -289,8 +312,11 @@ static bool nouveau_dsm_detect(void)
&buffer);
printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method 
%s handle\n",
acpi_method_name);
+   if (has_power_resources)
+   pr_info("nouveau: detected PR support, will not use 
DSM\n");
nouveau_dsm_priv.optimus_detected = true;
nouveau_dsm_priv.optimus_flags_detected = has_optimus_flags;
+   nouveau_dsm_priv.optimus_skip_dsm = has_power_resources;
ret = true;
} else if (vga_count == 2 && has_mux && guid_valid) {
acpi_get_name(nouveau_dsm_priv.dhandle, ACPI_FULL_PATHNAME,
@@ -320,7 +346,7 @@ void nouveau_register_dsm_handler(void)
 void nouveau_switcheroo_optimus_dsm(void)
 {
  

[Nouveau] [PATCH 0/4] nouveau fixes for RPM/Optimus-related hangs

2016-05-24 Thread Peter Wu
Hi,

Here are two patches to fix an issue reported on kernel bugzilla (infinite loop
due to unchecked function) and a more important fix to fix hanging Optimus
machines when runtime PM is enabled (with pm/pci patches).

An older (obsolete) patch for the first issue was tested by the reporter:
https://bugzilla.kernel.org/show_bug.cgi?id=104791#c11
(it is replaced by "check for function 0x1B before using it").

The second issue will occur when:
 - A modern Optimus laptop is in use (designed for Windows 8 or newer).
 - nouveau runtime PM is enabled (1 or the default -1).
 - The patch "PCI: Add runtime PM support for PCIe ports" from Mika is pulled
   into v4.7 (or v4.8[1]?) via the pci/pm branch,
   
https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?h=pci/pm&id=8b71f5652eeac561acf883da01ab4810f763ee42
(see also the discussion for "[PATCH] PCI: Power on bridges before scanning new
devices" at http://article.gmane.org/gmane.linux.power-management.general/76411)

The first two patches are just refactoring to reduce code duplication (and
scratch an itch) and make the following patches possible. The next two patches
fix the problems reported above.

I intend to get these patches in 4.7 (or the first version where pci/pm gets
merged) to avoid a lockup when runpm is enabled. Note:
 - If the fourth patch is merged before/without Mika's PCIe port patch, then
   those modern Optimus machines above will not be put into D3cold.
 - If the fourth patch is not merged (or merged after Mika's patch), then under
   the above conditions the affected machine can lock up.
 - The three other patches are unrelated to this issue and can safely be merged.

Tested with:
 - Linux v4.6 + pci/pm + these four patches
 - Hardware: Clevo P651RA with acpi_osi="!Windows 2015" (the latter is a
   workaround for another PCIe issue).
 - Card is asleep, woke up with lspci, waited a bit and retried/suspended:
   - # lspci -nn >/dev/null; sleep 5
   - # lspci -nn >/dev/null; sleep 5; systemctl suspend
   - # lspci -nn >/dev/null; systemctl suspend

Kind regards,
Peter

 [1]: https://lkml.kernel.org/r/20160524211309.gh1...@lahna.fi.intel.com

Peter Wu (4):
  drm/nouveau/acpi: ensure matching ACPI handle and supported functions
  drm/nouveau/acpi: return supported DSM functions
  drm/nouveau/acpi: check for function 0x1B before using it
  drm/nouveau/acpi: fix lockup with PCIe runtime PM

 drivers/gpu/drm/nouveau/nouveau_acpi.c | 100 +
 1 file changed, 63 insertions(+), 37 deletions(-)

-- 
2.8.2

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload

2016-05-26 Thread Peter Wu
On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote:
> nouveau_drm_load() calls pm_runtime_put() if nouveau_runtime_pm != 0,
> but nouveau_drm_unload() calls pm_runtime_get_sync() unconditionally.
> We therefore leak a runtime pm ref whenever nouveau is loaded with
> runpm=0 and then unloaded. The GPU will subsequently never runtime
> suspend even if nouveau is loaded again with runpm=1.
> 
> Fix by taking the runtime pm ref under the same condition that it was
> released on driver load.
> 
> Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)")
> Cc: Dave Airlie 
> Reported-by: Karol Herbst 
> Tested-by: Karol Herbst 
> Signed-off-by: Lukas Wunner 

Looks good, I tested this scenario:

ru(){ cat /sys/bus/pci/devices/\:01:00.0/power/runtime_usage;}
ru # reports 1
modprobe nouveau runpm=0
ru # reports 2
rmmod nouveau
ru # reports 1

Without runpm=0 the count drops to 0 in the second step and stays 0 in
the third step. After applying patch 2/9, this correctly reports 1 as
expected (this is the same as manually setting power/control to on).

Peter

> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 11f8dd9..faf7438 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -498,7 +498,10 @@ nouveau_drm_unload(struct drm_device *dev)
>  {
>   struct nouveau_drm *drm = nouveau_drm(dev);
>  
> - pm_runtime_get_sync(dev->dev);
> + if (nouveau_runtime_pm != 0) {
> + pm_runtime_get_sync(dev->dev);
> + }
> +
>   nouveau_fbcon_fini(dev);
>   nouveau_accel_fini(drm);
>   nouveau_hwmon_fini(dev);
> -- 
> 2.8.1
> 
> ___
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-05-27 Thread Peter Wu
On Wed, May 25, 2016 at 04:55:35PM +0300, Mika Westerberg wrote:
> On Wed, May 25, 2016 at 12:53:01AM +0200, Peter Wu wrote:
> > Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
> > can be runtime-suspended which disables power resources via ACPI. This
> > is incompatible with DSM, resulting in a GPU device which is still in D3
> > and locks up the kernel on resume.
> > 
> > Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
> > debugger trace) and stop using the DSM functions for D3cold when power
> > resources are available on the parent PCIe port.
> > 
> >  [1]: 
> > https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold
> > 
> > Signed-off-by: Peter Wu 
> > ---
> >  drivers/gpu/drm/nouveau/nouveau_acpi.c | 34 
> > ++
> >  1 file changed, 30 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
> > b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > index df9f73e..e469df7 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > @@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
> > bool dsm_detected;
> > bool optimus_detected;
> > bool optimus_flags_detected;
> > +   bool optimus_skip_dsm;
> > acpi_handle dhandle;
> > acpi_handle rom_handle;
> >  } nouveau_dsm_priv;
> > @@ -212,8 +213,26 @@ static const struct vga_switcheroo_handler 
> > nouveau_dsm_handler = {
> > .get_client_id = nouveau_dsm_get_client_id,
> >  };
> >  
> > +/* Firmware supporting Windows 8 or later do not use _DSM to put the 
> > device into
> > + * D3cold, they instead rely on disabling power resources on the parent. */
> > +static bool nouveau_pr3_present(struct pci_dev *pdev)
> > +{
> > +   struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
> > +   struct acpi_device *ad;
> 
> Nit: please call this adev instead of ad.

Will do.

> > +
> > +   if (!parent_pdev)
> > +   return false;
> > +
> > +   ad = ACPI_COMPANION(&parent_pdev->dev);
> > +   if (!ad)
> > +   return false;
> > +
> > +   return ad->power.flags.power_resources;
> 
> Is this sufficient to tell if the parent device has _PR3? I thought it
> returns true if it has power resources in general, not necessarily _PR3.
> 
> Otherwise this looks okay to me.

It is indeed set whenever there is any _PRx method. I wonder if it is
appropriate to access fields directly like this, perhaps this would be
more accurate (based on device_pm.c):

/* Check whether the _PR3 method is available. */
return adev->power.states[ACPI_STATE_D3_COLD].flags.valid;

I am also considering adding a check in case the pcieport driver does
not support D3cold via runtime PM, what do you think of this?

if (!parent_pdev)
return false;
/* If the PCIe port does not support D3cold via runtime PM, allow a
 * fallback to the Optimus DSM method to put the device in D3cold. */
if (parent_pdev->no_d3cold)
return false;

This is needed to avoid the regression reported in the cover letter, but
also allows pre-2015 systems to (still) have the D3cold possibility.


Out of curiosity I looked up an pre-2015 laptop (found Acer V5-573G,
apparently from November 2013, Windows 8.1) and extracted the ACPI
tables from the BIOS images. BIOS 2.28 (2014/05/13) introduces support
for power resources on the parent devicea(\_SB.PCI0.PEG0._PR3 and a
related NVP3 device) when _OSI("Windows 2013") is true. (This is added
as alternative for the old DSM interface.)

Maybe 2014 is also an appropriate cutoff date? I wonder if it is
feasible to detect firmware use of _OSI("Windows 2013") and use that
instead of the BIOS year.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-05-27 Thread Peter Wu
On Fri, May 27, 2016 at 02:01:39PM +0100, Emil Velikov wrote:
> Hi Peter,
> 
> On 24 May 2016 at 23:53, Peter Wu  wrote:
> > Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
> > can be runtime-suspended which disables power resources via ACPI. This
> > is incompatible with DSM, resulting in a GPU device which is still in D3
> > and locks up the kernel on resume.
> >
> > Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
> > debugger trace) and stop using the DSM functions for D3cold when power
> > resources are available on the parent PCIe port.
> >
> >  [1]: 
> > https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold
> >
> > Signed-off-by: Peter Wu 
> > ---
> >  drivers/gpu/drm/nouveau/nouveau_acpi.c | 34 
> > ++
> >  1 file changed, 30 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
> > b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > index df9f73e..e469df7 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > @@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
> > bool dsm_detected;
> > bool optimus_detected;
> > bool optimus_flags_detected;
> > +   bool optimus_skip_dsm;
> > acpi_handle dhandle;
> > acpi_handle rom_handle;
> >  } nouveau_dsm_priv;
> > @@ -212,8 +213,26 @@ static const struct vga_switcheroo_handler 
> > nouveau_dsm_handler = {
> > .get_client_id = nouveau_dsm_get_client_id,
> >  };
> >
> > +/* Firmware supporting Windows 8 or later do not use _DSM to put the 
> > device into
> > + * D3cold, they instead rely on disabling power resources on the parent. */
> > +static bool nouveau_pr3_present(struct pci_dev *pdev)
> > +{
> > +   struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
> > +   struct acpi_device *ad;
> > +
> > +   if (!parent_pdev)
> > +   return false;
> > +
> > +   ad = ACPI_COMPANION(&parent_pdev->dev);
> > +   if (!ad)
> > +   return false;
> > +
> > +   return ad->power.flags.power_resources;
> > +}
> > +
> >  static void nouveau_dsm_pci_probe(struct pci_dev *pdev, bool *has_mux,
> > - bool *has_opt, bool *has_opt_flags)
> > + bool *has_opt, bool *has_opt_flags,
> > + bool *has_power_resources)
> >  {
> > acpi_handle dhandle;
> > bool supports_mux;
> > @@ -238,6 +257,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
> > bool *has_mux,
> > *has_mux = supports_mux;
> > *has_opt = !!optimus_funcs;
> > *has_opt_flags = optimus_funcs & (1 << NOUVEAU_DSM_OPTIMUS_FLAGS);
> > +   *has_power_resources = false;
> >
> > if (optimus_funcs) {
> > uint32_t result;
> > @@ -247,6 +267,8 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
> > bool *has_mux,
> >  (result & OPTIMUS_ENABLED) ? "enabled" : 
> > "disabled",
> >  (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic 
> > power, " : "",
> >  (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios 
> > codec supported" : "");
> > +
> > +   *has_power_resources = nouveau_pr3_present(pdev);
> > }
> >  }
> >
> > @@ -258,6 +280,7 @@ static bool nouveau_dsm_detect(void)
> > bool has_mux = false;
> > bool has_optimus = false;
> > bool has_optimus_flags = false;
> > +   bool has_power_resources = false;
> > int vga_count = 0;
> > bool guid_valid;
> > bool ret = false;
> > @@ -273,14 +296,14 @@ static bool nouveau_dsm_detect(void)
> > vga_count++;
> >
> > nouveau_dsm_pci_probe(pdev, &has_mux, &has_optimus,
> > - &has_optimus_flags);
> > + &has_optimus_flags, 
> > &has_power_resources);
> > }
> >
> > while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != 
> > NULL) {
> > vga_count++;
> >
> > nouveau_dsm_pci_probe

Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-05-30 Thread Peter Wu
On Mon, May 30, 2016 at 11:48:34AM +0100, Emil Velikov wrote:
> On 27 May 2016 at 22:31, Peter Wu  wrote:
> > On Fri, May 27, 2016 at 02:01:39PM +0100, Emil Velikov wrote:
> >> Hi Peter,
> >>
> >> On 24 May 2016 at 23:53, Peter Wu  wrote:
> >> > Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
> >> > can be runtime-suspended which disables power resources via ACPI. This
> >> > is incompatible with DSM, resulting in a GPU device which is still in D3
> >> > and locks up the kernel on resume.
> >> >
> >> > Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
> >> > debugger trace) and stop using the DSM functions for D3cold when power
> >> > resources are available on the parent PCIe port.
> >> >
> >> >  [1]: 
> >> > https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold
> >> >
> >> > Signed-off-by: Peter Wu 
> >> > ---
> >> >  drivers/gpu/drm/nouveau/nouveau_acpi.c | 34 
> >> > ++
> >> >  1 file changed, 30 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
> >> > b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> >> > index df9f73e..e469df7 100644
> >> > --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> >> > +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> >> > @@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
> >> > bool dsm_detected;
> >> > bool optimus_detected;
> >> > bool optimus_flags_detected;
> >> > +   bool optimus_skip_dsm;
> >> > acpi_handle dhandle;
> >> > acpi_handle rom_handle;
> >> >  } nouveau_dsm_priv;
> >> > @@ -212,8 +213,26 @@ static const struct vga_switcheroo_handler 
> >> > nouveau_dsm_handler = {
> >> > .get_client_id = nouveau_dsm_get_client_id,
> >> >  };
> >> >
> >> > +/* Firmware supporting Windows 8 or later do not use _DSM to put the 
> >> > device into
> >> > + * D3cold, they instead rely on disabling power resources on the 
> >> > parent. */
> >> > +static bool nouveau_pr3_present(struct pci_dev *pdev)
> >> > +{
> >> > +   struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
> >> > +   struct acpi_device *ad;
> >> > +
> >> > +   if (!parent_pdev)
> >> > +   return false;
> >> > +
> >> > +   ad = ACPI_COMPANION(&parent_pdev->dev);
> >> > +   if (!ad)
> >> > +   return false;
> >> > +
> >> > +   return ad->power.flags.power_resources;
> >> > +}
> >> > +
> >> >  static void nouveau_dsm_pci_probe(struct pci_dev *pdev, bool *has_mux,
> >> > - bool *has_opt, bool *has_opt_flags)
> >> > + bool *has_opt, bool *has_opt_flags,
> >> > + bool *has_power_resources)
> >> >  {
> >> > acpi_handle dhandle;
> >> > bool supports_mux;
> >> > @@ -238,6 +257,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev 
> >> > *pdev, bool *has_mux,
> >> > *has_mux = supports_mux;
> >> > *has_opt = !!optimus_funcs;
> >> > *has_opt_flags = optimus_funcs & (1 << 
> >> > NOUVEAU_DSM_OPTIMUS_FLAGS);
> >> > +   *has_power_resources = false;
> >> >
> >> > if (optimus_funcs) {
> >> > uint32_t result;
> >> > @@ -247,6 +267,8 @@ static void nouveau_dsm_pci_probe(struct pci_dev 
> >> > *pdev, bool *has_mux,
> >> >  (result & OPTIMUS_ENABLED) ? "enabled" : 
> >> > "disabled",
> >> >  (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic 
> >> > power, " : "",
> >> >  (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios 
> >> > codec supported" : "");
> >> > +
> >> > +   *has_power_resources = nouveau_pr3_present(pdev);
> >> > }
> >> >  }
> >> >
> >> > @@ -258,6 +280,7 @@ static bool

Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-05-30 Thread Peter Wu
On Mon, May 30, 2016 at 12:57:09PM +0300, Mika Westerberg wrote:
> +Rafael
> 
> On Fri, May 27, 2016 at 01:10:37PM +0200, Peter Wu wrote:
> > On Wed, May 25, 2016 at 04:55:35PM +0300, Mika Westerberg wrote:
> > > On Wed, May 25, 2016 at 12:53:01AM +0200, Peter Wu wrote:
> > > > Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
> > > > can be runtime-suspended which disables power resources via ACPI. This
> > > > is incompatible with DSM, resulting in a GPU device which is still in D3
> > > > and locks up the kernel on resume.
> > > > 
> > > > Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
> > > > debugger trace) and stop using the DSM functions for D3cold when power
> > > > resources are available on the parent PCIe port.
> > > > 
> > > >  [1]: 
> > > > https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold
> > > > 
> > > > Signed-off-by: Peter Wu 
> > > > ---
> > > >  drivers/gpu/drm/nouveau/nouveau_acpi.c | 34 
> > > > ++
> > > >  1 file changed, 30 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
> > > > b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > > index df9f73e..e469df7 100644
> > > > --- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > > +++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
> > > > @@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
> > > > bool dsm_detected;
> > > > bool optimus_detected;
> > > > bool optimus_flags_detected;
> > > > +   bool optimus_skip_dsm;
> > > > acpi_handle dhandle;
> > > > acpi_handle rom_handle;
> > > >  } nouveau_dsm_priv;
> > > > @@ -212,8 +213,26 @@ static const struct vga_switcheroo_handler 
> > > > nouveau_dsm_handler = {
> > > > .get_client_id = nouveau_dsm_get_client_id,
> > > >  };
> > > >  
> > > > +/* Firmware supporting Windows 8 or later do not use _DSM to put the 
> > > > device into
> > > > + * D3cold, they instead rely on disabling power resources on the 
> > > > parent. */
> > > > +static bool nouveau_pr3_present(struct pci_dev *pdev)
> > > > +{
> > > > +   struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
> > > > +   struct acpi_device *ad;
> > > 
> > > Nit: please call this adev instead of ad.
> > 
> > Will do.
> > 
> > > > +
> > > > +   if (!parent_pdev)
> > > > +   return false;
> > > > +
> > > > +   ad = ACPI_COMPANION(&parent_pdev->dev);
> > > > +   if (!ad)
> > > > +   return false;
> > > > +
> > > > +   return ad->power.flags.power_resources;
> > > 
> > > Is this sufficient to tell if the parent device has _PR3? I thought it
> > > returns true if it has power resources in general, not necessarily _PR3.
> > > 
> > > Otherwise this looks okay to me.
> > 
> > It is indeed set whenever there is any _PRx method. I wonder if it is
> > appropriate to access fields directly like this, perhaps this would be
> > more accurate (based on device_pm.c):
> > 
> > /* Check whether the _PR3 method is available. */
> > return adev->power.states[ACPI_STATE_D3_COLD].flags.valid;
> > 
> > I am also considering adding a check in case the pcieport driver does
> > not support D3cold via runtime PM, what do you think of this?
> > 
> > if (!parent_pdev)
> > return false;
> > /* If the PCIe port does not support D3cold via runtime PM, allow a
> >  * fallback to the Optimus DSM method to put the device in D3cold. */
> > if (parent_pdev->no_d3cold)
> > return false;
> > 
> > This is needed to avoid the regression reported in the cover letter, but
> > also allows pre-2015 systems to (still) have the D3cold possibility.
> 
> The _DSM method with 0 as index parameter should return a bit field
> telling which functions are supported. Sane BIOS disables that
> particular function if it detects Windows 8 and newer. Have you checked
> if that's the case?
> 
> Then you can call _DSM only if it is supported and otherwise expect the
> parent device's power resourc

Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-05-30 Thread Peter Wu
On Mon, May 30, 2016 at 04:09:09PM +0300, Mika Westerberg wrote:
...
> > > > > > +
> > > > > > +   if (!parent_pdev)
> > > > > > +   return false;
> > > > > > +
> > > > > > +   ad = ACPI_COMPANION(&parent_pdev->dev);
> > > > > > +   if (!ad)
> > > > > > +   return false;
> > > > > > +
> > > > > > +   return ad->power.flags.power_resources;
> > > > > 
> > > > > Is this sufficient to tell if the parent device has _PR3? I thought it
> > > > > returns true if it has power resources in general, not necessarily 
> > > > > _PR3.
> > > > > 
> > > > > Otherwise this looks okay to me.
> > > > 
> > > > It is indeed set whenever there is any _PRx method. I wonder if it is
> > > > appropriate to access fields directly like this, perhaps this would be
> > > > more accurate (based on device_pm.c):
> > > > 
> > > > /* Check whether the _PR3 method is available. */
> > > > return adev->power.states[ACPI_STATE_D3_COLD].flags.valid;
> > > > 
> > > > I am also considering adding a check in case the pcieport driver does
> > > > not support D3cold via runtime PM, what do you think of this?
> > > > 
> > > > if (!parent_pdev)
> > > > return false;
> > > > /* If the PCIe port does not support D3cold via runtime PM, allow a
> > > >  * fallback to the Optimus DSM method to put the device in D3cold. 
> > > > */
> > > > if (parent_pdev->no_d3cold)
> > > > return false;
> > > > 
> > > > This is needed to avoid the regression reported in the cover letter, but
> > > > also allows pre-2015 systems to (still) have the D3cold possibility.
> > > 
> > > The _DSM method with 0 as index parameter should return a bit field
> > > telling which functions are supported. Sane BIOS disables that
> > > particular function if it detects Windows 8 and newer. Have you checked
> > > if that's the case?
> > > 
> > > Then you can call _DSM only if it is supported and otherwise expect the
> > > parent device's power resources to turn off power when runtime
> > > suspended.
> > 
> > The _DSM methods (for the Nvidia device) are often still included and
> > functions are reported as supported. I guess that vendors just check
> > whether it is working and do not bother removing legacy functions. The
> > Acer case below seems exceptional.
> > 
> > I suggested the no_d3cold check such that DSM can still be called even
> > though the runtime PM on the PCIe port does nothing.
> 
> Somehow it does not feel right to poke parent device's fields directly.
> 
> What if you just check if it has the method like:
> 
>   bool no_dsm = acpi_has_method(parent_adev->handle, "_PR3");
> 
> That should follow what Windows is doing.

Checking for _PR3 was the intention, but it seems that the ACPI core
does not really store it somewhere. Your check should be simple enough,
I'll use that in the next version.

Do you have any suggestions for the case where the pcieport driver
refuses to put the bridge in D3 (because the BIOS is too old)? In that
case the nouveau driver needs to fallback to the DSM method (but not
when runtime PM is deliberately disabled by writing control=on).
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload

2016-05-30 Thread Peter Wu
On Sun, May 29, 2016 at 05:50:06PM +0200, Lukas Wunner wrote:
> Hi Peter,
> 
> On Fri, May 27, 2016 at 03:07:33AM +0200, Peter Wu wrote:
> > On Tue, May 24, 2016 at 06:03:27PM +0200, Lukas Wunner wrote:
> > > nouveau_drm_load() calls pm_runtime_put() if nouveau_runtime_pm != 0,
> > > but nouveau_drm_unload() calls pm_runtime_get_sync() unconditionally.
> > > We therefore leak a runtime pm ref whenever nouveau is loaded with
> > > runpm=0 and then unloaded. The GPU will subsequently never runtime
> > > suspend even if nouveau is loaded again with runpm=1.
> > > 
> > > Fix by taking the runtime pm ref under the same condition that it was
> > > released on driver load.
> > > 
> > > Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)")
> > > Cc: Dave Airlie 
> > > Reported-by: Karol Herbst 
> > > Tested-by: Karol Herbst 
> > > Signed-off-by: Lukas Wunner 
> > 
> > Looks good, I tested this scenario:
> > 
> > ru(){ cat /sys/bus/pci/devices/\:01:00.0/power/runtime_usage;}
> > ru # reports 1
> > modprobe nouveau runpm=0
> > ru # reports 2
> > rmmod nouveau
> > ru # reports 1
> > 
> > Without runpm=0 the count drops to 0 in the second step and stays 0 in
> > the third step. After applying patch 2/9, this correctly reports 1 as
> > expected (this is the same as manually setting power/control to on).
> 
> How exactly did you reach the situation where the root port didn't wake
> up when you tried to load nouveau again? (IRC conversation this week.)

Ensure that the pci/pm patches are applied, then:

 0. Unload nouveau (I have blacklisted it for testing).
 1. Enable rpm for the root port and children (control = auto).
 2. Verify in the kernel logs that the devices are sleeping:
pcieport :00:01.0: power state changed by ACPI to D3cold
 3. (Optional, to rule out issues with delays:) Disable rpm for the
Nvidia device (control = on).
 4. modprobe nouveau.

The above test with v4.6 + 4 pci/pm patches (8b71f565) gives:

50.245795 MXM: GUID detected in BIOS
50.245948nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.GFX0._DSM] at AML address c9013b11 length 492
50.246016 ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch - 
Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95)
50.246044nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.GFX0._DSM] at AML address c9013b11 length 492
50.246110nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0.PEGP._DSM] at AML address c9018297 length 1F
50.246256 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch 
- Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95)
50.246289nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0.PEGP._DSM] at AML address c9018297 length 1F
50.246443 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type mismatch 
- Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95)
50.246457nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0.PEGP._DSM] at AML address c9018297 length 1F
50.246932 pci :01:00.0: optimus capabilities: enabled, status dynamic 
power, hda bios codec supported
50.247005 VGA switcheroo: detected Optimus DSM method \_SB_.PCI0.PEG0.PEGP 
handle
50.247084nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0.PG00._ON] at AML address c901086e length 11D
50.390140 pcieport :00:01.0: power state changed by ACPI to D0
50.491893nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0._DSW] at AML address c9010a2d length 1D
50.492285 pcieport :00:01.0: PME# disabled
50.492583 nouveau :01:00.0: unknown chipset ()
50.492687 nouveau: probe of :01:00.0 failed with error -12
50.501990nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0._S0W] at AML address c9010a8e length 2
50.502403 pcieport :00:01.0: PME# enabled
50.502601nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0._DSW] at AML address c9010a2d length 1D
50.513005nseval-0227 ns_evaluate   :  Execute method 
[\_SB.PCI0.PEG0.PG00._OFF] at AML address c9010994 length 6D
50.533258 pcieport :00:01.0: power state changed by ACPI to D3cold

(Note that this patch is not included.) When nouveau is operating
normally, I see that _PS0 is also called (which does not happen above).

If you think that mixing power resources with DSM causes this issue, I
also tried to apply my power resources work for nouveau but it gives the
same prob

Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-05-31 Thread Peter Wu
On Tue, May 31, 2016 at 11:43:56AM +0300, Mika Westerberg wrote:
> On Mon, May 30, 2016 at 06:13:51PM +0200, Peter Wu wrote:
> > Do you have any suggestions for the case where the pcieport driver
> > refuses to put the bridge in D3 (because the BIOS is too old)? In that
> > case the nouveau driver needs to fallback to the DSM method (but not
> > when runtime PM is deliberately disabled by writing control=on).
> 
> Do you know what Windows does then? I think we should do the same if
> possible.

If the BIOS is too old, then it probably does not have _PR3 objects nor
calls to _OSI("Windows 2013"). See below.

> If user has disabled runtime PM from the root port deliberately, there
> might be good reason to do so. Why we want to fallback to something that
> could cause problems? I mean _DSM on such systems is probably not that
> much tested because everybody runs Windows 8+ and using standard ACPI
> power resources.

I agree that when runtime PM on the root port is disabled (control=on),
then there should be no fallback to DSM. For devices without _PR3 it is
clear that DSM will always be used (if available).

In other cases (where _PR3 is available) we can distinguish:
 - pre-Windows 8 machines. I have never seen this combination. Firmware
   writers seems to prefer sticking to reference code which did not use
   power resources before.
 - Machines targeting Windows 8 or newer. (Note that there exist
   machines with Windows 8 support that do not have _PR3, DSM is used in
   that case.)

If Windows 7 is running on a Windows 8 machine, PR3 will not be used
anyway. If the Linux kernel claims support for Windows 8, but does not
use PR3, then we are probably approaching an untested area. So far
firmware seems fine with using *only* DSM *or* PR3, but at least my
laptop gets confused when you use both at the same time.

The latter happens on pci/pm (8b71f565) without other patches:

 1. nouveau invokes _DSM and _PS3, device is put in D3cold.
 2. pcieport driver calls PG00._OFF (PG00 is returned by _PR3).
 3. Wake up Nvidia device (e.g. by power=on).
 4. This will trigger PG00._ON (via pcieport) and _PS0 (via nouveau).
 5. Nvidia card is not really ready (observed via "restoring config
space at offset ... (was 0x, writing ...)", a soft lockup
and RCU stall after that requiring a reboot to recover).

nouveau could be patched not to invoke DSM when PR3 is detected
(proposal is ready) but will keep the device powered on in these cases:
 - nouveau is patched, but pci/pm patches are not.
 - PR3 is supported but due to the cutoff date (2015) it is not used.
 - Boot option pcie_port_pm=off.
 - runtime PM is disabled for pcieport (should be fine).


There is a wealth of acpidumps on Launchpad bug 752542
(https://bugs.launchpad.net/bugs/752542). Search for example for
comments in early 2015 or before, those will likely be machine from 2014
or before.

Interesting to see is the _PR3 method of a HP Envy TS 15 (11/20/2014):

Method (_PR3, 0, NotSerialized) {
If (\_OSI ("Windows 2013")) {
Return (Package (0x01) {
\NVP3
})
} Else {
Return (Package (0x00) {})
}
}

(Note for self: just checking for the _PR3 handle in the nouveau patch
is apparently not sufficient, it must really be evaluated.)

Other machines with _PR3:
 - Dell Inspiron 3543 (11/04/2014), comment 757.
 - Dell XPS 15 9530 (03/28/2014), comment 711.
 - Novatech 15.6 NSPIRE Laptop (01/20/2014), comment 695.
 - Lenovo ThinkPad T440p (10/27/2013), comment 659.

There were many models from 2013 without _PR3 method but still checking
for _OSI("Windows 2013"). Maybe some heuristics based on _PR3 would be
more helpful than just a cutoff date?
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 1/9] drm/nouveau: Don't leak runtime pm ref on driver unload

2016-05-31 Thread Peter Wu
On Tue, May 31, 2016 at 01:34:43PM +0200, Lukas Wunner wrote:
> On Mon, May 30, 2016 at 07:03:46PM +0200, Peter Wu wrote:
> > On Sun, May 29, 2016 at 05:50:06PM +0200, Lukas Wunner wrote:
> > > How exactly did you reach the situation where the root port didn't wake
> > > up when you tried to load nouveau again? (IRC conversation this week.)
> > 
> > Ensure that the pci/pm patches are applied, then:
> > 
> >  0. Unload nouveau (I have blacklisted it for testing).
> >  1. Enable rpm for the root port and children (control = auto).
> >  2. Verify in the kernel logs that the devices are sleeping:
> > pcieport :00:01.0: power state changed by ACPI to D3cold
> >  3. (Optional, to rule out issues with delays:) Disable rpm for the
> > Nvidia device (control = on).
> >  4. modprobe nouveau.
> > 
> > The above test with v4.6 + 4 pci/pm patches (8b71f565) gives:
> > 
> > 50.245795 MXM: GUID detected in BIOS
> > 50.245948nseval-0227 ns_evaluate   :  Execute method 
> > [\_SB.PCI0.GFX0._DSM] at AML address c9013b11 length 492
> > 50.246016 ACPI Warning: \_SB.PCI0.GFX0._DSM: Argument #4 type mismatch 
> > - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95)
> > 50.246044nseval-0227 ns_evaluate   :  Execute method 
> > [\_SB.PCI0.GFX0._DSM] at AML address c9013b11 length 492
> > 50.246110nseval-0227 ns_evaluate   :  Execute method 
> > [\_SB.PCI0.PEG0.PEGP._DSM] at AML address c9018297 length 1F
> > 50.246256 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type 
> > mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95)
> > 50.246289nseval-0227 ns_evaluate   :  Execute method 
> > [\_SB.PCI0.PEG0.PEGP._DSM] at AML address c9018297 length 1F
> > 50.246443 ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM: Argument #4 type 
> > mismatch - Found [Buffer], ACPI requires [Package] (20160108/nsarguments-95)
> > 50.246457nseval-0227 ns_evaluate   :  Execute method 
> > [\_SB.PCI0.PEG0.PEGP._DSM] at AML address c9018297 length 1F
> > 50.246932 pci :01:00.0: optimus capabilities: enabled, status 
> > dynamic power, hda bios codec supported
> > 50.247005 VGA switcheroo: detected Optimus DSM method 
> > \_SB_.PCI0.PEG0.PEGP handle
> > 50.247084nseval-0227 ns_evaluate   :  Execute method 
> > [\_SB.PCI0.PEG0.PG00._ON] at AML address c901086e length 11D
> > 50.390140 pcieport :00:01.0: power state changed by ACPI to D0
> > 50.491893nseval-0227 ns_evaluate   :  Execute method 
> > [\_SB.PCI0.PEG0._DSW] at AML address c9010a2d length 1D
> > 50.492285 pcieport :00:01.0: PME# disabled
> > 50.492583 nouveau :01:00.0: unknown chipset ()
> > 50.492687 nouveau: probe of :01:00.0 failed with error -12
> 
> I've tested this on a MacBook Pro, which does not have ACPI _PR3
> methods for the root port to which the discrete GPU is attached.
> The port can thus only suspend to D3hot, not D3cold.
> 
> Even without patch [2/9], when unloading nouveau and letting the
> root port go to D3hot, the port is subsequently correctly resumed
> to D0 when reloading nouveau.
> 
> So the issue that you're seeing without patch [2/9] seems to be
> specific to Optimus/_PR3 machines. If possible you should try to
> get it working without patch [2/9] because that patch is really
> optional (as I've written in the commit message). I'm totally
> unfamiliar with Optimus but maybe lspci could help to debug this?

Without 2/9 I can prevent the issue by writing "on" to
/sys/bus/pci/devices/:00:01.0/power/control (the PCIe port), but
that effectively gives the same result as applying 2/9.

The problem occurs when the power is lost (by putting the PCIe port in
D3cold). Maybe it is a bug in the PCI core that does not re-initialize
devices under the port, but since a workaround is available (2/9), I
will focus on other issues first. Maybe it is worth to mention this
issue in the commit message for 2/9 though.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-06-01 Thread Peter Wu
On Tue, May 31, 2016 at 02:20:26PM +0200, Lukas Wunner wrote:
> On Mon, May 30, 2016 at 06:13:51PM +0200, Peter Wu wrote:
> > Do you have any suggestions for the case where the pcieport driver
> > refuses to put the bridge in D3 (because the BIOS is too old)? In that
> > case the nouveau driver needs to fallback to the DSM method (but not
> > when runtime PM is deliberately disabled by writing control=on).
> 
> The BIOS cut-off date is meant to avoid issues when suspending ports
> on older chipsets. However if the port is used for an Optimus GPU
> and we can clearly identify that, and there's a _PR3 method provided,
> it's probably safe to say that the port is *intended* to be suspended.
> 
> So you may want to consider amending pci_bridge_d3_possible() to
> allow D3 for such ports regardless of the BIOS date, as I've done
> for Thunderbolt in this commit:
> https://github.com/l1k/linux/commit/3cb8549cd4e5

Then we have heuristics based on BIOS year, on whether it is TB or not,
and next to it whether it is an Optimus laptop? Maybe the PCI core needs
to export a function that allows drivers to override the detection if
this becomes more common.

> Not sure how to uniquely identify such ports though. Perhaps check
> if there's a device in slot 0 below the port which has
>   (pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY &&
>   (pdev->vendor == PCI_VENDOR_ID_NVIDIA ||
>pdev->vendor == PCI_VENDOR_ID_ATI)

Seems fragile, there are desktop setups satisfying this match.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-06-01 Thread Peter Wu
On Wed, Jun 01, 2016 at 12:28:47PM +0300, Mika Westerberg wrote:
> On Tue, May 31, 2016 at 01:02:31PM +0200, Peter Wu wrote:
> > On Tue, May 31, 2016 at 11:43:56AM +0300, Mika Westerberg wrote:
> > > On Mon, May 30, 2016 at 06:13:51PM +0200, Peter Wu wrote:
> > > > Do you have any suggestions for the case where the pcieport driver
> > > > refuses to put the bridge in D3 (because the BIOS is too old)? In that
> > > > case the nouveau driver needs to fallback to the DSM method (but not
> > > > when runtime PM is deliberately disabled by writing control=on).
> > > 
> > > Do you know what Windows does then? I think we should do the same if
> > > possible.
> > 
> > If the BIOS is too old, then it probably does not have _PR3 objects nor
> > calls to _OSI("Windows 2013"). See below.
> > 
> > > If user has disabled runtime PM from the root port deliberately, there
> > > might be good reason to do so. Why we want to fallback to something that
> > > could cause problems? I mean _DSM on such systems is probably not that
> > > much tested because everybody runs Windows 8+ and using standard ACPI
> > > power resources.
> > 
> > I agree that when runtime PM on the root port is disabled (control=on),
> > then there should be no fallback to DSM. For devices without _PR3 it is
> > clear that DSM will always be used (if available).
> > 
> > In other cases (where _PR3 is available) we can distinguish:
> >  - pre-Windows 8 machines. I have never seen this combination. Firmware
> >writers seems to prefer sticking to reference code which did not use
> >power resources before.
> >  - Machines targeting Windows 8 or newer. (Note that there exist
> >machines with Windows 8 support that do not have _PR3, DSM is used in
> >that case.)
> > 
> > If Windows 7 is running on a Windows 8 machine, PR3 will not be used
> > anyway. If the Linux kernel claims support for Windows 8, but does not
> > use PR3, then we are probably approaching an untested area. So far
> > firmware seems fine with using *only* DSM *or* PR3, but at least my
> > laptop gets confused when you use both at the same time.
> > 
> > The latter happens on pci/pm (8b71f565) without other patches:
> > 
> >  1. nouveau invokes _DSM and _PS3, device is put in D3cold.
> >  2. pcieport driver calls PG00._OFF (PG00 is returned by _PR3).
> >  3. Wake up Nvidia device (e.g. by power=on).
> >  4. This will trigger PG00._ON (via pcieport) and _PS0 (via nouveau).
> >  5. Nvidia card is not really ready (observed via "restoring config
> > space at offset ... (was 0x, writing ...)", a soft lockup
> > and RCU stall after that requiring a reboot to recover).
> > 
> > nouveau could be patched not to invoke DSM when PR3 is detected
> > (proposal is ready) but will keep the device powered on in these cases:
> >  - nouveau is patched, but pci/pm patches are not.
> >  - PR3 is supported but due to the cutoff date (2015) it is not used.
> >  - Boot option pcie_port_pm=off.
> >  - runtime PM is disabled for pcieport (should be fine).
> 
> Since using only _DSM has been the only method to power down the card
> currently inńLinux (even if the root port has had _PR3), and it has been
> working fine, why not stick with that when _DSM is supported?

Maybe it is not really working, people have been reporting memory
corruption[1] for example on certain Lenovo models that was gone after
hacking the bbswitch module to disable the root port:

https://bugs.freedesktop.org/show_bug.cgi?id=78530
https://github.com/Bumblebee-Project/bbswitch/issues/78
https://github.com/Bumblebee-Project/bbswitch/issues/115

I'll try to solicit some feedback from the affected people on these
patch series, whether it solves their memory corruption issue.

Dave also said "This fixes GPU auto powerdown on the Lenovo W541," when
he added PR3 support in https://patchwork.freedesktop.org/patch/76313/
So apparently it did not work with just DSM.

> In other words, something like this:
> 
>   nouveau_dsm_pci_probe()
>   {
>   ...
>   if (retval & (NOUVEAU_DSM_HAS_OPT | NOUVEAU_DSM_HAS_MUX)) {
>   /*
>* We have custom _DSM method to power down the card so
>* prevent the PCI core from transitioning the
>* card into D3cold.
>*/
>   pci_d3cold_disable(pdev);
>   }
>   }
> 
> (Not sure about those flags above, though).
> 
> Yes, it do

[Nouveau] [PATCH v2 1/4] drm/nouveau/acpi: ensure matching ACPI handle and supported functions

2016-07-07 Thread Peter Wu
Ensure that the returned set of supported DSM functions (MUX, Optimus)
match the ACPI handle that is set in nouveau_dsm_pci_probe.

As there are no machines with a MUX function on just one PCI device and
an Optimus on another, there should not be a functional impact. This
change however makes this implicit assumption more obvious.

Convert int to bool and rename has_dsm to has_mux while at it. Let the
caller set nouveau_dsm_priv.dhandle as needed.

 v2: pass dhandle to the caller.

Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 58 +++---
 1 file changed, 26 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index db76b94..886a67c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -57,9 +57,6 @@ bool nouveau_is_v1_dsm(void) {
return nouveau_dsm_priv.dsm_detected;
 }
 
-#define NOUVEAU_DSM_HAS_MUX 0x1
-#define NOUVEAU_DSM_HAS_OPT 0x2
-
 #ifdef CONFIG_VGA_SWITCHEROO
 static const char nouveau_dsm_muid[] = {
0xA0, 0xA0, 0x95, 0x9D, 0x60, 0x00, 0x48, 0x4D,
@@ -212,26 +209,33 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
-static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
+static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
+ bool *has_mux, bool *has_opt)
 {
acpi_handle dhandle;
-   int retval = 0;
+   bool supports_mux;
+   bool supports_opt;
 
dhandle = ACPI_HANDLE(&pdev->dev);
if (!dhandle)
-   return false;
+   return;
 
if (!acpi_has_method(dhandle, "_DSM"))
-   return false;
+   return;
 
-   if (acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
-  1 << NOUVEAU_DSM_POWER))
-   retval |= NOUVEAU_DSM_HAS_MUX;
+   supports_mux = acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
+ 1 << NOUVEAU_DSM_POWER);
+   supports_opt = nouveau_check_optimus_dsm(dhandle);
 
-   if (nouveau_check_optimus_dsm(dhandle))
-   retval |= NOUVEAU_DSM_HAS_OPT;
+   /* Does not look like a Nvidia device. */
+   if (!supports_mux && !supports_opt)
+   return;
 
-   if (retval & NOUVEAU_DSM_HAS_OPT) {
+   *dhandle_out = dhandle;
+   *has_mux = supports_mux;
+   *has_opt = supports_opt;
+
+   if (supports_opt) {
uint32_t result;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
@@ -240,10 +244,6 @@ static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " 
: "",
 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec 
supported" : "");
}
-   if (retval)
-   nouveau_dsm_priv.dhandle = dhandle;
-
-   return retval;
 }
 
 static bool nouveau_dsm_detect(void)
@@ -251,11 +251,11 @@ static bool nouveau_dsm_detect(void)
char acpi_method_name[255] = { 0 };
struct acpi_buffer buffer = {sizeof(acpi_method_name), 
acpi_method_name};
struct pci_dev *pdev = NULL;
-   int has_dsm = 0;
-   int has_optimus = 0;
+   acpi_handle dhandle = NULL;
+   bool has_mux = false;
+   bool has_optimus = false;
int vga_count = 0;
bool guid_valid;
-   int retval;
bool ret = false;
 
/* lookup the MXM GUID */
@@ -268,32 +268,26 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
-   if (retval & NOUVEAU_DSM_HAS_MUX)
-   has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
-   if (retval & NOUVEAU_DSM_HAS_MUX)
-   has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
}
 
/* find the optimus DSM or the old v1 DSM */
-   if (has_optimus == 1) {
+   if (has_optimus) {
+   nouveau_dsm_priv.dhandle = dhandle;
acpi_get_name(nouveau_dsm_priv.dhandle,

[Nouveau] [PATCH v2 3/4] drm/nouveau/acpi: check for function 0x1B before using it

2016-07-07 Thread Peter Wu
Do not unconditionally invoke function 0x1B without checking for its
availability, it leads to an infinite loop on some firmware.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104791
Fixes: 5addcf0a5f0fad ("nouveau: add runtime PM support (v0.9)")
Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 572ac30..ad273ad 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -45,6 +45,7 @@
 static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
+   bool optimus_flags_detected;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -212,7 +213,8 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
 };
 
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
- bool *has_mux, bool *has_opt)
+ bool *has_mux, bool *has_opt,
+ bool *has_opt_flags)
 {
acpi_handle dhandle;
bool supports_mux;
@@ -236,6 +238,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
*dhandle_out = dhandle;
*has_mux = supports_mux;
*has_opt = !!optimus_funcs;
+   *has_opt_flags = optimus_funcs & (1 << NOUVEAU_DSM_OPTIMUS_FLAGS);
 
if (optimus_funcs) {
uint32_t result;
@@ -256,6 +259,7 @@ static bool nouveau_dsm_detect(void)
acpi_handle dhandle = NULL;
bool has_mux = false;
bool has_optimus = false;
+   bool has_optimus_flags = false;
int vga_count = 0;
bool guid_valid;
bool ret = false;
@@ -270,13 +274,15 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
+ &has_optimus_flags);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
+ &has_optimus_flags);
}
 
/* find the optimus DSM or the old v1 DSM */
@@ -287,6 +293,7 @@ static bool nouveau_dsm_detect(void)
printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method 
%s handle\n",
acpi_method_name);
nouveau_dsm_priv.optimus_detected = true;
+   nouveau_dsm_priv.optimus_flags_detected = has_optimus_flags;
ret = true;
} else if (vga_count == 2 && has_mux && guid_valid) {
nouveau_dsm_priv.dhandle = dhandle;
@@ -320,8 +327,9 @@ void nouveau_switcheroo_optimus_dsm(void)
if (!nouveau_dsm_priv.optimus_detected)
return;
 
-   nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_FLAGS,
-   0x3, &result);
+   if (nouveau_dsm_priv.optimus_flags_detected)
+   nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, 
NOUVEAU_DSM_OPTIMUS_FLAGS,
+   0x3, &result);
 
nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_CAPS,
NOUVEAU_DSM_OPTIMUS_SET_POWERDOWN, &result);
-- 
2.9.0

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v2 2/4] drm/nouveau/acpi: return supported DSM functions

2016-07-07 Thread Peter Wu
Return the set of supported functions to the caller. No functional
changes.

Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 886a67c..572ac30 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -107,7 +107,7 @@ static int nouveau_optimus_dsm(acpi_handle handle, int 
func, int arg, uint32_t *
  * requirements on the fourth parameter, so a private implementation
  * instead of using acpi_check_dsm().
  */
-static int nouveau_check_optimus_dsm(acpi_handle handle)
+static int nouveau_dsm_get_optimus_functions(acpi_handle handle)
 {
int result;
 
@@ -122,7 +122,9 @@ static int nouveau_check_optimus_dsm(acpi_handle handle)
 * ACPI Spec v4 9.14.1: if bit 0 is zero, no function is supported.
 * If the n-th bit is enabled, function n is supported
 */
-   return result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS);
+   if (result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS))
+   return result;
+   return 0;
 }
 
 static int nouveau_dsm(acpi_handle handle, int func, int arg)
@@ -214,7 +216,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
 {
acpi_handle dhandle;
bool supports_mux;
-   bool supports_opt;
+   int optimus_funcs;
 
dhandle = ACPI_HANDLE(&pdev->dev);
if (!dhandle)
@@ -225,17 +227,17 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
 
supports_mux = acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
  1 << NOUVEAU_DSM_POWER);
-   supports_opt = nouveau_check_optimus_dsm(dhandle);
+   optimus_funcs = nouveau_dsm_get_optimus_functions(dhandle);
 
/* Does not look like a Nvidia device. */
-   if (!supports_mux && !supports_opt)
+   if (!supports_mux && !optimus_funcs)
return;
 
*dhandle_out = dhandle;
*has_mux = supports_mux;
-   *has_opt = supports_opt;
+   *has_opt = !!optimus_funcs;
 
-   if (supports_opt) {
+   if (optimus_funcs) {
uint32_t result;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
-- 
2.9.0

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v2 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-07-07 Thread Peter Wu
Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
can be runtime-suspended which disables power resources via ACPI. This
is incompatible with DSM, resulting in a GPU device which is still in D3
and locks up the kernel on resume (on a Clevo P651RA, GTX965M).

Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
debugger trace) and stop using the DSM functions for D3cold when power
resources are available on the parent PCIe port.

pci_d3cold_disable() is not used because on some machines, the old DSM
method is broken. On a Lenovo T440p (GT 730M) memory and disk corruption
would occur, but that is fixed with this patch[2].

 [1]: 
https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold
 [2]: 
https://github.com/Bumblebee-Project/bbswitch/issues/78#issuecomment-223549072

 v2: simply check directly for _PR3. Added affected machines.

Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 33 +
 1 file changed, 29 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index ad273ad..38a6445 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
bool optimus_flags_detected;
+   bool optimus_skip_dsm;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -212,9 +213,26 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
+/* Firmware supporting Windows 8 or later do not use _DSM to put the device 
into
+ * D3cold, they instead rely on disabling power resources on the parent. */
+static bool nouveau_pr3_present(struct pci_dev *pdev)
+{
+   struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
+   struct acpi_device *parent_adev;
+
+   if (!parent_pdev)
+   return false;
+
+   parent_adev = ACPI_COMPANION(&parent_pdev->dev);
+   if (!parent_adev)
+   return false;
+
+   return acpi_has_method(parent_adev->handle, "_PR3");
+}
+
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
  bool *has_mux, bool *has_opt,
- bool *has_opt_flags)
+ bool *has_opt_flags, bool *has_pr3)
 {
acpi_handle dhandle;
bool supports_mux;
@@ -239,6 +257,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
*has_mux = supports_mux;
*has_opt = !!optimus_funcs;
*has_opt_flags = optimus_funcs & (1 << NOUVEAU_DSM_OPTIMUS_FLAGS);
+   *has_pr3 = false;
 
if (optimus_funcs) {
uint32_t result;
@@ -248,6 +267,8 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
 (result & OPTIMUS_ENABLED) ? "enabled" : "disabled",
 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " 
: "",
 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec 
supported" : "");
+
+   *has_pr3 = nouveau_pr3_present(pdev);
}
 }
 
@@ -260,6 +281,7 @@ static bool nouveau_dsm_detect(void)
bool has_mux = false;
bool has_optimus = false;
bool has_optimus_flags = false;
+   bool has_power_resources = false;
int vga_count = 0;
bool guid_valid;
bool ret = false;
@@ -275,14 +297,14 @@ static bool nouveau_dsm_detect(void)
vga_count++;
 
nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
- &has_optimus_flags);
+ &has_optimus_flags, &has_power_resources);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
- &has_optimus_flags);
+ &has_optimus_flags, &has_power_resources);
}
 
/* find the optimus DSM or the old v1 DSM */
@@ -292,8 +314,11 @@ static bool nouveau_dsm_detect(void)
&buffer);
printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method 
%s handle\n",
acpi_method_name);
+   if (has_power_resources)
+   pr_info("nouveau: detected PR support, will not use 
DSM\n");
nouveau_dsm_priv.optimus_detected = true;
nouve

[Nouveau] [PATCH v2 0/4] nouveau RPM fixes for Optimus

2016-07-07 Thread Peter Wu
Hi,

Here are two patches to fix an issue reported on kernel bugzilla (infinite loop
due to unchecked function) and a more important fix to fix hanging Optimus
machines when runtime PM is enabled (with pm/pci patches).

See the first version[1] for a background on the fixed problems. This is the
second revision of incorporating feedback from Emil Velikov (patch 1), Mika
Westerberg (patch 4). Patches 2 and 3 are unchanged.
The previous patchset had R-b from Hans de Goede, I think they are still valid.

Noteworthy is that the fourth patch now checks directly for _PR3. The commit
message is updated to emphasize that memory/disk corruption is fixed for some
machines.


This patchset can be merged before or after the pci/pm changes[2] (expected to
be merged for 4.8), see the original posting[1] for consequences. I have tested
it on top of v4.7-rc5. To make patch four work properly, Lukas' RPM refcounting
patches should be included. A similar (open/new) RPM refcounting issue in
snd-hda-intel should also be fixed. Otherwise the bridge will not really sleep.

There is another minor patch for nouveau_pr3_present, but it is not included
here because it depends on visibility of pci_bridge_d3_possible(). I'll send a
separate mail for this to linux-pci.

Kind regards,
Peter

 [1]: https://lists.freedesktop.org/archives/nouveau/2016-May/025116.html
 [2]: https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/?h=pci/pm

Peter Wu (4):
  drm/nouveau/acpi: ensure matching ACPI handle and supported functions
  drm/nouveau/acpi: return supported DSM functions
  drm/nouveau/acpi: check for function 0x1B before using it
  drm/nouveau/acpi: fix lockup with PCIe runtime PM

 drivers/gpu/drm/nouveau/nouveau_acpi.c | 103 +
 1 file changed, 66 insertions(+), 37 deletions(-)

-- 
2.9.0

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH] drm/nouveau/fbcon: fix deadlock with FBIOPUT_CON2FBMAP

2016-07-12 Thread Peter Wu
The FBIOPUT_CON2FBMAP ioctl takes a console_lock(). When this is called
while nouveau was runtime suspended, a deadlock would occur due to
nouveau_fbcon_set_suspend also trying to obtain console_lock().

Fix this by delaying the drm_fb_helper_set_suspend call. Based on the
i915 code (which was done for performance reasons though).

Cc: Chris Wilson 
Cc: Daniel Vetter 
Signed-off-by: Peter Wu 
---
Tested on top of v4.7-rc5, the deadlock is gone.
---
 drivers/gpu/drm/nouveau/nouveau_drm.c   |  4 +--
 drivers/gpu/drm/nouveau/nouveau_drv.h   |  1 +
 drivers/gpu/drm/nouveau/nouveau_fbcon.c | 54 -
 drivers/gpu/drm/nouveau/nouveau_fbcon.h |  2 +-
 4 files changed, 50 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 11f8dd9..f9a2c10 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -552,7 +552,7 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
 
if (dev->mode_config.num_crtc) {
NV_INFO(drm, "suspending console...\n");
-   nouveau_fbcon_set_suspend(dev, 1);
+   nouveau_fbcon_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
NV_INFO(drm, "suspending display...\n");
ret = nouveau_display_suspend(dev, runtime);
if (ret)
@@ -635,7 +635,7 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
NV_INFO(drm, "resuming display...\n");
nouveau_display_resume(dev, runtime);
NV_INFO(drm, "resuming console...\n");
-   nouveau_fbcon_set_suspend(dev, 0);
+   nouveau_fbcon_set_suspend(dev, FBINFO_STATE_RUNNING, false);
}
 
return 0;
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 822a021..a743d19 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -147,6 +147,7 @@ struct nouveau_drm {
struct nouveau_channel *channel;
struct nvkm_gpuobj *notify;
struct nouveau_fbdev *fbcon;
+   struct work_struct fbdev_suspend_work;
struct nvif_object nvsw;
struct nvif_object ntfy;
struct nvif_notify flip;
diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c 
b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
index d1f248f..089156a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
@@ -492,19 +492,53 @@ static const struct drm_fb_helper_funcs 
nouveau_fbcon_helper_funcs = {
.fb_probe = nouveau_fbcon_create,
 };
 
+static void nouveau_fbcon_suspend_worker(struct work_struct *work)
+{
+   nouveau_fbcon_set_suspend(container_of(work,
+  struct nouveau_drm,
+  fbdev_suspend_work)->dev,
+ FBINFO_STATE_RUNNING,
+ true);
+}
+
 void
-nouveau_fbcon_set_suspend(struct drm_device *dev, int state)
+nouveau_fbcon_set_suspend(struct drm_device *dev, int state, bool synchronous)
 {
struct nouveau_drm *drm = nouveau_drm(dev);
-   if (drm->fbcon) {
-   console_lock();
-   if (state == FBINFO_STATE_RUNNING)
-   nouveau_fbcon_accel_restore(dev);
-   drm_fb_helper_set_suspend(&drm->fbcon->helper, state);
+   if (!drm->fbcon)
+   return;
+
+   if (synchronous) {
+   /* Flush any pending work to turn the console on, and then
+* wait to turn it off. It must be synchronous as we are
+* about to suspend or unload the driver.
+*
+* Note that from within the work-handler, we cannot flush
+* ourselves, so only flush outstanding work upon suspend!
+*/
if (state != FBINFO_STATE_RUNNING)
-   nouveau_fbcon_accel_save_disable(dev);
-   console_unlock();
+   flush_work(&drm->fbdev_suspend_work);
+   console_lock();
+   } else {
+   /*
+* The console lock can be pretty contented on resume due
+* to all the printk activity.  Try to keep it out of the hot
+* path of resume if possible.  This also prevents a deadlock
+* with FBIOPUT_CON2FBMAP.
+*/
+   WARN_ON(state != FBINFO_STATE_RUNNING);
+   if (!console_trylock()) {
+   schedule_work(&drm->fbdev_suspend_work);
+   return;
+   }
}
+
+   if (state == FBINFO_STATE_RUNNING)
+   nouveau_fbcon_accel_restore(dev);
+   drm_fb_helper_set_suspend(&drm->fbcon->helper, state);
+   i

Re: [Nouveau] [PATCH] drm/nouveau/fbcon: fix deadlock with FBIOPUT_CON2FBMAP

2016-07-12 Thread Peter Wu
On Tue, Jul 12, 2016 at 09:16:22PM +0200, Lukas Wunner wrote:
> On Tue, Jul 12, 2016 at 06:49:34PM +0200, Peter Wu wrote:
> > The FBIOPUT_CON2FBMAP ioctl takes a console_lock(). When this is called
> > while nouveau was runtime suspended, a deadlock would occur due to
> > nouveau_fbcon_set_suspend also trying to obtain console_lock().
> > 
> > Fix this by delaying the drm_fb_helper_set_suspend call. Based on the
> > i915 code (which was done for performance reasons though).
> > 
> > Cc: Chris Wilson 
> > Cc: Daniel Vetter 
> > Signed-off-by: Peter Wu 
> > ---
> > Tested on top of v4.7-rc5, the deadlock is gone.
> 
> Hm, how did you trigger this deadlock?
> 
> Thanks,
> Lukas

Here is a small Python script with hardcoded values:

#!/usr/bin/env python3
# see drivers/video/fbdev/core/fbmem.c ->
# drivers/video/console/fbcon.c for FB_EVENT_SET_CONSOLE_MAP
import array, fcntl
FBIOPUT_CON2FBMAP = 0x4610
console, framebuffer = 6, 1
with open("/dev/fb0") as f:
info = array.array('I', [console, framebuffer])
fcntl.ioctl(f, FBIOPUT_CON2FBMAP, info)

Ensure that the nouveau card is sleeping, then invoke:

python3 con2fbmap.py

If you check /proc/`pidof python3`/stack or the dmesg spew 120 seconds
later, you will see a trace like this on a kernel without this patch:

[   60.738089] snd_hda_intel :01:00.1: Disabling via vga_switcheroo
[   60.739810] nouveau :01:00.0: DRM: suspending console...
[   60.740090] nouveau :01:00.0: DRM: suspending display...
[   60.740581] nouveau :01:00.0: DRM: evicting buffers...
[   60.740718] nouveau :01:00.0: DRM: waiting for kernel channels to go 
idle...
[   60.741096] nouveau :01:00.0: DRM: suspending client object trees...
[   60.748015] nouveau :01:00.0: DRM: suspending kernel object tree...
[   62.598156] nouveau :01:00.0: power state changed by ACPI to D3cold
[   66.883880] nouveau :01:00.0: power state changed by ACPI to D0
[   66.883987] nouveau :01:00.0: restoring config space at offset 0x4 (was 
0x100403, writing 0x100407)
[   66.884017] nouveau :01:00.0: calling nv_msi_ht_cap_quirk_leaf+0x0/0x30
[   66.884032] nouveau :01:00.0: DRM: resuming kernel object tree...
[   66.995505] nouveau :01:00.0: priv: GPC0: 419df4  (1f40820e)
[   66.995512] nouveau :01:00.0: priv: GPC1: 419df4  (1f40820e)
[   67.014829] nouveau :01:00.0: DRM: resuming client object trees...
[   67.014905] nouveau :01:00.0: DRM: resuming display...
[   67.014962] nouveau :01:00.0: DRM: resuming console...
[  240.619840] INFO: task con2fb:482 blocked for more than 120 seconds.
[  240.619844]   Not tainted 4.7.0-rc1kasan-00011-g5c72d90 #2
[  240.619845] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  240.619858] con2fb  D 880769467378 25464   482447 0x
[  240.619864]  880769467378 845eb340 83ed1708 
00ff880769467330
[  240.619868]  8807762a06e0 8807762a0708 88077629fdd8 
88077629fdc0
[  240.619872]  880772bc6200 88076f221880 88076946 
ed00ed28c001
[  240.619874] Call Trace:
[  240.619881]  [] schedule+0x95/0x1b0
[  240.619885]  [] schedule_timeout+0x3d0/0x8b0
[  240.619889]  [] ? usleep_range+0xe0/0xe0
[  240.619894]  [] ? debug_lockdep_rcu_enabled+0x77/0x90
[  240.619897]  [] ? mark_held_locks+0xc8/0x120
[  240.619901]  [] ? _raw_spin_unlock_irq+0x2c/0x30
[  240.619904]  [] ? trace_hardirqs_on_caller+0x3f9/0x580
[  240.619907]  [] __down+0xff/0x1d0
[  240.619911]  [] ? ww_mutex_unlock+0x270/0x270
[  240.619925]  [] ? _dev_info+0xc2/0xf0
[  240.619929]  [] down+0x63/0x80
[  240.619933]  [] console_lock+0x1e/0x70
[  240.620012]  [] nouveau_fbcon_set_suspend+0x71/0x390 
[nouveau]
[  240.620085]  [] nouveau_do_resume+0x2e2/0x380 [nouveau]
[  240.620157]  [] nouveau_pmops_runtime_resume+0xce/0x210 
[nouveau]
[  240.620163]  [] ? pci_restore_standard_config+0x70/0x70
[  240.620167]  [] pci_pm_runtime_resume+0x130/0x220
[  240.620171]  [] ? pci_restore_standard_config+0x70/0x70
[  240.620175]  [] __rpm_callback+0x62/0xe0
[  240.620179]  [] ? pci_restore_standard_config+0x70/0x70
[  240.620182]  [] rpm_callback+0x168/0x210
[  240.620186]  [] ? pci_restore_standard_config+0x70/0x70
[  240.620189]  [] rpm_resume+0xbc3/0x1880
[  240.620193]  [] ? 
pm_runtime_autosuspend_expiration+0x60/0x60
[  240.620196]  [] ? __pm_runtime_resume+0x6a/0xa0
[  240.620200]  [] __pm_runtime_resume+0x78/0xa0
[  240.620270]  [] nouveau_fbcon_open+0xd0/0x120 [nouveau]
[  240.620274]  [] con2fb_acquire_newinfo+0xc7/0x2c0
[  240.620277]  [] set_con2fb_map+0x728/0xcb0
[  240.620281]  [] fbcon_event_notify+0xaac/0x1f90
[  240.620285]  [] notifier_call_chain+0xc9/0x130
[  240.620288]  [] __blocking_notifier_call_chain+0x70/0xb0
[  240.620292]  [] blocking_notifier_cal

Re: [Nouveau] [PATCH] drm/nouveau/fbcon: fix deadlock with FBIOPUT_CON2FBMAP

2016-07-13 Thread Peter Wu
On Wed, Jul 13, 2016 at 11:54:49AM +0200, Daniel Vetter wrote:
> On Tue, Jul 12, 2016 at 06:49:34PM +0200, Peter Wu wrote:
> > The FBIOPUT_CON2FBMAP ioctl takes a console_lock(). When this is called
> > while nouveau was runtime suspended, a deadlock would occur due to
> > nouveau_fbcon_set_suspend also trying to obtain console_lock().
> > 
> > Fix this by delaying the drm_fb_helper_set_suspend call. Based on the
> > i915 code (which was done for performance reasons though).
> > 
> > Cc: Chris Wilson 
> > Cc: Daniel Vetter 
> > Signed-off-by: Peter Wu 
> > ---
> > Tested on top of v4.7-rc5, the deadlock is gone.
> 
> If we bother with this, it should imo be moved into the drm_fb_helper.c
> function drm_fb_helper_set_suspend(). But this also smells like some kind
> of bad duct-tape. I think Lukas is working on some other rpm vs. fbdev
> deadlocks, maybe we could fix them all with one proper fix? I've made some
> comments on Lukas' last patch series.

This patch is only needed for drivers that use console_lock (for
drm_fb_helper_set_suspend) in their runtime resume functions.
Lukas posted fixes for runtime PM reference leaks, those are different
from this deadlock (see
https://lists.freedesktop.org/archives/dri-devel/2016-July/113005.html
for a backtrace for this issue).

The deadlock could also be avoided if the device backing the fbcon is
somehow runtime-resumed outside the lock, but that feels like a larger
hack that does not seem easy.

The i915 patch was done to reduce resume time (due to console_lock
contention), that feature seems useful to all other drivers too even if
the deadlock is fixed in a different way.

My current plan is to move stuff out of the lock and allow (just)
resuming the console to be delayed.  Some drivers (nouveau,
radeon/amdgpu, i915) do unnecessary stuff under the console lock:

 - nouveau: I *think* that cleraing/setting FBINFO_HWACCEL_DISABLED
   (nouveau_fbcon_accel_restore) is safe outside the lock as the fb is
   already suspended before clearing/after setting the flag.
 - radeon: since the console is suspended, I don't think that that all
   of the code is radeon_resume_kms is really needed.
 - amdgpu: same as radeon. Btw, console_lock is leaked on an error path.
 - i915: I think that clearing the fb memory can be done outside the
   lock too as the console is suspended.

Please correct me if my assumptions are flawed.

> Besides this, when fixing a deadlock pls provide more details about the
> precise callchain and the locks involved in the deadlock. If you
> discovered this using lockdep, then just add the entire lockdep splat to
> the commit message. Otherwise there's lots of guesswork involved here.
> -Daniel

There was no lockdep splat, it was triggered via the ioctl in the commit
message. I'll include the verbose trace from the previous mail in the
next proposed patch to reduce hunting though.

Peter

> > ---
> >  drivers/gpu/drm/nouveau/nouveau_drm.c   |  4 +--
> >  drivers/gpu/drm/nouveau/nouveau_drv.h   |  1 +
> >  drivers/gpu/drm/nouveau/nouveau_fbcon.c | 54 
> > -
> >  drivers/gpu/drm/nouveau/nouveau_fbcon.h |  2 +-
> >  4 files changed, 50 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> > b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > index 11f8dd9..f9a2c10 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > @@ -552,7 +552,7 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
> >  
> > if (dev->mode_config.num_crtc) {
> > NV_INFO(drm, "suspending console...\n");
> > -   nouveau_fbcon_set_suspend(dev, 1);
> > +   nouveau_fbcon_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
> > NV_INFO(drm, "suspending display...\n");
> > ret = nouveau_display_suspend(dev, runtime);
> > if (ret)
> > @@ -635,7 +635,7 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
> > NV_INFO(drm, "resuming display...\n");
> > nouveau_display_resume(dev, runtime);
> > NV_INFO(drm, "resuming console...\n");
> > -   nouveau_fbcon_set_suspend(dev, 0);
> > +   nouveau_fbcon_set_suspend(dev, FBINFO_STATE_RUNNING, false);
> > }
> >  
> > return 0;
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
> > b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > index 822a021..a743d19 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_drv.h
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > @@ -147,6 +147,7 @@ struct nouveau_drm {
> > struct 

Re: [Nouveau] vga_switcheroo audio runpm

2016-07-14 Thread Peter Wu
On Sun, Jul 10, 2016 at 03:20:13PM +0200, Lukas Wunner wrote:
> Hi Peter,
> 
> > [12:42] Lekensteyn: Should the video card always be powered up when a
> >   register is read from the HDMI audio controller? Or would it be
> >   better to leave the video card suspended and just fail the HDA
> >   register reads?
> 
> It should probably be powered up.

Seems sensible, a video signal is apparently also required if you want
to play audio.

> > [12:43] Lekensteyn: I'm trying to figure out how vga_switcheroo should
> handle this, maybe l1k has an idea?
> > [12:48] Lekensteyn: The current implemented policy is that the video card
> >   dictates the power state and that the HDMI audio device has to
> >   adapt (even if it is seemingly in use)
> 
> This current architecture seems to have come about historically (Dave
> Airlie first implemented vga_switcheroo with manual power control,
> then added runtime pm in a second step).
> 
> It may also be motivated by the fact that the driver core is currently
> not supporting dependencies between devices beyond the hierarchical
> parent/child relationship.
> 
> Rafael Wysocki (cc:) posted a series in January addressing the latter
> problem with so-called "device links". The series was reposted recently
> by Marek Szyprowski:
> https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1170031.html
> 
> The vga_switcheroo audio handling should probably be reworked to use such
> "device links". The result would be that the GPU won't runtime suspend
> as long as a ref is held for the audio card. The audio card needs to then
> make sure that it releases any refs if it has nothing to do.
> 
> The "device links" series seems to impose more restrictions than we
> actually need here, it seems to require that the "supplier" is bound
> to a driver before the "consumer" may probe. IOW nouveau needs to be
> bound before snd_hda_audio can probe. I'm not sure if that additional
> (unnecessary) restriction is a problem.

The device links feature looks promising. My initial guess that it is OK
to wait for nouveau to become available (as supplier), the audio port
(as consumer) probably does not work anyway without a video signal.

> I've recently tried to add runtime pm for muxed laptops to vga_switcheroo,
> this is something that Pierre Moreau (cc:) has expressed an interest in
> for his MacBook Pro. I came across a major roadblock in the form of
> vga_switcheroo_set_dynamic_switch(). That function is called from the
> DRM driver when the GPU runtime suspends and resumes. It takes the
> vgasr_mutex. The problem is, if the user initiates a switch of the mux,
> that mutex is already taken in vga_switcheroo_debugfs_write(). If the
> GPU we're switching to is asleep, it'll wake up for the switch and
> we'll get a deadlock when the DRM driver subsequently calls
> vga_switcheroo_set_dynamic_switch().
> 
> I would like to get rid of vga_switcheroo_set_dynamic_switch() to solve
> this. The function has two purposes: Number one, vga_switcheroo updates
> its internal power state representation for the GPU. That is actually
> pointless for driver power control because we can always query the
> driver core for this information (e.g. pm_runtime_suspended()).
> The second purpose is to switch the audio client on and off. If we would
> use a "device link" to express the dependency between the audio device
> and the GPU, we wouldn't need this. So moving to "device links" is
> a prerequisite to make runtime pm work for muxed laptops.

This internal state is likely a historical artifact due to the manual
control (ON / OFF) that was needed in the past. I have recently tried to
draw the dependencies on paper and the suspend/resume to dynamic switch
flow is not the prettiest ;) Using runtime pm would probably make the
dependencies clearer in a logical way.

> If you want to take a stab at using "device links" for vga_switcheroo
> then please go ahead as I'm swamped with other work. (I've reverse-
> engineered retrieval of Apple device properties from EFI this week,
> which is needed for Thunderbolt.) Let me know if you have any questions
> or need stuff reviewed or whatever. Since the "device links" series
> hasn't landed yet, it's still possible I think to ask for feature
> requests should it not work for this particular use case. The
> linux...@vger.kernel.org mailing list is the right place to inquire
> about the series.
> 
> Thanks for raising this important question.

I'll give this a go after finishing the PR3 nouveau patches and fixing
some locking issues.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau/fbcon: fix deadlock with FBIOPUT_CON2FBMAP

2016-07-15 Thread Peter Wu
On Wed, Jul 13, 2016 at 04:57:19PM +0200, Daniel Vetter wrote:
> On Wed, Jul 13, 2016 at 02:40:50PM +0200, Peter Wu wrote:
> > On Wed, Jul 13, 2016 at 11:54:49AM +0200, Daniel Vetter wrote:
> > > On Tue, Jul 12, 2016 at 06:49:34PM +0200, Peter Wu wrote:
> > > > The FBIOPUT_CON2FBMAP ioctl takes a console_lock(). When this is called
> > > > while nouveau was runtime suspended, a deadlock would occur due to
> > > > nouveau_fbcon_set_suspend also trying to obtain console_lock().
> > > > 
> > > > Fix this by delaying the drm_fb_helper_set_suspend call. Based on the
> > > > i915 code (which was done for performance reasons though).
> > > > 
> > > > Cc: Chris Wilson 
> > > > Cc: Daniel Vetter 
> > > > Signed-off-by: Peter Wu 
> > > > ---
> > > > Tested on top of v4.7-rc5, the deadlock is gone.
> > > 
> > > If we bother with this, it should imo be moved into the drm_fb_helper.c
> > > function drm_fb_helper_set_suspend(). But this also smells like some kind
> > > of bad duct-tape. I think Lukas is working on some other rpm vs. fbdev
> > > deadlocks, maybe we could fix them all with one proper fix? I've made some
> > > comments on Lukas' last patch series.
> > 
> > This patch is only needed for drivers that use console_lock (for
> > drm_fb_helper_set_suspend) in their runtime resume functions.
> > Lukas posted fixes for runtime PM reference leaks, those are different
> > from this deadlock (see
> > https://lists.freedesktop.org/archives/dri-devel/2016-July/113005.html
> > for a backtrace for this issue).
> > 
> > The deadlock could also be avoided if the device backing the fbcon is
> > somehow runtime-resumed outside the lock, but that feels like a larger
> > hack that does not seem easy.
> > 
> > The i915 patch was done to reduce resume time (due to console_lock
> > contention), that feature seems useful to all other drivers too even if
> > the deadlock is fixed in a different way.
> 
> I might have imagined something, but I thought Lukas is already working on
> some rpm vs. vga_switcheroo inversions. But looking again this was on the
> audio side.
> 
> I think the proper solution for fbcon would be for the fbdev emulation to
> also hold a runtime pm references when the console is in use.

nouveau does this by adding a fb_open and fb_release function that calls
pm_runtime_get and pm_runtime_put respectively (and is the only driver
doing this). So that is why it causes the console_lock issue for
nouveau, but not for others.

> This should already happen correctly for vblank, the more tricky part
> is fbdev mmap and fbcon:
> 
> - I have no idea, but there should be a way to intercept fbdev userspace
>   mmaps and make sure that as long as an app has the fbdev mmapped we
>   don't runtime suspend. No one really should be doing that (at least for
>   normal setups), hence this shouldn't result in unsafe.

mmap normally needs a fd right? When the chardev /dev/fbX is opened, the
fb_open function will be called. So nouveau should not have this issue
with mmap/read/write to a fb while the device is suspended.

(This RPM thing was added in f231976c2e89 ("drm/nouveau/fbcon: take
runpm reference when userspace has an open fd"), maybe it is not a bad
idea for other drivers with RPM support either.)

> - For fbcon, we could suspend it in the dpms off callbacks (maybe with a
>   timer), and resume it only when enabling fbcon again - fbcon needs to
>   redraw anyway on dpms on.

Would this guarantee that the fb is suspended before/during suspend (of
the graphics device) and resumed somewhere during/after resume?

Suspending the fb also has the effect that reads/writes to /dev/fbN
fail, maybe that is a bit too restricted since the framebuffer is just
accessible until the device is suspended.

(Hmm, fb_read/fb_write does not seem to do any locking...)

>   Another solution for fbcon might be to untangle the suspend/resume stuff
>   and protect it by something else than console_lock. But that means
>   fixing up fbcon locking horror shows.

console_lock seems needed for some code down the call stack, removing it
risks some blow ups.

Some archaeology: this locking problem was introduced with 054430e773c9
("fbcon: fix locking harder"). In the past fb_set_suspend also took the
fb_info lock but that was removed in 9e769ff3f585 ("fb: avoid possible
deadlock caused by fb_set_suspend").

Peter

> > My current plan is to move stuff out of the lock and allow (just)
> > resuming the console to be delayed.  Some drivers (nouveau,
> > radeon/amdgpu, i915) do unnecessary stuff under the console lock:
> > 
&g

Re: [Nouveau] [PATCH] drm/nouveau/fbcon: fix deadlock with FBIOPUT_CON2FBMAP

2016-07-15 Thread Peter Wu
On Wed, Jul 13, 2016 at 06:17:47PM +0100, Chris Wilson wrote:
> On Tue, Jul 12, 2016 at 06:49:34PM +0200, Peter Wu wrote:
> > The FBIOPUT_CON2FBMAP ioctl takes a console_lock(). When this is called
> > while nouveau was runtime suspended, a deadlock would occur due to
> > nouveau_fbcon_set_suspend also trying to obtain console_lock().
> > 
> > Fix this by delaying the drm_fb_helper_set_suspend call. Based on the
> > i915 code (which was done for performance reasons though).
> > 
> > Cc: Chris Wilson 
> > Cc: Daniel Vetter 
> > Signed-off-by: Peter Wu 
> > ---
> > Tested on top of v4.7-rc5, the deadlock is gone.
> > ---
> >  drivers/gpu/drm/nouveau/nouveau_drm.c   |  4 +--
> >  drivers/gpu/drm/nouveau/nouveau_drv.h   |  1 +
> >  drivers/gpu/drm/nouveau/nouveau_fbcon.c | 54 
> > -
> >  drivers/gpu/drm/nouveau/nouveau_fbcon.h |  2 +-
> >  4 files changed, 50 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> > b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > index 11f8dd9..f9a2c10 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> > @@ -552,7 +552,7 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
> >  
> > if (dev->mode_config.num_crtc) {
> > NV_INFO(drm, "suspending console...\n");
> > -   nouveau_fbcon_set_suspend(dev, 1);
> > +   nouveau_fbcon_set_suspend(dev, FBINFO_STATE_SUSPENDED, true);
> > NV_INFO(drm, "suspending display...\n");
> > ret = nouveau_display_suspend(dev, runtime);
> > if (ret)
> > @@ -635,7 +635,7 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
> > NV_INFO(drm, "resuming display...\n");
> > nouveau_display_resume(dev, runtime);
> > NV_INFO(drm, "resuming console...\n");
> > -   nouveau_fbcon_set_suspend(dev, 0);
> > +   nouveau_fbcon_set_suspend(dev, FBINFO_STATE_RUNNING, false);
> > }
> >  
> > return 0;
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
> > b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > index 822a021..a743d19 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_drv.h
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > @@ -147,6 +147,7 @@ struct nouveau_drm {
> > struct nouveau_channel *channel;
> > struct nvkm_gpuobj *notify;
> > struct nouveau_fbdev *fbcon;
> > +   struct work_struct fbdev_suspend_work;
> > struct nvif_object nvsw;
> > struct nvif_object ntfy;
> > struct nvif_notify flip;
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c 
> > b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
> > index d1f248f..089156a 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
> > @@ -492,19 +492,53 @@ static const struct drm_fb_helper_funcs 
> > nouveau_fbcon_helper_funcs = {
> > .fb_probe = nouveau_fbcon_create,
> >  };
> >  
> > +static void nouveau_fbcon_suspend_worker(struct work_struct *work)
> > +{
> > +   nouveau_fbcon_set_suspend(container_of(work,
> > +  struct nouveau_drm,
> > +  fbdev_suspend_work)->dev,
> > + FBINFO_STATE_RUNNING,
> > + true);
> > +}
> > +
> >  void
> > -nouveau_fbcon_set_suspend(struct drm_device *dev, int state)
> > +nouveau_fbcon_set_suspend(struct drm_device *dev, int state, bool 
> > synchronous)
> >  {
> > struct nouveau_drm *drm = nouveau_drm(dev);
> > -   if (drm->fbcon) {
> > -   console_lock();
> > -   if (state == FBINFO_STATE_RUNNING)
> > -   nouveau_fbcon_accel_restore(dev);
> > -   drm_fb_helper_set_suspend(&drm->fbcon->helper, state);
> > +   if (!drm->fbcon)
> > +   return;
> > +
> > +   if (synchronous) {
> > +   /* Flush any pending work to turn the console on, and then
> > +* wait to turn it off. It must be synchronous as we are
> > +* about to suspend or unload the driver.
> > +*
> > +* Note that from within the work-handler, we cannot flush
> > +* ourselves, so only flush outstanding work upon suspend!
> > +*/
> > if (state != FBINFO_STATE_RUNNING)
> &g

[Nouveau] [PATCH v3 1/4] drm/nouveau/acpi: ensure matching ACPI handle and supported functions

2016-07-15 Thread Peter Wu
Ensure that the returned set of supported DSM functions (MUX, Optimus)
match the ACPI handle that is set in nouveau_dsm_pci_probe.

As there are no machines with a MUX function on just one PCI device and
an Optimus on another, there should not be a functional impact. This
change however makes this implicit assumption more obvious.

Convert int to bool and rename has_dsm to has_mux while at it. Let the
caller set nouveau_dsm_priv.dhandle as needed.

 v2: pass dhandle to the caller.

Reviewed-by: Hans de Goede 
Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 58 +++---
 1 file changed, 26 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index db76b94..886a67c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -57,9 +57,6 @@ bool nouveau_is_v1_dsm(void) {
return nouveau_dsm_priv.dsm_detected;
 }
 
-#define NOUVEAU_DSM_HAS_MUX 0x1
-#define NOUVEAU_DSM_HAS_OPT 0x2
-
 #ifdef CONFIG_VGA_SWITCHEROO
 static const char nouveau_dsm_muid[] = {
0xA0, 0xA0, 0x95, 0x9D, 0x60, 0x00, 0x48, 0x4D,
@@ -212,26 +209,33 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
-static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
+static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
+ bool *has_mux, bool *has_opt)
 {
acpi_handle dhandle;
-   int retval = 0;
+   bool supports_mux;
+   bool supports_opt;
 
dhandle = ACPI_HANDLE(&pdev->dev);
if (!dhandle)
-   return false;
+   return;
 
if (!acpi_has_method(dhandle, "_DSM"))
-   return false;
+   return;
 
-   if (acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
-  1 << NOUVEAU_DSM_POWER))
-   retval |= NOUVEAU_DSM_HAS_MUX;
+   supports_mux = acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
+ 1 << NOUVEAU_DSM_POWER);
+   supports_opt = nouveau_check_optimus_dsm(dhandle);
 
-   if (nouveau_check_optimus_dsm(dhandle))
-   retval |= NOUVEAU_DSM_HAS_OPT;
+   /* Does not look like a Nvidia device. */
+   if (!supports_mux && !supports_opt)
+   return;
 
-   if (retval & NOUVEAU_DSM_HAS_OPT) {
+   *dhandle_out = dhandle;
+   *has_mux = supports_mux;
+   *has_opt = supports_opt;
+
+   if (supports_opt) {
uint32_t result;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
@@ -240,10 +244,6 @@ static int nouveau_dsm_pci_probe(struct pci_dev *pdev)
 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " 
: "",
 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec 
supported" : "");
}
-   if (retval)
-   nouveau_dsm_priv.dhandle = dhandle;
-
-   return retval;
 }
 
 static bool nouveau_dsm_detect(void)
@@ -251,11 +251,11 @@ static bool nouveau_dsm_detect(void)
char acpi_method_name[255] = { 0 };
struct acpi_buffer buffer = {sizeof(acpi_method_name), 
acpi_method_name};
struct pci_dev *pdev = NULL;
-   int has_dsm = 0;
-   int has_optimus = 0;
+   acpi_handle dhandle = NULL;
+   bool has_mux = false;
+   bool has_optimus = false;
int vga_count = 0;
bool guid_valid;
-   int retval;
bool ret = false;
 
/* lookup the MXM GUID */
@@ -268,32 +268,26 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
-   if (retval & NOUVEAU_DSM_HAS_MUX)
-   has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   retval = nouveau_dsm_pci_probe(pdev);
-   if (retval & NOUVEAU_DSM_HAS_MUX)
-   has_dsm |= 1;
-   if (retval & NOUVEAU_DSM_HAS_OPT)
-   has_optimus = 1;
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
}
 
/* find the optimus DSM or the old v1 DSM */
-   if (has_optimus == 1) {
+   if (has_optimus) {
+   nouveau_dsm_priv.dhandle = dhandle;
acpi_get_name(nouveau_dsm_priv.dhandl

[Nouveau] [PATCH v3 0/4] nouveau RPM fixes for Optimus (final)

2016-07-15 Thread Peter Wu
Hi,

Here are two patches to fix an issue reported on kernel bugzilla (infinite loop
due to unchecked function) and a more important fix to fix hanging Optimus
machines when runtime PM is enabled (with pm/pci patches).

These are the final patches targeting v4.8. Changes compared to v2[1]:
collected R-b from Hans and Mika and fixed a minor comment style issue.

I recommend it to be merged before the pci/pm patches[2], otherwise there is a
window where newer Nvidia Optimus laptops might fail to runtime resume and/or
lock up.  Once the pci/pm branch is merged I will propose another patch to
improve reliability[3].

Known issue with patch 4: when a Nvidia HDMI audio function is present, the
bridge will not suspend and hence the Nvidia card will still be powered. Fixing
this properly will require more work[4], until then you can kill the audio
device and make runtime PM work properly:

echo 1 > /sys/bus/pci/devices/:01:00.1/remove

Kind regards,
Peter

 [1]: https://lists.freedesktop.org/archives/nouveau/2016-July/025519.html
 [2]: https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/?h=pci/pm
 [3]: http://www.spinics.net/lists/linux-pci/msg52601.html
 [4]: https://lists.freedesktop.org/archives/dri-devel/2016-July/112759.html

Peter Wu (4):
  drm/nouveau/acpi: ensure matching ACPI handle and supported functions
  drm/nouveau/acpi: return supported DSM functions
  drm/nouveau/acpi: check for function 0x1B before using it
  drm/nouveau/acpi: fix lockup with PCIe runtime PM

 drivers/gpu/drm/nouveau/nouveau_acpi.c | 105 +
 1 file changed, 68 insertions(+), 37 deletions(-)

-- 
2.9.0

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v3 2/4] drm/nouveau/acpi: return supported DSM functions

2016-07-15 Thread Peter Wu
Return the set of supported functions to the caller. No functional
changes.

Reviewed-by: Hans de Goede 
Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 886a67c..572ac30 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -107,7 +107,7 @@ static int nouveau_optimus_dsm(acpi_handle handle, int 
func, int arg, uint32_t *
  * requirements on the fourth parameter, so a private implementation
  * instead of using acpi_check_dsm().
  */
-static int nouveau_check_optimus_dsm(acpi_handle handle)
+static int nouveau_dsm_get_optimus_functions(acpi_handle handle)
 {
int result;
 
@@ -122,7 +122,9 @@ static int nouveau_check_optimus_dsm(acpi_handle handle)
 * ACPI Spec v4 9.14.1: if bit 0 is zero, no function is supported.
 * If the n-th bit is enabled, function n is supported
 */
-   return result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS);
+   if (result & 1 && result & (1 << NOUVEAU_DSM_OPTIMUS_CAPS))
+   return result;
+   return 0;
 }
 
 static int nouveau_dsm(acpi_handle handle, int func, int arg)
@@ -214,7 +216,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
 {
acpi_handle dhandle;
bool supports_mux;
-   bool supports_opt;
+   int optimus_funcs;
 
dhandle = ACPI_HANDLE(&pdev->dev);
if (!dhandle)
@@ -225,17 +227,17 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
 
supports_mux = acpi_check_dsm(dhandle, nouveau_dsm_muid, 0x0102,
  1 << NOUVEAU_DSM_POWER);
-   supports_opt = nouveau_check_optimus_dsm(dhandle);
+   optimus_funcs = nouveau_dsm_get_optimus_functions(dhandle);
 
/* Does not look like a Nvidia device. */
-   if (!supports_mux && !supports_opt)
+   if (!supports_mux && !optimus_funcs)
return;
 
*dhandle_out = dhandle;
*has_mux = supports_mux;
-   *has_opt = supports_opt;
+   *has_opt = !!optimus_funcs;
 
-   if (supports_opt) {
+   if (optimus_funcs) {
uint32_t result;
nouveau_optimus_dsm(dhandle, NOUVEAU_DSM_OPTIMUS_CAPS, 0,
&result);
-- 
2.9.0

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v3 3/4] drm/nouveau/acpi: check for function 0x1B before using it

2016-07-15 Thread Peter Wu
Do not unconditionally invoke function 0x1B without checking for its
availability, it leads to an infinite loop on some firmware.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=104791
Fixes: 5addcf0a5f0fad ("nouveau: add runtime PM support (v0.9)")
Reviewed-by: Hans de Goede 
Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index 572ac30..ad273ad 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -45,6 +45,7 @@
 static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
+   bool optimus_flags_detected;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -212,7 +213,8 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
 };
 
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
- bool *has_mux, bool *has_opt)
+ bool *has_mux, bool *has_opt,
+ bool *has_opt_flags)
 {
acpi_handle dhandle;
bool supports_mux;
@@ -236,6 +238,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
*dhandle_out = dhandle;
*has_mux = supports_mux;
*has_opt = !!optimus_funcs;
+   *has_opt_flags = optimus_funcs & (1 << NOUVEAU_DSM_OPTIMUS_FLAGS);
 
if (optimus_funcs) {
uint32_t result;
@@ -256,6 +259,7 @@ static bool nouveau_dsm_detect(void)
acpi_handle dhandle = NULL;
bool has_mux = false;
bool has_optimus = false;
+   bool has_optimus_flags = false;
int vga_count = 0;
bool guid_valid;
bool ret = false;
@@ -270,13 +274,15 @@ static bool nouveau_dsm_detect(void)
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != 
NULL) {
vga_count++;
 
-   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
+ &has_optimus_flags);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
-   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus);
+   nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
+ &has_optimus_flags);
}
 
/* find the optimus DSM or the old v1 DSM */
@@ -287,6 +293,7 @@ static bool nouveau_dsm_detect(void)
printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method 
%s handle\n",
acpi_method_name);
nouveau_dsm_priv.optimus_detected = true;
+   nouveau_dsm_priv.optimus_flags_detected = has_optimus_flags;
ret = true;
} else if (vga_count == 2 && has_mux && guid_valid) {
nouveau_dsm_priv.dhandle = dhandle;
@@ -320,8 +327,9 @@ void nouveau_switcheroo_optimus_dsm(void)
if (!nouveau_dsm_priv.optimus_detected)
return;
 
-   nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_FLAGS,
-   0x3, &result);
+   if (nouveau_dsm_priv.optimus_flags_detected)
+   nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, 
NOUVEAU_DSM_OPTIMUS_FLAGS,
+   0x3, &result);
 
nouveau_optimus_dsm(nouveau_dsm_priv.dhandle, NOUVEAU_DSM_OPTIMUS_CAPS,
NOUVEAU_DSM_OPTIMUS_SET_POWERDOWN, &result);
-- 
2.9.0

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v3 4/4] drm/nouveau/acpi: fix lockup with PCIe runtime PM

2016-07-15 Thread Peter Wu
Since "PCI: Add runtime PM support for PCIe ports", the parent PCIe port
can be runtime-suspended which disables power resources via ACPI. This
is incompatible with DSM, resulting in a GPU device which is still in D3
and locks up the kernel on resume (on a Clevo P651RA, GTX965M).

Mirror the behavior of Windows 8 and newer[1] (as observed via an AMLi
debugger trace) and stop using the DSM functions for D3cold when power
resources are available on the parent PCIe port.

pci_d3cold_disable() is not used because on some machines, the old DSM
method is broken. On a Lenovo T440p (GT 730M) memory and disk corruption
would occur, but that is fixed with this patch[2].

 [1]: 
https://msdn.microsoft.com/windows/hardware/drivers/bringup/firmware-requirements-for-d3cold
 [2]: 
https://github.com/Bumblebee-Project/bbswitch/issues/78#issuecomment-223549072

 v2: simply check directly for _PR3. Added affected machines.
 v3: fixed block comment coding style.

Reviewed-by: Mika Westerberg 
Signed-off-by: Peter Wu 
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 35 ++
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index ad273ad..f2ad17a 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -46,6 +46,7 @@ static struct nouveau_dsm_priv {
bool dsm_detected;
bool optimus_detected;
bool optimus_flags_detected;
+   bool optimus_skip_dsm;
acpi_handle dhandle;
acpi_handle rom_handle;
 } nouveau_dsm_priv;
@@ -212,9 +213,28 @@ static const struct vga_switcheroo_handler 
nouveau_dsm_handler = {
.get_client_id = nouveau_dsm_get_client_id,
 };
 
+/*
+ * Firmware supporting Windows 8 or later do not use _DSM to put the device 
into
+ * D3cold, they instead rely on disabling power resources on the parent.
+ */
+static bool nouveau_pr3_present(struct pci_dev *pdev)
+{
+   struct pci_dev *parent_pdev = pci_upstream_bridge(pdev);
+   struct acpi_device *parent_adev;
+
+   if (!parent_pdev)
+   return false;
+
+   parent_adev = ACPI_COMPANION(&parent_pdev->dev);
+   if (!parent_adev)
+   return false;
+
+   return acpi_has_method(parent_adev->handle, "_PR3");
+}
+
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
  bool *has_mux, bool *has_opt,
- bool *has_opt_flags)
+ bool *has_opt_flags, bool *has_pr3)
 {
acpi_handle dhandle;
bool supports_mux;
@@ -239,6 +259,7 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
*has_mux = supports_mux;
*has_opt = !!optimus_funcs;
*has_opt_flags = optimus_funcs & (1 << NOUVEAU_DSM_OPTIMUS_FLAGS);
+   *has_pr3 = false;
 
if (optimus_funcs) {
uint32_t result;
@@ -248,6 +269,8 @@ static void nouveau_dsm_pci_probe(struct pci_dev *pdev, 
acpi_handle *dhandle_out
 (result & OPTIMUS_ENABLED) ? "enabled" : "disabled",
 (result & OPTIMUS_DYNAMIC_PWR_CAP) ? "dynamic power, " 
: "",
 (result & OPTIMUS_HDA_CODEC_MASK) ? "hda bios codec 
supported" : "");
+
+   *has_pr3 = nouveau_pr3_present(pdev);
}
 }
 
@@ -260,6 +283,7 @@ static bool nouveau_dsm_detect(void)
bool has_mux = false;
bool has_optimus = false;
bool has_optimus_flags = false;
+   bool has_power_resources = false;
int vga_count = 0;
bool guid_valid;
bool ret = false;
@@ -275,14 +299,14 @@ static bool nouveau_dsm_detect(void)
vga_count++;
 
nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
- &has_optimus_flags);
+ &has_optimus_flags, &has_power_resources);
}
 
while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_3D << 8, pdev)) != NULL) 
{
vga_count++;
 
nouveau_dsm_pci_probe(pdev, &dhandle, &has_mux, &has_optimus,
- &has_optimus_flags);
+ &has_optimus_flags, &has_power_resources);
}
 
/* find the optimus DSM or the old v1 DSM */
@@ -292,8 +316,11 @@ static bool nouveau_dsm_detect(void)
&buffer);
printk(KERN_INFO "VGA switcheroo: detected Optimus DSM method 
%s handle\n",
acpi_method_name);
+   if (has_power_resources)
+   pr_info("nouveau: detected PR support, will not use 
DSM\n&qu

Re: [Nouveau] [PATCH v3 0/4] nouveau RPM fixes for Optimus (final)

2016-07-15 Thread Peter Wu
On Fri, Jul 15, 2016 at 12:10:23PM -0400, Ilia Mirkin wrote:
> On Fri, Jul 15, 2016 at 9:12 AM, Peter Wu  wrote:
> > Hi,
> >
> > Here are two patches to fix an issue reported on kernel bugzilla (infinite 
> > loop
> > due to unchecked function) and a more important fix to fix hanging Optimus
> > machines when runtime PM is enabled (with pm/pci patches).
> >
> > These are the final patches targeting v4.8. Changes compared to v2[1]:
> > collected R-b from Hans and Mika and fixed a minor comment style issue.
> >
> > I recommend it to be merged before the pci/pm patches[2], otherwise there 
> > is a
> > window where newer Nvidia Optimus laptops might fail to runtime resume 
> > and/or
> > lock up.  Once the pci/pm branch is merged I will propose another patch to
> > improve reliability[3].
> >
> > Known issue with patch 4: when a Nvidia HDMI audio function is present, the
> > bridge will not suspend and hence the Nvidia card will still be powered. 
> > Fixing
> 
> That's basically all optimus gpu's, right? Anything GT21x+ has a HDMI
> audio subfunction, and prior to that, the nvidia gpu tended to be the
> only gpu, or hard-muxed.
> 
> If that's the case, that's pretty much a non-starter, IMO.

For some reason the audio function tends to disappear/hide, so maybe it
is not as problematic as it appears (see
https://bugs.freedesktop.org/show_bug.cgi?id=75985). For my laptop I
also had to runtime suspend/resume before lspci -H1 shows the device,
loading with runpm=0 didn't return my HDMI audio device.

The powered on issue will also only appear on devices produced in 2013
and newer that happen to have this ACPI _PR3 ACPI method (which is quite
common for new machines supporting Windows 8 though).

For these newer laptops, after the pci/pm merge and after a patch like
http://www.spinics.net/lists/linux-pci/msg52601.html, the user can
revert to the old DSM method by booting with pcie_port_pm=off which will
retain the current behavior.

The advantage of this patch is that it fixes memory corruption on some
devices. The risk is that the card stays on because the audio subsystem
needs some more work.  FWIW, I was working on some patches that properly
suspended in presence of the HDA controller, but somehow the audio
device was not properly resumed resulting in "no AFG or MFG node found"
and "snd_hda_intel :01:00.1: no codecs initialized".
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v3 0/4] nouveau RPM fixes for Optimus (final)

2016-07-15 Thread Peter Wu
On Fri, Jul 15, 2016 at 12:41:49PM -0400, Ilia Mirkin wrote:
> On Fri, Jul 15, 2016 at 12:36 PM, Peter Wu  wrote:
> > On Fri, Jul 15, 2016 at 12:10:23PM -0400, Ilia Mirkin wrote:
> >> On Fri, Jul 15, 2016 at 9:12 AM, Peter Wu  wrote:
> >> > Hi,
> >> >
> >> > Here are two patches to fix an issue reported on kernel bugzilla 
> >> > (infinite loop
> >> > due to unchecked function) and a more important fix to fix hanging 
> >> > Optimus
> >> > machines when runtime PM is enabled (with pm/pci patches).
> >> >
> >> > These are the final patches targeting v4.8. Changes compared to v2[1]:
> >> > collected R-b from Hans and Mika and fixed a minor comment style issue.
> >> >
> >> > I recommend it to be merged before the pci/pm patches[2], otherwise 
> >> > there is a
> >> > window where newer Nvidia Optimus laptops might fail to runtime resume 
> >> > and/or
> >> > lock up.  Once the pci/pm branch is merged I will propose another patch 
> >> > to
> >> > improve reliability[3].
> >> >
> >> > Known issue with patch 4: when a Nvidia HDMI audio function is present, 
> >> > the
> >> > bridge will not suspend and hence the Nvidia card will still be powered. 
> >> > Fixing
> >>
> >> That's basically all optimus gpu's, right? Anything GT21x+ has a HDMI
> >> audio subfunction, and prior to that, the nvidia gpu tended to be the
> >> only gpu, or hard-muxed.
> >>
> >> If that's the case, that's pretty much a non-starter, IMO.
> >
> > For some reason the audio function tends to disappear/hide, so maybe it
> > is not as problematic as it appears (see
> > https://bugs.freedesktop.org/show_bug.cgi?id=75985). For my laptop I
> 
> I'm aware of that bug. I believe this is an exceedingly rare scenario
> or it would have been reported a lot more.
> 
> > also had to runtime suspend/resume before lspci -H1 shows the device,
> > loading with runpm=0 didn't return my HDMI audio device.
> 
> Hm ok. Do you have the same laptop as the reporter of that bug?

Nope, I have a Clevo P651RA (GTX965M). That reporter has a Dell XPS 15,
but it also seems present for the Lenovo ThinkPad T420s (see comment on
bug), Asus N56VZ, MSI GT60 2PE, Dell L502x (Launchpad 1377653), Asus
G46vw (Ask Ubuntu user). There is another AU report for a GT 525M
(laptop brand/model unknown).

Maybe there are more affected users, but then they did not notice it
because they did not use HDMI audio.

> >
> > The powered on issue will also only appear on devices produced in 2013
> > and newer that happen to have this ACPI _PR3 ACPI method (which is quite
> > common for new machines supporting Windows 8 though).
> >
> > For these newer laptops, after the pci/pm merge and after a patch like
> > http://www.spinics.net/lists/linux-pci/msg52601.html, the user can
> > revert to the old DSM method by booting with pcie_port_pm=off which will
> > retain the current behavior.
> >
> > The advantage of this patch is that it fixes memory corruption on some
> > devices. The risk is that the card stays on because the audio subsystem
> > needs some more work.  FWIW, I was working on some patches that properly
> > suspended in presence of the HDA controller, but somehow the audio
> > device was not properly resumed resulting in "no AFG or MFG node found"
> > and "snd_hda_intel :01:00.1: no codecs initialized".
> 
> Does this restriction (runpm being broken in presence of the audio
> subfunction) only affect devices with _PR3? If so, that's a lot more
> palatable - I bet Windows 8+ is in an era when the display-less thing
> became more popular, and thus less likely to affect a ton of people.

Yes it only affects those devices with _PR3.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v3 0/4] nouveau RPM fixes for Optimus (final)

2016-07-15 Thread Peter Wu
On Fri, Jul 15, 2016 at 06:54:27PM +0200, Peter Wu wrote:
> On Fri, Jul 15, 2016 at 12:41:49PM -0400, Ilia Mirkin wrote:
> > On Fri, Jul 15, 2016 at 12:36 PM, Peter Wu  wrote:
> > > On Fri, Jul 15, 2016 at 12:10:23PM -0400, Ilia Mirkin wrote:
> > >> On Fri, Jul 15, 2016 at 9:12 AM, Peter Wu  wrote:
> > >> > Hi,
> > >> >
> > >> > Here are two patches to fix an issue reported on kernel bugzilla 
> > >> > (infinite loop
> > >> > due to unchecked function) and a more important fix to fix hanging 
> > >> > Optimus
> > >> > machines when runtime PM is enabled (with pm/pci patches).
> > >> >
> > >> > These are the final patches targeting v4.8. Changes compared to v2[1]:
> > >> > collected R-b from Hans and Mika and fixed a minor comment style issue.
> > >> >
> > >> > I recommend it to be merged before the pci/pm patches[2], otherwise 
> > >> > there is a
> > >> > window where newer Nvidia Optimus laptops might fail to runtime resume 
> > >> > and/or
> > >> > lock up.  Once the pci/pm branch is merged I will propose another 
> > >> > patch to
> > >> > improve reliability[3].
> > >> >
> > >> > Known issue with patch 4: when a Nvidia HDMI audio function is 
> > >> > present, the
> > >> > bridge will not suspend and hence the Nvidia card will still be 
> > >> > powered. Fixing
> > >>
> > >> That's basically all optimus gpu's, right? Anything GT21x+ has a HDMI
> > >> audio subfunction, and prior to that, the nvidia gpu tended to be the
> > >> only gpu, or hard-muxed.
> > >>
> > >> If that's the case, that's pretty much a non-starter, IMO.
> > >
> > > For some reason the audio function tends to disappear/hide, so maybe it
> > > is not as problematic as it appears (see
> > > https://bugs.freedesktop.org/show_bug.cgi?id=75985). For my laptop I
> > 
> > I'm aware of that bug. I believe this is an exceedingly rare scenario
> > or it would have been reported a lot more.
> > 
> > > also had to runtime suspend/resume before lspci -H1 shows the device,
> > > loading with runpm=0 didn't return my HDMI audio device.
> > 
> > Hm ok. Do you have the same laptop as the reporter of that bug?
> 
> Nope, I have a Clevo P651RA (GTX965M). That reporter has a Dell XPS 15,
> but it also seems present for the Lenovo ThinkPad T420s (see comment on
> bug), Asus N56VZ, MSI GT60 2PE, Dell L502x (Launchpad 1377653), Asus
> G46vw (Ask Ubuntu user). There is another AU report for a GT 525M
> (laptop brand/model unknown).
> 
> Maybe there are more affected users, but then they did not notice it
> because they did not use HDMI audio.
> 
> > >
> > > The powered on issue will also only appear on devices produced in 2013
> > > and newer that happen to have this ACPI _PR3 ACPI method (which is quite
> > > common for new machines supporting Windows 8 though).
> > >
> > > For these newer laptops, after the pci/pm merge and after a patch like
> > > http://www.spinics.net/lists/linux-pci/msg52601.html, the user can
> > > revert to the old DSM method by booting with pcie_port_pm=off which will
> > > retain the current behavior.
> > >
> > > The advantage of this patch is that it fixes memory corruption on some
> > > devices. The risk is that the card stays on because the audio subsystem
> > > needs some more work.  FWIW, I was working on some patches that properly
> > > suspended in presence of the HDA controller, but somehow the audio
> > > device was not properly resumed resulting in "no AFG or MFG node found"
> > > and "snd_hda_intel :01:00.1: no codecs initialized".
> > 
> > Does this restriction (runpm being broken in presence of the audio
> > subfunction) only affect devices with _PR3? If so, that's a lot more
> > palatable - I bet Windows 8+ is in an era when the display-less thing
> > became more popular, and thus less likely to affect a ton of people.
> 
> Yes it only affects those devices with _PR3.

I downloaded all .tar.gz files from the big Launchpad bug that collects
DSDTs (and more recently also dmidecode/lspci) and ran an analysis. The
result (limited to files which actually had a lspci and dmidecode file):

 - 111 Nvidia video devices
 - 20 out of these have an audio device.
 - 18 use _PR3, 93 use DSM (or gmux).
 - Exactly zero use _PR3 and hav

Re: [Nouveau] [PATCH v3 0/4] nouveau RPM fixes for Optimus (final)

2016-07-27 Thread Peter Wu
Ping, would it be possible to get some acks and merge it for 4.8?
Current -next is broken (on modern laptops as expected) and these series
fix the issues according to an IRC report.

The audio issue mentioned below should not give issues, modern laptops
do not seem to expose the audio device by default
(https://bugs.freedesktop.org/show_bug.cgi?id=75985). On Windows the
audio device only appears after inserting the HDMI/miniDP cable (my
laptop has no other connectors), enabling the card for rendering
purposes has no effect on the availability of the audio device.

Kind regards,
Peter

On Fri, Jul 15, 2016 at 03:12:14PM +0200, Peter Wu wrote:
> Hi,
> 
> Here are two patches to fix an issue reported on kernel bugzilla (infinite 
> loop
> due to unchecked function) and a more important fix to fix hanging Optimus
> machines when runtime PM is enabled (with pm/pci patches).
> 
> These are the final patches targeting v4.8. Changes compared to v2[1]:
> collected R-b from Hans and Mika and fixed a minor comment style issue.
> 
> I recommend it to be merged before the pci/pm patches[2], otherwise there is a
> window where newer Nvidia Optimus laptops might fail to runtime resume and/or
> lock up.  Once the pci/pm branch is merged I will propose another patch to
> improve reliability[3].
> 
> Known issue with patch 4: when a Nvidia HDMI audio function is present, the
> bridge will not suspend and hence the Nvidia card will still be powered. 
> Fixing
> this properly will require more work[4], until then you can kill the audio
> device and make runtime PM work properly:
> 
> echo 1 > /sys/bus/pci/devices/:01:00.1/remove
> 
> Kind regards,
> Peter
> 
>  [1]: https://lists.freedesktop.org/archives/nouveau/2016-July/025519.html
>  [2]: https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/?h=pci/pm
>  [3]: http://www.spinics.net/lists/linux-pci/msg52601.html
>  [4]: https://lists.freedesktop.org/archives/dri-devel/2016-July/112759.html
> 
> Peter Wu (4):
>   drm/nouveau/acpi: ensure matching ACPI handle and supported functions
>   drm/nouveau/acpi: return supported DSM functions
>   drm/nouveau/acpi: check for function 0x1B before using it
>   drm/nouveau/acpi: fix lockup with PCIe runtime PM
> 
>  drivers/gpu/drm/nouveau/nouveau_acpi.c | 105 
> +
>  1 file changed, 68 insertions(+), 37 deletions(-)
> 
> -- 
> 2.9.0
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH] drm/nouveau/acpi: use DSM if bridge does not support D3cold

2016-08-25 Thread Peter Wu
Even if PR3 support is available on the bridge, it will not be used if
the PCI layer considers it unavailable (i.e. on all laptops from 2013
and 2014). Ensure that this condition is checked to allow a fallback to
the Optimus DSM for device poweroff.

Initially I wanted to call pci_d3cold_enable before checking bridge_d3
(in case the user changed d3cold_allowed), but that is such an unlikely
case and likely fragile anyway. The current patch is suggested by Mika
in http://www.spinics.net/lists/linux-pci/msg52599.html

Cc: Mika Westerberg 
Signed-off-by: Peter Wu 
---
Hi,

This idea is floating since July, but was blocked by the PCI D3cold patches.
Since these got into -rc1, here is the promised follow-up patch for v4.8.

Without this patch, some laptops from 2013 and 2014 will regress in v4.8 and
suck more power.

Kind regards,
Peter
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index f2ad17a..dc57b62 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -225,6 +225,17 @@ static bool nouveau_pr3_present(struct pci_dev *pdev)
if (!parent_pdev)
return false;
 
+   if (!parent_pdev->bridge_d3) {
+   /*
+* Parent PCI bridge is currently not power managed.
+* Since userspace can change these afterwards to be on
+* the safe side we stick with _DSM and prevent usage of
+* _PR3 from the bridge.
+*/
+   pci_d3cold_disable(pdev);
+   return false;
+   }
+
parent_adev = ACPI_COMPANION(&parent_pdev->dev);
if (!parent_adev)
return false;
-- 
2.9.3

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off GPU via Power Resources

2016-10-26 Thread Peter Wu
Hi PCI/ACPI PM experts,

Since Linux 4.8, nouveau switched to rely on the PCIe port driver to
transition to D3cold. This however does not happen for an Acer Aspire
V7-582PG (Haswell, NVIDIA GTX 750M) from Rick.

Any idea why? acpidump, lspci, dmesg and other details can be found in
the linked bug below.

Kind regards,
Peter

On Wed, Oct 26, 2016 at 10:42:07PM +, bugzilla-dae...@freedesktop.org wrote:
> https://bugs.freedesktop.org/show_bug.cgi?id=98398
> 
> --- Comment #11 from Peter Wu  ---
> So 4.7 and before used the "DSM" method on runtime-suspend:
> - \_SB.PCI0.RP05.PEGP._DSM would be invoked to enable Optimus
> - \_SB.PCI0.RP05.PEGP._PS3 is then invoked which would enter D3cold
> (note, this method is still used in 4.8 on older laptops or with the
> pcie_pm_port=off kernel option)
> 
> Since 4.8, _DSM is not called anymore by nouveau (when support from the PCI
> core is detected) and this sequence should instead happen:
> - \_SB.PCI0.RP05.PEGP._PS3 (does nothing besides updating _STA)
> - PCIe core removes power for the PCIe port since all its children are in
>   D3 and are willing to transition to D3cold. It does so by invoking
>   \NVP3._OFF (where \NVP3 is the power resource from \_SB.PCI0.RP05._PR3)
> 
> That is how I think it should work in theory, but on Ricks laptop running
> 4.8.4,
> /sys/bus/devices/:1c.4/firmware_node/ does not have power_resources_D0
> devices (which I do have on my own laptop for :01:0).
> 
> The SSDT1 of Rick's Acer laptop shows this structure:
> 
> If (\_OSI ("Windows 2013"))
> {
> Scope (\_SB.PCI0.RP05)
> {
> //...
> Name (_PR0, Package (0x01)  // _PR0: Power Resources for D0
> {
> NVP3
> })
> Name (_PR2, Package (0x01)  // _PR2: Power Resources for D2
> {
> NVP2
> })
> Name (_PR3, Package (0x01)  // _PR3: Power Resources for D3hot
> {
> NVP3
> })
> // ...
> Method (_PS0, 0, NotSerialized)  // _PS0: Power State 0
> {
> }
> 
> Method (_PS3, 0, NotSerialized)  // _PS3: Power State 3
> {
> }
> }
> 
> Name (MSD3, Zero)
> PowerResource (NVP3, 0x00, 0x)
> {
> Name (_STA, One)  // _STA: Status
> // ...
> 
> Method (_ON, 0, NotSerialized)  // _ON_: Power On
> {
> // ...
> }
> 
> Method (_OFF, 0, NotSerialized)  // _OFF: Power Off
> {
> // ...
> }
> }
> 
> The dmesg does show "ACPI: Power Resource [NVP3] (on)", so I guess that the
> methods are found. It is a mystery to me why the "power_resources_Dx" files 
> are
> not created, possibly breaking PM.
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off GPU via Power Resources

2016-10-27 Thread Peter Wu
On Thu, Oct 27, 2016 at 11:17:48AM +0300, Mika Westerberg wrote:
> On Thu, Oct 27, 2016 at 12:56:41AM +0200, Peter Wu wrote:
> > Hi PCI/ACPI PM experts,
> > 
> > Since Linux 4.8, nouveau switched to rely on the PCIe port driver to
> > transition to D3cold. This however does not happen for an Acer Aspire
> > V7-582PG (Haswell, NVIDIA GTX 750M) from Rick.
> > 
> > Any idea why? acpidump, lspci, dmesg and other details can be found in
> > the linked bug below.
> 
> > 
> > Kind regards,
> > Peter
> > 
> > On Wed, Oct 26, 2016 at 10:42:07PM +, bugzilla-dae...@freedesktop.org 
> > wrote:
> > > https://bugs.freedesktop.org/show_bug.cgi?id=98398
> > > 
> > > --- Comment #11 from Peter Wu  ---
> > > So 4.7 and before used the "DSM" method on runtime-suspend:
> > > - \_SB.PCI0.RP05.PEGP._DSM would be invoked to enable Optimus
> > > - \_SB.PCI0.RP05.PEGP._PS3 is then invoked which would enter D3cold
> > > (note, this method is still used in 4.8 on older laptops or with the
> > > pcie_pm_port=off kernel option)
> > > 
> > > Since 4.8, _DSM is not called anymore by nouveau (when support from the 
> > > PCI
> > > core is detected) and this sequence should instead happen:
> > > - \_SB.PCI0.RP05.PEGP._PS3 (does nothing besides updating _STA)
> > > - PCIe core removes power for the PCIe port since all its children are in
> > >   D3 and are willing to transition to D3cold. It does so by invoking
> > >   \NVP3._OFF (where \NVP3 is the power resource from \_SB.PCI0.RP05._PR3)
> > > 
> > > That is how I think it should work in theory, but on Ricks laptop running
> > > 4.8.4,
> > > /sys/bus/devices/:1c.4/firmware_node/ does not have power_resources_D0
> > > devices (which I do have on my own laptop for :01:0).
> > > 
> > > The SSDT1 of Rick's Acer laptop shows this structure:
> > > 
> > > If (\_OSI ("Windows 2013"))
> > > {
> > > Scope (\_SB.PCI0.RP05)
> > > {
> > > //...
> > > Name (_PR0, Package (0x01)  // _PR0: Power Resources for D0
> > > {
> > > NVP3
> > > })
> > > Name (_PR2, Package (0x01)  // _PR2: Power Resources for D2
> > > {
> > > NVP2
> > > })
> > > Name (_PR3, Package (0x01)  // _PR3: Power Resources for D3hot
> > > {
> > > NVP3
> > > })
> > > // ...
> > > Method (_PS0, 0, NotSerialized)  // _PS0: Power State 0
> > > {
> > > }
> > > 
> > > Method (_PS3, 0, NotSerialized)  // _PS3: Power State 3
> > > {
> > > }
> > > }
> > > 
> > > Name (MSD3, Zero)
> > > PowerResource (NVP3, 0x00, 0x)
> > > {
> > > Name (_STA, One)  // _STA: Status
> > > // ...
> > > 
> > > Method (_ON, 0, NotSerialized)  // _ON_: Power On
> > > {
> > > // ...
> > > }
> > > 
> > > Method (_OFF, 0, NotSerialized)  // _OFF: Power Off
> > > {
> > > // ...
> > > }
> > > }
> > > 
> > > The dmesg does show "ACPI: Power Resource [NVP3] (on)", so I guess that 
> > > the
> > > methods are found. It is a mystery to me why the "power_resources_Dx" 
> > > files are
> > > not created, possibly breaking PM.
> 
> The ASL code looks right to me (except for the NVP2 which never set _STA
> to 0 but should not affect here).
> 
> I wonder what does /sys/bus/pci/devices/:00:1c.4/firmware_node/path 
> contain?

The value is as expected, \_SB.PCI0.RP05:

/sys/bus/pci/devices/:00:1c.4/firmware_node/path:\_SB_.PCI0.RP05
/sys/bus/pci/devices/:00:1c.4/firmware_node/power_state:D3hot
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off GPU via Power Resources

2016-10-27 Thread Peter Wu
bsystem -> 
../../../../../../../bus/acpi
-rw-r--r-- 1 root root 4096 26 okt 22:31 uevent

./LNXVIDEO:00/device:3a/power:
totaal 0
-rw-r--r-- 1 root root 4096 26 okt 22:31 async
-rw-r--r-- 1 root root 4096 26 okt 22:31 autosuspend_delay_ms
-rw-r--r-- 1 root root 4096 26 okt 22:31 control
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_kids
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_enabled
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_status
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_suspended_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_usage

./LNXVIDEO:00/input:
totaal 0
drwxr-xr-x 6 root root 0 26 okt 22:31 input4

./LNXVIDEO:00/input/input4:
totaal 0
drwxr-xr-x 2 root root0 26 okt 22:31 capabilities
lrwxrwxrwx 1 root root0 26 okt 22:31 device -> ../../../LNXVIDEO:00
drwxr-xr-x 3 root root0 26 okt 22:31 event4
drwxr-xr-x 2 root root0 26 okt 22:31 id
-r--r--r-- 1 root root 4096 26 okt 22:31 modalias
-r--r--r-- 1 root root 4096 26 okt 22:31 name
-r--r--r-- 1 root root 4096 26 okt 22:31 phys
drwxr-xr-x 2 root root0 26 okt 22:31 power
-r--r--r-- 1 root root 4096 26 okt 22:31 properties
lrwxrwxrwx 1 root root0 26 okt 22:10 subsystem -> 
../../../../../../../../class/input
-rw-r--r-- 1 root root 4096 26 okt 22:31 uevent
-r--r--r-- 1 root root 4096 26 okt 22:31 uniq

./LNXVIDEO:00/input/input4/capabilities:
totaal 0
-r--r--r-- 1 root root 4096 26 okt 22:31 abs
-r--r--r-- 1 root root 4096 26 okt 22:31 ev
-r--r--r-- 1 root root 4096 26 okt 22:31 ff
-r--r--r-- 1 root root 4096 26 okt 22:31 key
-r--r--r-- 1 root root 4096 26 okt 22:31 led
-r--r--r-- 1 root root 4096 26 okt 22:31 msc
-r--r--r-- 1 root root 4096 26 okt 22:31 rel
-r--r--r-- 1 root root 4096 26 okt 22:31 snd
-r--r--r-- 1 root root 4096 26 okt 22:31 sw

./LNXVIDEO:00/input/input4/event4:
totaal 0
-r--r--r-- 1 root root 4096 26 okt 22:31 dev
lrwxrwxrwx 1 root root0 26 okt 22:31 device -> ../../input4
drwxr-xr-x 2 root root0 26 okt 22:31 power
lrwxrwxrwx 1 root root0 26 okt 22:10 subsystem -> 
../../../../../../../../../class/input
-rw-r--r-- 1 root root 4096 26 okt 22:31 uevent

./LNXVIDEO:00/input/input4/event4/power:
totaal 0
-rw-r--r-- 1 root root 4096 26 okt 22:31 async
-rw-r--r-- 1 root root 4096 26 okt 22:31 autosuspend_delay_ms
-rw-r--r-- 1 root root 4096 26 okt 22:31 control
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_kids
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_enabled
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_status
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_suspended_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_usage

./LNXVIDEO:00/input/input4/id:
totaal 0
-r--r--r-- 1 root root 4096 26 okt 22:31 bustype
-r--r--r-- 1 root root 4096 26 okt 22:31 product
-r--r--r-- 1 root root 4096 26 okt 22:31 vendor
-r--r--r-- 1 root root 4096 26 okt 22:31 version

./LNXVIDEO:00/input/input4/power:
totaal 0
-rw-r--r-- 1 root root 4096 26 okt 22:31 async
-rw-r--r-- 1 root root 4096 26 okt 22:31 autosuspend_delay_ms
-rw-r--r-- 1 root root 4096 26 okt 22:31 control
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_kids
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_enabled
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_status
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_suspended_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_usage

./LNXVIDEO:00/power:
totaal 0
-rw-r--r-- 1 root root 4096 26 okt 22:31 async
-rw-r--r-- 1 root root 4096 26 okt 22:31 autosuspend_delay_ms
-rw-r--r-- 1 root root 4096 26 okt 22:31 control
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_kids
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_active_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_enabled
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_status
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_suspended_time
-r--r--r-- 1 root root 4096 26 okt 22:31 runtime_usage

./power:
totaal 0
-rw-r--r-- 1 root root 4096 26 okt 22:11 async
-rw-r--r-- 1 root root 4096 26 okt 22:11 autosuspend_delay_ms
-rw-r--r-- 1 root root 4096 26 okt 22:11 control
-r--r--r-- 1 root root 4096 26 okt 22:11 runtime_active_kids
-r--r--r-- 1 root root 4096 26 okt 22:11 runtime_active_time
-r--r--r-- 1 root root 4096 26 okt 22:11 runtime_enabled
-r--r--r-- 1 root root 4096 26 okt 22:11 runtime_status
-r--r--r-- 1 root root 4096 26 okt 22:11 runtime_suspended_time
-r--r--r-- 1 root root 40

Re: [Nouveau] Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off GPU via Power Resources

2016-10-27 Thread Peter Wu
On Thu, Oct 27, 2016 at 12:55:08PM +0300, Mika Westerberg wrote:
> On Thu, Oct 27, 2016 at 09:42:28AM +, Rick Kerkhof wrote:
> >No, there are not. Here is the recursive directory listing:
> 
> Are you able to try the following patch and send me dmesg (or attach it
> to that bug)? It should show if the ACPI core even tries to add those
> power resources.

So Rick has tested this patch now on top of 4.8.4 (mainline fails to
boot due to a kbuild issue which I reported elsewhere), but the output
is empty. That seems to indicate that flags.power_resources is unset.

Given that _PS3 exists and is indeed a package with some elements, it
seems that acpi_extract_power_resources is failing. Note that in the
SSDT, the power resource NVP3 was referenced before it was defined,
could that result in this enumeration failure? Relevant SSDT excerpt:

Scope (\_SB.PCI0.RP05)
{
Name (_PR3, Package (0x01)  // _PR3: Power Resources for D3hot
{
NVP3
})
// ...
}

PowerResource (NVP3, 0x00, 0x)

Kind regards,
Peter

> diff --git a/drivers/acpi/power.c b/drivers/acpi/power.c
> index fcd4ce6f78d5..af9c3e15dd74 100644
> --- a/drivers/acpi/power.c
> +++ b/drivers/acpi/power.c
> @@ -444,6 +444,9 @@ void acpi_power_add_remove_device(struct acpi_device 
> *adev, bool add)
>   if (!adev->power.flags.power_resources)
>   return;
>  
> + acpi_handle_info(adev->handle, "Adding power resources for %s\n",
> +  dev_name(&adev->dev));
> +
>   for (state = ACPI_STATE_D0; state <= ACPI_STATE_D3_HOT; state++)
>   acpi_power_expose_hide(adev,
>  &adev->power.states[state].resources,
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off GPU via Power Resources

2016-10-28 Thread Peter Wu
On Fri, Oct 28, 2016 at 11:56:30AM +0300, Mika Westerberg wrote:
> On Thu, Oct 27, 2016 at 06:06:48PM +0200, Peter Wu wrote:
> > On Thu, Oct 27, 2016 at 12:55:08PM +0300, Mika Westerberg wrote:
> > > On Thu, Oct 27, 2016 at 09:42:28AM +, Rick Kerkhof wrote:
> > > >No, there are not. Here is the recursive directory listing:
> > > 
> > > Are you able to try the following patch and send me dmesg (or attach it
> > > to that bug)? It should show if the ACPI core even tries to add those
> > > power resources.
> > 
> > So Rick has tested this patch now on top of 4.8.4 (mainline fails to
> > boot due to a kbuild issue which I reported elsewhere), but the output
> > is empty. That seems to indicate that flags.power_resources is unset.
> 
> Is it completely empty or is it empty just for RP05? It should print out
> all devices with power resources.

\NVP2 and \NVP3 are the only power resources under RP05 and defined in
SSDT1, there are no others.

Kind regards,
Peter

> > Given that _PS3 exists and is indeed a package with some elements, it
> > seems that acpi_extract_power_resources is failing. Note that in the
> > SSDT, the power resource NVP3 was referenced before it was defined,
> > could that result in this enumeration failure? Relevant SSDT excerpt:
> > 
> > Scope (\_SB.PCI0.RP05)
> > {
> > Name (_PR3, Package (0x01)  // _PR3: Power Resources for D3hot
> > {
> > NVP3
> > })
> > // ...
> > }
> > 
> > PowerResource (NVP3, 0x00, 0x)
> 
> That and the fact that they come from an SSDT instead of DSDT may cause
> this. However, I'm not expert in ACPICA so adding Bob and Lv if they
> have ideas.
> 
> Bob, Lv, the bug in question is: 
> https://bugs.freedesktop.org/show_bug.cgi?id=98398
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off GPU via Power Resources

2016-10-28 Thread Peter Wu
On Fri, Oct 28, 2016 at 02:19:07PM +0300, Mika Westerberg wrote:
> On Fri, Oct 28, 2016 at 01:09:30PM +0200, Peter Wu wrote:
> > On Fri, Oct 28, 2016 at 11:56:30AM +0300, Mika Westerberg wrote:
> > > On Thu, Oct 27, 2016 at 06:06:48PM +0200, Peter Wu wrote:
> > > > On Thu, Oct 27, 2016 at 12:55:08PM +0300, Mika Westerberg wrote:
> > > > > On Thu, Oct 27, 2016 at 09:42:28AM +, Rick Kerkhof wrote:
> > > > > >No, there are not. Here is the recursive directory listing:
> > > > > 
> > > > > Are you able to try the following patch and send me dmesg (or attach 
> > > > > it
> > > > > to that bug)? It should show if the ACPI core even tries to add those
> > > > > power resources.
> > > > 
> > > > So Rick has tested this patch now on top of 4.8.4 (mainline fails to
> > > > boot due to a kbuild issue which I reported elsewhere), but the output
> > > > is empty. That seems to indicate that flags.power_resources is unset.
> > > 
> > > Is it completely empty or is it empty just for RP05? It should print out
> > > all devices with power resources.
> > 
> > \NVP2 and \NVP3 are the only power resources under RP05 and defined in
> > SSDT1, there are no others.
> 
> We should probably add a debug print before checking
> flags.power_resources just to be sure the patch is correctly applied.

It was correctly applied. I did some testing with QEMU, it seems that
the \_OSI check is problematic. Removing it makes things work again.

To reproduce:

 1. Build the kernel with the attached config (minconfig with ACPI and
printk stuff enabled) and debug patch.
 2. Compile the attached ASL file to AML (iasl ssdt1.dsl).
(You can remove the If(\_OSI(...)) check to see the difference.
 3. Boot QEMU and watch the dmesg for the debug prints.

Alternatively, just boot QEMU with this SSDT and check against any Linux
distro and inspect /sys/bus/pci/devices/:00:01.3/firmware_node/.

Tested with Linux v4.8.5, QEMU 2.7.0, iasl 20160831. The SSDT structure
is adapted from Ricks SSDT1 file and modified for the 00:01.3 device in
QEMU. The QEMU command I used was:

    qemu-system-x86_64 -m 1G -M pc,accel=kvm -nographic \
-kernel arch/x86/boot/bzImage -serial file:/dev/stdout \
-acpitable file=ssdt1.aml \
-append 'console=ttyS0 acpi.aml_debug_output=1'
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
/*
 * Intel ACPI Component Architecture
 * AML/ASL+ Disassembler version 20160831-64
 * Copyright (c) 2000 - 2016 Intel Corporation
 * 
 * Disassembling to symbolic ASL+ operators
 *
 * Disassembly of ssdt1.dat, Wed Oct 26 21:43:27 2016
 *
 * Original Table Header:
 * Signature"SSDT"
 * Length   0x2CA9 (11433)
 * Revision 0x01
 * Checksum 0x8F
 * OEM ID   "ACRSYS"
 * OEM Table ID "ACRPRDCT"
 * OEM Revision 0x1000 (4096)
 * Compiler ID  "1025"
 * Compiler Version 0x0004 (262144)
 */
DefinitionBlock ("", "SSDT", 1, "ACRSYS", "ACRPRDCT", 0x1000)
{
External (\_SB.PCI0.PX13, DeviceObj)

If (\_OSI ("Windows 2013"))
{
Scope (\_SB.PCI0.PX13)
{
Name (_PR0, Package (0x01)  // _PR0: Power Resources for D0
{
NVP3
})
Name (_PR2, Package (0x01)  // _PR2: Power Resources for D2
{
NVP2
})
Name (_PR3, Package (0x01)  // _PR3: Power Resources for D3hot
{
NVP3
})
Method (_S0W, 0, NotSerialized)  // _S0W: S0 Device Wake State
{
Debug = "_S0W"
// Return(3)
Return(4)
}

Method (_DSW, 3, NotSerialized)  // _DSW: Device Sleep Wake
{
}

Method (_PS0, 0, NotSerialized)  // _PS0: Power State 0
{
Debug = "_PS0"
}

Method (_PS3, 0, NotSerialized)  // _PS3: Power State 3
{
Debug = "_PS3"
}
}

PowerResource (NVP3, 0x00, 0x)
{
Name (_STA, One)  // _STA: Status

Method (_ON, 0, NotSerialized)  // _ON_: Power On
{
Debug = "NVP3._ON"
_STA = One
}

Method (_OFF, 0, NotSerialized)  // _OFF: Power Off
{
Debug = "NVP3._OFF"
_STA = Zero
}
}

PowerResource (NVP2, 0x00, 0x)
{
Name (_STA, One)  // _STA: Status
Method (_ON, 0, NotSeriali

[Nouveau] [PATCH] drm/nouveau/acpi: fix check for power resources support

2016-10-28 Thread Peter Wu
Check whether the kernel really supports power resources for a device,
otherwise the power might not be removed when the device is runtime
suspended (DSM should still work in these cases where PR does not).

Link: https://bugs.freedesktop.org/show_bug.cgi?id=98398
Fixes: 692a17dcc292 ("drm/nouveau/acpi: fix lockup with PCIe runtime PM")
Signed-off-by: Peter Wu 
---
Hi,

Maybe Cc: stable (4.8+)?

Compile-tested only. Rick, can you test if this patch fixes the problem for you?

This check was actually in the original patch, but it was changed:
https://lists.freedesktop.org/archives/nouveau/2016-May/025125.html
Re-adding the check as suggested by Mika.

Kind regards,
Peter
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index dc57b62..193573d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -240,7 +240,8 @@ static bool nouveau_pr3_present(struct pci_dev *pdev)
if (!parent_adev)
return false;
 
-   return acpi_has_method(parent_adev->handle, "_PR3");
+   return parent_adev->power.flags.power_resources &&
+   acpi_has_method(parent_adev->handle, "_PR3");
 }
 
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
-- 
2.10.1

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off GPU via Power Resources

2016-10-30 Thread Peter Wu
On Sat, Oct 29, 2016 at 12:42:34AM +, Zheng, Lv wrote:
> Hi, Mika
> 
> > From: Mika Westerberg [mailto:mika.westerb...@linux.intel.com]
> > Subject: Re: Acer Aspire V7-582PG (Haswell, GTX 750M) fails to power off 
> > GPU via Power Resources
> > 
> > On Thu, Oct 27, 2016 at 06:06:48PM +0200, Peter Wu wrote:
> > > On Thu, Oct 27, 2016 at 12:55:08PM +0300, Mika Westerberg wrote:
> > > > On Thu, Oct 27, 2016 at 09:42:28AM +, Rick Kerkhof wrote:
> > > > >No, there are not. Here is the recursive directory listing:
> > > >
> > > > Are you able to try the following patch and send me dmesg (or attach it
> > > > to that bug)? It should show if the ACPI core even tries to add those
> > > > power resources.
> > >
> > > So Rick has tested this patch now on top of 4.8.4 (mainline fails to
> > > boot due to a kbuild issue which I reported elsewhere), but the output
> > > is empty. That seems to indicate that flags.power_resources is unset.
> > 
> > Is it completely empty or is it empty just for RP05? It should print out
> > all devices with power resources.
> > 
> > > Given that _PS3 exists and is indeed a package with some elements, it
> > > seems that acpi_extract_power_resources is failing. Note that in the
> > > SSDT, the power resource NVP3 was referenced before it was defined,
> > > could that result in this enumeration failure? Relevant SSDT excerpt:
> > >
> > > Scope (\_SB.PCI0.RP05)
> > > {
> > > Name (_PR3, Package (0x01)  // _PR3: Power Resources for D3hot
> > > {
> > > NVP3
> > > })
> > > // ...
> > > }
> > >
> > > PowerResource (NVP3, 0x00, 0x)
> 
> Looks wrong order to me.
> 
> However, _PR3 is a package, for AML opcode that contains PkgLength grammar 
> primitive, forward reference may be OK (for example Method).
> DefPackage := PackageOp PkgLength NumElements PackageElementList
> DefMethod := MethodOp PkgLength NameString MethodFlags TermList
> As when it is encountered, AML interpreter may only saves entire packaged 
> object.
> 
> I need to test behaviors around Package with qemu but I don't have 
> environment now.
> Can you help to give it a try?
> By adding customize an ssdt with the above code putting under root scope,
> DefinitionBlock ("ssdt-package.aml", "SSDT", 2, "INTEL ", "PACKAGE ", 
> 0x3000)
> {
> Scope (\)
> {
> Name (_PR3, Package (0x01) { NVP3 })
> }
> PowerResource (NVP3, 0x00, 0x) {}
> }
> Running Windows images on qemu with this ssdt appended "-acpitable 
> file=ssdt-package.aml".
> To see if blue screen can be resulted.
> 
> Thanks
> Lv

Testing this code with Windows 10 gives a BSOD, changing the order of
the PowerResource and Scope does not make a difference. If I take my
previous SSDT and change the ordering of NVP3 definition vs _PR3 use,
there is no change.

Kind regards,
Peter

> > 
> > That and the fact that they come from an SSDT instead of DSDT may cause
> > this. However, I'm not expert in ACPICA so adding Bob and Lv if they
> > have ideas.
> > 
> > Bob, Lv, the bug in question is: 
> > https://bugs.freedesktop.org/show_bug.cgi?id=98398
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [PATCH v2] drm/nouveau/acpi: fix check for power resources support

2016-10-31 Thread Peter Wu
Check whether the kernel really supports power resources for a device,
otherwise the power might not be removed when the device is runtime
suspended (DSM should still work in these cases where PR does not).

This is a workaround for a problem where ACPICA and Windows 10 differ in
behavior. ACPICA does not correctly enumerate power resources within a
conditional block (due to delayed execution of such blocks) and as a
result power_resources is set to false even if _PR3 exists.

Fixes: 692a17dcc292 ("drm/nouveau/acpi: fix lockup with PCIe runtime PM")
Link: https://bugs.freedesktop.org/show_bug.cgi?id=98398
Reported-and-tested-by: Rick Kerkhof 
Reviewed-by: Mika Westerberg 
Signed-off-by: Peter Wu 
---
 v2: collected tags from Rick and Mika; added ACPICA note as requested by Mika

I suggest Cc: stable (if the maintainer is OK with that?)
---
 drivers/gpu/drm/nouveau/nouveau_acpi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_acpi.c 
b/drivers/gpu/drm/nouveau/nouveau_acpi.c
index dc57b62..193573d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_acpi.c
+++ b/drivers/gpu/drm/nouveau/nouveau_acpi.c
@@ -240,7 +240,8 @@ static bool nouveau_pr3_present(struct pci_dev *pdev)
if (!parent_adev)
return false;
 
-   return acpi_has_method(parent_adev->handle, "_PR3");
+   return parent_adev->power.flags.power_resources &&
+   acpi_has_method(parent_adev->handle, "_PR3");
 }
 
 static void nouveau_dsm_pci_probe(struct pci_dev *pdev, acpi_handle 
*dhandle_out,
-- 
2.10.1

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v2] drm/nouveau/acpi: fix check for power resources support

2016-11-01 Thread Peter Wu
On Tue, Nov 01, 2016 at 09:24:23AM -0400, Alex Deucher wrote:
> On Tue, Nov 1, 2016 at 12:55 AM, Dave Airlie  wrote:
> > On 1 November 2016 at 08:48, Peter Wu  wrote:
> >> Check whether the kernel really supports power resources for a device,
> >> otherwise the power might not be removed when the device is runtime
> >> suspended (DSM should still work in these cases where PR does not).
> >>
> >> This is a workaround for a problem where ACPICA and Windows 10 differ in
> >> behavior. ACPICA does not correctly enumerate power resources within a
> >> conditional block (due to delayed execution of such blocks) and as a
> >> result power_resources is set to false even if _PR3 exists.
> >>
> >> Fixes: 692a17dcc292 ("drm/nouveau/acpi: fix lockup with PCIe runtime PM")
> >> Link: https://bugs.freedesktop.org/show_bug.cgi?id=98398
> >> Reported-and-tested-by: Rick Kerkhof 
> >> Reviewed-by: Mika Westerberg 
> >> Signed-off-by: Peter Wu 
> >
> > I've appled it this and cc'ed stable to drm-fixes.
> >
> > Are we going to get ACPICA fixed?

The ACPI folks are aware of the problem, see this thread and its
follow-ups:
http://www.spinics.net/lists/linux-acpi/msg70050.html

> Looks like we may have hit this on radeon/amdgpu as well:
> https://bugs.freedesktop.org/show_bug.cgi?id=98505
> 
> Alex

That log seems to suggest that resume fails while this nouveau issue is
related to suspend not powering off a device. The root cause might be
different.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v2] gr: fallback to legacy paths during firmware lookup

2016-11-05 Thread Peter Wu
On Fri, Nov 04, 2016 at 06:36:17PM +0900, Alexandre Courbot wrote:
> Look for firmware files using the legacy ("nouveau/nvxx_fuc") path
> if they cannot be found in the new, "official" path. User setups were
> broken by the switch, which is bad.
> 
> There are only 4 firmware files we may want to look up that way, so
> hardcode them into the lookup function. All new firmware files should
> use the standard "nvidia//gr/" path.
> 
> Fixes: 8539b37acef7 ("drm/nouveau/gr: use NVIDIA-provided external firmwares")
> Signed-off-by: Alexandre Courbot 
> ---
> Changes since v1:
> * Moved file name translation into the legacy function
> 
>  drm/nouveau/nvkm/engine/gr/gf100.c | 53 
> +++---
>  1 file changed, 49 insertions(+), 4 deletions(-)
> 
> diff --git a/drm/nouveau/nvkm/engine/gr/gf100.c 
> b/drm/nouveau/nvkm/engine/gr/gf100.c
> index eccdee04107d..ed45f923442f 100644
> --- a/drm/nouveau/nvkm/engine/gr/gf100.c
> +++ b/drm/nouveau/nvkm/engine/gr/gf100.c
> @@ -1756,6 +1756,53 @@ gf100_gr_ = {
>  };
>  
>  int
> +gf100_gr_ctor_fw_legacy(struct gf100_gr *gr, const char *fwname,
> + struct gf100_gr_fuc *fuc, int ret)
> +{
> + struct nvkm_subdev *subdev = &gr->base.engine.subdev;
> + struct nvkm_device *device = subdev->device;
> + const struct firmware *fw;
> + char f[32];
> +
> + /* see if this firmware has a legacy path */
> + if (!strcmp(fwname, "fecs_inst"))
> + fwname = "fuc409c";
> + else if (!strcmp(fwname, "fecs_data"))
> + fwname = "fuc409d";
> + else if (!strcmp(fwname, "gpccs_inst"))
> + fwname = "fuc41ac";
> + else if (!strcmp(fwname, "gpccs_data"))
> + fwname = "fuc41ad";
> + else
> + fwname = NULL;
> +
> + /* nope, let's just return the error we got */
> + if (!fwname) {
> + nvkm_error(subdev, "failed to load %s\n", fwname);

Due to the rename from legacy_fwname -> fwname, this can be NULL.
What about replacing the else branch above by this block?

Kind regards,
Peter

> + return ret;
> + }
> +
> + /* yes, try to load from the legacy path */
> + nvkm_debug(subdev, "%s: falling back to legacy path\n", fwname);
> +
> + snprintf(f, sizeof(f), "nouveau/nv%02x_%s", device->chipset, fwname);
> + ret = request_firmware(&fw, f, device->dev);
> + if (ret) {
> + snprintf(f, sizeof(f), "nouveau/%s", fwname);
> + ret = request_firmware(&fw, f, device->dev);
> + if (ret) {
> + nvkm_error(subdev, "failed to load %s\n", fwname);
> + return ret;
> + }
> + }
> +
> + fuc->size = fw->size;
> + fuc->data = kmemdup(fw->data, fuc->size, GFP_KERNEL);
> + release_firmware(fw);
> + return (fuc->data != NULL) ? 0 : -ENOMEM;
> +}
> +
> +int
>  gf100_gr_ctor_fw(struct gf100_gr *gr, const char *fwname,
>struct gf100_gr_fuc *fuc)
>  {
> @@ -1765,10 +1812,8 @@ gf100_gr_ctor_fw(struct gf100_gr *gr, const char 
> *fwname,
>   int ret;
>  
>   ret = nvkm_firmware_get(device, fwname, &fw);
> - if (ret) {
> - nvkm_error(subdev, "failed to load %s\n", fwname);
> - return ret;
> - }
> + if (ret)
> + return gf100_gr_ctor_fw_legacy(gr, fwname, fuc, ret);
>  
>   fuc->size = fw->size;
>   fuc->data = kmemdup(fw->data, fuc->size, GFP_KERNEL);
> -- 
> 2.10.0
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau: Drop superfluous DRM_SWITCH_POWER_DYNAMIC_OFF checks

2016-11-08 Thread Peter Wu
On Tue, Nov 08, 2016 at 12:57:00PM +0100, Lukas Wunner wrote:
> nouveau's ->suspend and ->resume callbacks are currently skipped if the
> device's status is either DRM_SWITCH_POWER_OFF (powered off by
> vga_switcheroo manual power control) or DRM_SWITCH_POWER_DYNAMIC_OFF
> (runtime suspended).
> 
> In the former case this makes sense since the device is powered off
> behind the PM core's back:  It will try to execute the ->suspend and
> ->resume callbacks upon system sleep, not knowing that the device is
> inaccessible.  Therefore the callbacks have to become no-ops.
> 
> However the latter case doesn't make any sense because the PM core
> never calls the ->suspend and ->resume callbacks of runtime suspended
> devices:  Such devices are either runtime resumed before going to system
> sleep (see call to pm_runtime_resume() in drivers/pci/pci-driver:
> pci_pm_suspend()) or they are left runtime suspended over the entire
> system suspend/resume process (search for "direct_complete" in
> drivers/base/power/main.c).
> 
> Consequently the DRM_SWITCH_POWER_DYNAMIC_OFF checks are superfluous.
> Drop them.
> 
> Signed-off-by: Lukas Wunner 

It is better to rely on the official documentation rather than the
implementation. Luckily, Documentation/power/pci.txt supports the claim:

2.4.1. System Suspend

When the system is going into a sleep state in which the contents of memory 
will
be preserved, such as one of the ACPI sleep states S1-S3, the phases are:

prepare, suspend, suspend_noirq.
[..]
The pci_pm_prepare() routine first puts the device into the "fully 
functional"
state with the help of pm_runtime_resume(). [..]

So indeed we can be sure that the device is runtime-resumed before
suspend. System resume is not documented explicitly, but it seems
reasonable that the device is not runtime-suspended between system
suspend and resume.

Reviewed-by: Peter Wu 

> ---
>  drivers/gpu/drm/nouveau/nouveau_drm.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index 9876e6f..d91d092 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -666,8 +666,7 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
>   struct drm_device *drm_dev = pci_get_drvdata(pdev);
>   int ret;
>  
> - if (drm_dev->switch_power_state == DRM_SWITCH_POWER_OFF ||
> - drm_dev->switch_power_state == DRM_SWITCH_POWER_DYNAMIC_OFF)
> + if (drm_dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>   return 0;
>  
>   ret = nouveau_do_suspend(drm_dev, false);
> @@ -688,8 +687,7 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
>   struct drm_device *drm_dev = pci_get_drvdata(pdev);
>   int ret;
>  
> - if (drm_dev->switch_power_state == DRM_SWITCH_POWER_OFF ||
> - drm_dev->switch_power_state == DRM_SWITCH_POWER_DYNAMIC_OFF)
> + if (drm_dev->switch_power_state == DRM_SWITCH_POWER_OFF)
>   return 0;
>  
>   pci_set_power_state(pdev, PCI_D0);
> -- 
> 2.10.1
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] acpi: video: Move ACPI_VIDEO_NOTIFY_* defines to acpi/video.h

2016-11-10 Thread Peter Wu
On Wed, Nov 09, 2016 at 06:15:56PM +0100, Hans de Goede wrote:
> acpi_video.c passed the ACPI_VIDEO_NOTIFY_* defines as type code to
> acpi_notifier_call_chain(). Move these defines to acpi/video.h so
> that acpi_notifier listeners can check the type code using these
> defines.
> 
> Signed-off-by: Hans de Goede 

Reviewed-by: Peter Wu 

> ---
>  drivers/acpi/acpi_video.c | 11 ---
>  include/acpi/video.h  | 11 +++
>  2 files changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
> index c5557d0..201292e 100644
> --- a/drivers/acpi/acpi_video.c
> +++ b/drivers/acpi/acpi_video.c
> @@ -43,17 +43,6 @@
>  
>  #define ACPI_VIDEO_BUS_NAME  "Video Bus"
>  #define ACPI_VIDEO_DEVICE_NAME   "Video Device"
> -#define ACPI_VIDEO_NOTIFY_SWITCH 0x80
> -#define ACPI_VIDEO_NOTIFY_PROBE  0x81
> -#define ACPI_VIDEO_NOTIFY_CYCLE  0x82
> -#define ACPI_VIDEO_NOTIFY_NEXT_OUTPUT0x83
> -#define ACPI_VIDEO_NOTIFY_PREV_OUTPUT0x84
> -
> -#define ACPI_VIDEO_NOTIFY_CYCLE_BRIGHTNESS   0x85
> -#define  ACPI_VIDEO_NOTIFY_INC_BRIGHTNESS0x86
> -#define ACPI_VIDEO_NOTIFY_DEC_BRIGHTNESS 0x87
> -#define ACPI_VIDEO_NOTIFY_ZERO_BRIGHTNESS0x88
> -#define ACPI_VIDEO_NOTIFY_DISPLAY_OFF0x89
>  
>  #define MAX_NAME_LEN 20
>  
> diff --git a/include/acpi/video.h b/include/acpi/video.h
> index 4536bd3..bfe484d 100644
> --- a/include/acpi/video.h
> +++ b/include/acpi/video.h
> @@ -30,6 +30,17 @@ struct acpi_device;
>  #define ACPI_VIDEO_DISPLAY_LEGACY_PANEL   0x0110
>  #define ACPI_VIDEO_DISPLAY_LEGACY_TV  0x0200
>  
> +#define ACPI_VIDEO_NOTIFY_SWITCH 0x80
> +#define ACPI_VIDEO_NOTIFY_PROBE  0x81
> +#define ACPI_VIDEO_NOTIFY_CYCLE  0x82
> +#define ACPI_VIDEO_NOTIFY_NEXT_OUTPUT0x83
> +#define ACPI_VIDEO_NOTIFY_PREV_OUTPUT0x84
> +#define ACPI_VIDEO_NOTIFY_CYCLE_BRIGHTNESS   0x85
> +#define ACPI_VIDEO_NOTIFY_INC_BRIGHTNESS 0x86
> +#define ACPI_VIDEO_NOTIFY_DEC_BRIGHTNESS 0x87
> +#define ACPI_VIDEO_NOTIFY_ZERO_BRIGHTNESS0x88
> +#define ACPI_VIDEO_NOTIFY_DISPLAY_OFF0x89
> +
>  enum acpi_backlight_type {
>   acpi_backlight_undef = -1,
>   acpi_backlight_none = 0,
> -- 
> 2.9.3
> 
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH] drm/nouveau: Intercept ACPI_VIDEO_NOTIFY_PROBE

2016-11-10 Thread Peter Wu
+ * This should be dropped once that is merged.
> + */
> +#ifndef ACPI_VIDEO_NOTIFY_PROBE
> +#define ACPI_VIDEO_NOTIFY_PROBE  0x81
> +#endif
> +
> +static void
> +nouveau_display_acpi_work(struct work_struct *work)
> +{
> + struct nouveau_drm *drm = container_of(work, typeof(*drm), acpi_work);
> +
> + pm_runtime_get_sync(drm->dev->dev);
> +
> + drm_helper_hpd_irq_event(drm->dev);
> +
> + pm_runtime_mark_last_busy(drm->dev->dev);
> + pm_runtime_put_sync(drm->dev->dev);

Nothing depends on the device being suspended immediately, I guess you
can drop _sync and also use:

pm_runtime_put_autosuspend

Kind regards,
Peter

> +}
> +
> +static int
> +nouveau_display_acpi_ntfy(struct notifier_block *nb, unsigned long val,
> +   void *data)
> +{
> + struct nouveau_drm *drm = container_of(nb, typeof(*drm), acpi_nb);
> + struct acpi_bus_event *info = data;
> +
> + if (!strcmp(info->device_class, ACPI_VIDEO_CLASS)) {
> + if (info->type == ACPI_VIDEO_NOTIFY_PROBE) {
> + /*
> +  * This may be the only indication we receive of a
> +  * connector hotplug on a runtime suspended GPU,
> +  * schedule acpi_work to check.
> +  */
> + schedule_work(&drm->acpi_work);
> +
> + /* acpi-video should not generate keypresses for this */
> + return NOTIFY_BAD;
> + }
> + }
> +
> + return NOTIFY_DONE;
> +}
> +#endif
> +
>  int
>  nouveau_display_init(struct drm_device *dev)
>  {
> @@ -537,6 +589,12 @@ nouveau_display_create(struct drm_device *dev)
>   }
>  
>   nouveau_backlight_init(dev);
> +#ifdef CONFIG_ACPI
> + INIT_WORK(&drm->acpi_work, nouveau_display_acpi_work);
> + drm->acpi_nb.notifier_call = nouveau_display_acpi_ntfy;
> + register_acpi_notifier(&drm->acpi_nb);
> +#endif
> +
>   return 0;
>  
>  vblank_err:
> @@ -552,6 +610,9 @@ nouveau_display_destroy(struct drm_device *dev)
>  {
>   struct nouveau_display *disp = nouveau_display(dev);
>  
> +#ifdef CONFIG_ACPI
> + unregister_acpi_notifier(&nouveau_drm(dev)->acpi_nb);
> +#endif
>   nouveau_backlight_exit(dev);
>   nouveau_display_vblank_fini(dev);
>  
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
> b/drivers/gpu/drm/nouveau/nouveau_drv.h
> index 822a021..71d4532 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drv.h
> +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
> @@ -37,6 +37,8 @@
>   *  - implemented limited ABI16/NVIF interop
>   */
>  
> +#include 
> +
>  #include 
>  #include 
>  #include 
> @@ -161,6 +163,10 @@ struct nouveau_drm {
>   struct nvbios vbios;
>   struct nouveau_display *display;
>   struct backlight_device *backlight;
> +#ifdef CONFIG_ACPI
> + struct notifier_block acpi_nb;
> + struct work_struct acpi_work;
> +#endif
>  
>   /* power management */
>   struct nouveau_hwmon *hwmon;
> -- 
> 2.9.3
> 
> ___
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Nvidia recent drivers

2017-01-27 Thread Peter Wu
It is already available (I have that dGPU myself). Newer kernels are 
recommended (Linux 4.8+), you need either xf86-video-modesetting driver (known 
working with xorg 1.18) or xf86-video-nouveau (from git).

Also note that many laptops with these card suffer from a problem that cause 
lockups when runpm kicks in. See 
https://bugzilla.kernel.org/show_bug.cgi?id=156341

Kind regards,
Peter
https://lekensteyn.nl
(pardon my brevity, top-posting and formatting, sent from my phone)


On 23 January 2017 20:50:34 CET, "Mr.ze"  wrote:
>Hi,
>When will Nvidia Geforce 965M nouveau driver be available?
>It's just that when I use linux, i have to use intel HD graphics :\
>
>Thanks,
>Ze.
>
>H
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] Discrete card is always off even if it is used

2017-01-27 Thread Peter Wu
Have you checked your Xorg.log? It should load the modesetting or nouveau 
driver for the dGPU.

Kind regards,
Peter
https://lekensteyn.nl
(pardon my brevity, top-posting and formatting, sent from my phone)


On 17 January 2017 17:56:47 CET, mich...@nectarine.it wrote:
>Hi,
>I have a computer with an integrated graphic card and a discrete
>graphic 
>card, namely:
>
># lspci | grep -E "VGA|3D"
>00:02.0 VGA compatible controller: Intel Corporation Broadwell-U 
>Integrated Graphics (rev 09)
>03:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 840M] (rev
>a2)
>
>This is confirmed by vgaswitcheroo:
>
>xray:/ # cat /sys/kernel/debug/vgaswitcheroo/switch
>0:IGD:+:Pwr::00:02.0
>1:DIS: :DynOff::03:00.0
>
>Some info:
>
>xray:/ # uname -a
>Linux xray 4.9.4-1.gd9de2ec-default #1 SMP PREEMPT Sun Jan 15 16:51:00 
>UTC 2017 (d9de2ec) x86_64 x86_64 x86_64 GNU/Linux
>
>xray:/var/log # rpm -qa | grep -i nouveau
>libdrm_nouveau2-2.4.68-1.4.x86_64
>libvdpau_nouveau-11.2.2-166.1.x86_64
>xf86-video-nouveau-1.0.12-1.5.x86_64
>Mesa-dri-nouveau-11.2.2-166.1.x86_64
>
>No NVIDIA drivers are used.
>
># lsmod | egrep "i915|nvidia"
>i915 1241088  4
>video  40960  5 
>dell_wmi,dell_laptop,int3406_thermal,nouveau,i915
>button 16384  2 nouveau,i915
>i2c_algo_bit   16384  2 nouveau,i915
>drm_kms_helper159744  2 nouveau,i915
>drm   360448  7 nouveau,i915,ttm,drm_kms_helper
>
>Now begins the strangeness
>
>xray:~> xrandr --listproviders
>Providers: number : 1
>Provider 0: id: 0x49 cap: 0xb, Source Output, Sink Output, Sink Offload
>
>crtcs: 4 outputs: 6 associated providers: 0 name:Intel
>
>Why just one provider? Should be also listed NVIDIA with nouveau.
>
>And now for some strangeness again:
>xray:~> glxgears -info
>Running synchronized to the vertical refresh.  The framerate should be
>approximately the same as the monitor refresh rate.
>GL_RENDERER   = Mesa DRI Intel(R) HD Graphics 5500 (Broadwell GT2)
>GL_VERSION= 3.0 Mesa 11.2.2
>GL_VENDOR = Intel Open Source Technology Center
>GL_EXTENSIONS = GL_ARB_multisample GL_EXT_abgr GL_EXT_bgra 
>GL_EXT_blend_color GL_EXT_blend_minmax GL_EXT_blend_subtract 
>GL_EXT_copy_texture GL_EXT_polygon_offset GL_EXT_subtexture 
>GL_EXT_texture_object GL_EXT_vertex_array GL_EXT_compiled_vertex_array 
>GL_EXT_texture GL_EXT_texture3D GL_IBM_rasterpos_clip 
>GL_ARB_point_parameters GL_EXT_draw_range_elements GL_EXT_packed_pixels
>
>GL_EXT_point_parameters GL_EXT_rescale_normal 
>GL_EXT_separate_specular_color GL_EXT_texture_edge_clamp 
>GL_SGIS_generate_mipmap GL_SGIS_texture_border_clamp 
>GL_SGIS_texture_edge_clamp GL_SGIS_texture_lod GL_ARB_framebuffer_sRGB 
>GL_ARB_multitexture GL_EXT_framebuffer_sRGB
>GL_IBM_multimode_draw_arrays 
>GL_IBM_texture_mirrored_repeat GL_3DFX_texture_compression_FXT1 
>GL_ARB_texture_cube_map GL_ARB_texture_env_add GL_ARB_transpose_matrix 
>GL_EXT_blend_func_separate GL_EXT_fog_coord GL_EXT_multi_draw_arrays 
>GL_EXT_secondary_color GL_EXT_texture_env_add 
>GL_EXT_texture_filter_anisotropic GL_EXT_texture_lod_bias 
>GL_INGR_blend_func_separate GL_NV_blend_square GL_NV_light_max_exponent
>
>GL_NV_texgen_reflection GL_NV_texture_env_combine4 GL_S3_s3tc 
>GL_SUN_multi_draw_arrays GL_ARB_texture_border_clamp 
>GL_ARB_texture_compression GL_EXT_framebuffer_object 
>GL_EXT_texture_compression_s3tc GL_EXT_texture_env_combine 
>GL_EXT_texture_env_dot3 GL_MESA_window_pos GL_NV_packed_depth_stencil 
>GL_NV_texture_rectangle GL_ARB_depth_texture GL_ARB_occlusion_query 
>GL_ARB_shadow GL_ARB_texture_env_combine GL_ARB_texture_env_crossbar 
>GL_ARB_texture_env_dot3 GL_ARB_texture_mirrored_repeat
>GL_ARB_window_pos 
>GL_EXT_stencil_two_side GL_EXT_texture_cube_map GL_NV_depth_clamp 
>GL_APPLE_packed_pixels GL_APPLE_vertex_array_object GL_ARB_draw_buffers
>
>GL_ARB_fragment_program GL_ARB_fragment_shader GL_ARB_shader_objects 
>GL_ARB_vertex_program GL_ARB_vertex_shader GL_ATI_draw_buffers 
>GL_ATI_texture_env_combine3 GL_ATI_texture_float GL_EXT_shadow_funcs 
>GL_EXT_stencil_wrap GL_MESA_pack_invert GL_NV_primitive_restart 
>GL_ARB_depth_clamp GL_ARB_fragment_program_shadow 
>GL_ARB_half_float_pixel GL_ARB_occlusion_query2 GL_ARB_point_sprite 
>GL_ARB_shading_language_100 GL_ARB_sync GL_ARB_texture_non_power_of_two
>
>GL_ARB_vertex_buffer_object GL_ATI_blend_equation_separate 
>GL_EXT_blend_equation_separate GL_OES_read_format 
>GL_ARB_color_buffer_float GL_ARB_pixel_buffer_object 
>GL_ARB_texture_compression_rgtc GL_ARB_texture_float 
>GL_ARB_texture_rectangle GL_EXT_packed_float GL_EXT_pixel_buffer_object
>
>GL_EXT_texture_compression_dxt1 GL_EXT_texture_compression_rgtc 
>GL_EXT_texture_rectangle GL_EXT_texture_sRGB 
>GL_EXT_texture_shared_exponent GL_ARB_framebuffer_object 
>GL_EXT_framebuffer_blit GL_EXT_framebuffer_multisample 
>GL_EXT_packed_depth_stencil GL_APPLE_object_purgeable 
>GL_ARB_vertex_array_object GL_ATI_separate_stencil GL_EXT_draw_buffers2
>
>GL_EXT_draw_instanced GL_

Re: [Nouveau] Nvidia recent drivers

2017-01-28 Thread Peter Wu
Hi Lukas,

On Sat, Jan 28, 2017 at 07:58:32AM +0100, Lukas Wunner wrote:
> On Fri, Jan 27, 2017 at 04:06:38PM +0100, Peter Wu wrote:
> > Also note that many laptops with these card suffer from a problem
> > that cause lockups when runpm kicks in.
> > See https://bugzilla.kernel.org/show_bug.cgi?id=156341
> 
> Peter, I notice you weren't cc'ed on the last few messages of the
> 'PCI: Revert "PCI: Add runtime PM support for PCIe ports"' thread,
> the similar bug https://bugzilla.kernel.org/show_bug.cgi?id=190861
> turned out to be caused by locking issues in nouveau which were
> fixed with 3846fd9b8600 in Linus' tree.  That commit swiftly got
> reverted because it lacked prerequisites, and replaced by the
> temporary fix cae9ff036eea + 15266ae38fe0.  These two will be
> reverted in 4.11 and replaced with the proper solution (which was
> 3846fd9b8600).  Long story short, the issue seems solved, the revert
> of PCIe port PM is dequeued.  Perhaps this also fixes the issues in
> the bugzilla entry you mentioned above, at least for some laptops?

Thanks for following up, but I am afraid it won't help since the other
problem is in ACPI or PCI while this fixes a locking problem (in 4.10,
current stable 4.9 was unaffected).

FYI, the fbcon patch mentioned above will also (temporary)
Related to the fbcon patch, there was an attempt to do something similar
last year, also due to locking issues (with console_lock):
https://lists.freedesktop.org/archives/dri-devel/2016-July/113087.html
The 15266ae38fe0 patch will probably fix that for the moment, but when
reverting it the problem returns.
-- 
Kind regards,
Peter Wu
https://lekensteyn.nl
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau