Re: [Nouveau] [PATCH v7] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2020-03-20 Thread Karol Herbst
On Fri, Mar 20, 2020 at 11:19 PM Bjorn Helgaas  wrote:
>
> On Tue, Mar 10, 2020 at 08:26:27PM +0100, Karol Herbst wrote:
> > Fixes the infamous 'runtime PM' bug many users are facing on Laptops with
> > Nvidia Pascal GPUs by skipping said PCI power state changes on the GPU.
> >
> > Depending on the used kernel there might be messages like those in demsg:
> >
> > "nouveau :01:00.0: Refused to change power state, currently in D3"
> > "nouveau :01:00.0: can't change power state from D3cold to D0 (config
> > space inaccessible)"
> > followed by backtraces of kernel crashes or timeouts within nouveau.
> >
> > It's still unkown why this issue exists, but this is a reliable workaround
> > and solves a very annoying issue for user having to choose between a
> > crashing kernel or higher power consumption of their Laptops.
>
> Thanks for the bugzilla link.  The bugzilla mentions lots of mailing
> list discussion.  Can you include links to some of that?
>
> IIUC this basically just turns off PCI power management for the GPU.
> Can you do that with something like the following?  I don't know
> anything about DRM, so I don't know where you could save the pm_cap,
> but I'm sure the driver could keep it somewhere.
>

Sure this would work? From a quick look over the pci code, it looks
like a of code would be skipped we really need, like the platform code
to turn off the GPU via ACPI. But I could also remember incorrectly on
how all of that worked again. I can of course try and see what the
effect of this patch would be. And would the parent bus even go into
D3hot if it knows one of its children is still at D0? Because that's
what the result of that would be as well, no? And I know that if the
bus stays in D0, that it has a negative impact on power consumption.

Anyway, I will try that out, I am just not seeing how that would help.

>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
> b/drivers/gpu/drm/nouveau/nouveau_drm.c
> index b65ae817eabf..2ad825e8891c 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_drm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
> @@ -618,6 +618,23 @@ nouveau_drm_device_fini(struct drm_device *dev)
> kfree(drm);
>  }
>
> +static void quirk_broken_nv_runpm(struct drm_device *drm_dev)
> +{
> +   struct pci_dev *pdev = drm_dev->pdev;
> +   struct pci_dev *bridge = pci_upstream_bridge(pdev);
> +
> +   if (!bridge || bridge->vendor != PCI_VENDOR_ID_INTEL)
> +   return;
> +
> +   switch (bridge->device) {
> +   case 0x1901:
> +   STASH->pm_cap = pdev->pm_cap;
> +   pdev->pm_cap = 0;
> +   NV_INFO(drm_dev, "Disabling PCI power management to avoid 
> bug\n");
> +   break;
> +   }
> +}
> +
>  static int nouveau_drm_probe(struct pci_dev *pdev,
>  const struct pci_device_id *pent)
>  {
> @@ -699,6 +716,7 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
> if (ret)
> goto fail_drm_dev_init;
>
> +   quirk_broken_nv_runpm(drm_dev);
> return 0;
>
>  fail_drm_dev_init:
> @@ -735,6 +753,9 @@ nouveau_drm_remove(struct pci_dev *pdev)
>  {
> struct drm_device *dev = pci_get_drvdata(pdev);
>
> +   /* If we disabled PCI power management, restore it */
> +   if (STASH->pm_cap)
> +   pdev->pm_cap = STASH->pm_cap;
> nouveau_drm_device_remove(dev);
> pci_disable_device(pdev);
>  }
>

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH v7] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2020-03-20 Thread Bjorn Helgaas
On Tue, Mar 10, 2020 at 08:26:27PM +0100, Karol Herbst wrote:
> Fixes the infamous 'runtime PM' bug many users are facing on Laptops with
> Nvidia Pascal GPUs by skipping said PCI power state changes on the GPU.
> 
> Depending on the used kernel there might be messages like those in demsg:
> 
> "nouveau :01:00.0: Refused to change power state, currently in D3"
> "nouveau :01:00.0: can't change power state from D3cold to D0 (config
> space inaccessible)"
> followed by backtraces of kernel crashes or timeouts within nouveau.
> 
> It's still unkown why this issue exists, but this is a reliable workaround
> and solves a very annoying issue for user having to choose between a
> crashing kernel or higher power consumption of their Laptops.

Thanks for the bugzilla link.  The bugzilla mentions lots of mailing
list discussion.  Can you include links to some of that?

IIUC this basically just turns off PCI power management for the GPU.
Can you do that with something like the following?  I don't know
anything about DRM, so I don't know where you could save the pm_cap,
but I'm sure the driver could keep it somewhere.


diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index b65ae817eabf..2ad825e8891c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -618,6 +618,23 @@ nouveau_drm_device_fini(struct drm_device *dev)
kfree(drm);
 }
 
+static void quirk_broken_nv_runpm(struct drm_device *drm_dev)
+{
+   struct pci_dev *pdev = drm_dev->pdev;
+   struct pci_dev *bridge = pci_upstream_bridge(pdev);
+
+   if (!bridge || bridge->vendor != PCI_VENDOR_ID_INTEL)
+   return;
+
+   switch (bridge->device) {
+   case 0x1901:
+   STASH->pm_cap = pdev->pm_cap;
+   pdev->pm_cap = 0;
+   NV_INFO(drm_dev, "Disabling PCI power management to avoid 
bug\n");
+   break;
+   }
+}
+
 static int nouveau_drm_probe(struct pci_dev *pdev,
 const struct pci_device_id *pent)
 {
@@ -699,6 +716,7 @@ static int nouveau_drm_probe(struct pci_dev *pdev,
if (ret)
goto fail_drm_dev_init;
 
+   quirk_broken_nv_runpm(drm_dev);
return 0;
 
 fail_drm_dev_init:
@@ -735,6 +753,9 @@ nouveau_drm_remove(struct pci_dev *pdev)
 {
struct drm_device *dev = pci_get_drvdata(pdev);
 
+   /* If we disabled PCI power management, restore it */
+   if (STASH->pm_cap)
+   pdev->pm_cap = STASH->pm_cap;
nouveau_drm_device_remove(dev);
pci_disable_device(pdev);
 }
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 4/4] mm: check the device private page owner in hmm_range_fault

2020-03-20 Thread Jason Gunthorpe
On Mon, Mar 16, 2020 at 08:32:16PM +0100, Christoph Hellwig wrote:
> diff --git a/mm/hmm.c b/mm/hmm.c
> index cfad65f6a67b..b75b3750e03d 100644
> +++ b/mm/hmm.c
> @@ -216,6 +216,14 @@ int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned 
> long addr,
>   unsigned long end, uint64_t *pfns, pmd_t pmd);
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  
> +static inline bool hmm_is_device_private_entry(struct hmm_range *range,
> + swp_entry_t entry)
> +{
> + return is_device_private_entry(entry) &&
> + device_private_entry_to_page(entry)->pgmap->owner ==
> + range->dev_private_owner;
> +}

Thinking about this some more, does the locking work out here?

hmm_range_fault() runs with mmap_sem in read, and does not lock any of
the page table levels.

So it relies on accessing stale pte data being safe, and here we
introduce for the first time a page pointer dereference and a pgmap
dereference without any locking/refcounting.

The get_dev_pagemap() worked on the PFN and obtained a refcount, so it
created safety.

Is there some tricky reason this is safe, eg a DEVICE_PRIVATE page
cannot be removed from the vma without holding mmap_sem in write or
something?

Jason
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 3/4] mm: simplify device private page handling in hmm_range_fault

2020-03-20 Thread Jason Gunthorpe
On Thu, Mar 19, 2020 at 06:33:04PM -0700, Ralph Campbell wrote:

> > > + .default_flags = dmirror_hmm_flags[HMM_PFN_VALID] |
> > > + (write ? dmirror_hmm_flags[HMM_PFN_WRITE] : 0),
> > > + .dev_private_owner = dmirror->mdevice,
> > > + };
> > > + int ret = 0;
> > 
> > > +static int dmirror_snapshot(struct dmirror *dmirror,
> > > + struct hmm_dmirror_cmd *cmd)
> > > +{
> > > + struct mm_struct *mm = dmirror->mm;
> > > + unsigned long start, end;
> > > + unsigned long size = cmd->npages << PAGE_SHIFT;
> > > + unsigned long addr;
> > > + unsigned long next;
> > > + uint64_t pfns[64];
> > > + unsigned char perm[64];
> > > + char __user *uptr;
> > > + struct hmm_range range = {
> > > + .pfns = pfns,
> > > + .flags = dmirror_hmm_flags,
> > > + .values = dmirror_hmm_values,
> > > + .pfn_shift = DPT_SHIFT,
> > > + .pfn_flags_mask = ~0ULL,
> > 
> > Same here, especially since this is snapshot
> > 
> > Jason
> 
> Actually, snapshot ignores pfn_flags_mask and default_flags.

Yes, so no reason to set them to not 0..

Jason
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau