Re: [Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-16 Thread Bjorn Helgaas
On Wed, Oct 16, 2019 at 11:48:22PM +0200, Karol Herbst wrote:
> On Wed, Oct 16, 2019 at 11:37 PM Bjorn Helgaas  wrote:
> > On Wed, Oct 16, 2019 at 09:18:32PM +0200, Karol Herbst wrote:
> > > but setting the PCI_DEV_FLAGS_NO_D3 flag does prevent using the
> > > platform means of putting the device into D3cold, right? That's
> > > actually what should still happen, just the D3hot step should be
> > > skipped.
> >
> > If I understand correctly, when we put a device in D3cold on an ACPI
> > system, we do something like this:
> >
> >   pci_set_power_state(D3cold)
> > if (PCI_DEV_FLAGS_NO_D3)
> >   return 0   <-- nothing at all if 
> > quirked
> > pci_raw_set_power_state
> >   pci_write_config_word(PCI_PM_CTRL, D3hot)  <-- set to D3hot
> > __pci_complete_power_transition(D3cold)
> >   pci_platform_power_transition(D3cold)
> > platform_pci_set_power_state(D3cold)
> >   acpi_pci_set_power_state(D3cold)
> > acpi_device_set_power(ACPI_STATE_D3_COLD)
> >   ...
> > acpi_evaluate_object("_OFF") <-- set to D3cold
> >
> > I did not understand the connection with platform (ACPI) power
> > management from your patch.  It sounds like you want this entire path
> > except that you want to skip the PCI_PM_CTRL write?
> >
> 
> exactly. I am running with this workaround for a while now and never
> had any fails with it anymore. The GPU gets turned off correctly and I
> see the same power savings, just that the GPU can be powered on again.
> 
> > That seems like something Rafael should weigh in on.  I don't know
> > why we set the device to D3hot with PCI_PM_CTRL before using the ACPI
> > methods, and I don't know what the effect of skipping that is.  It
> > seems a little messy to slice out this tiny piece from the middle, but
> > maybe it makes sense.
> >
> 
> afaik when I was talking with others in the past about it, Windows is
> doing that before using ACPI calls, but maybe they have some similar
> workarounds for certain intel bridges as well? I am sure it affects
> more than the one I am blacklisting here, but I rather want to check
> each device before blacklisting all kabylake and sky lake bridges (as
> those are the ones were this issue can be observed).

From a quick look at the ACPI spec, I didn't see conditions like "OSPM
must put PCI devices in D3hot before executing _OFF".  But obviously
there's *some* reason and I probably just missed it.

> Sadly we had no luck getting any information about such workaround out
> of Nvidia or Intel.

I'm not surprised; it doesn't seem like we really have the details
needed to get to a root cause yet.  I think what we really need is a
PCIe analyzer trace to see what happens when the device "falls off the
bus".

Bjorn
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

Re: [Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-16 Thread Karol Herbst
On Wed, Oct 16, 2019 at 11:37 PM Bjorn Helgaas  wrote:
>
> [+cc linux-acpi]
>
> On Wed, Oct 16, 2019 at 09:18:32PM +0200, Karol Herbst wrote:
> > but setting the PCI_DEV_FLAGS_NO_D3 flag does prevent using the
> > platform means of putting the device into D3cold, right? That's
> > actually what should still happen, just the D3hot step should be
> > skipped.
>
> If I understand correctly, when we put a device in D3cold on an ACPI
> system, we do something like this:
>
>   pci_set_power_state(D3cold)
> if (PCI_DEV_FLAGS_NO_D3)
>   return 0   <-- nothing at all if quirked
> pci_raw_set_power_state
>   pci_write_config_word(PCI_PM_CTRL, D3hot)  <-- set to D3hot
> __pci_complete_power_transition(D3cold)
>   pci_platform_power_transition(D3cold)
> platform_pci_set_power_state(D3cold)
>   acpi_pci_set_power_state(D3cold)
> acpi_device_set_power(ACPI_STATE_D3_COLD)
>   ...
> acpi_evaluate_object("_OFF") <-- set to D3cold
>
> I did not understand the connection with platform (ACPI) power
> management from your patch.  It sounds like you want this entire path
> except that you want to skip the PCI_PM_CTRL write?
>

exactly. I am running with this workaround for a while now and never
had any fails with it anymore. The GPU gets turned off correctly and I
see the same power savings, just that the GPU can be powered on again.

> That seems like something Rafael should weigh in on.  I don't know
> why we set the device to D3hot with PCI_PM_CTRL before using the ACPI
> methods, and I don't know what the effect of skipping that is.  It
> seems a little messy to slice out this tiny piece from the middle, but
> maybe it makes sense.
>

afaik when I was talking with others in the past about it, Windows is
doing that before using ACPI calls, but maybe they have some similar
workarounds for certain intel bridges as well? I am sure it affects
more than the one I am blacklisting here, but I rather want to check
each device before blacklisting all kabylake and sky lake bridges (as
those are the ones were this issue can be observed).

Sadly we had no luck getting any information about such workaround out
of Nvidia or Intel.

> > On Wed, Oct 16, 2019 at 9:14 PM Bjorn Helgaas  wrote:
> > >
> > > On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote:
> > > > Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher 
> > > > device
> > > > states.
> > > >
> > > > v2: convert to pci_dev quirk
> > > > put a proper technical explanation of the issue as a in-code comment
> > > > v3: disable it only for certain combinations of intel and nvidia 
> > > > hardware
> > > >
> > > > Signed-off-by: Karol Herbst 
> > > > Cc: Bjorn Helgaas 
> > > > Cc: Lyude Paul 
> > > > Cc: Rafael J. Wysocki 
> > > > Cc: Mika Westerberg 
> > > > Cc: linux-...@vger.kernel.org
> > > > Cc: linux...@vger.kernel.org
> > > > Cc: dri-de...@lists.freedesktop.org
> > > > Cc: nouveau@lists.freedesktop.org
> > > > ---
> > > >  drivers/pci/pci.c| 11 ++
> > > >  drivers/pci/quirks.c | 52 
> > > >  include/linux/pci.h  |  1 +
> > > >  3 files changed, 64 insertions(+)
> > > >
> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index b97d9e10c9cc..8e056eb7e6ff 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > > > @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct 
> > > > pci_dev *dev)
> > > >   return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false;
> > > >  }
> > > >
> > > > +static inline bool parent_broken_child_pm(struct pci_dev *dev)
> > > > +{
> > > > + if (!dev->bus || !dev->bus->self)
> > > > + return false;
> > > > + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm;
> > > > +}
> > > > +
> > > >  /**
> > > >   * pci_raw_set_power_state - Use PCI PM registers to set the power 
> > > > state of
> > > >   *given PCI device
> > > > @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev 
> > > > *dev, pci_power_t state)
> > > >  || (state == PCI_D2 && !dev->d2_support))
> > > >   return -EIO;
> > > >
> > > > + /* check if the bus controller causes issues */
> > > > + if (state != PCI_D0 && parent_broken_child_pm(dev))
> > > > + return 0;
> > > > +
> > > >   pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, );
> > > >
> > > >   /*
> > > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > > index 44c4ae1abd00..c2f20b745dd4 100644
> > > > --- a/drivers/pci/quirks.c
> > > > +++ b/drivers/pci/quirks.c
> > > > @@ -5268,3 +5268,55 @@ static void 
> > > > quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev)
> > > >  DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1,
> > > > PCI_CLASS_DISPLAY_VGA, 8,
> > > > 

Re: [Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-16 Thread Bjorn Helgaas
[+cc linux-acpi]

On Wed, Oct 16, 2019 at 09:18:32PM +0200, Karol Herbst wrote:
> but setting the PCI_DEV_FLAGS_NO_D3 flag does prevent using the
> platform means of putting the device into D3cold, right? That's
> actually what should still happen, just the D3hot step should be
> skipped.

If I understand correctly, when we put a device in D3cold on an ACPI
system, we do something like this:

  pci_set_power_state(D3cold)
if (PCI_DEV_FLAGS_NO_D3)
  return 0   <-- nothing at all if quirked
pci_raw_set_power_state
  pci_write_config_word(PCI_PM_CTRL, D3hot)  <-- set to D3hot
__pci_complete_power_transition(D3cold)
  pci_platform_power_transition(D3cold)
platform_pci_set_power_state(D3cold)
  acpi_pci_set_power_state(D3cold)
acpi_device_set_power(ACPI_STATE_D3_COLD)
  ...
acpi_evaluate_object("_OFF") <-- set to D3cold

I did not understand the connection with platform (ACPI) power
management from your patch.  It sounds like you want this entire path
except that you want to skip the PCI_PM_CTRL write?

That seems like something Rafael should weigh in on.  I don't know
why we set the device to D3hot with PCI_PM_CTRL before using the ACPI
methods, and I don't know what the effect of skipping that is.  It
seems a little messy to slice out this tiny piece from the middle, but
maybe it makes sense.

> On Wed, Oct 16, 2019 at 9:14 PM Bjorn Helgaas  wrote:
> >
> > On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote:
> > > Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher 
> > > device
> > > states.
> > >
> > > v2: convert to pci_dev quirk
> > > put a proper technical explanation of the issue as a in-code comment
> > > v3: disable it only for certain combinations of intel and nvidia hardware
> > >
> > > Signed-off-by: Karol Herbst 
> > > Cc: Bjorn Helgaas 
> > > Cc: Lyude Paul 
> > > Cc: Rafael J. Wysocki 
> > > Cc: Mika Westerberg 
> > > Cc: linux-...@vger.kernel.org
> > > Cc: linux...@vger.kernel.org
> > > Cc: dri-de...@lists.freedesktop.org
> > > Cc: nouveau@lists.freedesktop.org
> > > ---
> > >  drivers/pci/pci.c| 11 ++
> > >  drivers/pci/quirks.c | 52 
> > >  include/linux/pci.h  |  1 +
> > >  3 files changed, 64 insertions(+)
> > >
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index b97d9e10c9cc..8e056eb7e6ff 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> > > @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct 
> > > pci_dev *dev)
> > >   return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false;
> > >  }
> > >
> > > +static inline bool parent_broken_child_pm(struct pci_dev *dev)
> > > +{
> > > + if (!dev->bus || !dev->bus->self)
> > > + return false;
> > > + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm;
> > > +}
> > > +
> > >  /**
> > >   * pci_raw_set_power_state - Use PCI PM registers to set the power state 
> > > of
> > >   *given PCI device
> > > @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev 
> > > *dev, pci_power_t state)
> > >  || (state == PCI_D2 && !dev->d2_support))
> > >   return -EIO;
> > >
> > > + /* check if the bus controller causes issues */
> > > + if (state != PCI_D0 && parent_broken_child_pm(dev))
> > > + return 0;
> > > +
> > >   pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, );
> > >
> > >   /*
> > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > > index 44c4ae1abd00..c2f20b745dd4 100644
> > > --- a/drivers/pci/quirks.c
> > > +++ b/drivers/pci/quirks.c
> > > @@ -5268,3 +5268,55 @@ static void 
> > > quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev)
> > >  DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1,
> > > PCI_CLASS_DISPLAY_VGA, 8,
> > > quirk_reset_lenovo_thinkpad_p50_nvgpu);
> > > +
> > > +/*
> > > + * Some Intel PCIe bridges cause devices to disappear from the PCIe bus 
> > > after
> > > + * those were put into D3cold state if they were put into a non D0 PCI PM
> > > + * device state before doing so.
> > > + *
> > > + * This leads to various issue different issues which all manifest 
> > > differently,
> > > + * but have the same root cause:
> > > + *  - AIML code execution hits an infinite loop (as the coe waits on 
> > > device
> > > + *memory to change).
> > > + *  - kernel crashes, as all pci reads return -1, which most code isn't 
> > > able
> > > + *to handle well enough.
> > > + *  - sudden shutdowns, as the kernel identified an unrecoverable error 
> > > after
> > > + *userspace tries to access the GPU.
> > > + *
> > > + * In all cases dmesg will contain at least one line like this:
> > > + * 'nouveau :01:00.0: Refused to change power state, currently in D3'
> > > + 

[Nouveau] [Bug 75985] [NVC1] HDMI audio device only visible after rescan

2019-10-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=75985

--- Comment #114 from Lukas Wunner  ---
(In reply to Przemysław Kopa from comment #113)
> (In reply to Lukas Wunner from comment #112)
> > Glad to hear. You don't seem to have any commits in the kernel so far. Would
> > you like to try and bake these changes into a proper patch? If not I'll
> > gladly create and submit the patch myself but mentoring someone else make
> > their first contribution is more beneficial to the community, hence my
> > question.
> 
> Lukas, could you please handle it this time? Sorry for not posting sooner.

Sure thing.

Just one question, you wrote that you had to add "HDA_CODEC_ENTRY(0x10de0403,
"GPU 0403 HDMI/DP", patch_nvhdmi)" to snd_hda_id_hdmi[] with the rationale that
the "PCI ID of my Nvidia HDA wasn't there".

This confuses me because the PCI device ID of the HDA controller is "0bea" and
"0403" are the 16 most significant bits of the PCI class ID.

HDA_CODEC_ENTRY() needs to match for the 32-bit HD audio vendor ID. Just to
double-check, could you execute "cat
/sys/bus/pci/devices/:01:00.1/hdaudioC1D0/vendor_id" and post the result
here? Is it really 0x10de0403? Thanks!

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

Re: [Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-16 Thread Karol Herbst
but setting the PCI_DEV_FLAGS_NO_D3 flag does prevent using the
platform means of putting the device into D3cold, right? That's
actually what should still happen, just the D3hot step should be
skipped.

On Wed, Oct 16, 2019 at 9:14 PM Bjorn Helgaas  wrote:
>
> On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote:
> > Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device
> > states.
> >
> > v2: convert to pci_dev quirk
> > put a proper technical explanation of the issue as a in-code comment
> > v3: disable it only for certain combinations of intel and nvidia hardware
> >
> > Signed-off-by: Karol Herbst 
> > Cc: Bjorn Helgaas 
> > Cc: Lyude Paul 
> > Cc: Rafael J. Wysocki 
> > Cc: Mika Westerberg 
> > Cc: linux-...@vger.kernel.org
> > Cc: linux...@vger.kernel.org
> > Cc: dri-de...@lists.freedesktop.org
> > Cc: nouveau@lists.freedesktop.org
> > ---
> >  drivers/pci/pci.c| 11 ++
> >  drivers/pci/quirks.c | 52 
> >  include/linux/pci.h  |  1 +
> >  3 files changed, 64 insertions(+)
> >
> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index b97d9e10c9cc..8e056eb7e6ff 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c
> > @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct 
> > pci_dev *dev)
> >   return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false;
> >  }
> >
> > +static inline bool parent_broken_child_pm(struct pci_dev *dev)
> > +{
> > + if (!dev->bus || !dev->bus->self)
> > + return false;
> > + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm;
> > +}
> > +
> >  /**
> >   * pci_raw_set_power_state - Use PCI PM registers to set the power state of
> >   *given PCI device
> > @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev 
> > *dev, pci_power_t state)
> >  || (state == PCI_D2 && !dev->d2_support))
> >   return -EIO;
> >
> > + /* check if the bus controller causes issues */
> > + if (state != PCI_D0 && parent_broken_child_pm(dev))
> > + return 0;
> > +
> >   pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, );
> >
> >   /*
> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 44c4ae1abd00..c2f20b745dd4 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -5268,3 +5268,55 @@ static void 
> > quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev)
> >  DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1,
> > PCI_CLASS_DISPLAY_VGA, 8,
> > quirk_reset_lenovo_thinkpad_p50_nvgpu);
> > +
> > +/*
> > + * Some Intel PCIe bridges cause devices to disappear from the PCIe bus 
> > after
> > + * those were put into D3cold state if they were put into a non D0 PCI PM
> > + * device state before doing so.
> > + *
> > + * This leads to various issue different issues which all manifest 
> > differently,
> > + * but have the same root cause:
> > + *  - AIML code execution hits an infinite loop (as the coe waits on device
> > + *memory to change).
> > + *  - kernel crashes, as all pci reads return -1, which most code isn't 
> > able
> > + *to handle well enough.
> > + *  - sudden shutdowns, as the kernel identified an unrecoverable error 
> > after
> > + *userspace tries to access the GPU.
> > + *
> > + * In all cases dmesg will contain at least one line like this:
> > + * 'nouveau :01:00.0: Refused to change power state, currently in D3'
> > + * followed by a lot of nouveau timeouts.
> > + *
> > + * ACPI code writes bit 0x80 to the not documented PCI register 0x248 of 
> > the
> > + * PCIe bridge controller in order to power down the GPU.
> > + * Nonetheless, there are other code paths inside the ACPI firmware which 
> > use
> > + * other registers, which seem to work fine:
> > + *  - 0xbc bit 0x20 (publicly available documentation claims 'reserved')
> > + *  - 0xb0 bit 0x10 (link disable)
> > + * Changing the conditions inside the firmware by poking into the relevant
> > + * addresses does resolve the issue, but it seemed to be ACPI private 
> > memory
> > + * and not any device accessible memory at all, so there is no portable 
> > way of
> > + * changing the conditions.
> > + *
> > + * The only systems where this behavior can be seen are hybrid graphics 
> > laptops
> > + * with a secondary Nvidia Pascal GPU. It cannot be ruled out that this 
> > issue
> > + * only occurs in combination with listed Intel PCIe bridge controllers and
> > + * the mentioned GPUs or if it's only a hw bug in the bridge controller.
> > + *
> > + * But because this issue was NOT seen on laptops with an Nvidia Pascal GPU
> > + * and an Intel Coffee Lake SoC, there is a higher chance of there being a 
> > bug
> > + * in the bridge controller rather than in the GPU.
> > + *
> > + * This issue was not able to be reproduced on non laptop systems.
> > + */
> > +

Re: [Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-16 Thread Bjorn Helgaas
On Wed, Oct 16, 2019 at 04:44:49PM +0200, Karol Herbst wrote:
> Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device
> states.
> 
> v2: convert to pci_dev quirk
> put a proper technical explanation of the issue as a in-code comment
> v3: disable it only for certain combinations of intel and nvidia hardware
> 
> Signed-off-by: Karol Herbst 
> Cc: Bjorn Helgaas 
> Cc: Lyude Paul 
> Cc: Rafael J. Wysocki 
> Cc: Mika Westerberg 
> Cc: linux-...@vger.kernel.org
> Cc: linux...@vger.kernel.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: nouveau@lists.freedesktop.org
> ---
>  drivers/pci/pci.c| 11 ++
>  drivers/pci/quirks.c | 52 
>  include/linux/pci.h  |  1 +
>  3 files changed, 64 insertions(+)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b97d9e10c9cc..8e056eb7e6ff 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct pci_dev 
> *dev)
>   return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false;
>  }
>  
> +static inline bool parent_broken_child_pm(struct pci_dev *dev)
> +{
> + if (!dev->bus || !dev->bus->self)
> + return false;
> + return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm;
> +}
> +
>  /**
>   * pci_raw_set_power_state - Use PCI PM registers to set the power state of
>   *given PCI device
> @@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev *dev, 
> pci_power_t state)
>  || (state == PCI_D2 && !dev->d2_support))
>   return -EIO;
>  
> + /* check if the bus controller causes issues */
> + if (state != PCI_D0 && parent_broken_child_pm(dev))
> + return 0;
> +
>   pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, );
>  
>   /*
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 44c4ae1abd00..c2f20b745dd4 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5268,3 +5268,55 @@ static void 
> quirk_reset_lenovo_thinkpad_p50_nvgpu(struct pci_dev *pdev)
>  DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1,
> PCI_CLASS_DISPLAY_VGA, 8,
> quirk_reset_lenovo_thinkpad_p50_nvgpu);
> +
> +/*
> + * Some Intel PCIe bridges cause devices to disappear from the PCIe bus after
> + * those were put into D3cold state if they were put into a non D0 PCI PM
> + * device state before doing so.
> + *
> + * This leads to various issue different issues which all manifest 
> differently,
> + * but have the same root cause:
> + *  - AIML code execution hits an infinite loop (as the coe waits on device
> + *memory to change).
> + *  - kernel crashes, as all pci reads return -1, which most code isn't able
> + *to handle well enough.
> + *  - sudden shutdowns, as the kernel identified an unrecoverable error after
> + *userspace tries to access the GPU.
> + *
> + * In all cases dmesg will contain at least one line like this:
> + * 'nouveau :01:00.0: Refused to change power state, currently in D3'
> + * followed by a lot of nouveau timeouts.
> + *
> + * ACPI code writes bit 0x80 to the not documented PCI register 0x248 of the
> + * PCIe bridge controller in order to power down the GPU.
> + * Nonetheless, there are other code paths inside the ACPI firmware which use
> + * other registers, which seem to work fine:
> + *  - 0xbc bit 0x20 (publicly available documentation claims 'reserved')
> + *  - 0xb0 bit 0x10 (link disable)
> + * Changing the conditions inside the firmware by poking into the relevant
> + * addresses does resolve the issue, but it seemed to be ACPI private memory
> + * and not any device accessible memory at all, so there is no portable way 
> of
> + * changing the conditions.
> + *
> + * The only systems where this behavior can be seen are hybrid graphics 
> laptops
> + * with a secondary Nvidia Pascal GPU. It cannot be ruled out that this issue
> + * only occurs in combination with listed Intel PCIe bridge controllers and
> + * the mentioned GPUs or if it's only a hw bug in the bridge controller.
> + *
> + * But because this issue was NOT seen on laptops with an Nvidia Pascal GPU
> + * and an Intel Coffee Lake SoC, there is a higher chance of there being a 
> bug
> + * in the bridge controller rather than in the GPU.
> + *
> + * This issue was not able to be reproduced on non laptop systems.
> + */
> +
> +static void quirk_broken_nv_runpm(struct pci_dev *dev)
> +{
> + dev->broken_nv_runpm = 1;

Can you use the existing PCI_DEV_FLAGS_NO_D3 flag for this instead of
adding a new flag?

I would put the parent_broken_child_pm() logic here, if possible,
e.g., something like:

  struct pci_dev *bridge = pci_upstream_bridge(dev);

  if (bridge &&
  bridge->vendor == PCI_VENDOR_ID_INTEL && bridge->device == 0x1901)
dev->dev_flags |= PCI_DEV_FLAGS_NO_D3;

> +}
> 

Re: [Nouveau] [PATCH] drm: Generalized NV Block Linear DRM format mod

2019-10-16 Thread James Jones

On 10/15/19 8:42 AM, Daniel Vetter wrote:

On Tue, Oct 15, 2019 at 5:14 PM James Jones  wrote:


On 10/15/19 7:19 AM, Daniel Vetter wrote:

On Mon, Oct 14, 2019 at 03:13:21PM -0700, James Jones wrote:

Builds upon the existing NVIDIA 16Bx2 block linear
format modifiers by adding more "fields" to the
existing parameterized
DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifier
macro that allow fully defining a unique-across-
all-NVIDIA-hardware bit layout using a minimal
set of fields and values.  The new modifier macro
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D is
effectively backwards compatible with the existing
macro, introducing a superset of the previously
definable format modifiers.

Backwards compatibility has two quirks.  First,
the zero value for the "kind" field, which is
implied by the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK
macro, must be special cased in drivers and
assumed to map to the pre-Turing generic kind of
0xfe, since a kind of "zero" is reserved for
linear buffer layouts on all GPUs.

Second, it is assumed backwards compatibility
is only needed when running on Tegra GPUs, and
specifically Tegra GPUs prior to Xavier.  This
is based on two assertions:

-Tegra GPUs prior to Xavier used a slightly
   different raw bit layout than desktop GPUs,
   making it impossible to directly share block
   linear buffers between the two.

-Support for the existing block linear modifiers
   was incomplete, making them useful only for
   exporting buffers created by nouveau and
   importing them to Tegra DRM as framebuffers for
   scan out.  There was no support for adding
   framebuffers using format modifiers in nouveau,
   nor importing dma-buf/PRIME GEM objects into
   nouveau userspace drivers with modifiers in Mesa.

Hence it is assumed the prior modifiers were not
intended for use on desktop GPUs, and as a
corrolary, were not intended to support sharing
block linear buffers across two different NVIDIA
GPUs.

Signed-off-by: James Jones 
---
   include/uapi/drm/drm_fourcc.h | 108 +++---
   1 file changed, 100 insertions(+), 8 deletions(-)

diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 3feeaa3f987a..cc9853d42a24 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -497,7 +497,99 @@ extern "C" {
   #define DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED fourcc_mod_code(NVIDIA, 1)

   /*
- * 16Bx2 Block Linear layout, used by desktop GPUs, and Tegra K1 and later
+ * Generalized Block Linear layout, used by desktop GPUs starting with 
NV50/G80,
+ * and Tegra GPUs starting with Tegra K1.
+ *
+ * Pixels are arranged in Groups of Bytes (GOBs).  GOB size and layout varies
+ * based on the architecture generation.  GOBs themselves are then arranged in
+ * 3D blocks, with the block dimensions (in terms of GOBs) always being a power
+ * of two, and hence expressible as their log2 equivalent (E.g., "2" represents
+ * a block depth or height of "4").
+ *
+ * Chapter 20 "Pixel Memory Formats" of the Tegra X1 TRM describes this format
+ * in full detail.
+ *
+ *   Macro
+ * Bits  Param Description
+ *   - 
-
+ *
+ *  3:0  h log2(height) of each block, in GOBs.  Placed here for
+ * compatibility with the existing
+ * DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK()-based modifiers.
+ *
+ *  4:4  - Must be 1, to indicate block-linear layout.  Necessary for
+ * compatibility with the existing
+ * DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK()-based modifiers.
+ *
+ *  8:5  - Reserved (To support 3D-surfaces with variable log2(depth) block
+ * size).  Must be zero.
+ *
+ * Note there is no log2(width) parameter.  Some portions of the
+ * hardware support a block width of two gobs, but it is 
impractical
+ * to use due to lack of support elsewhere, and has no known
+ * benefits.
+ *
+ * 11:9  - Reserved (To support 2D-array textures with variable array 
stride
+ * in blocks, specified via log2(tile width in blocks)).  Must be
+ * zero.
+ *
+ * 19:12 k Page Kind.  This value directly maps to a field in the page
+ * tables of all GPUs >= NV50.  It affects the exact layout of bits
+ * in memory and can be derived from the tuple
+ *
+ *   (format, GPU model, compression type, samples per pixel)
+ *
+ * Where compression type is defined below.  If GPU model were
+ * implied by the format modifier, format, or memory buffer, page
+ * kind would not need to be included in the modifier itself, but
+ * since the modifier should define the layout of the associated
+ * memory buffer independent from any device or other context, it
+ * must be included here.
+ *
+ * To grandfather in prior block linear format modifiers to this
+ * layout, 

[Nouveau] [PATCH v2] drm: Generalized NV Block Linear DRM format mod

2019-10-16 Thread James Jones
Builds upon the existing NVIDIA 16Bx2 block linear
format modifiers by adding more "fields" to the
existing parameterized
DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK format modifier
macro that allow fully defining a unique-across-
all-NVIDIA-hardware bit layout using a minimal
set of fields and values.  The new modifier macro
DRM_FORMAT_MOD_NVIDIA_BLOCK_LINEAR_2D is
effectively backwards compatible with the existing
macro, introducing a superset of the previously
definable format modifiers.

Backwards compatibility has two quirks.  First,
the zero value for the "kind" field, which is
implied by the DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK
macro, must be special cased in drivers and
assumed to map to the pre-Turing generic kind of
0xfe, since a kind of "zero" is reserved for
linear buffer layouts on all GPUs.

Second, it is assumed backwards compatibility
is only needed when running on Tegra GPUs, and
specifically Tegra GPUs prior to Xavier.  This
is based on two assertions:

-Tegra GPUs prior to Xavier used a slightly
 different raw bit layout than desktop GPUs,
 making it impossible to directly share block
 linear buffers between the two.

-Support for the existing block linear modifiers
 was incomplete, making them useful only for
 exporting buffers created by nouveau and
 importing them to Tegra DRM as framebuffers for
 scan out.  There was no support for adding
 framebuffers using format modifiers in nouveau,
 nor importing dma-buf/PRIME GEM objects into
 nouveau userspace drivers with modifiers in Mesa.

Hence it is assumed the prior modifiers were not
intended for use on desktop GPUs, and as a
corrolary, were not intended to support sharing
block linear buffers across two different NVIDIA
GPUs.

v2:
  - Added canonicalize helper function

Signed-off-by: James Jones 
---
 include/uapi/drm/drm_fourcc.h | 116 +++---
 1 file changed, 108 insertions(+), 8 deletions(-)

diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 3feeaa3f987a..56c8fe30caab 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -497,7 +497,107 @@ extern "C" {
 #define DRM_FORMAT_MOD_NVIDIA_TEGRA_TILED fourcc_mod_code(NVIDIA, 1)
 
 /*
- * 16Bx2 Block Linear layout, used by desktop GPUs, and Tegra K1 and later
+ * Generalized Block Linear layout, used by desktop GPUs starting with 
NV50/G80,
+ * and Tegra GPUs starting with Tegra K1.
+ *
+ * Pixels are arranged in Groups of Bytes (GOBs).  GOB size and layout varies
+ * based on the architecture generation.  GOBs themselves are then arranged in
+ * 3D blocks, with the block dimensions (in terms of GOBs) always being a power
+ * of two, and hence expressible as their log2 equivalent (E.g., "2" represents
+ * a block depth or height of "4").
+ *
+ * Chapter 20 "Pixel Memory Formats" of the Tegra X1 TRM describes this format
+ * in full detail.
+ *
+ *   Macro
+ * Bits  Param Description
+ *   - 
-
+ *
+ *  3:0  h log2(height) of each block, in GOBs.  Placed here for
+ * compatibility with the existing
+ * DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK()-based modifiers.
+ *
+ *  4:4  - Must be 1, to indicate block-linear layout.  Necessary for
+ * compatibility with the existing
+ * DRM_FORMAT_MOD_NVIDIA_16BX2_BLOCK()-based modifiers.
+ *
+ *  8:5  - Reserved (To support 3D-surfaces with variable log2(depth) block
+ * size).  Must be zero.
+ *
+ * Note there is no log2(width) parameter.  Some portions of the
+ * hardware support a block width of two gobs, but it is 
impractical
+ * to use due to lack of support elsewhere, and has no known
+ * benefits.
+ *
+ * 11:9  - Reserved (To support 2D-array textures with variable array 
stride
+ * in blocks, specified via log2(tile width in blocks)).  Must be
+ * zero.
+ *
+ * 19:12 k Page Kind.  This value directly maps to a field in the page
+ * tables of all GPUs >= NV50.  It affects the exact layout of bits
+ * in memory and can be derived from the tuple
+ *
+ *   (format, GPU model, compression type, samples per pixel)
+ *
+ * Where compression type is defined below.  If GPU model were
+ * implied by the format modifier, format, or memory buffer, page
+ * kind would not need to be included in the modifier itself, but
+ * since the modifier should define the layout of the associated
+ * memory buffer independent from any device or other context, it
+ * must be included here.
+ *
+ * 21:20 g GOB Height and Page Kind Generation.  The height of a GOB 
changed
+ * starting with Fermi GPUs.  Additionally, the mapping between 
page
+ * kind and bit layout has changed at various points.
+ *
+ *   0 = Gob Height 8, Fermi - 

Re: [Nouveau] nouveau kernel module will not load on old Sony Vaio laptop with 8400M GT

2019-10-16 Thread Felix Miata
Karol Herbst composed on 2019-10-16 15:25 (UTC+0200):

> Felix Miata wrote:

>> is there anyone here who can help with:

>> https://bugs.freedesktop.org/show_bug.cgi?id=111853
>> nouveau kernel module won't load (not available) on Sony laptop with NVIDIA 
>> G86M
>> [GeForce 8400M GT] ID: 10de:0426

>> ???

> do you know if it used to work with older kernels? If yes, maybe a git
> bisect on the kernel could help

I've updated the bug to indicate 3.16.7 and 4.2.6 kernels will load the kernel
nouveau module without producing expected X results, along with Xorg.0.logs.
-- 
Evolution as taught in public schools is religion, not science.

 Team OS/2 ** Reg. Linux User #211409 ** a11y rocks!

Felix Miata  ***  http://fm.no-ip.com/
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

[Nouveau] [Bug 111853] nouveau kernel module won't load (not available) on Sony laptop with NVIDIA G86M [GeForce 8400M GT] ID: 10de:0426

2019-10-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111853

--- Comment #11 from Felix Miata  ---
Created attachment 145758
  --> https://bugs.freedesktop.org/attachment.cgi?id=145758=edit
Xorg.0.log from live Knoppix 7.6.1

# lsmod | sort | grep veau
drm_kms_helper 70712  1 nouveau
mxm_wmi 1635  1 nouveau
nouveau  1033769  0
ttm60685  1 nouveau
wmi 7363  2 mxm_wmi,nouveau
# inxi -c0 -GxxSM
System:Host: Microknoppix Kernel: 4.2.6-64 x86_64 (64 bit gcc: 5.2.1)
   Console: tty 3 dm: kdm Distro: Debian GNU/Linux stretch/sid
Machine:   System: Sony (portable) product: VGN-AR730E v: C3LR1E11 serial:
28272434-3101919
   Mobo: Sony model: VAIO Bios: Phoenix v: R2090J8 date: 02/26/2008
   Chassis: type: 10
Graphics:  Card: NVIDIA G86M [GeForce 8400M GT]
   bus-ID: 01:00.0 chip-ID: 10de:0426
   Display Server: X.org 1.17.3 drivers: vesa,nouveau (unloaded: fbdev)
   tty size: 80x25 Advanced Data: N/A for root out of X
# dmesg | grep ailed
[0.770863] acpi PNP0A08:00: _OSC failed (AE_SUPPORT); disabling ASPM
[0.849357] pci :01:00.0: BAR 6: failed to assign [mem size 0x0002
pref]
[1.607734] ACPI Error: Method parse/execution failed
[\_SB_.PCI0.SATA.PRT0._SDD] (Node 8800bf08f338), AE_NOT_FOUND
(20150619/psparse-536)
[1.609103] ACPI Error: Method parse/execution failed
[\_SB_.PCI0.SATA.PRT0._SDD] (Node 8800bf08f338), AE_NOT_FOUND
(20150619/psparse-536)
[   37.465961] systemd-udevd[2004]: Process '/usr/sbin/alsactl -E
HOME=/var/run/alsa restore 0' failed with exit code 99.
[   38.226541] systemd-udevd[2008]: Process '/sbin/crda' failed with exit code
249.
[   38.229603] systemd-udevd[2008]: Process '/sbin/crda' failed with exit code
249.
[   42.469065] systemd-udevd[2086]: Process '/usr/sbin/alsactl -E
HOME=/var/run/alsa restore 0' failed with exit code 99.
[   69.564168] systemd-logind[2229]: Failed to start user service, ignoring:
Unknown unit: user@1000.service
[  100.435441] uvcvideo: Failed to query (129) UVC probe control : -32 (exp.
26).
[  100.435443] uvcvideo: Failed to initialize the device (-5).

Again, nouveau loads, but X video is scrambled VESA, again with no
/dev/dri/card0 or /dev/fb0.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

[Nouveau] [Bug 111853] nouveau kernel module won't load (not available) on Sony laptop with NVIDIA G86M [GeForce 8400M GT] ID: 10de:0426

2019-10-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=111853

--- Comment #10 from Felix Miata  ---
Created attachment 145757
  --> https://bugs.freedesktop.org/attachment.cgi?id=145757=edit
Xorg.0.log from live LMDE 2 Betsy boot

# uname -a
Linux stresslinux 2.6.37.6-0.5-default #1 SMP 2011-04-25 21:48:33 +0200 x86_64
x86_64 x86_64 GNU/Linux
# lsmod | sort | grep veau
button  6797  1 nouveau
drm   229676  3 nouveau,ttm,drm_kms_helper
drm_kms_helper 36630  1 nouveau
i2c_algo_bit6342  1 nouveau
nouveau   678496  1 
ttm72581  1 nouveau
video  15865  1 nouveau

But Stresslinux 0.7.106 has no X :-(

# uname -a
Linux mint 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt7-1 (2015-03-01) x86_64
GNU/Linux
# lsmod | sort | grep veau
button 12944  1 nouveau
drm   249955  3 ttm,drm_kms_helper,nouveau
drm_kms_helper 49210  1 nouveau
i2c_algo_bit   12751  1 nouveau
i2c_core   46012  7
drm,i2c_i801,drm_kms_helper,i2c_algo_bit,v4l2_common,nouveau,videodev
mxm_wmi12515  1 nouveau
nouveau  1122419  0 
ttm77862  1 nouveau
video  18096  1 nouveau
wmi17339  2 mxm_wmi,nouveau
# inxi -GxxS
System:Host: mint Kernel: 3.16.0-4-amd64 x86_64 (64 bit gcc: 4.8.4)
   Desktop: MATE 1.8.1 (Gtk 3.14.5+4) dm: mdm Distro: LinuxMint 2 betsy
Graphics:  Card: NVIDIA G86M [GeForce 8400M GT] bus-ID: 01:00.0 chip-ID:
10de:0426
   Display Server: X.Org 1.16.4 drivers: fbdev,vesa,nouveau Resolution:
1024x768@61.00hz
   GLX Renderer: Gallium 0.4 on llvmpipe (LLVM 3.5, 128 bits)
   GLX Version: 3.0 Mesa 10.3.2 Direct Rendering: Yes

With this old live distro, nouveau loads, but there's no /dev/dri/card0 or
/dev/fb0, so it's stuck in VESA 1024x768.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

[Nouveau] [Bug 75985] [NVC1] HDMI audio device only visible after rescan

2019-10-16 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=75985

--- Comment #113 from Przemysław Kopa  ---
(In reply to Lukas Wunner from comment #112)
>
> Glad to hear. You don't seem to have any commits in the kernel so far. Would
> you like to try and bake these changes into a proper patch? If not I'll
> gladly create and submit the patch myself but mentoring someone else make
> their first contribution is more beneficial to the community, hence my
> question.

Lukas, could you please handle it this time? Sorry for not posting sooner.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau

[Nouveau] [PATCH v3] pci: prevent putting nvidia GPUs into lower device states on certain intel bridges

2019-10-16 Thread Karol Herbst
Fixes state transitions of Nvidia Pascal GPUs from D3cold into higher device
states.

v2: convert to pci_dev quirk
put a proper technical explanation of the issue as a in-code comment
v3: disable it only for certain combinations of intel and nvidia hardware

Signed-off-by: Karol Herbst 
Cc: Bjorn Helgaas 
Cc: Lyude Paul 
Cc: Rafael J. Wysocki 
Cc: Mika Westerberg 
Cc: linux-...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: dri-de...@lists.freedesktop.org
Cc: nouveau@lists.freedesktop.org
---
 drivers/pci/pci.c| 11 ++
 drivers/pci/quirks.c | 52 
 include/linux/pci.h  |  1 +
 3 files changed, 64 insertions(+)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b97d9e10c9cc..8e056eb7e6ff 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -805,6 +805,13 @@ static inline bool platform_pci_bridge_d3(struct pci_dev 
*dev)
return pci_platform_pm ? pci_platform_pm->bridge_d3(dev) : false;
 }
 
+static inline bool parent_broken_child_pm(struct pci_dev *dev)
+{
+   if (!dev->bus || !dev->bus->self)
+   return false;
+   return dev->bus->self->broken_nv_runpm && dev->broken_nv_runpm;
+}
+
 /**
  * pci_raw_set_power_state - Use PCI PM registers to set the power state of
  *  given PCI device
@@ -850,6 +857,10 @@ static int pci_raw_set_power_state(struct pci_dev *dev, 
pci_power_t state)
   || (state == PCI_D2 && !dev->d2_support))
return -EIO;
 
+   /* check if the bus controller causes issues */
+   if (state != PCI_D0 && parent_broken_child_pm(dev))
+   return 0;
+
pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, );
 
/*
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 44c4ae1abd00..c2f20b745dd4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5268,3 +5268,55 @@ static void quirk_reset_lenovo_thinkpad_p50_nvgpu(struct 
pci_dev *pdev)
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, 0x13b1,
  PCI_CLASS_DISPLAY_VGA, 8,
  quirk_reset_lenovo_thinkpad_p50_nvgpu);
+
+/*
+ * Some Intel PCIe bridges cause devices to disappear from the PCIe bus after
+ * those were put into D3cold state if they were put into a non D0 PCI PM
+ * device state before doing so.
+ *
+ * This leads to various issue different issues which all manifest differently,
+ * but have the same root cause:
+ *  - AIML code execution hits an infinite loop (as the coe waits on device
+ *memory to change).
+ *  - kernel crashes, as all pci reads return -1, which most code isn't able
+ *to handle well enough.
+ *  - sudden shutdowns, as the kernel identified an unrecoverable error after
+ *userspace tries to access the GPU.
+ *
+ * In all cases dmesg will contain at least one line like this:
+ * 'nouveau :01:00.0: Refused to change power state, currently in D3'
+ * followed by a lot of nouveau timeouts.
+ *
+ * ACPI code writes bit 0x80 to the not documented PCI register 0x248 of the
+ * PCIe bridge controller in order to power down the GPU.
+ * Nonetheless, there are other code paths inside the ACPI firmware which use
+ * other registers, which seem to work fine:
+ *  - 0xbc bit 0x20 (publicly available documentation claims 'reserved')
+ *  - 0xb0 bit 0x10 (link disable)
+ * Changing the conditions inside the firmware by poking into the relevant
+ * addresses does resolve the issue, but it seemed to be ACPI private memory
+ * and not any device accessible memory at all, so there is no portable way of
+ * changing the conditions.
+ *
+ * The only systems where this behavior can be seen are hybrid graphics laptops
+ * with a secondary Nvidia Pascal GPU. It cannot be ruled out that this issue
+ * only occurs in combination with listed Intel PCIe bridge controllers and
+ * the mentioned GPUs or if it's only a hw bug in the bridge controller.
+ *
+ * But because this issue was NOT seen on laptops with an Nvidia Pascal GPU
+ * and an Intel Coffee Lake SoC, there is a higher chance of there being a bug
+ * in the bridge controller rather than in the GPU.
+ *
+ * This issue was not able to be reproduced on non laptop systems.
+ */
+
+static void quirk_broken_nv_runpm(struct pci_dev *dev)
+{
+   dev->broken_nv_runpm = 1;
+}
+DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+ PCI_BASE_CLASS_DISPLAY, 16,
+ quirk_broken_nv_runpm);
+/* kaby lake */
+DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1901,
+   quirk_broken_nv_runpm);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index ac8a6c4e1792..903a0b3a39ec 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -416,6 +416,7 @@ struct pci_dev {
unsigned int__aer_firmware_first_valid:1;
unsigned int__aer_firmware_first:1;
unsigned intbroken_intx_masking:1;  /* INTx masking can't be used 

Re: [Nouveau] nouveau kernel module will not load on old Sony Vaio laptop with 8400M GT

2019-10-16 Thread Karol Herbst
do you know if it used to work with older kernels? If yes, maybe a git
bisect on the kernel could help

On Wed, Oct 16, 2019 at 12:48 AM Felix Miata  wrote:
>
> is there anyone here who can help with:
>
> https://bugs.freedesktop.org/show_bug.cgi?id=111853
> nouveau kernel module won't load (not available) on Sony laptop with NVIDIA 
> G86M
> [GeForce 8400M GT] ID: 10de:0426
>
> ???
> --
> Evolution as taught in public schools is religion, not science.
>
>  Team OS/2 ** Reg. Linux User #211409 ** a11y rocks!
>
> Felix Miata  ***  http://fm.no-ip.com/
> ___
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/nouveau

___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau