[Nouveau] [Bug 92961] Xorg freezes (only mouse and ssh are still working)

2018-02-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92961

--- Comment #16 from Tulli0  ---
I have the same problem with Gnome 3 on Gentoo with nouveau driver.
The freezes is randomly and with web browser, mail client and so on.
When I shutdown the system for a reboot I can see the message that I attached
I search this file [gpfifogf100.c] on internet and I find this one patch:
https://patchwork.kernel.org/patch/9502079/

I restore the previous value of ret with:
"ret = -EBUSY;"
in the gpfifogf100.c and gpfifogk104.c
And the problem seam solved

This is the fourth day of test without a freeze.

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 92961] Xorg freezes (only mouse and ssh are still working)

2018-02-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=92961

--- Comment #15 from Tulli0  ---
Created attachment 137309
  --> https://bugs.freedesktop.org/attachment.cgi?id=137309=edit
shutdown log when the system was freezed

-- 
You are receiving this mail because:
You are the assignee for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


[Nouveau] [Bug 105045] System freeze with nouveau

2018-02-12 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=105045

--- Comment #9 from Sergey Tereschenko  ---
I built mesa-git from source, and gdm starting only in xorg mode.

When i try to run

> XDG_SESSION_TYPE=wayland exec dbus-run-session gnome-session

It fails with message "Connection to the bus can't be made"

weston is still running fine.

reverted back to 17.3.3

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers

2018-02-12 Thread Alex Deucher
On Mon, Feb 12, 2018 at 4:45 AM, Lukas Wunner  wrote:
> On Mon, Feb 12, 2018 at 09:03:26AM +, Mike Lothian wrote:
>> On 12 February 2018 at 03:39, Lukas Wunner  wrote:
>> > On Mon, Feb 12, 2018 at 12:35:51AM +, Mike Lothian wrote:
>> > > I've not been able to reproduce the original problem you're trying to
>> > > solve on amdgpu thats with or without your patch set and the above
>> > > "trigger" too
>> > >
>> > > Is anything else required to trigger it, I started multiple DRI_PRIME
>> > > glxgears, in parallel, serial waiting the 12 seconds and serial within
>> > > the 12 seconds and I couldn't reproduce it
>> >
>> > The discrete GPU needs to runtime suspend, that's the trigger,
>> > so no DRI_PRIME executables should be running.  Just let it
>> > autosuspend after boot.  Do you see "waiting 12 sec" messages
>> > in dmesg?  If not it's not autosuspending.
>>
>> Yes I'm seeing those messages, I'm just not seeing the hangs
>>
>> I've attached the dmesg in case you're interested
>
> Okay the reason you're not seeing deadlocks is because the output poll
> worker is not enabled.  And the output poll worker is not enabled
> because your discrete GPU doesn't have any outputs:
>
> [0.265568] [drm:dc_create] *ERROR* DC: Number of connectors is zero!
>
> The outputs are only polled if there are connectors which have the
> DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT flag set.
> And that only ever seems to be the case for VGA and DVI.
>
> We know based on bugzilla reports that hybrid graphics laptops do exist
> which poll outputs with radeon and nouveau.  If there are no laptops
> supported by amdgpu whose discrete GPU has polled connectors, then
> patch [5/5] would be unnecessary.  That is for Alex to decide.

Most hybrid laptops don't have display connectors on the dGPU and we
only use polling on analog connectors, so you are not likely to run
into this on recent laptops.  That said, I don't know if there is some
OEM system out there with a VGA port on the dGPU in a hybrid laptop.
I guess another option is to just disable polling on hybrid laptops.

Alex

>
> However that is very good to know, so thanks a lot for your testing
> efforts, much appreciated!
>
> Kind regards,
>
> Lukas
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [Intel-gfx] [PATCH 2/5] drm: Allow determining if current task is output poll worker

2018-02-12 Thread Lukas Wunner
On Mon, Feb 12, 2018 at 05:50:12PM +, Chris Wilson wrote:
> Quoting Lyude Paul (2018-02-12 17:46:11)
> > On Sun, 2018-02-11 at 10:38 +0100, Lukas Wunner wrote:
> > > Introduce a helper to determine if the current task is an output poll
> > > worker.
> > > 
> > > This allows us to fix a long-standing deadlock in several DRM drivers
> > > wherein the ->runtime_suspend callback waits for the output poll worker
> > > to finish and the worker in turn calls a ->detect callback which waits
> > > for runtime suspend to finish.  The ->detect callback is invoked from
> > > multiple call sites and waiting for runtime suspend to finish is the
> > > correct thing to do except if it's executing in the context of the
> > > worker.
> > > 
> > > Cc: Dave Airlie 
> > > Cc: Ben Skeggs 
> > > Cc: Alex Deucher 
> > > Signed-off-by: Lukas Wunner 
> > > ---
> > >  drivers/gpu/drm/drm_probe_helper.c | 14 ++
> > >  include/drm/drm_crtc_helper.h  |  1 +
> > >  2 files changed, 15 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_probe_helper.c
> > > b/drivers/gpu/drm/drm_probe_helper.c
> > > index 555fbe54d6e2..019881d15ce1 100644
> > > --- a/drivers/gpu/drm/drm_probe_helper.c
> > > +++ b/drivers/gpu/drm/drm_probe_helper.c
> > > @@ -653,6 +653,20 @@ static void output_poll_execute(struct work_struct
> > > *work)
> > >   schedule_delayed_work(delayed_work,
> > > DRM_OUTPUT_POLL_PERIOD);
> > >  }
> > >  
> > > +/**
> > > + * drm_kms_helper_is_poll_worker - is %current task an output poll 
> > > worker?
> > > + *
> > > + * Determine if %current task is an output poll worker.  This can be used
> > > + * to select distinct code paths for output polling versus other 
> > > contexts.
> > > + */
> > > +bool drm_kms_helper_is_poll_worker(void)
> > > +{
> > > + struct work_struct *work = current_work();
> > > +
> > > + return work && work->func == output_poll_execute;
> 
> What ensures that work is accessible? Does this need rcu_read_lock
> protection or more?

The work_struct exists as long this kthread is working on it.
Since this is called by the kthread itself, it is guaranteed to exist.
So there's no protection needed.

Thanks,

Lukas
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [Intel-gfx] [PATCH 2/5] drm: Allow determining if current task is output poll worker

2018-02-12 Thread Chris Wilson
Quoting Lyude Paul (2018-02-12 17:46:11)
> On Sun, 2018-02-11 at 10:38 +0100, Lukas Wunner wrote:
> > Introduce a helper to determine if the current task is an output poll
> > worker.
> > 
> > This allows us to fix a long-standing deadlock in several DRM drivers
> > wherein the ->runtime_suspend callback waits for the output poll worker
> > to finish and the worker in turn calls a ->detect callback which waits
> > for runtime suspend to finish.  The ->detect callback is invoked from
> > multiple call sites and waiting for runtime suspend to finish is the
> > correct thing to do except if it's executing in the context of the
> > worker.
> > 
> > Cc: Dave Airlie 
> > Cc: Ben Skeggs 
> > Cc: Alex Deucher 
> > Signed-off-by: Lukas Wunner 
> > ---
> >  drivers/gpu/drm/drm_probe_helper.c | 14 ++
> >  include/drm/drm_crtc_helper.h  |  1 +
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_probe_helper.c
> > b/drivers/gpu/drm/drm_probe_helper.c
> > index 555fbe54d6e2..019881d15ce1 100644
> > --- a/drivers/gpu/drm/drm_probe_helper.c
> > +++ b/drivers/gpu/drm/drm_probe_helper.c
> > @@ -653,6 +653,20 @@ static void output_poll_execute(struct work_struct
> > *work)
> >   schedule_delayed_work(delayed_work,
> > DRM_OUTPUT_POLL_PERIOD);
> >  }
> >  
> > +/**
> > + * drm_kms_helper_is_poll_worker - is %current task an output poll worker?
> > + *
> > + * Determine if %current task is an output poll worker.  This can be used
> > + * to select distinct code paths for output polling versus other contexts.
> > + */
> For this, it would be worth explicitly noting in the comments herethat this
> should be called by DRM drivers in order to prevent racing with hotplug
> polling workers, so that new drivers in the future can avoid implementing this
> race condition in their driver.
> 
> > +bool drm_kms_helper_is_poll_worker(void)
> > +{
> > + struct work_struct *work = current_work();
> > +
> > + return work && work->func == output_poll_execute;

What ensures that work is accessible? Does this need rcu_read_lock
protection or more?
-Chris
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 2/5] drm: Allow determining if current task is output poll worker

2018-02-12 Thread Lyude Paul
On Sun, 2018-02-11 at 10:38 +0100, Lukas Wunner wrote:
> Introduce a helper to determine if the current task is an output poll
> worker.
> 
> This allows us to fix a long-standing deadlock in several DRM drivers
> wherein the ->runtime_suspend callback waits for the output poll worker
> to finish and the worker in turn calls a ->detect callback which waits
> for runtime suspend to finish.  The ->detect callback is invoked from
> multiple call sites and waiting for runtime suspend to finish is the
> correct thing to do except if it's executing in the context of the
> worker.
> 
> Cc: Dave Airlie 
> Cc: Ben Skeggs 
> Cc: Alex Deucher 
> Signed-off-by: Lukas Wunner 
> ---
>  drivers/gpu/drm/drm_probe_helper.c | 14 ++
>  include/drm/drm_crtc_helper.h  |  1 +
>  2 files changed, 15 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_probe_helper.c
> b/drivers/gpu/drm/drm_probe_helper.c
> index 555fbe54d6e2..019881d15ce1 100644
> --- a/drivers/gpu/drm/drm_probe_helper.c
> +++ b/drivers/gpu/drm/drm_probe_helper.c
> @@ -653,6 +653,20 @@ static void output_poll_execute(struct work_struct
> *work)
>   schedule_delayed_work(delayed_work,
> DRM_OUTPUT_POLL_PERIOD);
>  }
>  
> +/**
> + * drm_kms_helper_is_poll_worker - is %current task an output poll worker?
> + *
> + * Determine if %current task is an output poll worker.  This can be used
> + * to select distinct code paths for output polling versus other contexts.
> + */
For this, it would be worth explicitly noting in the comments herethat this
should be called by DRM drivers in order to prevent racing with hotplug
polling workers, so that new drivers in the future can avoid implementing this
race condition in their driver.

> +bool drm_kms_helper_is_poll_worker(void)
> +{
> + struct work_struct *work = current_work();
> +
> + return work && work->func == output_poll_execute;
> +}
> +EXPORT_SYMBOL(drm_kms_helper_is_poll_worker);
> +
>  /**
>   * drm_kms_helper_poll_disable - disable output polling
>   * @dev: drm_device
> diff --git a/include/drm/drm_crtc_helper.h b/include/drm/drm_crtc_helper.h
> index 76e237bd989b..6914633037a5 100644
> --- a/include/drm/drm_crtc_helper.h
> +++ b/include/drm/drm_crtc_helper.h
> @@ -77,5 +77,6 @@ void drm_kms_helper_hotplug_event(struct drm_device *dev);
>  
>  void drm_kms_helper_poll_disable(struct drm_device *dev);
>  void drm_kms_helper_poll_enable(struct drm_device *dev);
> +bool drm_kms_helper_is_poll_worker(void);
>  
>  #endif
-- 
Cheers,
Lyude Paul
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers

2018-02-12 Thread Lyude Paul
This patchset is a no-brainer for me. I'm ashamed to say I think I might have
seen this issue before, but polling gets used so rarely nowadays it slipped
under the rug by accident.

Anyway, for the whole series  (except patch #2, which I will respond to with a
short comment in just a moment):

Reviewed-by: Lyude Paul 

On Sun, 2018-02-11 at 10:38 +0100, Lukas Wunner wrote:
> Fix a deadlock on hybrid graphics laptops that's been present since 2013:
> 
> DRM drivers poll connectors in 10 sec intervals.  The poll worker is
> stopped on ->runtime_suspend with cancel_delayed_work_sync().  However
> the poll worker invokes the DRM drivers' ->detect callbacks, which call
> pm_runtime_get_sync().  If the poll worker starts after runtime suspend
> has begun, pm_runtime_get_sync() will wait for runtime suspend to finish
> with the intention of runtime resuming the device afterwards.  The result
> is a circular wait between poll worker and autosuspend worker.
> 
> I'm seeing this deadlock so often it's not funny anymore. I've also found
> 3 nouveau bugzillas about it and 1 radeon bugzilla. (See patch [3/5] and
> [4/5].)
> 
> One theoretical solution would be to add a flag to the ->detect callback
> to tell it that it's running in the poll worker's context.  In that case
> it doesn't need to call pm_runtime_get_sync() because the poll worker is
> only enabled while runtime active and we know that ->runtime_suspend
> waits for it to finish.  But this approach would require touching every
> single DRM driver's ->detect hook.  Moreover the ->detect hook is called
> from numerous other places, both in the DRM library as well as in each
> driver, making it necessary to trace every possible call chain and check
> if it's coming from the poll worker or not.  I've tried to do that for
> nouveau (see the call sites listed in the commit message of patch [3/5])
> and concluded that this approach is a nightmare to implement.
> 
> Another reason for the infeasibility of this approach is that ->detect
> is called from within the DRM library without driver involvement, e.g.
> via DRM's sysfs interface.  In those cases, pm_runtime_get_sync() would
> have to be called by the DRM library, but the DRM library is supposed to
> stay generic, wheras runtime PM is driver-specific.
> 
> So instead, I've come up with this much simpler solution which gleans
> from the current task's flags if it's a workqueue worker, retrieves the
> work_struct and checks if it's the poll worker.  All that's needed is
> one small helper in the workqueue code and another small helper in the
> DRM library.  This solution lends itself nicely to stable-backporting.
> 
> The patches for radeon and amdgpu are compile-tested only, I only have a
> MacBook Pro with an Nvidia GK107 to test.  To test the patches, add an
> "msleep(12*1000);" at the top of the driver's ->runtime_suspend hook.
> This ensures that the poll worker runs after ->runtime_suspend has begun.
> Wait 12 sec after the GPU has begun runtime suspend, then check
> /sys/bus/pci/devices/:01:00.0/power/runtime_status.  Without this
> series, the status will be stuck at "suspending" and you'll get hung task
> errors in dmesg after a few minutes.
> 
> i915, malidp and msm "solved" this issue by not stopping the poll worker
> on runtime suspend.  But this results in the GPU bouncing back and forth
> between D0 and D3 continuously.  That's a total no-go for GPUs which
> runtime suspend to D3cold since every suspend/resume cycle costs a
> significant amount of time and energy.  (i915 and malidp do not seem
> to acquire a runtime PM ref in the ->detect callbacks, which seems
> questionable.  msm however does and would also deadlock if it disabled
> the poll worker on runtime suspend.  cc += Archit, Liviu, intel-gfx)
> 
> Please review.  Thanks,
> 
> Lukas
> 
> Lukas Wunner (5):
>   workqueue: Allow retrieval of current task's work struct
>   drm: Allow determining if current task is output poll worker
>   drm/nouveau: Fix deadlock on runtime suspend
>   drm/radeon: Fix deadlock on runtime suspend
>   drm/amdgpu: Fix deadlock on runtime suspend
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 58 +---
>  drivers/gpu/drm/drm_probe_helper.c | 14 +
>  drivers/gpu/drm/nouveau/nouveau_connector.c| 18 +--
>  drivers/gpu/drm/radeon/radeon_connectors.c | 74 +
> -
>  include/drm/drm_crtc_helper.h  |  1 +
>  include/linux/workqueue.h  |  1 +
>  kernel/workqueue.c | 16 ++
>  7 files changed, 132 insertions(+), 50 deletions(-)
> 
-- 
Cheers,
Lyude Paul
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers

2018-02-12 Thread Imre Deak
On Sun, Feb 11, 2018 at 10:38:28AM +0100, Lukas Wunner wrote:
> [...]
> i915, malidp and msm "solved" this issue by not stopping the poll worker
> on runtime suspend.  But this results in the GPU bouncing back and forth
> between D0 and D3 continuously.  That's a total no-go for GPUs which
> runtime suspend to D3cold since every suspend/resume cycle costs a
> significant amount of time and energy.  (i915 and malidp do not seem
> to acquire a runtime PM ref in the ->detect callbacks, which seems
> questionable.  msm however does and would also deadlock if it disabled
> the poll worker on runtime suspend.  cc += Archit, Liviu, intel-gfx)

In i915 polling is on during runtime suspend only if there are outputs
without hotplug interrupt support. A special case is when an output has
working HPD interrupts when in D0, but no interrupts when runtime
suspended. For these we start polling (from a scheduled work) in the
runtime suspend hook and stop it in the runtime resume hook (again from
a scheduled work).

Imo whether to leave polling on or not for the above purpose is a policy
question (configurable with the drm_kms_helper.poll param).

--Imre

> 
> Please review.  Thanks,
> 
> Lukas
> 
> Lukas Wunner (5):
>   workqueue: Allow retrieval of current task's work struct
>   drm: Allow determining if current task is output poll worker
>   drm/nouveau: Fix deadlock on runtime suspend
>   drm/radeon: Fix deadlock on runtime suspend
>   drm/amdgpu: Fix deadlock on runtime suspend
> 
>  drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c | 58 +---
>  drivers/gpu/drm/drm_probe_helper.c | 14 +
>  drivers/gpu/drm/nouveau/nouveau_connector.c| 18 +--
>  drivers/gpu/drm/radeon/radeon_connectors.c | 74 
> +-
>  include/drm/drm_crtc_helper.h  |  1 +
>  include/linux/workqueue.h  |  1 +
>  kernel/workqueue.c | 16 ++
>  7 files changed, 132 insertions(+), 50 deletions(-)
> 
> -- 
> 2.15.1
> 
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau


Re: [Nouveau] [PATCH 0/5] Fix deadlock on runtime suspend in DRM drivers

2018-02-12 Thread Lukas Wunner
On Mon, Feb 12, 2018 at 09:03:26AM +, Mike Lothian wrote:
> On 12 February 2018 at 03:39, Lukas Wunner  wrote:
> > On Mon, Feb 12, 2018 at 12:35:51AM +, Mike Lothian wrote:
> > > I've not been able to reproduce the original problem you're trying to
> > > solve on amdgpu thats with or without your patch set and the above
> > > "trigger" too
> > >
> > > Is anything else required to trigger it, I started multiple DRI_PRIME
> > > glxgears, in parallel, serial waiting the 12 seconds and serial within
> > > the 12 seconds and I couldn't reproduce it
> >
> > The discrete GPU needs to runtime suspend, that's the trigger,
> > so no DRI_PRIME executables should be running.  Just let it
> > autosuspend after boot.  Do you see "waiting 12 sec" messages
> > in dmesg?  If not it's not autosuspending.
> 
> Yes I'm seeing those messages, I'm just not seeing the hangs
> 
> I've attached the dmesg in case you're interested

Okay the reason you're not seeing deadlocks is because the output poll
worker is not enabled.  And the output poll worker is not enabled
because your discrete GPU doesn't have any outputs:

[0.265568] [drm:dc_create] *ERROR* DC: Number of connectors is zero!

The outputs are only polled if there are connectors which have the
DRM_CONNECTOR_POLL_CONNECT or DRM_CONNECTOR_POLL_DISCONNECT flag set.
And that only ever seems to be the case for VGA and DVI.

We know based on bugzilla reports that hybrid graphics laptops do exist
which poll outputs with radeon and nouveau.  If there are no laptops
supported by amdgpu whose discrete GPU has polled connectors, then
patch [5/5] would be unnecessary.  That is for Alex to decide.

However that is very good to know, so thanks a lot for your testing
efforts, much appreciated!

Kind regards,

Lukas
___
Nouveau mailing list
Nouveau@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/nouveau