Re: Possible resource leak in Weston

2020-04-23 Thread Guillermo Rodriguez
Hi Pekka,

El jue., 23 abr. 2020 a las 11:43, Pekka Paalanen
() escribió:
>
> On Wed, 22 Apr 2020 16:45:42 +0200
> Guillermo Rodriguez  wrote:
>
> > Hi all,
> >
> > I am investigating something that looks like a resource leak in
> > Weston. I first saw the problem in an application involving Gstreamer,
> > which would run out of fds after a number of iterations (~1000).
> > However I have also been able to reproduce it without using Gstreamer.
> >
> > This is the scenario:
> >
> > I have a wayland application that uses Gstreamer to display video. The
> > application creates the display connection (wl_display_connect) and a
> > top-level surface; these last for the lifetime of the application.
> > Every time the application needs to show a video fragment, it builds a
> > Gstreamer pipeline, and passes the display and surface handles to
> > Gstreamer's waylandsink.
> >
> > This ends up calling the gst_wl_display_new_existing function [1],
> > which registers a listener for the wayland registry and binds to some
> > interfaces [2]. Binding to the wl_shell interface results in a call to
> > weston_desktop_wl_shell_bind -> weston_desktop_client_create where
> > some resources are created, including, among others, a ping timer.
> >
> > When Gstreamer is done showing the video, it cleans up and releases
> > all resources. As part of this process it calls wl_shell_destroy [3],
> > however this does NOT result in a call to
> > weston_desktop_client_destroy, and so the associated resources are not
> > released. In fact, the resources are only released when the
> > application disconnects (wl_display_disconnect) and exits. At this
> > point, all "pending" calls to weston_desktop_client_destroy are made.
> >
> > The original symptoms (application running out of fds) are only
> > visible with wayland < 1.18.0. This is because up to 1.18.0, one
> > timerfd was being used for each ping timer. This was changed in this
> > commit [4]. However even if the symptoms are less visible, the issue
> > still exists in Weston.
> >
> > Does this make sense?
>
> Hi,
>
> yes, it does make sense, I think.
>
> > In case I am not overlooking something and this is indeed a Weston
> > issue, any hints on how to fix it?
>
> It is mostly a Weston issue, but there are a couple Wayland protocol
> issues as well:
>
> - wl_registry does not have a destroy request defined, hence the
>   compositor cannot know when a client destroys a wl_registry. The
>   compositor side of a wl_registry object can only be reaped when the
>   client disconnects. Fixing this is cumbersome.
>
> - wl_shell interface does not have a destroy request either. However,
>   adding one would be much easier that with wl_registry. OTOH, wl_shell
>   is deprecated, so even if we add one, I doubt the clients still using
>   it would get fixed.
>
> The above are fundamental but relatively minor leaks during a client's
> lifetime in the compositor. Solving them should not be necessary for
> solving your immediate problem, but they would be needed for a complete
> solution rather than just pushing the failure point even further out.

The lack of a destroy request for wl_shell seems to be the root of the
problem in my case.
Each time my application needs to show a video fragment it builds a
Gstreamer pipeline, Gstreamer's waylandsink binds to the wl_shell
client, and this results in the creation of a new ping timer. Once the
video stops the timer resources are not released. After a number of
iterations of this process, the application cannot show video anymore.

You mention that "solving this should not be necessary for solving my
immediate problem", but I am not sure why you say that. If the
wl_shell objects (and associated resources) are being leaked, how do I
work around this issue?

>
> There is also a peculiarity with the wl_shell_surface interface: it
> does not have a destroy request defined, but the protocol object is
> documented to be destroyed when the wl_surface is destroyed. You might
> want to check libweston-desktop actually implements this, and that the
> client also does destroy the wl_surface.
>
> In the early days of Wayland, we missed to add destroy requests to many
> interfaces which we assumed would only be instantiated once in a
> client's lifetime. That's why the oldest levels of the protocol
> interfaces are inconsistent wrt. freeing.
>
> The rest is issues with libweston-desktop or weston in general. I'm not
> too familiar with libweston-desktop, but I have some ideas.
>
> To me it looks like libweston-desktop hangs too much data (including
> the ping timer) on the client. E.g. the ping timer is not needed if
> there is nothing to ping, so the timer could perhaps be torn down
> earlier and created on demand. Note however, that while wl_shell pings
> on the wl_shell_surface, xdg-shell pings on the xdg_wm_base which is
> not a per-surface object. Fortunately xdg_wm_base does have a destroy
> request, so hooking up to that is simple.
>
> I assume that 

Re: Possible resource leak in Weston

2020-04-23 Thread Pekka Paalanen
On Wed, 22 Apr 2020 16:45:42 +0200
Guillermo Rodriguez  wrote:

> Hi all,
> 
> I am investigating something that looks like a resource leak in
> Weston. I first saw the problem in an application involving Gstreamer,
> which would run out of fds after a number of iterations (~1000).
> However I have also been able to reproduce it without using Gstreamer.
> 
> This is the scenario:
> 
> I have a wayland application that uses Gstreamer to display video. The
> application creates the display connection (wl_display_connect) and a
> top-level surface; these last for the lifetime of the application.
> Every time the application needs to show a video fragment, it builds a
> Gstreamer pipeline, and passes the display and surface handles to
> Gstreamer's waylandsink.
> 
> This ends up calling the gst_wl_display_new_existing function [1],
> which registers a listener for the wayland registry and binds to some
> interfaces [2]. Binding to the wl_shell interface results in a call to
> weston_desktop_wl_shell_bind -> weston_desktop_client_create where
> some resources are created, including, among others, a ping timer.
> 
> When Gstreamer is done showing the video, it cleans up and releases
> all resources. As part of this process it calls wl_shell_destroy [3],
> however this does NOT result in a call to
> weston_desktop_client_destroy, and so the associated resources are not
> released. In fact, the resources are only released when the
> application disconnects (wl_display_disconnect) and exits. At this
> point, all "pending" calls to weston_desktop_client_destroy are made.
> 
> The original symptoms (application running out of fds) are only
> visible with wayland < 1.18.0. This is because up to 1.18.0, one
> timerfd was being used for each ping timer. This was changed in this
> commit [4]. However even if the symptoms are less visible, the issue
> still exists in Weston.
> 
> Does this make sense?

Hi,

yes, it does make sense, I think.

> In case I am not overlooking something and this is indeed a Weston
> issue, any hints on how to fix it?

It is mostly a Weston issue, but there are a couple Wayland protocol
issues as well:

- wl_registry does not have a destroy request defined, hence the
  compositor cannot know when a client destroys a wl_registry. The
  compositor side of a wl_registry object can only be reaped when the
  client disconnects. Fixing this is cumbersome.

- wl_shell interface does not have a destroy request either. However,
  adding one would be much easier that with wl_registry. OTOH, wl_shell
  is deprecated, so even if we add one, I doubt the clients still using
  it would get fixed.

The above are fundamental but relatively minor leaks during a client's
lifetime in the compositor. Solving them should not be necessary for
solving your immediate problem, but they would be needed for a complete
solution rather than just pushing the failure point even further out.

There is also a peculiarity with the wl_shell_surface interface: it
does not have a destroy request defined, but the protocol object is
documented to be destroyed when the wl_surface is destroyed. You might
want to check libweston-desktop actually implements this, and that the
client also does destroy the wl_surface.

In the early days of Wayland, we missed to add destroy requests to many
interfaces which we assumed would only be instantiated once in a
client's lifetime. That's why the oldest levels of the protocol
interfaces are inconsistent wrt. freeing.

The rest is issues with libweston-desktop or weston in general. I'm not
too familiar with libweston-desktop, but I have some ideas.

To me it looks like libweston-desktop hangs too much data (including
the ping timer) on the client. E.g. the ping timer is not needed if
there is nothing to ping, so the timer could perhaps be torn down
earlier and created on demand. Note however, that while wl_shell pings
on the wl_shell_surface, xdg-shell pings on the xdg_wm_base which is
not a per-surface object. Fortunately xdg_wm_base does have a destroy
request, so hooking up to that is simple.

I assume that 'struct weston_desktop_client' is really supposed to be
per-client, and not e.g. per wl_shell object. All wl_shell objects are
totally equivalent, so ensuring this would be nice. I'm not sure if all
xdg_wm_base objects are intended to be totally equivalent as well,
because it contains the ping/pong messages, and I don't how it is
supposed to work if a client has multiple xdg_wm_base objects, e.g. can
they be used interchangeably to respond to pings.


Thanks,
pq

> 
> Thank you,
> 
> Guillermo Rodriguez
> 
>  [1]: 
> https://github.com/GStreamer/gst-plugins-bad/blob/master/ext/wayland/wldisplay.c#L295
>  [2]: 
> https://github.com/GStreamer/gst-plugins-bad/blob/master/ext/wayland/wldisplay.c#L204
>  [3]: 
> https://github.com/GStreamer/gst-plugins-bad/blob/master/ext/wayland/wldisplay.c#L91
>  [4]: 
> https://github.com/wayland-project/wayland/commit/60a8d29ad8526c18cc670612e2c94f443873011a
> 

Re: Chrome Remote Desktop and Wayland

2020-04-23 Thread Jonas Ã…dahl
On Wed, Apr 22, 2020 at 04:34:01PM -0400, Ray Strode wrote:
> Hi,
> 
> On Tue, Apr 21, 2020 at 8:21 AM Benjamin Berg  wrote:
> > Yes, I agree that "user" is very similar. However, it cannot currently
> > convey any information about whether a graphical session is already
> > running or whether it is capable of spanning multiple logind sessions.
> why does that information need to be conveyed? who would consume it?
> 
> > > I don't think gnome-session needs to do much beyond what it's already
> > > doing.  It just needs to make sure systemd is told what services to
> > > start, if they aren't already started (which target to reach)
> > Well, you need some mechanism to attach the new session/seat to the
> > running graphical session and then watch for it to be released again
> > And yes, the session leader process could be the one taking care of
> > that.
> So to be clear, I've been advocating for mutter/gnome-shell to do the
> attaching *themselves*. We don't need any new api for them to do
> that.
> 
> If mutter is already running, it just needs to set up a sd_login_monitor
> to detect when a new session for the user show up. As soon as a
> new session is registered with logind, that new session is announced.
> 
> mutter can then check the session type of the newly registered session
> and see if it needs to add a new output (to e.g. pipewire or to kms).
> 
> This can be done using api that exists today (well we need a new
> session type, right now there's only tty, x11, wayland, mir, unspecified,
> we need pipewire too)

Can't the remote login session still be "wayland", but without being
able to be drm master? We still need API to manage the pipewire streams
(add/remove/change virtual outputs, inject input, i.e. something like
org.gnome.Mutter.RemoteDesktop/ScreenCast); would adding a new session
type bring us anything useful? We wouldn't just implicitly spit out a
pipewire stream for some made up output, I assume we still want it to be
managed some how by the remote desktop service.


Jonas

> 
> > But, the other part of this is that the situation is confusing. Right
> > now we assign "user" processes to one of the logind sessions by doing a
> > best guess. That works great as long as the user has one graphical
> > logind session. But, if this graphical session starts spanning multiple
> > logind sessions, then the choice becomes more relevant as each of the
> > sessions might for example have a different "Active" state.
> Right, mutter shouldn't be guessing which session to use. it should use
> all of them! it may use some logic to figure out where window go (e.g,
> last that went active wins), but all active sessions should get chrome.
> 
> > So, having something that represents the combination of all of them
> > could bypass that problem in an elegant way. We would never need to
> > "guess" the session, we would just always return the combined "user"
> > session. This user session would for example be considered "Active" as
> > lone as any of the underlying logind sessions are active.
> Totally agree with these words, but I also think that's precisely what a
> "user" is in logind. i don't see any advantage to having something between
> user and session that's basically the same as user.
> 
> > Sorry, I meant the desktop here.
> ah okay.
> 
> > So if KDE and GNOME have incompatible
> > implementation then we may run into odd error scenarios should the user
> > try to change the session type while they are logged in already.
> You're saying a user wants to log into KDE on X11 and GNOME on wayland
> at the same time, and both use systemd --user?
> 
> First, I think that's a fairly niche use case that cause problems for the user
> (e.g., what if they try to run firefox in both?)
> 
> But I don't think there's anything that would prevent it from working.
> 
> logind lets a display server query whether or not a session is KDE or GNOME.
> See sd_session_get_desktop . Each display server, should clearly only
> manage sessions registered for their own desktop.
> 
> > > >  3. we would likely get different implementations with varying degree
> > > > of brokenness.
> > > Not sure what you're saying here, can you reiterate?
> > I think the above captures it well enough.
> I still don't understand. Is this in the context of KDE and GNOME ?
> 
> --Ray
> ___
> wayland-devel mailing list
> wayland-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/wayland-devel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel