Re: why not flow control in wl_connection_flush?

2024-03-02 Thread jleivent
On Fri, 1 Mar 2024 11:59:36 +0200
Pekka Paalanen  wrote:

> ...
> The real problem here is though, how do you architect your app or
> toolkit so that it can stop and postpone what it is doing with Wayland
> when you detect that the socket is not draining fast enough? You might
> be calling into Wayland using libraries that do not support this.
> 
> Returning to the main event loop is the natural place to check and
> postpone, but this whole issue stems from the reality that apps do not
> return to their main loop often enough or do not know how to wait with
> sending even in the main loop.

I am concluding from this discussion that I don't think clients would
be constructed not to cause problems if they attempt to send too fast.

I think I may add an option to wl_connection_flush in my copy of
libwayland so that I can turn on client waiting on flush from an env
var.  It looks like it the change would be pretty small.  Unless you
think it would be worth making this a MR on its own?

If the client is single threaded, this will cause the whole client to
wait, which probably won't be a problem, considering the type of
clients that might try to be that fast.

If the client isn't single threaded, then it may cause a thread to wait
that the client doesn't expect to wait, which could be a problem for
that client, admittedly.



Re: why not flow control in wl_connection_flush?

2024-03-01 Thread Pekka Paalanen
On Thu, 29 Feb 2024 16:32:26 -0500
jleivent  wrote:

> On Tue, 27 Feb 2024 11:01:18 +0200
> Pekka Paalanen  wrote:
> 
> 
> > > But suppose I'm writing a client that has the possibility of
> > > sending a rapid stream of msgs to the compositor that might be,
> > > depending on what other clients are doing, too fast for the
> > > compositor to handle, and I'd like to implement some flow control.
> > > I don't want the connection to the compositor to sever or for the
> > > condition to cause memory consumption without any ability for me to
> > > find out about and control the situation.  Especially if I could
> > > slow down that rapid stream of updates without too high a cost to
> > > what my client is trying to accomplish.
> > > 
> > > Is there a way I could do that?
> > 
> > Get the Wayland fd with wl_display_get_fd() and poll it for writable.
> > If it's not writable, you're sending too fast.  
> 
> That's what I was looking for!  I think... Maybe?
> 
> > 
> > That's what any client should always do. Usually it would be prompted
> > by wl_display_flush() erroring out with EAGAIN as your cue to start
> > polling for writable. It's even documented.  
> 
> But, calling wl_display_flush too often is bad for throughput, right?

I don't think it is if you're not calling it in a busy-loop? Have a
look through its implementation.

If you mean calling after every single request sending, then yeah, I
would not do that, even though I don't think it would have measurable
performance impact. Power consumption impact might be measurable.

There is a wl_display mutex though, so if you hammer it from many
threads, I guess it could hurt.

> Isn't it better to allow the ring buffer to determine itself when to
> flush based on being full, and batch send many msgs?  Obviously
> sometimes the client has nothing more to send (for a while), so
> wl_display_flush then makes sense.  But in this case, it does have more
> to send and wants to know if it should attempt to do so or hold back.

Normally, your program returns to its main event loop relatively fast.
Definitely between each drawn frame if we consider just one wl_surface
at a time. That is the natural place and granularity to flush. You
must flush before polling the Wayland fd for readable, because
otherwise you might be waiting forever for replies that never come
because the compositor never got your request yet.

The automatic flush on the libwayland send buffer (which is 4 kB so
far) getting full is just to buy you time to do your work and return to
your main event loop to flush and sleep. This flush is the problematic
one, because if it fails, there is no recovery at this time. We cannot
return a "hold up, do something else for a while instead of sending
requests", and there is no place to put the current request, so the
only way out is to disconnect.

The socket buffer usually defaulting to something like 200 kB is a
saviour, I believe.

> I could instead wait for the display fd to be writable before
> attempting each msg send.  But the display fd may be writable merely
> because the ring buffer hasn't tried flushing yet.  But the ring buffer
> could have less than enough space for the msg I'm about to send.  And
> the socket buffer could have very little space - just enough for it to
> say its writable.
> 
> Which means that sometimes polling the display fd will return writable
> when an attempt to send a msg is still going to result in ring buffer
> growth or client disconnect.

Right. So you need to try to flush first to ensure the libwayland ring
buffer is empty and can hold your message. You cannot send messages
larger than the ring buffer anyway.

> So... back to calling wl_display_flush ... sometimes.
> 
> I guess I could call wl_display_flush after what I think is about 4K
> worth of msg content.  Wl_display_flush returns the amount sent, so
> that helps keep the extra state I need to maintain.
> 
> Is there currently a way I could get the size of contents in the output
> ring buffer?

I don't recall any.

The MRs for solving this problem have considered adding API for the app
to see when the ring buffer is getting a little too full.

The real problem here is though, how do you architect your app or
toolkit so that it can stop and postpone what it is doing with Wayland
when you detect that the socket is not draining fast enough? You might
be calling into Wayland using libraries that do not support this.

Returning to the main event loop is the natural place to check and
postpone, but this whole issue stems from the reality that apps do not
return to their main loop often enough or do not know how to wait with
sending even in the main loop.


Thanks,
pq


pgpKYz7so5Jd8.pgp
Description: OpenPGP digital signature


Re: why not flow control in wl_connection_flush?

2024-02-29 Thread jleivent
On Tue, 27 Feb 2024 11:01:18 +0200
Pekka Paalanen  wrote:


> > But suppose I'm writing a client that has the possibility of
> > sending a rapid stream of msgs to the compositor that might be,
> > depending on what other clients are doing, too fast for the
> > compositor to handle, and I'd like to implement some flow control.
> > I don't want the connection to the compositor to sever or for the
> > condition to cause memory consumption without any ability for me to
> > find out about and control the situation.  Especially if I could
> > slow down that rapid stream of updates without too high a cost to
> > what my client is trying to accomplish.
> > 
> > Is there a way I could do that?  
> 
> Get the Wayland fd with wl_display_get_fd() and poll it for writable.
> If it's not writable, you're sending too fast.

That's what I was looking for!  I think... Maybe?

> 
> That's what any client should always do. Usually it would be prompted
> by wl_display_flush() erroring out with EAGAIN as your cue to start
> polling for writable. It's even documented.

But, calling wl_display_flush too often is bad for throughput, right?
Isn't it better to allow the ring buffer to determine itself when to
flush based on being full, and batch send many msgs?  Obviously
sometimes the client has nothing more to send (for a while), so
wl_display_flush then makes sense.  But in this case, it does have more
to send and wants to know if it should attempt to do so or hold back.

I could instead wait for the display fd to be writable before
attempting each msg send.  But the display fd may be writable merely
because the ring buffer hasn't tried flushing yet.  But the ring buffer
could have less than enough space for the msg I'm about to send.  And
the socket buffer could have very little space - just enough for it to
say its writable.

Which means that sometimes polling the display fd will return writable
when an attempt to send a msg is still going to result in ring buffer
growth or client disconnect.

So... back to calling wl_display_flush ... sometimes.

I guess I could call wl_display_flush after what I think is about 4K
worth of msg content.  Wl_display_flush returns the amount sent, so
that helps keep the extra state I need to maintain.

Is there currently a way I could get the size of contents in the output
ring buffer?



Re: why not flow control in wl_connection_flush?

2024-02-27 Thread Pekka Paalanen
On Mon, 26 Feb 2024 13:23:01 -0500
jleivent  wrote:

> On Mon, 26 Feb 2024 15:12:23 +0200
> Pekka Paalanen  wrote:
> 
> ...
> > > What is the advantage to having the impacted clients grow their send
> > > buffers while waiting?  They wait either way.
> > 
> > They are not waiting if they are growing their send buffers.  
> 
> I meant that they must wait for the UI to update corresponding to the
> messages they are trying to send to the compositor.
> 
> This may as also be about my assumption of a threading model, this time
> for the client.  I assume that a client that has some important work to
> do that is unrelated to updating the display will do that work in a
> distinct thread from the one dedicated to sending display related msgs
> to the compositor.

That's what they definitely should do, if their other work ends up
blocking the thread. But if their other work is fast enough, then we
have no assumptions of using threads. Given lots of software is written
in C where threading creates more problems than it solves, allowing
apps to be single-threaded is a good idea to me.

> If that's not the case, then indeed causing the client's sending thread
> to wait could impact some other computation.  Which might be bad,
> depending on what that other computation is trying to do.

Many (most?) games and game framework libraries are prime examples of
difficult requirements. They can be "single"-threaded, which means we
should not block them, but at the same time their main loop may be a
busy loop of drawing new frames with very specific expectations on when
and where the window system should be blocking them to throttle their
framerate if the game so wants.

Sometimes users do not want throttling, but they will get it anyway,
because e.g. EGL and Vulkan have a limit on the number of framebuffers
per surface, so when you run out, the app gets blocked anyway waiting
for some buffer to be released from the compositor before drawing can
continue.

You can also write an app such that there is no limit on the number of
framebuffers, but then you risk OOM instead.

> But suppose I'm writing a client that has the possibility of sending a
> rapid stream of msgs to the compositor that might be, depending on what
> other clients are doing, too fast for the compositor to handle, and I'd
> like to implement some flow control.  I don't want the connection to
> the compositor to sever or for the condition to cause memory
> consumption without any ability for me to find out about and control
> the situation.  Especially if I could slow down that rapid stream of
> updates without too high a cost to what my client is trying to
> accomplish.
> 
> Is there a way I could do that?

Get the Wayland fd with wl_display_get_fd() and poll it for writable.
If it's not writable, you're sending too fast.

That's what any client should always do. Usually it would be prompted
by wl_display_flush() erroring out with EAGAIN as your cue to start
polling for writable. It's even documented.

I haven't followed the details of the libwayland message buffering MR,
but I would expect it to give you more grace time to poll, so you don't
have to go out of your way to check at inconvenient times.

And then there are the clients who don't care to poll for writable at
all.


Thanks,
pq


pgpQLS4G22et6.pgp
Description: OpenPGP digital signature


Re: why not flow control in wl_connection_flush?

2024-02-26 Thread jleivent
On Mon, 26 Feb 2024 15:12:23 +0200
Pekka Paalanen  wrote:

...
> > What is the advantage to having the impacted clients grow their send
> > buffers while waiting?  They wait either way.  
> 
> They are not waiting if they are growing their send buffers.

I meant that they must wait for the UI to update corresponding to the
messages they are trying to send to the compositor.

This may as also be about my assumption of a threading model, this time
for the client.  I assume that a client that has some important work to
do that is unrelated to updating the display will do that work in a
distinct thread from the one dedicated to sending display related msgs
to the compositor.

If that's not the case, then indeed causing the client's sending thread
to wait could impact some other computation.  Which might be bad,
depending on what that other computation is trying to do.

But suppose I'm writing a client that has the possibility of sending a
rapid stream of msgs to the compositor that might be, depending on what
other clients are doing, too fast for the compositor to handle, and I'd
like to implement some flow control.  I don't want the connection to
the compositor to sever or for the condition to cause memory
consumption without any ability for me to find out about and control
the situation.  Especially if I could slow down that rapid stream of
updates without too high a cost to what my client is trying to
accomplish.

Is there a way I could do that?


Re: why not flow control in wl_connection_flush?

2024-02-26 Thread Pekka Paalanen
On Sat, 24 Feb 2024 15:35:27 -0500
jleivent  wrote:

> On Fri, 23 Feb 2024 12:12:38 +0200
> Pekka Paalanen  wrote:
> 
> 
> > I would think it to be quite difficult for a compositor to dedicate a
> > whole thread for each client.  
> 
> But that means it is possible that the server cannot catch up for long
> periods.  And that just having a large number of otherwise friendly
> clients can cause their socket buffers to fill up.  And things are
> worse on systems with more cores.

That can be true. Let me know when you hit that. It will be
interesting to hear what it takes to happen.

Even more interesting will be to hear if threading the client
connection handling will actually help, or is the compositor choking on
its own operations like repaint that need a snapshot of all clients'
state, because there are simply too many surfaces up.

> What is the advantage to having the impacted clients grow their send
> buffers while waiting?  They wait either way.

They are not waiting if they are growing their send buffers.


Thanks,
pq


pgpt4aYlfB0Ig.pgp
Description: OpenPGP digital signature


Re: why not flow control in wl_connection_flush?

2024-02-24 Thread jleivent
On Fri, 23 Feb 2024 12:12:38 +0200
Pekka Paalanen  wrote:


> I would think it to be quite difficult for a compositor to dedicate a
> whole thread for each client.

But that means it is possible that the server cannot catch up for long
periods.  And that just having a large number of otherwise friendly
clients can cause their socket buffers to fill up.  And things are
worse on systems with more cores.

What is the advantage to having the impacted clients grow their send
buffers while waiting?  They wait either way.


Re: why not flow control in wl_connection_flush?

2024-02-23 Thread Pekka Paalanen
On Thu, 22 Feb 2024 10:26:00 -0500
jleivent  wrote:

> Thanks for this response.  I am considering adding unbounded buffering
> to my Wayland middleware project, and wanted to consider the flow
> control options first.  Walking through the reasonsing here is very
> helpful.  I didn't know that there was a built-in expectation that
> clients would do some of their own flow control.  I was also operating
> under the assumption that blocking flushes from the compositor to
> one client would not have an impact on other clients (was assuming an
> appropriate threading model in compositors).

I would think it to be quite difficult for a compositor to dedicate a
whole thread for each client. There would be times like repainting an
output, where you would need to freeze or snapshot all relevant
surfaces' state, blocking the handling of client requests. I'm sure it
could be done, but due to the complexity a highly threaded design would
cause, most compositors to my perception have just opted for an
approximately single-threaded design, maybe not least because Wayland
explicitly aims to support that. Wayland requests are intended to be
very fast to serve, so threading them per client should not be
necessary.

> The client OOM issue, though: A malicious client can do all kinds of
> things to try to get DoS, and moving towards OOM would accomplish that
> as well on systems with sufficient speed disadvantages for thrashing.
> A buggy client that isn't trying to do anything malicious, but is
> trapped in a send loop, that would be a case where causing it to wait
> might be better than allowing it to move towards OOM (and thrash).

Where do you draw the line between being "stuck in a loop", "doing something
stupid but still workable and legit", and "doing what it simply needs
to do"?

One example of an innocent client overflowing its sending is Xwayland
where X11 clients cause wl_surface damage in an extremely fragmented
way. It might result in thousands of tiny damage rectangles, and it
could happen in multiple X11 windows simultaneously. If all that damage
was relayed as-is, it is very possible the Wayland socket would
overflow. (To work around that, there is a limit in Xwayland on how
many rects it is willing forward, and when that is exceeded, it falls
back to a single bounding-box of damage, IIRC.)

Blocking might be ok for Xwayland, perhaps, so not the best example in
that sense.

A client could also be trapped in an unthrottled repaint loop, where it
allocates pixel buffers without a limit because a compositor is not
releasing them as fast, and the general idea is that if you need to
draw and you don't have a free pixel buffer, you allocate a new one.
It's up to the client to limit itself to a reasonable number of pixel
buffers per surface, and that number is not 2. It's probably 4 usually.
A reasonable number could be even more, depending.


Thanks,
pq


> On Thu, 22 Feb 2024 11:52:28 +0200
> Pekka Paalanen  wrote:
> 
> > On Wed, 21 Feb 2024 11:08:02 -0500
> > jleivent  wrote:
> >   
> > > Not completely blocking makes sense for the compositor, but why not
> > > block the client?
> > 
> > Blocking in clients is indeed less of a problem, but:
> > 
> > - Clients do not usually have requests they *have to* send to the
> >   compositor even if the compositor is not responding timely, unlike
> >   input events that compositors have; a client can spam surfaces all
> > it wants, but it is just throwing work away if it does it faster than
> >   the screen can update. So there is some built-in expectation that
> >   clients control their sending.
> > 
> > - I think the Wayland design wants to give full asynchronicity for
> >   clients as well, never blocking them unless they explicitly choose
> > to wait for an event. A client might have semi-real-time
> >   responsibilities as well.
> > 
> > - A client's send buffer could be infinite. If a client chooses to
> > send requests so fast it hits OOM, it is just DoS'ing itself.
> >   
> > > For the compositor, wouldn't a timeout in the sendmsg make sense?
> > 
> > That would make both problems: slight blocking multiplied by number of
> > (stalled) clients, and overflows. That could lead to jittery user
> > experience while not eliminating the overflow problem.
> > 
> > 
> > Thanks,
> > pq
> >   



pgp6yRoItQM_Z.pgp
Description: OpenPGP digital signature


Re: why not flow control in wl_connection_flush?

2024-02-22 Thread jleivent
Thanks for this response.  I am considering adding unbounded buffering
to my Wayland middleware project, and wanted to consider the flow
control options first.  Walking through the reasonsing here is very
helpful.  I didn't know that there was a built-in expectation that
clients would do some of their own flow control.  I was also operating
under the assumption that blocking flushes from the compositor to
one client would not have an impact on other clients (was assuming an
appropriate threading model in compositors).

The client OOM issue, though: A malicious client can do all kinds of
things to try to get DoS, and moving towards OOM would accomplish that
as well on systems with sufficient speed disadvantages for thrashing.
A buggy client that isn't trying to do anything malicious, but is
trapped in a send loop, that would be a case where causing it to wait
might be better than allowing it to move towards OOM (and thrash).

On Thu, 22 Feb 2024 11:52:28 +0200
Pekka Paalanen  wrote:

> On Wed, 21 Feb 2024 11:08:02 -0500
> jleivent  wrote:
> 
> > Not completely blocking makes sense for the compositor, but why not
> > block the client?  
> 
> Blocking in clients is indeed less of a problem, but:
> 
> - Clients do not usually have requests they *have to* send to the
>   compositor even if the compositor is not responding timely, unlike
>   input events that compositors have; a client can spam surfaces all
> it wants, but it is just throwing work away if it does it faster than
>   the screen can update. So there is some built-in expectation that
>   clients control their sending.
> 
> - I think the Wayland design wants to give full asynchronicity for
>   clients as well, never blocking them unless they explicitly choose
> to wait for an event. A client might have semi-real-time
>   responsibilities as well.
> 
> - A client's send buffer could be infinite. If a client chooses to
> send requests so fast it hits OOM, it is just DoS'ing itself.
> 
> > For the compositor, wouldn't a timeout in the sendmsg make sense?  
> 
> That would make both problems: slight blocking multiplied by number of
> (stalled) clients, and overflows. That could lead to jittery user
> experience while not eliminating the overflow problem.
> 
> 
> Thanks,
> pq
> 


Re: why not flow control in wl_connection_flush?

2024-02-22 Thread Pekka Paalanen
On Wed, 21 Feb 2024 11:08:02 -0500
jleivent  wrote:

> Not completely blocking makes sense for the compositor, but why not
> block the client?

Blocking in clients is indeed less of a problem, but:

- Clients do not usually have requests they *have to* send to the
  compositor even if the compositor is not responding timely, unlike
  input events that compositors have; a client can spam surfaces all it
  wants, but it is just throwing work away if it does it faster than
  the screen can update. So there is some built-in expectation that
  clients control their sending.

- I think the Wayland design wants to give full asynchronicity for
  clients as well, never blocking them unless they explicitly choose to
  wait for an event. A client might have semi-real-time
  responsibilities as well.

- A client's send buffer could be infinite. If a client chooses to send
  requests so fast it hits OOM, it is just DoS'ing itself.

> For the compositor, wouldn't a timeout in the sendmsg make sense?

That would make both problems: slight blocking multiplied by number of
(stalled) clients, and overflows. That could lead to jittery user
experience while not eliminating the overflow problem.


Thanks,
pq

> On Wed, 21 Feb 2024 16:39:08 +0100
> Olivier Fourdan  wrote:
> 
> > Hi,
> > 
> > On Wed, Feb 21, 2024 at 4:21 PM jleivent  wrote:
> >   
> > > I've been looking into some of the issues about allowing the
> > > socket's kernel buffer to run out of space, and was wondering why
> > > not simply remove MSG_DONTWAIT from the sendmsg call in
> > > wl_connection_flush?  That should implement flow control by having
> > > the sender thread wait until the receiver has emptied the socket's
> > > buffer sufficiently.
> > >
> > > It seems to me that using an unbounded buffer could cause memory
> > > resource problems on whichever end was using that buffer.
> > >
> > > Was removing MSG_DONTWAIT from the sendmsg call considered and
> > > abandoned for some reason?
> > >
> > 
> > See this thread [1] from 2012, it might give some hint on why
> > MSG_DONTWAIT was added with commit  b26774da [2].
> > 
> > HTH
> > Olivier
> > 
> > [1]
> > https://lists.freedesktop.org/archives/wayland-devel/2012-February/002394.html
> > [2] https://gitlab.freedesktop.org/wayland/wayland/-/commit/b26774da  
> 



pgp6eosZRij9S.pgp
Description: OpenPGP digital signature


Re: why not flow control in wl_connection_flush?

2024-02-21 Thread jleivent
Not completely blocking makes sense for the compositor, but why not
block the client?

For the compositor, wouldn't a timeout in the sendmsg make sense?

On Wed, 21 Feb 2024 16:39:08 +0100
Olivier Fourdan  wrote:

> Hi,
> 
> On Wed, Feb 21, 2024 at 4:21 PM jleivent  wrote:
> 
> > I've been looking into some of the issues about allowing the
> > socket's kernel buffer to run out of space, and was wondering why
> > not simply remove MSG_DONTWAIT from the sendmsg call in
> > wl_connection_flush?  That should implement flow control by having
> > the sender thread wait until the receiver has emptied the socket's
> > buffer sufficiently.
> >
> > It seems to me that using an unbounded buffer could cause memory
> > resource problems on whichever end was using that buffer.
> >
> > Was removing MSG_DONTWAIT from the sendmsg call considered and
> > abandoned for some reason?
> >  
> 
> See this thread [1] from 2012, it might give some hint on why
> MSG_DONTWAIT was added with commit  b26774da [2].
> 
> HTH
> Olivier
> 
> [1]
> https://lists.freedesktop.org/archives/wayland-devel/2012-February/002394.html
> [2] https://gitlab.freedesktop.org/wayland/wayland/-/commit/b26774da



Re: why not flow control in wl_connection_flush?

2024-02-21 Thread Olivier Fourdan
Hi,

On Wed, Feb 21, 2024 at 4:21 PM jleivent  wrote:

> I've been looking into some of the issues about allowing the socket's
> kernel buffer to run out of space, and was wondering why not simply
> remove MSG_DONTWAIT from the sendmsg call in wl_connection_flush?  That
> should implement flow control by having the sender thread wait until
> the receiver has emptied the socket's buffer sufficiently.
>
> It seems to me that using an unbounded buffer could cause memory
> resource problems on whichever end was using that buffer.
>
> Was removing MSG_DONTWAIT from the sendmsg call considered and abandoned
> for some reason?
>

See this thread [1] from 2012, it might give some hint on why MSG_DONTWAIT
was added with commit  b26774da [2].

HTH
Olivier

[1]
https://lists.freedesktop.org/archives/wayland-devel/2012-February/002394.html
[2] https://gitlab.freedesktop.org/wayland/wayland/-/commit/b26774da


why not flow control in wl_connection_flush?

2024-02-21 Thread jleivent
I've been looking into some of the issues about allowing the socket's
kernel buffer to run out of space, and was wondering why not simply
remove MSG_DONTWAIT from the sendmsg call in wl_connection_flush?  That
should implement flow control by having the sender thread wait until
the receiver has emptied the socket's buffer sufficiently.

It seems to me that using an unbounded buffer could cause memory
resource problems on whichever end was using that buffer.

Was removing MSG_DONTWAIT from the sendmsg call considered and abandoned
for some reason?