Re: race between (say) wl_display_sync and wl_callback_add_listener?

2023-09-26 Thread John Cox
Hi

>On Tue, Sep 26, 2023 at 10:08:55AM +0100, John Cox wrote:
>> Hi
>> 
>> Many thanks for your comprehensive answer
>> 
>> >On Mon, Sep 25, 2023 at 05:10:30PM +0100, John Cox wrote:
>> >> Hi
>> >
>> >Hi,
>> >
>> >> Assuming I have a separate poll/read_events/dispatch_queue thread to my
>> >> main thread. If I go wl_display_sync/wl_callback_add_listener on the
>> >> main thread is there a race condition between those two calls where the
>> >> other thread can read & dispatch the callback before the listener is
>> >> added or is there some magic s.t. adding the listener will always work
>> >> as expected?
>> >
>> >This is indeed a racy interaction, in more ways than one:
>> >
>> >1. There is an inherent data race setting and using the
>> >   listener/user_data concurrently from different threads.
>> >   Neither adding a listener/user_data (see wl_proxy_add_listener()),
>> >   nor the actual dispatching operation in libwayland (which uses the
>> >   listener, see wayland-client.c:dispatch_event()) is protected by
>> >   the internal "display lock".
>> 
>> So are all interactions from another thread than the dispatch thread
>> unsafe e.g. buffer creation, setup & assignment to a surface or is it
>> just the obviously racy subset?
>
>From a low-level libwayland API perspective, creating objects or making
>requests from another thread is generally fine. It's this particular
>case of requests that create object and cause events to be emitted
>immediately for them that's problematic.
>
>Many requests that fall in this category are in fact global binds (e.g.,
>wl_shm, wl_seat, wl_output), which are typically handled in the context of the
>wl_registry.global handler, thus in the dispatch thread, in which case no
>special consideration is required.
>
>There are also some requests that at first glance seem problematic,
>but in fact they are not (if things are done properly, and assuming
>correct compositor behavior). For example:
>
>  struct wl_callback *wl_surface_frame(struct wl_surface *surface);
>
>seems suspiciously similar to wl_display_sync(). However, this request
>takes effect (i.e., the callback will receive events) only after the next
>wl_surface_commit(), so the following is safe:
>
>cb = wl_surface_frame(surface);
>/* As long as the listener is set before the commit we are good. */
>wl_callback_add_listener(cb);
>wl_surface_commit(surface);
>
>Of course, on top of all these there are also the typical higher-level
>multithreading synchronization considerations.
>
>

Good - I think my current expectations align with what you and pq have
said.

So thank you once again for taking the time to help me

John Cox





Re: race between (say) wl_display_sync and wl_callback_add_listener?

2023-09-26 Thread John Cox
Hi

Many thanks for your comprehensive answer

>On Mon, Sep 25, 2023 at 05:10:30PM +0100, John Cox wrote:
>> Hi
>
>Hi,
>
>> Assuming I have a separate poll/read_events/dispatch_queue thread to my
>> main thread. If I go wl_display_sync/wl_callback_add_listener on the
>> main thread is there a race condition between those two calls where the
>> other thread can read & dispatch the callback before the listener is
>> added or is there some magic s.t. adding the listener will always work
>> as expected?
>
>This is indeed a racy interaction, in more ways than one:
>
>1. There is an inherent data race setting and using the
>   listener/user_data concurrently from different threads.
>   Neither adding a listener/user_data (see wl_proxy_add_listener()),
>   nor the actual dispatching operation in libwayland (which uses the
>   listener, see wayland-client.c:dispatch_event()) is protected by
>   the internal "display lock".

So are all interactions from another thread than the dispatch thread
unsafe e.g. buffer creation, setup & assignment to a surface or is it
just the obviously racy subset?

>2. Even ignoring the above, if the callback done event is received and
>   dispatched before the listener is actually set, the callback will
>   not get called.

That was the race I could see
   
>> I assume it must be safe if dispatch & create/add_listener are in the
>> same thread.
>
>Yes, a single thread guarantees that the listener is properly set before
>any relevant event is dispatched (assuming you don't dispatch events
>before setting the listener!).
>
>> The same question applies to adding listeners after binding (say) the
>> shm extension where the bind provokes a set of events giving formats.
>
>Yes, any object creation that may cause the compositor to send events as
>an immediate response has this problem.
> 
>> Is there a safe way of doing this? (mutex around the dispatch and
>> create/add_listener pairs would seem plausible even if really not what I
>> want to do)
>
>Here are some ideas:
>
>Use a separate queue in the main thread for this event. Special care is
>needed to ensure the event is actually dispatched on that queue. See
>wl_proxy_create_wrapper(), and wl_display_roundtrip_queue() for a
>function that uses a wrapper. Note that using a separate queue may
>cause the event(s) to be handled out of order relative to the events
>in the main queue. This may or may not be a problem depending on the
>specific scenario.

Yup - understood

>Using a mutex as you describe is also a possible solution. The typical
>caveats apply: holding a mutex when calling out to potentially arbitrary
>code (i.e., dispatch handlers) may increase deadlock risk etc.

Thats why I don't want to do that

>Another alternative would be to introduce a mechanism to pause and
>resume the dispatch thread (e.g., wake up the poll with a pipe/eventfd
>and then wait on some condition variable to resume), which would be used
>like:
>
>pause_dispatch_thread();
>... sync and add listener ...
>resume_dispatch_thread();
>
>You could also introduce a mechanism to schedule your own function
>calls in the dispatch thread:
>
>schedule_in_dispatch_thread(sync_and_add_listener, data);

Yup - thats what I'm now doing for the obvious cases

>Finally, you might also want to consider a different design more in line
>with the libwayland API expectations, if possible.

Not really possible - there are some things I need done (buffer reclaim
mostly) async as soon as possible and I don't have control over the main
loop.

There may be a document that sets out everything you've said above and
gives the exact expectations but I failed to find it. In general the
individual call documentation is great but how interactions between
calls are managed is harder to find. I started from an (incorrect)
assumption that everything was fully async and could be called from any
thread (my natural progamming style) and so there must be magic to
enable that and have only slowly been corrected by reality.

Again many thanks

John Cox

>HTH,
>Alexandros


race between (say) wl_display_sync and wl_callback_add_listener?

2023-09-25 Thread John Cox
Hi

Assuming I have a separate poll/read_events/dispatch_queue thread to my
main thread. If I go wl_display_sync/wl_callback_add_listener on the
main thread is there a race condition between those two calls where the
other thread can read & dispatch the callback before the listener is
added or is there some magic s.t. adding the listener will always work
as expected?

I assume it must be safe if dispatch & create/add_listener are in the
same thread.

The same question applies to adding listeners after binding (say) the
shm extension where the bind provokes a set of events giving formats.

Is there a safe way of doing this? (mutex around the dispatch and
create/add_listener pairs would seem plausible even if really not what I
want to do)

Many thanks

John Cox


Re: wl_surface::attach(NULL) release previous buffer?

2023-09-22 Thread John Cox
Hi

>On 9/14/23 14:24, John Cox wrote:
>
>> Hi
>>
>> A, hopefully, simple question - should I expect a wl_buffer::release
>> event from the buffer previously committed to a surface after I've
>> attached (and invalidated & committed) a NULL buffer to the surface? it
>> doesn't seem to happen for me (I have WAYLAND_DEBUG=1 logs showing it
>> not happening).
>>
>> If I shouldn't expect a release - when is it safe to reuse/free the
>> buffer storage?
>
>The compositor may continue using the buffer even if you attach a null 
>buffer to the wl_surface. For example, the compositor may do it to play 
>an animation if the window is unmapped.

You've completely lost me with this example.  Are you suggesting that
the compositor is reusing my buffer for its own purposes?

>It is safe to reuse the buffer when you receive wl_buffer.release event 
>from the compositor.

Yup I got that (and you also have to wait for dmabuf fences if it is
that sort of buffer).

In the case that I was wondering about it was a compositor bug that kept
the buffer locked when it shouldn't have been.

Thanks

HJC


Re: wl_surface::attach(NULL) release previous buffer?

2023-09-17 Thread John Cox
On Fri, 15 Sep 2023 21:53:34 +, you wrote:

>> Ideally the release event would have been a wl_callback just like 
>> wl_surface.frame is.
>
>I sent [1] to fix this.
>
>[1]: https://gitlab.freedesktop.org/wayland/wayland/-/merge_requests/137

That would have been nice but it doesn't look like it made it?

Thanks

JC


Re: wl_surface::attach(NULL) release previous buffer?

2023-09-15 Thread John Cox
Hi

>On Thu, 14 Sep 2023 12:24:42 +0100
>John Cox  wrote:
>
>> Hi
>> 
>> A, hopefully, simple question - should I expect a wl_buffer::release
>> event from the buffer previously committed to a surface after I've
>> attached (and invalidated & committed) a NULL buffer to the surface? it
>> doesn't seem to happen for me (I have WAYLAND_DEBUG=1 logs showing it
>> not happening).
>
>Theoretically yes, because committing a NULL buffer should delete any
>content of the surface, so there is no reason for the compositor to
>keep the buffer.
>
>It may not be immediate, though, so a simple roundtrip is not
>guaranteed to be enough. An unfinished output frame may cause the
>compositor to still need the previous buffer for a moment longer, and
>the compositor might need to paint one more output refresh to ensure
>the buffer is no longer in use.
>
>Then there are also client related considerations like if the
>wl_surface is actually a synchronized sub-surface, then committing a
>NULL buffer won't take effect until the parent has been committed and
>taken effect.
>
>Another thing is that some wl_surface roles might forbid committing a
>NULL buffer, but you'd notice that from a protocol error. I don't
>remember what roles that might be.
>
>> If I shouldn't expect a release - when is it safe to reuse/free the
>> buffer storage?
>
>To re-use the storage, you do need to receive wl_buffer.release, or an
>equivalent if you are using an extension like explicit sync.

Many thanks for your comprehensive answer. I believe that I've accounted
for all of the special cases you mention: I've waited long enough via
artificial debug delay, its on an async subplane, the window is visibly
black and it works as expected with an egl backend rether the the pixman
one. I know that our pixman backend has been patched but I wanted to
check what I was expecting before trying to raise an issue with them.

As an aside one of the more disapointing lines I read in the API docn
was "If a pending wl_buffer has been committed to more than one
wl_surface, the delivery of wl_buffer.release events becomes undefined."
Yes there may be ways of working round this but it feels wrong.

Thanks again

JC


wl_surface::attach(NULL) release previous buffer?

2023-09-14 Thread John Cox
Hi

A, hopefully, simple question - should I expect a wl_buffer::release
event from the buffer previously committed to a surface after I've
attached (and invalidated & committed) a NULL buffer to the surface? it
doesn't seem to happen for me (I have WAYLAND_DEBUG=1 logs showing it
not happening).

If I shouldn't expect a release - when is it safe to reuse/free the
buffer storage?

Thanks

John Cox


Re: Help with proxy queues please

2023-09-11 Thread John Cox
On Mon, 11 Sept 2023 at 15:52, Pekka Paalanen  wrote:
>
> On Mon, 11 Sep 2023 12:53:15 +0100
> John Cox  wrote:
>
> > Hi
> > I am clearly deeply confused by event queues and proxies - I have this
> > snippet of code (from some VLC work I'm doing):
> >
> > sys->eventq = wl_display_create_queue(video_display(sys));
> > if (sys->eventq == NULL) {
> > msg_Err(vd, "Failed to create event Q");
> > goto error;
> > }
> > if ((sys->wrapped_display =
> > wl_proxy_create_wrapper(video_display(sys))) == NULL)
> > {
> > msg_Err(vd, "Failed to create wrapper");
> > goto error;
> > }
> > wl_proxy_set_queue((struct wl_proxy *)sys->wrapped_display, sys->eventq);
> >
> > struct wl_callback * cb;
> > struct wl_registry *const registry =
> > wl_display_get_registry(sys->wrapped_display);
> > if (registry == NULL) {
> > msg_Err(vd, "Cannot get registry for display");
> > goto error;
> > }
> >
> > wl_registry_add_listener(registry, ®istry_cbs, vd);
> > cb = wl_display_sync(sys->wrapped_display);
> > wl_callback_add_listener(cb, ®_done_sync_listener, vd);
> >
> > msg_Info(vd, "Roundtrip start");
> > wl_display_roundtrip_queue(sys->wrapped_display, sys->eventq);
>
> The problem is probably here.
>
> wrapped_display is just a shell of a wl_proxy, but
> wl_display_roundtrip_queue() needs the original wl_display, just like
> wl_display_dispatch_queue() it calls.
>
> Thanks to elinor in IRC for pointing this out, when I first went on a
> complete wild goose chase wondering if a wl_display can even be
> wrapped, but of course it must be wrappable.
>
> > msg_Info(vd, "Roundtrip done");
> > wl_registry_destroy(registry);
> >
> > video_display() returns the display we are rendering to. My
> > expectation was that the registry & sync listeners would be called
> > somewhere between the roundtrip start/end debug lines, but what seems
> > to happen is that the roundtrip just returns immediately.
> >
> > I have clearly misunderstood something at a fundamental level. I
> > thought that the wrapped display would set the Q that the listeners
> > would send events on and th eroundtrip would wait for that. The
> > wl_proxy_create_wrapper docn on
> > https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__proxy
> > also seems to imply this.
> >
> > Any hints would be gratefully received - thanks
>
> Yeah, it's a bit awkward to have wl_display both as a proxy and a
> Wayland connection. Some APIs need only a wl_proxy, others need the
> connection.
>
> That's something to add in the docs somehow.
>
> As a rule of thumb, I'd say that anything that explicitly takes a
> 'struct wl_display *' as an argument in wayland-client-core.h needs the
> original wl_display and does not work with a proxy-wrapper.
>
> Another exception is that you cannot change the queue of the original
> wl_display (or at least it's buggy), while a proxy-wrapped wl_display
> you can.
>
>
> Thanks,
> pq
Many thanks for that. That makes it all (a little) clearer.

John Cox


Help with proxy queues please

2023-09-11 Thread John Cox
Hi
I am clearly deeply confused by event queues and proxies - I have this
snippet of code (from some VLC work I'm doing):

sys->eventq = wl_display_create_queue(video_display(sys));
if (sys->eventq == NULL) {
msg_Err(vd, "Failed to create event Q");
goto error;
}
if ((sys->wrapped_display =
wl_proxy_create_wrapper(video_display(sys))) == NULL)
{
msg_Err(vd, "Failed to create wrapper");
goto error;
}
wl_proxy_set_queue((struct wl_proxy *)sys->wrapped_display, sys->eventq);

struct wl_callback * cb;
struct wl_registry *const registry =
wl_display_get_registry(sys->wrapped_display);
if (registry == NULL) {
msg_Err(vd, "Cannot get registry for display");
goto error;
}

wl_registry_add_listener(registry, ®istry_cbs, vd);
cb = wl_display_sync(sys->wrapped_display);
wl_callback_add_listener(cb, ®_done_sync_listener, vd);

msg_Info(vd, "Roundtrip start");
wl_display_roundtrip_queue(sys->wrapped_display, sys->eventq);
msg_Info(vd, "Roundtrip done");
wl_registry_destroy(registry);

video_display() returns the display we are rendering to. My
expectation was that the registry & sync listeners would be called
somewhere between the roundtrip start/end debug lines, but what seems
to happen is that the roundtrip just returns immediately.

I have clearly misunderstood something at a fundamental level. I
thought that the wrapped display would set the Q that the listeners
would send events on and th eroundtrip would wait for that. The
wl_proxy_create_wrapper docn on
https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__proxy
also seems to imply this.

Any hints would be gratefully received - thanks

John Cox