Re: Remote display with 3D acceleration using Wayland/Weston
On 2/28/17 8:12 AM, Hardening wrote: > This has been done a quite long time ago, here => > https://gitorious.org/weston/jonseverinsson-weston/?p=weston:jonseverinsson-weston.git;a=commit;h=9e26d9356255f4af1723700272805f6d356c7d7a > > It's clearly outdated, and IIRC people here didn't like the way it was > implemented, but you have the idea. It's using DRI render nodes to do > the rendering. Thanks for the hint. I was able to integrate your patch with the latest Weston code by borrowing some of the code from the DRM backend, but it doesn't seem to display anything (my patch is attached. Maybe someone can point out what I'm doing wrong, so I could at least get this working for experimental purposes.) The following GitHub comment summarizes my general feelings at the moment: https://github.com/TurboVNC/turbovnc/issues/18#issuecomment-282827192 To the best of my knowledge (someone please correct me if I got any part of this wrong), I think it would be much easier and more flexible to pursue the interposer approach for now, mostly because of the nVidia proprietary driver issue (that's a show-stopper for VirtualGL and TurboVNC) but also because implementing a headless hardware-accelerated remote display backend "the right way" would likely require some changes to gl_renderer and the compositor code, including implementing PBO readback in order to prevent the readback in the compositor from blocking other OpenGL applications. I anticipate that further tuning will likely be required as well, and it may still not perform as well as an interposer, because VirtualGL reads back the pixels in the application's rendering thread but uses a separate thread for displaying the pixels. Thus, if the compositor was blocking on encoding a pixel region for remote display, the Wayland client could still render the next frame in the background. I am not in a good position to develop or maintain changes to Weston, and since interest has been expressed in "officially" supporting a headless hardware-accelerated remote display backend in the long term, it makes sense for me to develop an interposer as a stopgap measure. The interposer could also provide a springboard for other developers who are interested in making their own Weston remote display backends, since they would not have to deal with the problem of OpenGL hardware acceleration. In the long term, I anticipate that this interposer will be rendered obsolete, but that's fine, because a lot of the effort necessary to build an EGL interposer for Wayland would benefit the existing GLX interposer in VirtualGL as well, so the only thing that might be "thrown away" in the long term would be the Wayland-specific parts. Barring any further information, that seems to be the best path forward at the moment, but if there is any movement on the nVidia front or on the headless hardware-accelerated remote display backend front, please keep me in the loop. DRC From f331db279a138295701806de0c8bd71f385d2796 Mon Sep 17 00:00:00 2001 From: DRCDate: Tue, 28 Feb 2017 22:49:22 -0600 Subject: [PATCH] rdp-backend.so: OpenGL hardware acceleration --- compositor/main.c | 23 +++- configure.ac | 4 +- libweston/compositor-rdp.c | 314 +++-- libweston/compositor-rdp.h | 24 4 files changed, 352 insertions(+), 13 deletions(-) diff --git a/compositor/main.c b/compositor/main.c index 72c3cd1..7f4b8db 100644 --- a/compositor/main.c +++ b/compositor/main.c @@ -601,6 +601,7 @@ usage(int error_code) " --rdp4-key=FILE\tThe file containing the key for RDP4 encryption\n" " --rdp-tls-cert=FILE\tThe file containing the certificate for TLS encryption\n" " --rdp-tls-key=FILE\tThe file containing the private key for TLS encryption\n" + " --use-pixman\t\tUse the pixman (CPU) renderer\n" "\n"); #endif @@ -1329,11 +1330,14 @@ static void rdp_backend_output_configure(struct wl_listener *listener, void *data) { struct weston_output *output = data; + struct weston_config *wc = wet_get_config(output->compositor); struct wet_compositor *compositor = to_wet_compositor(output->compositor); struct wet_output_config *parsed_options = compositor->parsed_options; + struct weston_config_section *section; const struct weston_rdp_output_api *api = weston_rdp_output_get_api(output->compositor); int width = 640; int height = 480; + char *gbm_format = NULL; assert(parsed_options); @@ -1342,6 +1346,8 @@ rdp_backend_output_configure(struct wl_listener *listener, void *data) return; } + section = weston_config_get_section(wc, "output", "name", output->name); + if (parsed_options->width) width = parsed_options->width; @@ -1351,6 +1357,12 @@ rdp_backend_output_configure(struct wl_listener *listener,
Re: Remote display with 3D acceleration using Wayland/Weston
Le 24/02/2017 à 00:51, DRC a écrit : > On 12/15/16 3:01 AM, Pekka Paalanen wrote: >> The current RDP-backed is written to set up and use only the Pixman >> renderer. Pixman renderer is a software renderer, and will not >> initialize EGL in the compositor. Therefore no support for hardware >> accelerated OpenGL gets advertised to clients, and clients fall back to >> software GL. >> >> You can fix this purely by modifying libweston/compositor-rdp.c file, >> writing the support for initializing the GL-renderer. Then you get >> hardware accelerated GL support for all Wayland clients without any >> other modifications anywhere. >> >> Why that has not been done already is because it was thought that >> having clients using hardware OpenGL while the compositor is not cannot >> be performant enough to justify the effort. Also, it pulls in the >> dependency to EGL and GL libs, which are huge. Obviously your use case >> is different and this rationale does not apply. >> >> The hardest part in adding the support to the RDP-backend is >> implementing the buffer content access efficiently. RDP requires pixel >> data in system memory so the CPU can read it, but GL-renderer has all >> pixel data in graphics memory which often cannot be directly read by >> the CPU. Accessing that pixel data requires a copy (glReadPixels), and >> there is nowadays a helper: weston_surface_copy_content(), however the >> function is not efficient and is so far meant only for debugging and >> testing. > > I am attempting to modify the RDP backend to prove the concept that > hardware-accelerated OpenGL is possible with a remote display backend, > but my lack of familiarity with the code is making this very > challenging. It seems that the RDP backend uses Pixman both for GL > rendering and also to maintain its framebuffer in main memory > (shadow_surface.) Is that correct? If so, then it seems that I would > need to continue using the shadow surface but use gl_renderer instead of > the Pixman renderer, then implement my own method of transferring pixels > from the GL renderer to the shadow surface at the end of every frame (?) > I've been trying to work from compositor-wayland.c as a template, but > it's unclear how everything connects, which parts of that code I need in > order to implement hardware acceleration, and which parts are > unnecessary. I would appreciate it if someone who has familiarity with > the RDP backend could give me some targeted advice. This has been done a quite long time ago, here => https://gitorious.org/weston/jonseverinsson-weston/?p=weston:jonseverinsson-weston.git;a=commit;h=9e26d9356255f4af1723700272805f6d356c7d7a It's clearly outdated, and IIRC people here didn't like the way it was implemented, but you have the idea. It's using DRI render nodes to do the rendering. Best regards. -- David FORT website: http://www.hardening-consulting.com/ ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On 24 February 2017 at 09:36, Pekka Paalanenwrote: > On Thu, 23 Feb 2017 17:51:24 -0600 > DRC wrote: > >> On 12/15/16 3:01 AM, Pekka Paalanen wrote: >> > The current RDP-backed is written to set up and use only the Pixman >> > renderer. Pixman renderer is a software renderer, and will not >> > initialize EGL in the compositor. Therefore no support for hardware >> > accelerated OpenGL gets advertised to clients, and clients fall back to >> > software GL. >> > >> > You can fix this purely by modifying libweston/compositor-rdp.c file, >> > writing the support for initializing the GL-renderer. Then you get >> > hardware accelerated GL support for all Wayland clients without any >> > other modifications anywhere. >> > >> > Why that has not been done already is because it was thought that >> > having clients using hardware OpenGL while the compositor is not cannot >> > be performant enough to justify the effort. Also, it pulls in the >> > dependency to EGL and GL libs, which are huge. Obviously your use case >> > is different and this rationale does not apply. >> > >> > The hardest part in adding the support to the RDP-backend is >> > implementing the buffer content access efficiently. RDP requires pixel >> > data in system memory so the CPU can read it, but GL-renderer has all >> > pixel data in graphics memory which often cannot be directly read by >> > the CPU. Accessing that pixel data requires a copy (glReadPixels), and >> > there is nowadays a helper: weston_surface_copy_content(), however the >> > function is not efficient and is so far meant only for debugging and >> > testing. >> >> I am attempting to modify the RDP backend to prove the concept that >> hardware-accelerated OpenGL is possible with a remote display backend, >> but my lack of familiarity with the code is making this very >> challenging. It seems that the RDP backend uses Pixman both for GL >> rendering and also to maintain its framebuffer in main memory >> (shadow_surface.) Is that correct? If so, then it seems that I would >> need to continue using the shadow surface but use gl_renderer instead of >> the Pixman renderer, then implement my own method of transferring pixels >> from the GL renderer to the shadow surface at the end of every frame (?) > > That is pretty much the case, yes. I suppose you could also just let > GL-renderer maintain the framebuffer to only read it out for > transmission rather than maintaining a shadow copy, but the difference > is mostly just conceptual. > >> I've been trying to work from compositor-wayland.c as a template, but >> it's unclear how everything connects, which parts of that code I need in >> order to implement hardware acceleration, and which parts are >> unnecessary. I would appreciate it if someone who has familiarity with >> the RDP backend could give me some targeted advice. > > I cannot help with the RDP-specifics. > > Since this compositor is essentially headless in the local machine, you > would want to use DRM render nodes instead of KMS nodes for accessing > the GPU. The KMS node would be reserved by any display server running > for the local monitors. > > You would initialize EGL somehow to use a render node. I can't really > provide a good suggestion for an architecture off-hand, but maybe these > could help: > https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_platform_device.txt > https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_platform_gbm.txt > FYI: One can use EGL_EXT_device_drm to get the master fd, but we need another extension for the render. I've got some work on the topic - both EGL Device in mesa and the new extension - need to see if I can finish it these days. -Emil ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Thu, 23 Feb 2017 17:51:24 -0600 DRCwrote: > On 12/15/16 3:01 AM, Pekka Paalanen wrote: > > The current RDP-backed is written to set up and use only the Pixman > > renderer. Pixman renderer is a software renderer, and will not > > initialize EGL in the compositor. Therefore no support for hardware > > accelerated OpenGL gets advertised to clients, and clients fall back to > > software GL. > > > > You can fix this purely by modifying libweston/compositor-rdp.c file, > > writing the support for initializing the GL-renderer. Then you get > > hardware accelerated GL support for all Wayland clients without any > > other modifications anywhere. > > > > Why that has not been done already is because it was thought that > > having clients using hardware OpenGL while the compositor is not cannot > > be performant enough to justify the effort. Also, it pulls in the > > dependency to EGL and GL libs, which are huge. Obviously your use case > > is different and this rationale does not apply. > > > > The hardest part in adding the support to the RDP-backend is > > implementing the buffer content access efficiently. RDP requires pixel > > data in system memory so the CPU can read it, but GL-renderer has all > > pixel data in graphics memory which often cannot be directly read by > > the CPU. Accessing that pixel data requires a copy (glReadPixels), and > > there is nowadays a helper: weston_surface_copy_content(), however the > > function is not efficient and is so far meant only for debugging and > > testing. > > I am attempting to modify the RDP backend to prove the concept that > hardware-accelerated OpenGL is possible with a remote display backend, > but my lack of familiarity with the code is making this very > challenging. It seems that the RDP backend uses Pixman both for GL > rendering and also to maintain its framebuffer in main memory > (shadow_surface.) Is that correct? If so, then it seems that I would > need to continue using the shadow surface but use gl_renderer instead of > the Pixman renderer, then implement my own method of transferring pixels > from the GL renderer to the shadow surface at the end of every frame (?) That is pretty much the case, yes. I suppose you could also just let GL-renderer maintain the framebuffer to only read it out for transmission rather than maintaining a shadow copy, but the difference is mostly just conceptual. > I've been trying to work from compositor-wayland.c as a template, but > it's unclear how everything connects, which parts of that code I need in > order to implement hardware acceleration, and which parts are > unnecessary. I would appreciate it if someone who has familiarity with > the RDP backend could give me some targeted advice. I cannot help with the RDP-specifics. Since this compositor is essentially headless in the local machine, you would want to use DRM render nodes instead of KMS nodes for accessing the GPU. The KMS node would be reserved by any display server running for the local monitors. You would initialize EGL somehow to use a render node. I can't really provide a good suggestion for an architecture off-hand, but maybe these could help: https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_platform_device.txt https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_platform_gbm.txt Or any other way to get EGL initialized with a real GPU, but without an actual display or a window system. Depending on how that works, you might be rendering into an FBO and use glReadPixels(), or create an EGLSurface and use glReadPixels() or something else perhaps. Weston will need to be able to run EGL with a render node for testing the GL-renderer on real hardware, but so far we don't have that code, so I don't have an example. A completely different path would be to duplicate the parts of GL-renderer you need (access to client provided buffers) and throw away the rest (all the rendering, damage tracking, and whatnot), initialize EGL with a render node, and just scrape the client buffer contents directly without composition. In this case you would be transmitting client window contents as is, not the final composition. That might have (bandwidth) drawbacks of its own. Thanks, pq pgpYqGx096BYl.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Fri, 24 Feb 2017 02:20:08 +0100 Christian Stroetmannwrote: > Maybe a look on another compositor or the IVI shell might help you. Please do not look at the IVI shell. It has nothing to do with this while being an example of a... "inferior" window management architecture dictated by GENIVI. Thanks, pq pgpJnGrr_YE_x.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Fwd: Remote display with 3D acceleration using Wayland/Weston
-- Forwarded message -- From: Erik De Rijcke <derijcke.e...@gmail.com> Date: 2017-02-24 8:46 GMT+01:00 Subject: Re: Remote display with 3D acceleration using Wayland/Weston To: Christian Stroetmann <stroetm...@ontolab.com> I made a poc for a remote (gl) display some time ago for my own compositor, but instead of using rdp, I used html5 websocket & canvas. On a 800x600 display I would get a steady 30fps. There were basically 2 performance challenges to be solved. The glreadpixels and the compression of raw pixels to a compressed image format (png in my case). In the end I had to resolve to multi threading to get acceptable performance. This was without other optimizations like only reading out damaged regions, or resorting to something more efficient than glreadpixels. Nowadays I'm working on a html5 solution that is more (very) closely related to what waltham aims to be. That is where only the content of individual surfaces are send over rtp (openwebrtc in my case). While communication with the back-end happens over a wayland alike protocol [ https://github.com/udevbe/westfield ]. Each client then basically runs it's own compositor. One major issue that is still unresolved is to efficiently encode a gl texture to a video frame (vp8/9, h264) without leaving the gpu space early. Ideally the encoding to a video frame (or a png or jpeg) happens on the gpu so the amount of pixels to be read out is minimized. What you should ask yourself is if you really want to send over the entire screen or just the individual surfaces. Using individual surfaces should give you less work to get some decent performance, another side effect is that moving a surface does not require re-sending the entire screen. ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On the 24th of February 2017 00:51, DRC wrote: On 12/15/16 3:01 AM, Pekka Paalanen wrote: The current RDP-backed is written to set up and use only the Pixman renderer. Pixman renderer is a software renderer, and will not initialize EGL in the compositor. Therefore no support for hardware accelerated OpenGL gets advertised to clients, and clients fall back to software GL. You can fix this purely by modifying libweston/compositor-rdp.c file, writing the support for initializing the GL-renderer. Then you get hardware accelerated GL support for all Wayland clients without any other modifications anywhere. Why that has not been done already is because it was thought that having clients using hardware OpenGL while the compositor is not cannot be performant enough to justify the effort. Also, it pulls in the dependency to EGL and GL libs, which are huge. Obviously your use case is different and this rationale does not apply. The hardest part in adding the support to the RDP-backend is implementing the buffer content access efficiently. RDP requires pixel data in system memory so the CPU can read it, but GL-renderer has all pixel data in graphics memory which often cannot be directly read by the CPU. Accessing that pixel data requires a copy (glReadPixels), and there is nowadays a helper: weston_surface_copy_content(), however the function is not efficient and is so far meant only for debugging and testing. I am attempting to modify the RDP backend to prove the concept that hardware-accelerated OpenGL is possible with a remote display backend, but my lack of familiarity with the code is making this very challenging. It seems that the RDP backend uses Pixman both for GL rendering and also to maintain its framebuffer in main memory (shadow_surface.) Is that correct? If so, then it seems that I would need to continue using the shadow surface but use gl_renderer instead of the Pixman renderer, then implement my own method of transferring pixels from the GL renderer to the shadow surface at the end of every frame (?) I've been trying to work from compositor-wayland.c as a template, but it's unclear how everything connects, which parts of that code I need in order to implement hardware acceleration, and which parts are unnecessary. I would appreciate it if someone who has familiarity with the RDP backend could give me some targeted advice. Aloha I have not come so far some years ago but your approach sounds good. Basically, it is a translation of what you do with your GLX interposer (VirtualGL) and high-speed X proxy (TurboVNC) to Weston, which reduces to the handling of the buffers. That said, I would try to get it running and then compare the speed with the old version running with XWindow, and also upload the code for review by the gurus. Maybe a look on another compositor or the IVI shell might help you. Regards Christian Stroetmann ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On the 24th of February 2017 00:51, DRC wrote: On 12/15/16 3:01 AM, Pekka Paalanen wrote: The current RDP-backed is written to set up and use only the Pixman renderer. Pixman renderer is a software renderer, and will not initialize EGL in the compositor. Therefore no support for hardware accelerated OpenGL gets advertised to clients, and clients fall back to software GL. You can fix this purely by modifying libweston/compositor-rdp.c file, writing the support for initializing the GL-renderer. Then you get hardware accelerated GL support for all Wayland clients without any other modifications anywhere. Why that has not been done already is because it was thought that having clients using hardware OpenGL while the compositor is not cannot be performant enough to justify the effort. Also, it pulls in the dependency to EGL and GL libs, which are huge. Obviously your use case is different and this rationale does not apply. The hardest part in adding the support to the RDP-backend is implementing the buffer content access efficiently. RDP requires pixel data in system memory so the CPU can read it, but GL-renderer has all pixel data in graphics memory which often cannot be directly read by the CPU. Accessing that pixel data requires a copy (glReadPixels), and there is nowadays a helper: weston_surface_copy_content(), however the function is not efficient and is so far meant only for debugging and testing. I am attempting to modify the RDP backend to prove the concept that hardware-accelerated OpenGL is possible with a remote display backend, but my lack of familiarity with the code is making this very challenging. It seems that the RDP backend uses Pixman both for GL rendering and also to maintain its framebuffer in main memory (shadow_surface.) Is that correct? If so, then it seems that I would need to continue using the shadow surface but use gl_renderer instead of the Pixman renderer, then implement my own method of transferring pixels from the GL renderer to the shadow surface at the end of every frame (?) I've been trying to work from compositor-wayland.c as a template, but it's unclear how everything connects, which parts of that code I need in order to implement hardware acceleration, and which parts are unnecessary. I would appreciate it if someone who has familiarity with the RDP backend could give me some targeted advice. Aloha I have not come so far some years ago but your approach sounds good. Basically, it is a translation of what you do with your GLX interposer (VirtualGL) and high-speed X proxy (TurboVNC) to Weston, which reduces to the handling of the buffers. That said, I would try to get it running and then compare the speed with the old version running with XWindow, and also upload the code for review by the gurus. Maybe a look on another compositor or the IVI shell might help you. Regards Christian Stroetmann ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On 12/15/16 3:01 AM, Pekka Paalanen wrote: > The current RDP-backed is written to set up and use only the Pixman > renderer. Pixman renderer is a software renderer, and will not > initialize EGL in the compositor. Therefore no support for hardware > accelerated OpenGL gets advertised to clients, and clients fall back to > software GL. > > You can fix this purely by modifying libweston/compositor-rdp.c file, > writing the support for initializing the GL-renderer. Then you get > hardware accelerated GL support for all Wayland clients without any > other modifications anywhere. > > Why that has not been done already is because it was thought that > having clients using hardware OpenGL while the compositor is not cannot > be performant enough to justify the effort. Also, it pulls in the > dependency to EGL and GL libs, which are huge. Obviously your use case > is different and this rationale does not apply. > > The hardest part in adding the support to the RDP-backend is > implementing the buffer content access efficiently. RDP requires pixel > data in system memory so the CPU can read it, but GL-renderer has all > pixel data in graphics memory which often cannot be directly read by > the CPU. Accessing that pixel data requires a copy (glReadPixels), and > there is nowadays a helper: weston_surface_copy_content(), however the > function is not efficient and is so far meant only for debugging and > testing. I am attempting to modify the RDP backend to prove the concept that hardware-accelerated OpenGL is possible with a remote display backend, but my lack of familiarity with the code is making this very challenging. It seems that the RDP backend uses Pixman both for GL rendering and also to maintain its framebuffer in main memory (shadow_surface.) Is that correct? If so, then it seems that I would need to continue using the shadow surface but use gl_renderer instead of the Pixman renderer, then implement my own method of transferring pixels from the GL renderer to the shadow surface at the end of every frame (?) I've been trying to work from compositor-wayland.c as a template, but it's unclear how everything connects, which parts of that code I need in order to implement hardware acceleration, and which parts are unnecessary. I would appreciate it if someone who has familiarity with the RDP backend could give me some targeted advice. ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Mon, 19 Dec 2016 13:23:26 -0600 DRCwrote: > On 12/19/16 2:48 AM, Pekka Paalanen wrote: > > Hmm, indeed, maybe it would be possible if you are imposing your own > > EGL middle-man library between the application and the real EGL library. > > > > That's definitely a idea to look into. I cannot say off-hand why it > > would not work, so maybe it can work. :-) > > > > To summarize, with that approach, you would have the client send only > > wl_shm buffers to the compositor, and the compositor never needs to > > touch EGL at all. It also has the benefit that the read-back cost > > (glReadPixels) is completely in the client process, so the compositor > > will not stall on it, and you don't need the stuff I explained about in > > the compositor. And you get support for the proprietary drivers! > > > > Sorry for not realizing the "wrap libEGL.so" approach earlier. > > Yes, exactly. That is essentially how VirtualGL already works with > GLX/OpenGL, so it is a solution space I know well. As I see it, the > advantages of implementing this at the compositor level are: > > -- Automatic hardware acceleration for window managers that might need > to use OpenGL (which includes most of them these days) > -- No need to launch OpenGL applications using a wrapper script > -- Potentially the compositor could tap into GPU-based encoding methods > (NVENC, for instance) quite easily to compress the pixel updates sent to > the client. This becomes more difficult when the pixel readback is > occurring in the OpenGL application process but the compression is > occurring in another process. > > The potential advantages of an interposer are: > > -- Much easier for me to develop, since this would represent basically a > subset of VirtualGL's existing functionality (the GLX interposer could > also benefit from a back end that accesses the GPU directly through EGL > rather than forwarding the GLX requests through a local X server.) > -- The readback occurs in-process, so only applications that actually > need it (OpenGL applications) are subject to that overhead, and the > design of VirtualGL makes it such that the readback of the current frame > occurs in parallel with the display of the last frame. > -- Theoretically should work with any Wayland implementation or back > end. It goes without saying that I'm not the only one in this game. In > the current market, there are lots of different vendors producing their > X11 proxy of choice, but all of them can use VirtualGL to add GPU > acceleration. I don't know how the market will look with Wayland, but I > would anticipate that those same vendors will produce their own Wayland > proxies of choice as well, so there might be an advantage to retaining > VirtualGL as an independent bolt-on product. > -- That is a good point about the compositor not stalling on > glReadPixels()-- although I think I could probably mitigate that by > using PBOs rather than synchronous glReadPixels(). > > I know for sure that I can make the interposer approach work, and > perhaps that would be a good short-term approach to get something up and > running while the other approach is explored in more depth. Hi, I fully agree on everything you said here. :-) Thanks, pq pgp1ZHuIPTVLU.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On 12/19/16 2:48 AM, Pekka Paalanen wrote: > Hmm, indeed, maybe it would be possible if you are imposing your own > EGL middle-man library between the application and the real EGL library. > > That's definitely a idea to look into. I cannot say off-hand why it > would not work, so maybe it can work. :-) > > To summarize, with that approach, you would have the client send only > wl_shm buffers to the compositor, and the compositor never needs to > touch EGL at all. It also has the benefit that the read-back cost > (glReadPixels) is completely in the client process, so the compositor > will not stall on it, and you don't need the stuff I explained about in > the compositor. And you get support for the proprietary drivers! > > Sorry for not realizing the "wrap libEGL.so" approach earlier. Yes, exactly. That is essentially how VirtualGL already works with GLX/OpenGL, so it is a solution space I know well. As I see it, the advantages of implementing this at the compositor level are: -- Automatic hardware acceleration for window managers that might need to use OpenGL (which includes most of them these days) -- No need to launch OpenGL applications using a wrapper script -- Potentially the compositor could tap into GPU-based encoding methods (NVENC, for instance) quite easily to compress the pixel updates sent to the client. This becomes more difficult when the pixel readback is occurring in the OpenGL application process but the compression is occurring in another process. The potential advantages of an interposer are: -- Much easier for me to develop, since this would represent basically a subset of VirtualGL's existing functionality (the GLX interposer could also benefit from a back end that accesses the GPU directly through EGL rather than forwarding the GLX requests through a local X server.) -- The readback occurs in-process, so only applications that actually need it (OpenGL applications) are subject to that overhead, and the design of VirtualGL makes it such that the readback of the current frame occurs in parallel with the display of the last frame. -- Theoretically should work with any Wayland implementation or back end. It goes without saying that I'm not the only one in this game. In the current market, there are lots of different vendors producing their X11 proxy of choice, but all of them can use VirtualGL to add GPU acceleration. I don't know how the market will look with Wayland, but I would anticipate that those same vendors will produce their own Wayland proxies of choice as well, so there might be an advantage to retaining VirtualGL as an independent bolt-on product. -- That is a good point about the compositor not stalling on glReadPixels()-- although I think I could probably mitigate that by using PBOs rather than synchronous glReadPixels(). I know for sure that I can make the interposer approach work, and perhaps that would be a good short-term approach to get something up and running while the other approach is explored in more depth. ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Mon, 19 Dec 2016 10:50:22 +0100 Christian Stroetmannwrote: > On19.Dec.2016 09:48, Pekka Paalanen wrote: > > Sorry for not realizing the "wrap libEGL.so" approach earlier. > > Yeah, and how does this look like when put in context with Waltham? It has nothing to do with Waltham at all. It is completely orthogonal. You still have the Wayland compositor beside the application, and the compositor is still serving Wayland clients and using something else on the network. It could be waltham or RDP or VNC or HTTP... neither the application nor the wrapper-libEGL would care. The libEGL.so wrapper could help by converting the application output to wl_shm in the application process, which is the alternative to implementing support in the compositor for handling all the different hardware-related buffer types. Both ways presumably work, and with slightly different characteristics. Thanks, pq pgp5gIN3RrUqi.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On19.Dec.2016 09:48, Pekka Paalanen wrote: On Fri, 16 Dec 2016 11:35:48 -0600 DRCwrote: On 12/16/16 3:06 AM, Pekka Paalanen wrote: I should probably tell a little more, because what I explained above is a simplification due to using a single path for all buffer types. ... Thanks again. This is all very new to me, and I guess I don't fully understand where these buffer types would come into play. Bearing in mind that I really don't care whether non-OpenGL applications are hardware-accelerated, does that simplify anything? It would be sufficient if only OpenGL applications could render using the GPU. The compositor itself doesn't necessarily need to. I do not know of any OpenGL (EGL, actually) implementation that would allow using the GPU while the compositor is not accepting hardware (EGL-based) buffers. The reason is that it simply cannot be performant in fully local usage scenario. When I say this, I mean in a way that would be transparent to the application. Applications themselves could initialize GPU support on an EGL platform other than Wayland, render with the GPU, call glReadPixels, and then use wl_shm buffers for sending the content to the compositor. However, this obviously needs explicit coding in the application, and I would not expect anyone to do it, because in a usual case of a fully local graphics stack without any remoting, it would make a huge performance hit. Well, I suppose some proprietary implementations have used wl_shm for hardware-rendered content, but we have always considered that a bug, especially when no other transport has been implemented. A bit of background: http://ppaalanen.blogspot.fi/2012/11/on-supporting-wayland-gl-clients-and.html Oh, this might be a nice reading before that one: http://ppaalanen.blogspot.fi/2012/03/what-does-egl-do-in-wayland-stack.html Lastly, and I believe this is the most sad part for you, is that NVIDIA proprietary drivers do not work (the way we would like). NVIDIA has been proposing for years a solution that is completely different to anything explained above: EGLStreams, and for the same amount of years, the community has been unimpressed with the design. Anyway, NVIDIA did implement their design and even wrote patches for Weston which we have not merged. Other compositors (e.g. Mutter) may choose to support EGLStreams as a temporary solution. I guess I was hoping to take advantage of the EGL_PLATFORM_DEVICE_EXT extension that allows for off-screen OpenGL rendering. It currently works with nVidia's drivers: https://gist.github.com/dcommander/ee1247362201552b2532 Right. You can do that, but then you need to write the application to use it... Hmm, indeed, maybe it would be possible if you are imposing your own EGL middle-man library between the application and the real EGL library. That's definitely a idea to look into. I cannot say off-hand why it would not work, so maybe it can work. :-) To summarize, with that approach, you would have the client send only wl_shm buffers to the compositor, and the compositor never needs to touch EGL at all. It also has the benefit that the read-back cost (glReadPixels) is completely in the client process, so the compositor will not stall on it, and you don't need the stuff I explained about in the compositor. And you get support for the proprietary drivers! Sorry for not realizing the "wrap libEGL.so" approach earlier. Thanks, pq Yeah, and how does this look like when put in context with Waltham? Regards Christian Stroetmann ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On19.Dec.2016 09:48, Pekka Paalanen wrote: On Fri, 16 Dec 2016 11:35:48 -0600 DRCwrote: On 12/16/16 3:06 AM, Pekka Paalanen wrote: I should probably tell a little more, because what I explained above is a simplification due to using a single path for all buffer types. ... Thanks again. This is all very new to me, and I guess I don't fully understand where these buffer types would come into play. Bearing in mind that I really don't care whether non-OpenGL applications are hardware-accelerated, does that simplify anything? It would be sufficient if only OpenGL applications could render using the GPU. The compositor itself doesn't necessarily need to. I do not know of any OpenGL (EGL, actually) implementation that would allow using the GPU while the compositor is not accepting hardware (EGL-based) buffers. The reason is that it simply cannot be performant in fully local usage scenario. When I say this, I mean in a way that would be transparent to the application. Applications themselves could initialize GPU support on an EGL platform other than Wayland, render with the GPU, call glReadPixels, and then use wl_shm buffers for sending the content to the compositor. However, this obviously needs explicit coding in the application, and I would not expect anyone to do it, because in a usual case of a fully local graphics stack without any remoting, it would make a huge performance hit. Well, I suppose some proprietary implementations have used wl_shm for hardware-rendered content, but we have always considered that a bug, especially when no other transport has been implemented. A bit of background: http://ppaalanen.blogspot.fi/2012/11/on-supporting-wayland-gl-clients-and.html Oh, this might be a nice reading before that one: http://ppaalanen.blogspot.fi/2012/03/what-does-egl-do-in-wayland-stack.html Lastly, and I believe this is the most sad part for you, is that NVIDIA proprietary drivers do not work (the way we would like). NVIDIA has been proposing for years a solution that is completely different to anything explained above: EGLStreams, and for the same amount of years, the community has been unimpressed with the design. Anyway, NVIDIA did implement their design and even wrote patches for Weston which we have not merged. Other compositors (e.g. Mutter) may choose to support EGLStreams as a temporary solution. I guess I was hoping to take advantage of the EGL_PLATFORM_DEVICE_EXT extension that allows for off-screen OpenGL rendering. It currently works with nVidia's drivers: https://gist.github.com/dcommander/ee1247362201552b2532 Right. You can do that, but then you need to write the application to use it... Hmm, indeed, maybe it would be possible if you are imposing your own EGL middle-man library between the application and the real EGL library. That's definitely a idea to look into. I cannot say off-hand why it would not work, so maybe it can work. :-) To summarize, with that approach, you would have the client send only wl_shm buffers to the compositor, and the compositor never needs to touch EGL at all. It also has the benefit that the read-back cost (glReadPixels) is completely in the client process, so the compositor will not stall on it, and you don't need the stuff I explained about in the compositor. And you get support for the proprietary drivers! Sorry for not realizing the "wrap libEGL.so" approach earlier. Thanks, pq Yeah, and how does this look like when put in context with Waltham? Regards Christian Stroetmann ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Fri, 16 Dec 2016 11:35:48 -0600 DRCwrote: > On 12/16/16 3:06 AM, Pekka Paalanen wrote: > > I should probably tell a little more, because what I explained above is > > a simplification due to using a single path for all buffer types. > > ... > > Thanks again. This is all very new to me, and I guess I don't fully > understand where these buffer types would come into play. Bearing in > mind that I really don't care whether non-OpenGL applications are > hardware-accelerated, does that simplify anything? It would be > sufficient if only OpenGL applications could render using the GPU. The > compositor itself doesn't necessarily need to. I do not know of any OpenGL (EGL, actually) implementation that would allow using the GPU while the compositor is not accepting hardware (EGL-based) buffers. The reason is that it simply cannot be performant in fully local usage scenario. When I say this, I mean in a way that would be transparent to the application. Applications themselves could initialize GPU support on an EGL platform other than Wayland, render with the GPU, call glReadPixels, and then use wl_shm buffers for sending the content to the compositor. However, this obviously needs explicit coding in the application, and I would not expect anyone to do it, because in a usual case of a fully local graphics stack without any remoting, it would make a huge performance hit. Well, I suppose some proprietary implementations have used wl_shm for hardware-rendered content, but we have always considered that a bug, especially when no other transport has been implemented. A bit of background: http://ppaalanen.blogspot.fi/2012/11/on-supporting-wayland-gl-clients-and.html Oh, this might be a nice reading before that one: http://ppaalanen.blogspot.fi/2012/03/what-does-egl-do-in-wayland-stack.html > > Lastly, and I believe this is the most sad part for you, is that NVIDIA > > proprietary drivers do not work (the way we would like). > > > > NVIDIA has been proposing for years a solution that is completely > > different to anything explained above: EGLStreams, and for the same > > amount of years, the community has been unimpressed with the design. > > Anyway, NVIDIA did implement their design and even wrote patches for > > Weston which we have not merged. Other compositors (e.g. Mutter) may > > choose to support EGLStreams as a temporary solution. > > I guess I was hoping to take advantage of the EGL_PLATFORM_DEVICE_EXT > extension that allows for off-screen OpenGL rendering. It currently > works with nVidia's drivers: > https://gist.github.com/dcommander/ee1247362201552b2532 Right. You can do that, but then you need to write the application to use it... Hmm, indeed, maybe it would be possible if you are imposing your own EGL middle-man library between the application and the real EGL library. That's definitely a idea to look into. I cannot say off-hand why it would not work, so maybe it can work. :-) To summarize, with that approach, you would have the client send only wl_shm buffers to the compositor, and the compositor never needs to touch EGL at all. It also has the benefit that the read-back cost (glReadPixels) is completely in the client process, so the compositor will not stall on it, and you don't need the stuff I explained about in the compositor. And you get support for the proprietary drivers! Sorry for not realizing the "wrap libEGL.so" approach earlier. Thanks, pq pgpeuXea9yyn3.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On 12/16/16 3:06 AM, Pekka Paalanen wrote: > I should probably tell a little more, because what I explained above is > a simplification due to using a single path for all buffer types. > ... Thanks again. This is all very new to me, and I guess I don't fully understand where these buffer types would come into play. Bearing in mind that I really don't care whether non-OpenGL applications are hardware-accelerated, does that simplify anything? It would be sufficient if only OpenGL applications could render using the GPU. The compositor itself doesn't necessarily need to. > Lastly, and I believe this is the most sad part for you, is that NVIDIA > proprietary drivers do not work (the way we would like). > > NVIDIA has been proposing for years a solution that is completely > different to anything explained above: EGLStreams, and for the same > amount of years, the community has been unimpressed with the design. > Anyway, NVIDIA did implement their design and even wrote patches for > Weston which we have not merged. Other compositors (e.g. Mutter) may > choose to support EGLStreams as a temporary solution. I guess I was hoping to take advantage of the EGL_PLATFORM_DEVICE_EXT extension that allows for off-screen OpenGL rendering. It currently works with nVidia's drivers: https://gist.github.com/dcommander/ee1247362201552b2532 ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Thu, 15 Dec 2016 09:55:44 -0600 DRCwrote: > On 12/15/16 3:01 AM, Pekka Paalanen wrote: > > I assure you, this is a limitation of the RDP-backend itself. Nothing > > outside of Weston creates this restriction. > > > > The current RDP-backed is written to set up and use only the Pixman > > renderer. Pixman renderer is a software renderer, and will not > > initialize EGL in the compositor. Therefore no support for hardware > > accelerated OpenGL gets advertised to clients, and clients fall back to > > software GL. > > > > You can fix this purely by modifying libweston/compositor-rdp.c file, > > writing the support for initializing the GL-renderer. Then you get > > hardware accelerated GL support for all Wayland clients without any > > other modifications anywhere. > > > > Why that has not been done already is because it was thought that > > having clients using hardware OpenGL while the compositor is not cannot > > be performant enough to justify the effort. Also, it pulls in the > > dependency to EGL and GL libs, which are huge. Obviously your use case > > is different and this rationale does not apply. > > Like many things, it depends on the application. GLXgears may not > perform better in a hardware-accelerated remote 3D environment vs. using > software OpenGL, but real-world applications with larger geometries > certainly will. In a VirtualGL environment, the overhead is per-frame > rather than per-primitive, so geometric throughput is essentially as > fast as it would be in the local case (the OpenGL applications are still > using direct rendering.) The main performance limiters are pixel > readback and transmission. Modern GPUs have pretty fast readback-- > 800-1000 Mpixels/sec in the case of a mid-range Quadro, for instance, if > you use synchronous readback. VirtualGL uses PBO readback, which is a > bit slower than synchronous readback but which uses practically zero CPU > cycles and does not block at the driver level (this is what enables many > users to share the same GPU without conflict.) VGL also uses a frame > queueing/spoiling system to send the 3D frames from the rendering thread > into another thread for transmission and/or display, so it can be > displaying or transmitting the last frame while the application renders > the next frame. TurboVNC (and most other X proxies that people use with > VGL) is based on libjpeg-turbo, which can compress JPEG images at > hundreds of Mpixels/sec on modern CPUs. In total, you can pretty easily > push 60+ Megapixels/sec with perceptually lossless image quality to > clients on even a 100 Megabit network, and 20 Megapixels/sec across a 10 > Megabit network (with reduced quality.) Our biggest success stories are > large companies who have replaced their 3D workstation infrastructure > with 8 or 10 beefy servers running VirtualGL+TurboVNC with laptop > clients running the TurboVNC Viewer. In most cases, they claim that the > perceived performance is as good as or better than their old workstations. > > To put some numbers on this, our GLXspheres benchmark uses a geometry > size that is relatively small (~60,000 polygons) but still a lot more > realistic than GLXgears (which has a polygon count only in the hundreds, > if I recall correctly.) When running on a 1920x1200 remote display > session (TurboVNC), this benchmark will perform at about 14 Hz with > llvmpipe but 43 Hz with VirtualGL. So software OpenGL definitely does > slow things down, even with a relatively modest geometry size and in an > environment where there is a lot of per-frame overhead. Hi, indeed, those are use cases I (we?) have not thought about. Our thinking has largely revolved around the idea that reading back a buffer from gfx memory into system memory is prohibitively slow. And in many cases it is, but not if you want to remote. Another thought was that if the clients can use hardware GL, then why would the compositor not use hardware paths all the way to the scanout? So the case has been largely ignored. It is very interesting to hear about the numbers! > > The hardest part in adding the support to the RDP-backend is > > implementing the buffer content access efficiently. RDP requires pixel > > data in system memory so the CPU can read it, but GL-renderer has all > > pixel data in graphics memory which often cannot be directly read by > > the CPU. Accessing that pixel data requires a copy (glReadPixels), and > > there is nowadays a helper: weston_surface_copy_content(), however the > > function is not efficient and is so far meant only for debugging and > > testing. > > I could probably reuse some of the VirtualGL code for this, since it > already does a good job of buffer management. > > Thanks so much for all of the helpful info. I guess I have my work cut > out for me. :| I should probably tell a little more, because what I explained above is a simplification due to using a single path for all buffer types.
Re: Remote display with 3D acceleration using Wayland/Weston
On 12/15/16 3:01 AM, Pekka Paalanen wrote: > I assure you, this is a limitation of the RDP-backend itself. Nothing > outside of Weston creates this restriction. > > The current RDP-backed is written to set up and use only the Pixman > renderer. Pixman renderer is a software renderer, and will not > initialize EGL in the compositor. Therefore no support for hardware > accelerated OpenGL gets advertised to clients, and clients fall back to > software GL. > > You can fix this purely by modifying libweston/compositor-rdp.c file, > writing the support for initializing the GL-renderer. Then you get > hardware accelerated GL support for all Wayland clients without any > other modifications anywhere. > > Why that has not been done already is because it was thought that > having clients using hardware OpenGL while the compositor is not cannot > be performant enough to justify the effort. Also, it pulls in the > dependency to EGL and GL libs, which are huge. Obviously your use case > is different and this rationale does not apply. Like many things, it depends on the application. GLXgears may not perform better in a hardware-accelerated remote 3D environment vs. using software OpenGL, but real-world applications with larger geometries certainly will. In a VirtualGL environment, the overhead is per-frame rather than per-primitive, so geometric throughput is essentially as fast as it would be in the local case (the OpenGL applications are still using direct rendering.) The main performance limiters are pixel readback and transmission. Modern GPUs have pretty fast readback-- 800-1000 Mpixels/sec in the case of a mid-range Quadro, for instance, if you use synchronous readback. VirtualGL uses PBO readback, which is a bit slower than synchronous readback but which uses practically zero CPU cycles and does not block at the driver level (this is what enables many users to share the same GPU without conflict.) VGL also uses a frame queueing/spoiling system to send the 3D frames from the rendering thread into another thread for transmission and/or display, so it can be displaying or transmitting the last frame while the application renders the next frame. TurboVNC (and most other X proxies that people use with VGL) is based on libjpeg-turbo, which can compress JPEG images at hundreds of Mpixels/sec on modern CPUs. In total, you can pretty easily push 60+ Megapixels/sec with perceptually lossless image quality to clients on even a 100 Megabit network, and 20 Megapixels/sec across a 10 Megabit network (with reduced quality.) Our biggest success stories are large companies who have replaced their 3D workstation infrastructure with 8 or 10 beefy servers running VirtualGL+TurboVNC with laptop clients running the TurboVNC Viewer. In most cases, they claim that the perceived performance is as good as or better than their old workstations. To put some numbers on this, our GLXspheres benchmark uses a geometry size that is relatively small (~60,000 polygons) but still a lot more realistic than GLXgears (which has a polygon count only in the hundreds, if I recall correctly.) When running on a 1920x1200 remote display session (TurboVNC), this benchmark will perform at about 14 Hz with llvmpipe but 43 Hz with VirtualGL. So software OpenGL definitely does slow things down, even with a relatively modest geometry size and in an environment where there is a lot of per-frame overhead. > The hardest part in adding the support to the RDP-backend is > implementing the buffer content access efficiently. RDP requires pixel > data in system memory so the CPU can read it, but GL-renderer has all > pixel data in graphics memory which often cannot be directly read by > the CPU. Accessing that pixel data requires a copy (glReadPixels), and > there is nowadays a helper: weston_surface_copy_content(), however the > function is not efficient and is so far meant only for debugging and > testing. I could probably reuse some of the VirtualGL code for this, since it already does a good job of buffer management. Thanks so much for all of the helpful info. I guess I have my work cut out for me. :| ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Wed, 14 Dec 2016 11:42:54 -0600 DRCwrote: > But if you run OpenGL applications in Weston, as it is currently > implemented, then the OpenGL applications are either GPU-accelerated or > not, depending on the back end used. If you run Weston nested in a > Wayland compositor that is already GPU-accelerated, then OpenGL > applications run in the Weston session will be GPU-accelerated as well. > If you run Weston with the RDP back end, then OpenGL applications run in > the Weston session will use Mesa llvmpipe instead. I'm trying to > understand, quite simply, whether it's possible for unmodified Wayland > OpenGL applications-- such as the example OpenGL applications in the > Weston source-- to take advantage of OpenGL GPU acceleration when they > are running with the RDP back end. (I'm assuming that whatever > restrictions there are on the RDP back end would exist for the TurboVNC > back end I intend to develop.) My testing thus far indicates that this > is not currently possible, but I need to understand the source of the > limitation so I can understand how to work around it. Instead, you seem > to be telling me that the limitation doesn't exist, but I can assure you > that it does. Please test Weston with the RDP back end and confirm that > OpenGL applications run in that environment are not GPU-accelerated. Hi, I assure you, this is a limitation of the RDP-backend itself. Nothing outside of Weston creates this restriction. The current RDP-backed is written to set up and use only the Pixman renderer. Pixman renderer is a software renderer, and will not initialize EGL in the compositor. Therefore no support for hardware accelerated OpenGL gets advertised to clients, and clients fall back to software GL. You can fix this purely by modifying libweston/compositor-rdp.c file, writing the support for initializing the GL-renderer. Then you get hardware accelerated GL support for all Wayland clients without any other modifications anywhere. Why that has not been done already is because it was thought that having clients using hardware OpenGL while the compositor is not cannot be performant enough to justify the effort. Also, it pulls in the dependency to EGL and GL libs, which are huge. Obviously your use case is different and this rationale does not apply. The hardest part in adding the support to the RDP-backend is implementing the buffer content access efficiently. RDP requires pixel data in system memory so the CPU can read it, but GL-renderer has all pixel data in graphics memory which often cannot be directly read by the CPU. Accessing that pixel data requires a copy (glReadPixels), and there is nowadays a helper: weston_surface_copy_content(), however the function is not efficient and is so far meant only for debugging and testing. In fact, we have been thinking about adding (hardware or software) OpenGL support to the headless backend so that we could actually run tests on those code paths in the Weston test suite: https://bugs.freedesktop.org/show_bug.cgi?id=83984 https://bugs.freedesktop.org/show_bug.cgi?id=83985 Since filing those bugs, I have been thinking that testing Weston's GL-renderer should happen with the Wayland-backend. The host compositor would be Weston with the headless backend modified to initialize compositor-side EGL ad hoc. That way we might be able to limit all test-only code in the headless backend. Thanks, pq pgpNpXBFeSO0b.pgp Description: OpenPGP digital signature ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On 12/14/16 8:52 PM, Carsten Haitzler (The Rasterman) wrote: > weston is not the only wayland compositor. is the the sample/test compositor. > wayalnd does not mean sticking to just what weston does. > > i suspect weston's rdp back-end forces a sw gl stack because it's easier to be > driver agnostic and run everywhere and as you have to read-back pixel data for > transmitting over rdp... why bother with the complexity of actual driver setup > and hw device permissions etc... > > what pekka is saying that it's kind of YOUR job then to make a headless > compositor (base it on weston code or write your own entirely from scratch > etc.), and this headless compositor does return a hw egl context to clients. > it > can transport data to the other server via vnc. rdp or any other method > you like. your headless compositor will get new drm buffers from client when > they display (having rendered using the local gpu) and then transfer tot he > other end. the other end can be a vnc or rdp viewer or a custom app your wrote > for your protocol etc. ... but what you want is perfectly doable with > wayland... but it's kind of your job to do it. that is what virtual-gl would > be. a local headless wayland compositor (for wayland mode) with some kind of > display front end on the other end. Exactly what I needed to know. Thanks. ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On Wed, 14 Dec 2016 11:42:54 -0600 DRCsaid: ...snip... > Again, not how it currently works when using Weston with the RDP back end. weston is not the only wayland compositor. is the the sample/test compositor. wayalnd does not mean sticking to just what weston does. i suspect weston's rdp back-end forces a sw gl stack because it's easier to be driver agnostic and run everywhere and as you have to read-back pixel data for transmitting over rdp... why bother with the complexity of actual driver setup and hw device permissions etc... what pekka is saying that it's kind of YOUR job then to make a headless compositor (base it on weston code or write your own entirely from scratch etc.), and this headless compositor does return a hw egl context to clients. it can transport data to the other server via vnc. rdp or any other method you like. your headless compositor will get new drm buffers from client when they display (having rendered using the local gpu) and then transfer tot he other end. the other end can be a vnc or rdp viewer or a custom app your wrote for your protocol etc. ... but what you want is perfectly doable with wayland... but it's kind of your job to do it. that is what virtual-gl would be. a local headless wayland compositor (for wayland mode) with some kind of display front end on the other end. -- - Codito, ergo sum - "I code, therefore I am" -- The Rasterman (Carsten Haitzler)ras...@rasterman.com ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Remote display with 3D acceleration using Wayland/Weston
On 12/14/16 3:27 AM, Pekka Paalanen wrote: > could you be more specific on what you mean by "server-side", please? > Are you referring to the machine where the X server runs, or the > machine that is remote from a user perspective where the app runs? Few people use remote X anymore in my industry, so the reality of most VirtualGL deployments (and all of the commercial VGL deployments of which I'm aware) is that the X servers and the GPU are all on the application host, the machine where the applications are actually executed. Typically people allocate beefy server hardware with multiple GPUs, hundreds of gigabytes of memory, and as many as 32-64 CPU cores to act as VirtualGL servers for 50 or 100 users. We use the terms "3D X server" and "2D X server" to indicate where the 3D and 2D rendering is actually occurring. The 3D X server is located on the application host and is usually headless, since it only needs to be used by VirtualGL for obtaining Pbuffer contexts from the GPU-accelerated OpenGL implementation (usually nVidia or AMD/ATI.) There is typically one 3D X server shared by all users of the machine (VirtualGL allows this sharing, since it rewrites all of the GLX calls from applications and automatically converts all of them for off-screen rendering), and the 3D X server has a separate screen for each GPU. The 2D X server is usually an X proxy such as TurboVNC, and there are multiple instances of it (one or more per user.) These 2D X server instances are usually located on the application host but don't necessarily have to be. The client machine simply runs a VNC viewer. X proxies such as Xvnc do not support hardware-accelerated OpenGL, because they are implemented on top of a virtual framebuffer stored in main memory. The only way to implement hardware-accelerated OpenGL in that environment is to use "split rendering", which is what VirtualGL does. It splits off the 3D rendering to another X server that has a GPU attached. > Wayland apps handle all rendering themselves, there is nothing for > sending rendering commands to another process like the Wayland > compositor. > > What a Wayland compositor needs to do is to advertise support for EGL > Wayland platform for clients. That it does by using the > EGL_WL_bind_wayland_display extension. > > If you want all GL rendering to happen in the machine where the app > runs, then you don't have to do much anything, it already works like > that. You only need to make sure the compositor initializes EGL, which > in Weston's case means using the gl-renderer. The renderer does not > have to actually composite anything if you want to remote windows > separately, but it is needed to gain access to the window contents. In > Weston, only the renderer knows how to access the contents of all > windows (wl_surfaces). > > If OTOH you want to send GL rendering commands to the other machine > than where the app is running, that will require a great deal of work, > since you have to implement serialization and de-serialization of > OpenGL (and EGL) yourself. (It has been done before, do ask me if you > want details.) But if you run OpenGL applications in Weston, as it is currently implemented, then the OpenGL applications are either GPU-accelerated or not, depending on the back end used. If you run Weston nested in a Wayland compositor that is already GPU-accelerated, then OpenGL applications run in the Weston session will be GPU-accelerated as well. If you run Weston with the RDP back end, then OpenGL applications run in the Weston session will use Mesa llvmpipe instead. I'm trying to understand, quite simply, whether it's possible for unmodified Wayland OpenGL applications-- such as the example OpenGL applications in the Weston source-- to take advantage of OpenGL GPU acceleration when they are running with the RDP back end. (I'm assuming that whatever restrictions there are on the RDP back end would exist for the TurboVNC back end I intend to develop.) My testing thus far indicates that this is not currently possible, but I need to understand the source of the limitation so I can understand how to work around it. Instead, you seem to be telling me that the limitation doesn't exist, but I can assure you that it does. Please test Weston with the RDP back end and confirm that OpenGL applications run in that environment are not GPU-accelerated. > I think you have an underlying assumption that EGL and GL would somehow > automatically be carried over the network, and you need to undo it. > That does not happen, as the display server always runs in the same > machine as the application. The Wayland display is always local, it can > never be remote simply because Wayland can never go over a network. No I don't have that assumption at all, because that does not currently occur with VirtualGL. VirtualGL is designed precisely to avoid that situation. The problem is quite simply: In Weston, as it is currently implemented, OpenGL applications are not
Re: Remote display with 3D acceleration using Wayland/Weston
On Tue, 13 Dec 2016 14:39:31 -0600 DRCwrote: > Greetings. I am the founder and principal developer for The VirtualGL > Project, which has (since 2004) produced a GLX interposer (VirtualGL) > and a high-speed X proxy (TurboVNC) that are widely used for running > Linux/Unix OpenGL applications remotely with hardware-accelerated > server-side 3D rendering. For those who aren't familiar with VirtualGL, > it basically works by: Hi, could you be more specific on what you mean by "server-side", please? Are you referring to the machine where the X server runs, or the machine that is remote from a user perspective where the app runs? My confusion is caused by the difference in the X11 vs. Wayland models. The display server the app connects to is not on the same side in one model as in the other model. With X11 (traditional indirect rendering with X11 over network): Machine A| Machine B | App -> libs (X11, GLX)> X server -> display | -> GPU B With Wayland apps remoted: Machine A | Machine B | App | -> EGL and GL libs -> GPU A | --(wayland)--> Weston --(VNC/RDP)---> VNC/RDP viewer -> window system -> display Wayland apps handle all rendering themselves, there is nothing for sending rendering commands to another process like the Wayland compositor. What a Wayland compositor needs to do is to advertise support for EGL Wayland platform for clients. That it does by using the EGL_WL_bind_wayland_display extension. If you want all GL rendering to happen in the machine where the app runs, then you don't have to do much anything, it already works like that. You only need to make sure the compositor initializes EGL, which in Weston's case means using the gl-renderer. The renderer does not have to actually composite anything if you want to remote windows separately, but it is needed to gain access to the window contents. In Weston, only the renderer knows how to access the contents of all windows (wl_surfaces). If OTOH you want to send GL rendering commands to the other machine than where the app is running, that will require a great deal of work, since you have to implement serialization and de-serialization of OpenGL (and EGL) yourself. (It has been done before, do ask me if you want details.) > -- Interposing (via LD_PRELOAD) GLX calls from the OpenGL application > -- Rewriting the GLX calls such that OpenGL contexts are created in > Pbuffers instead of windows > -- Redirecting the GLX calls to the server's local display (usually :0, > which presumably has a GPU attached) rather than the remote display or > the X proxy > -- Reading back the rendered 3D images from the server's local display > and transferring them to the remote display or X proxy when the > application swaps buffers or performs other "triggers" (such as calling > glFinish() when rendering to the front buffer) > > There is more complexity to it than that, but that's at least the > general idea. Ok, so that sounds like you want the GL execution to happen in the app-side machine. That's the easy case. :-) > At the moment, I'm investigating how best to accomplish a similar feat > in a Wayland/Weston environment. I'm given to understand that building > a VNC server on top of Weston is straightforward and has already been > done as a proof of concept, so really my main question is how to do the > OpenGL stuff. At the moment, my (very limited) understanding of the > architecture seems to suggest that I have two options: Weston has the RDP backend already, indeed. > (1) Implement an interposer similar in concept to VirtualGL, except that > this interposer would rewrite EGL calls to redirect them from the > Wayland display to a low-level EGL device that supports off-screen > rendering (such as the devices provided through the > EGL_PLATFORM_DEVICE_EXT extension, which is currently supported by > nVidia's drivers.) How to get the images from that low-level device > into the Weston compositor when it is using a remote display back-end is > an open question, but I assume I'd have to ask the compositor for a > surface (which presumably would be allocated from main memory) and > handle the transfer of the pixels from the GPU to that surface. That is > similar in concept to how VirtualGL currently works, vis-a-vis using > glReadPixels to transfer the rendered OpenGL pixels into an MIT-SHM image. I think you have an underlying assumption that EGL and GL would somehow automatically be carried over the network, and you need to undo it. That does not happen, as the display server always runs in the same machine as the application. The Wayland display is always local, it can never be remote simply because Wayland can never go over a network. Furthermore, all GL rendering is always
Re: Remote display with 3D acceleration using Wayland/Weston
On 13.Dec.2016 21:39, DRC wrote: I thought about this on the 14th of March 2014 (see also [1]). Have you looked at https://github.com/waltham/waltham ? Regards Christian Stroetmann [1] OntoGraphics (www.ontolinux.com/technology/ontographics/ontographics.htm) Greetings. I am the founder and principal developer for The VirtualGL Project, which has (since 2004) produced a GLX interposer (VirtualGL) and a high-speed X proxy (TurboVNC) that are widely used for running Linux/Unix OpenGL applications remotely with hardware-accelerated server-side 3D rendering. For those who aren't familiar with VirtualGL, it basically works by: -- Interposing (via LD_PRELOAD) GLX calls from the OpenGL application -- Rewriting the GLX calls such that OpenGL contexts are created in Pbuffers instead of windows -- Redirecting the GLX calls to the server's local display (usually :0, which presumably has a GPU attached) rather than the remote display or the X proxy -- Reading back the rendered 3D images from the server's local display and transferring them to the remote display or X proxy when the application swaps buffers or performs other "triggers" (such as calling glFinish() when rendering to the front buffer) There is more complexity to it than that, but that's at least the general idea. At the moment, I'm investigating how best to accomplish a similar feat in a Wayland/Weston environment. I'm given to understand that building a VNC server on top of Weston is straightforward and has already been done as a proof of concept, so really my main question is how to do the OpenGL stuff. At the moment, my (very limited) understanding of the architecture seems to suggest that I have two options: (1) Implement an interposer similar in concept to VirtualGL, except that this interposer would rewrite EGL calls to redirect them from the Wayland display to a low-level EGL device that supports off-screen rendering (such as the devices provided through the EGL_PLATFORM_DEVICE_EXT extension, which is currently supported by nVidia's drivers.) How to get the images from that low-level device into the Weston compositor when it is using a remote display back-end is an open question, but I assume I'd have to ask the compositor for a surface (which presumably would be allocated from main memory) and handle the transfer of the pixels from the GPU to that surface. That is similar in concept to how VirtualGL currently works, vis-a-vis using glReadPixels to transfer the rendered OpenGL pixels into an MIT-SHM image. (2) Figure out some way of redirecting the OpenGL rendering within Weston itself, rather than using an interposer. This is where I'm fuzzy on the details. Is this even possible with a remote display back-end? Maybe it's as straightforward as writing a back-end that allows Weston to use the aforementioned low-level EGL device to obtain all of the rendering surfaces that it passes to applications, but I don't have a good enough understanding of the architecture to know whether or not that idea is nonsense. I know that X proxies, such as Xvnc, allocate a "virtual framebuffer" that is used by the X.org code for performing X11 rendering. Because this virtual framebuffer is located in main memory, you can't do hardware-accelerated OpenGL with it unless you use a solution like VirtualGL. It would be impractical to allocate the X proxy's virtual framebuffer in GPU memory because of the fine-grained nature of X11, but since Wayland is all image-based, perhaps that is no longer a limitation. Any advice is greatly appreciated. Thanks for your time. DRC ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel