On Sun, 22 Aug 2021 17:13:26 +0100 "Andrew Bainbridge" <a...@deadfrog.co.uk> said:
There are so many "it depends" in the answer. I'll try and break it down. I will gloss over a fee things and be a bit rough - so this below is able to be nitpicked but it's the broad strokes of what is going on. Some of this is confusing because X11 is from the 80's and has been added to and extended and thus there are things that are still there and work and are used and more modern extensions. xeyes is an old-school X app so it just renders directly to it's window. without any compositing this would involve xeyes sending draw commands to the xserver. The xserver will then draw into a single front-framebuffer that everything shares, clipping just for the region xeyes lives in. If it's hidden/behind things the Xserver throws away the rendering commands entirely. It is possible for X clients to know if their window is unobscured, partially or fully obscured to avoid sending any rendering commands at all in some cases. Now ... compositing. This has the compositor tell the Xswerver: "All drawing that normally goes to this window - please allocate a pixmap and redirect the drawing to go to that pixmap". The compositor can get access to this pixmap to use as a source to render from. So there will be one pixmap (buffer of pixels). Where this lives will entirely depend on the driver. It may live in system memory - maybe on the GPU. It may even move around. For most modern drivers with GPUs that have their own video memory, this will live in actual video memory (until you run out of it or something forces it to migrate around). Again - this is will maybe vary from GPU to GPU and driver to driver. Integrated GPUs have no "video memory" and share it with the system - though here video memory would be the memory MAPPED to the GPU (e.g. may not be cacheable on the CPU side). Almost every single traditional 2D X app will just render to the window directly - either with basic XFillRetangle and friends, XPutImage/XShmPutImage (which is relevant to you - you will want to be using this to basically blast/upload a blob of pixels to your window - XShm will only work on a local xserver and not over a network, so yoku need to have code to detect this and use only if it works, but it is much faster if you do use it vs. XPutImage) or Xrender (more advanced rendering with alpha channels and ARGB etc. but if you custom render pixels on the CPU you still need to get them to Xrender vis X(Shm)PutImage) or perhaps the more adventurous of apps will be using OpenGL (or really bleeding-edge - Vulkan). OGL/VK will use the X11 DRI2/DRI3 extension protocol behind the scenes to swap buffers (tell the Xserver to show/present some buffer that was once a backbuffer that the client allocated - these buffers will not be pixmaps but are the same thing pretty much - a blob of pixels accessible to the GPU). There is an old extension called the double buffer extension (XDBE) for rendering to a backbuffer then swap. I don't actually ever remember seeing this used in the wild, but I did try it out many years ago and it provided no benefits vs. X(Shm)PutImage, but this was pre-compositing. I suspect it's pretty much ignored and no optimizations have been made due to zero usage, so know that this exists, but ... ignore it. Xpresent is newer. It kind of is like XDBE but allows for timestamps to show a pixmap and is more likely to actually allow for buffer SWAPS. Now here comes the catch. The way most WMs work, swapping actually ends up degrading to a copy anyway. So let me get stuck into that. Client applications tend to have their window not use CSD (client side decorations). The Window Manager does this. The way it is almost always done is by taking the client window (your app), then placing it as a child inside a parent WM frame window that is larger. Your window is at an offset and the WM draws titlebars and borders in the extra space around your window in its parent window. To move your window the WM just moves its window and yours follows with it. Resizing is more involved but WM resizes its window and resizes your window and redraws the frame too. A compositor will request to redirect the WM's parent frame window, not your client window. This means the whole frame including titlebars is redirected to a pixmap with normal X clipping it'd do in the basic xeyes case I first mentioned being applied within this sub-tree. Thus your swaps of any buffers end up having to COPY the pixels from your window to the redirected pixmap (at an offset etc.). Yes - in theory you could do the inverse. Swap buffers then just copy the frame regions from the previous buffer to the new one thus reducing the amount copied. I do not think this is actually done (I could be wrong). If you have a separate compositor vs WM client then this case here absolutely applies. I actually believe it applies to most cases. Some WM's have both compositor and WM rolled into one. I now can only talk about mine and it does NOT do the above. There is a parent frame window but it is identical in size to the client. It's just used for control. The frame/border is drawn inside the compositor itself and not with ye-olde 2D rendering and in the parent frame (and some magic with the shape extension is done to calculate input regions to pretend there is a frame window there but to direct input events for that area to the WM). The compositor runs a full scene graph and borders are just more scene graph objects drawn with everything else (so with software rendering just like you or with OpenGL to accelerate it all with everything being textures, triangles etc.) This means your client window == redirected pixmap in size. it's an exact match. This means it's a very simple: if ((x == 0) && (y == 0) && (buffer_width == pixmap_width) && (buffer_height == pixmap_height)) { do a buffer swap where pximap ID now points to the newly swapped buffer and avoid a copy } else { ye olde copy }. This has to be done on the Xserver-side. I know the Xserver drivers already optimize this case for GL apps using DRI protocol when the window is fullscreen to drop from a copy to a buffer exchange to cut costs, so It is a very minimal extension of that logic to do it for a composited pixmap. So given this - even with Xpresent, you will be doing a copy from your X(shm)Putimage to the pixmap, THEN presenting that pixmap (maybe will be another copy - details above). So on a best case basis you have just as many copies as going directly to the window, at worst it may be 2x the copies. Admittedly the copies here will probably be on-GPU as opposed to the PutImage which will be a CPU -> GPU copy. So there still may be a copy and this still may have tearing happen In theory you could allocate your own DMABUFs and use DRI2 protocol - software render into the mmaped dmabuf then show it like opengl does. As for waiting for compositor to be ready - you can't do that. You don't know when the compositor will consume your pixmap and updates and even if it will consume it at all. It may choose not to update/render your window (it's hidden, it may be dropping down to only rendering every 4th frame or something). The best thing for you to do is either render with a fixed timer (eg at 60hz) do that on your side, open /dev/dri/card0 and try get vblank events (use libdrm to do this), or probably a bit better is to use the xpresent (XPresentNotifyMSC() to request events for screen refreshes). > Hi > > I'm a reformed Windows programmer trying to understand the big-picture of how > X11 manages frame buffers. With a typical compositing manager and, say, the > xeyes app, how many frame buffers are there? How many are in system memory > and how many in GPU memory? Is flipping employed? > > Is this kind of thing documented anywhere? > > I have a software rendering 2D library and various apps that depend on it > (*). I'm porting it from Windows to Linux. The apps do smooth animation by > drawing to a window-sized pixmap (bitmap in Windows speak) in system memory, > sending it to the compositor every frame and then waiting until the > compositor is ready for another. > > Is the Present extension the best way to do that on X11 today? > > * If curious, see https://www.youtube.com/watch?v=-xVune0NEsA for an example > app. > > Thanks, > Andy -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- Carsten Haitzler - ras...@rasterman.com