Re: Ways to test Weston during development (Re: Full-motion zero-copy screen capture in Weston)
Hi Matt, On Fri, 7 Jun 2024 at 16:30, Hoosier, Matt wrote: > Okay, makes sense that you don’t want to have to repeat the dependencies’ > builds for every CI test. I’m not arguing that you should – it was just more > a thought experiment to see whether riding Meson subprojects is a reasonable > idea for establishing a development environment. > > I get your point that that can become a deep rabbit hole. But it seems that > you didn’t have any need to build LLVM and similar just to support the > hand-built copy of Mesa that’s in the CI. Is there some reason why a deeper > set of transitive dependencies would be needed using Meson subprojects than > when building by hand? Seems like I could probably just mimic what you’ve > done. Maybe your point is that the CI is a very constrained environment > that’s known not to need ATI or llvmpipe, but a general developer situation > with physical machines would? Oh no, the CI environment absolutely needs llvmpipe! We install quite a few development packages (cf .gitlab-ci/debian-install.sh) into the CI environment though, so although we don't build LLVM, we do absolutely depend on distro LLVM development packages which aren't present in a clean distro install. You're completely right though that it makes no difference to the dependency chain whether the dependencies come from Meson subprojects or previous installs though. Cheers, Daniel
Re: Ways to test Weston during development (Re: Full-motion zero-copy screen capture in Weston)
Hi Matt, On Fri, 7 Jun 2024 at 15:30, Hoosier, Matt wrote: > Would Meson’s dependency wrapping capabilities be a viable solution here? I > think that most of Weston’s dependencies that have aggressive version > requirements are themselves also Meson projects. > > The Weston CI configuration builds a bunch of its dependencies (Mesa, libdrm, > libwayland …) manually. I wonder why Meson wrapping was not used for this? We don't want to rebuild Mesa every time. We could've built it as a subproject and cached it, but it didn't seem to offer much any advantage over just installing it into the system. We could probably add some subprojects, but you'd probably end up pulling in more components as well - e.g. if you want to run Mesa with its software renderer or the AMD drivers, you'll also need to use LLVM - and at what point does your easy subproject build turn into, well, a full distribution? I guess one thing we could do is to jazz the CI build up a little so it's easier to pull the OCI and run it inside a toolbox, as well as reuse those scripts locally. Cheers, Daniel
Re: Ways to test Weston during development (Re: Full-motion zero-copy screen capture in Weston)
Hi, On Wed, 5 Jun 2024 at 09:09, Pekka Paalanen wrote: > On Tue, 4 Jun 2024 20:33:48 + > "Hoosier, Matt" wrote: > > Tactical question: I somehow missed until this point that the remote > > and pipewire plugins will only run if the DRM backend is being used. > > > > But the DRM backend *really* doesn't want to start nowadays unless > > you're running on a system with seatd and/or logind available. > > Toolbox [1] is the de facto way to develop on bleeding edge copies of > > components these days. But it logind and seatd aren't exposed into it. > > > > How do Weston people interactively develop on the Weston DRM backend > > nowadays? > > > > [1] https://docs.fedoraproject.org/en-US/fedora-silverblue/toolbox/ > > I'm doing it old-school on my workstation, without any containers. What > dependencies my distribution does not provide, I build and install > manually into a prefix under $HOME: > > https://www.collabora.com/news-and-blog/blog/2020/04/10/clean-reliable-setup-for-dependency-installation/ > > The "clean and reliable" is probably outdated in this era of > containers... Yes, doing it in containers is a little bit tricky since it's not exactly the design case. Honestly, on my Silverblue systems, I just install a bunch of relevant dependencies into the system image with rpm-ostree, and have a pile of self-built dependencies in a local prefix. This might give you some insight however: https://github.com/containers/toolbox/issues/992 It probably needs some minor changes in Weston but does at least seem doable ... Cheers, Daniel
Re: [RFC PATCH v4 00/42] Color Pipeline API w/ VKMS
rs/gpu/drm/drm_mode_config.c | 7 + > drivers/gpu/drm/drm_plane.c | 52 ++ > drivers/gpu/drm/tests/Makefile| 3 +- > drivers/gpu/drm/tests/drm_fixp_test.c | 69 ++ > drivers/gpu/drm/vkms/Kconfig | 20 + > drivers/gpu/drm/vkms/Makefile | 4 +- > drivers/gpu/drm/vkms/tests/.kunitconfig | 4 + > drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 449 ++ > drivers/gpu/drm/vkms/vkms_colorop.c | 100 +++ > drivers/gpu/drm/vkms/vkms_composer.c | 135 ++- > drivers/gpu/drm/vkms/vkms_drv.h | 8 + > drivers/gpu/drm/vkms/vkms_luts.c | 802 ++ > drivers/gpu/drm/vkms/vkms_luts.h | 12 + > drivers/gpu/drm/vkms/vkms_plane.c | 2 + > include/drm/drm_atomic.h | 122 +++ > include/drm/drm_atomic_uapi.h | 3 + > include/drm/drm_colorop.h | 301 +++ > include/drm/drm_file.h| 7 + > include/drm/drm_fixed.h | 35 +- > include/drm/drm_mode_config.h | 18 + > include/drm/drm_plane.h | 13 + > include/uapi/drm/drm.h| 16 + > include/uapi/drm/drm_mode.h | 14 + > 38 files changed, 3882 insertions(+), 30 deletions(-) > create mode 100644 Documentation/gpu/rfc/color_pipeline.rst > create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.c > create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.h > create mode 100644 drivers/gpu/drm/drm_colorop.c > create mode 100644 drivers/gpu/drm/tests/drm_fixp_test.c > create mode 100644 drivers/gpu/drm/vkms/Kconfig > create mode 100644 drivers/gpu/drm/vkms/tests/.kunitconfig > create mode 100644 drivers/gpu/drm/vkms/tests/vkms_color_tests.c > create mode 100644 drivers/gpu/drm/vkms/vkms_colorop.c > create mode 100644 drivers/gpu/drm/vkms/vkms_luts.c > create mode 100644 drivers/gpu/drm/vkms/vkms_luts.h > create mode 100644 include/drm/drm_colorop.h > > -- > 2.44.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/2] drm/tidss: Fix initial plane zpos values
Hi, On Fri, 16 Feb 2024 at 09:00, Tomi Valkeinen wrote: > On 13/02/2024 13:39, Daniel Stone wrote: > > Specifically, you probably want commits 4cde507be6a1 and 58dde0e0c000. > > I think the window of breakage was small enough that - assuming either > > those commits or an upgrade to Weston 12/13 fixes it - we can just ask > > people to upgrade to a fixed Weston. > > > >>> Presuming this is not related to any TI specific code, I guess it's a > >>> regression in the sense that at some point Weston added the support to use > >>> planes for composition, so previously with only a single plane per display > >>> there was no issue. > > > > That point was 12 years ago, so not that novel. ;) > > Hmm, so do I understand it right, the plane code from 12 years back > supposedly works ok, but somewhere around Weston 10 something broke, but > was fixed with the commits you mention above? We always had plane support but pre-zpos; we added support for zpos a couple/few releases ago, but then massively refactored it ... so it could've always been broken, or could've been broken for as long as we have zpos, or it could've just been a small window in between the refactor.
Re: [PATCH 1/2] drm/tidss: Fix initial plane zpos values
Hi, On Tue, 13 Feb 2024 at 10:18, Marius Vlad wrote: > On Tue, Feb 13, 2024 at 11:57:59AM +0200, Tomi Valkeinen wrote: > > I haven't. I'm quite unfamiliar with Weston, and Randolph from TI (cc'd) has > > been working on the Weston side of things. I also don't know if there's > > something TI specific here, as the use case is with non-mainline GPU drivers > > and non-mainline Mesa. I should have been a bit clearer in the patch > > description, as I didn't mean that upstream Weston has a bug (maybe it has, > > maybe it has not). Don't worry about it. We've had bugs in the past and I'm sure we'll have more. :) Either way, it's definitely better to have the kernel expose sensible behaviour rather than weird workarounds, unless they've been around for so long that they're basically baked into ABI. > > The issue seen is that when Weston decides to use DRM planes for > > composition, the plane zpositions are not configured correctly (or at all?). > > Afaics, this leads to e.g. weston showing a window with a DRM "overlay" > > plane that is behind the "primary" root plane, so the window is not visible. > > And as Weston thinks that the area supposedly covered by the overlay plane > > does not need to be rendered on the root plane, there are also artifacts on > > that area. > > > > Also, the Weston I used is a bit older one (10.0.1), as I needed to go back > > in my buildroot versions to get all that non-mainline GPU stuff compiled and > > working. A more recent Weston may behave differently. > > Right after Weston 10, we had a few minor changes related to the > zpos-sorting list of planes and how we parse the plan list without having > a temporary zpos ordered list to pick planes from. > > And there's another fix for missing out to set out the zpos for scanout > to the minimum available - which seems like a good candidate to explain > what happens in the issue described above. So if trying Weston again, > please try with at least Weston 12, which should have those changes > in. Specifically, you probably want commits 4cde507be6a1 and 58dde0e0c000. I think the window of breakage was small enough that - assuming either those commits or an upgrade to Weston 12/13 fixes it - we can just ask people to upgrade to a fixed Weston. > > Presuming this is not related to any TI specific code, I guess it's a > > regression in the sense that at some point Weston added the support to use > > planes for composition, so previously with only a single plane per display > > there was no issue. That point was 12 years ago, so not that novel. ;) Cheers, Daniel
[ANNOUNCE] wayland-protocols 1.33
Hi, wayland-protocols 1.33 has been released. This marks the linux-dmabuf protocol - now at v5 - stable, introduces the ext-transient-seat protocol, and has a number of minor fixes and clarifications for other protocols. Thanks to all who have contributed. Andri Yngvason (1): Add the transient seat protocol Daniel Stone (1): build: Bump version to 1.33 Jonas Ådahl (1): xdg-shell: Clarify what a toplevel by default includes Lleyton Gray (1): staging/drm-lease: fix typo in description MaxVerevkin (1): linux-dmabuf: sync changes from unstable to stable Sebastian Wick (3): security-context-v1: Document out of band metadata for flatpak security-context-v1: Document what can be done with the open sockets security-context-v1: Make sandbox engine names use reverse-DNS Simon Ser (12): linux-dmabuf: add note about implicit sync members: remove EFL/Enlightenment build: simplify dict loops build: add version for stable protocols linux-dmabuf: mark as stable xdg-decoration: fix configure event summary xdg-decoration: remove ambiguous wording in configure event presentation-time: stop referring to Linux/glibc readme: version should be included in stable protocol filenames linux-dmabuf: require all planes to use the same modifier readme: make it clear that we are a standards body ci: upgrade ci-templates and Debian Vaxry (2): README: fix typos governance: fix typos git tag: 1.33 https://gitlab.freedesktop.org/wayland/wayland-protocols/-/releases/1.33/downloads/wayland-protocols-1.33.tar.xz SHA256: 94f0c50b090d6e61a03f62048467b19abbe851be4e11ae7b36f65f8b98c3963a wayland-protocols-1.33.tar.xz SHA512: 4584f6ac86367655f9db5d0c0ed0681efa31e73f984e4b620fbe5317df21790927f4f5317ecbbc194ac31eaf88caebc431bcc52c23d9dc0098c71de3cb4a9fef wayland-protocols-1.33.tar.xz PGP: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/releases/1.33/downloads/wayland-protocols-1.33.tar.xz.sig
Re: Right mailing list for mutter/gnome-remote-desktop question?
Hi Matt, On Wed, 17 Jan 2024 at 17:08, Matt Hoosier wrote: > Does anybody know whether there’s a dedicated mailing list suitable for > asking questions about the hardware acceleration in the remote desktop > use-case for those two? > > I did a quick look through both repos’ README and CONTRIBUTING files, but > didn’t find anything. https://discourse.gnome.org is probably your best bet there. Cheers, Daniel
Re: Sub 16ms render but missing swap
Hi Joe, On Wed, 18 Oct 2023 at 02:00, Joe M wrote: > A few questions: > 1. What other avenues of investigation should I pursue for the swap delay? > As in, why when I take 12 ms to render do I not see about 4ms for the swap > call to return? My display is running in at 60hz. Further to Emmanuel's point about GPU rendering being async (you can validate by calling glFinish before eglSwapBuffers, which will wait for everything to complete) - which hardware platform are you using here, and which software stack as well? As in, do your Weston + drivers + etc come from upstream projects or are they provided by a vendor? > 2. Has EGL been optimized to use the available wayland callbacks and > maximize available client drawing time? Yes, very much. > 3. Does EGL leverage "weston_direct_display_v1" when available? What's > required to take advantage of it in the app code? (ie. run fullscreen?) No need. We bypass composition as much as we possibly can. You can try using weston-simple-egl with the flag to use direct-display if you want to satisfy yourself, but it's in no way required to bypass GPU composition and use the display controller to scan out. Cheers, Daniel
Re: [PATCH v1] dynamic_debug: add support for logs destination
On Thu, Oct 12, 2023 at 01:39:44PM +0300, Pekka Paalanen wrote: > On Thu, 12 Oct 2023 11:53:52 +0200 > Daniel Vetter wrote: > > > On Thu, Oct 12, 2023 at 11:55:48AM +0300, Pekka Paalanen wrote: > > > On Wed, 11 Oct 2023 11:42:24 +0200 > > > Daniel Vetter wrote: > > > > > > > On Wed, Oct 11, 2023 at 11:48:16AM +0300, Pekka Paalanen wrote: > > ... > > > > > > - all selections tailored separately for each userspace subscriber > > > > > (- per open device file description selection of messages) > > > > > > > > Again this feels like a userspace problem. Sessions could register what > > > > kind of info they need for their session, and something like journald > > > > can > > > > figure out how to record it all. > > > > > > Only if the kernel actually attaches all the required information to > > > the debug messages *in machine readable form* so that userspace > > > actually can do the filtering. And that makes *that* information UABI. > > > Maybe that's fine? I wouldn't know. > > > > Well if you configure the filters to go into separate ringbuffers for each > > session (or whatever you want to split) it also becomes uapi. > > It's a different UAPI: filter configuration vs. message structure. I > don't mind which it is, I just suspect one is easier to maintain and > extend than the other. > > > Also I'd say that for the first cut just getting the logs out on demand > > should be good enough, multi-gpu (or multi-compositor) systems are a step > > further. We can figure those out when we get there. > > This reminds me of what you recently said in IRC about a very different > topic: > >swick[m], tell this past me roughly 10 years ago, would > have been easy to add into the design back when there was no > driver code yet > > I just want to mention today everything I can see as useful. It's up to > the people doing the actual work to decide what they include and how. I actually pondered this a bit more today, and I think even with hindsight the atomic design we ended up with was probably rather close to optimal. Sure there's a bunch of things that would have been nice to include, but another very hard requirement of atomic was that it's feasible to convert current drivers over to it. And I think going full free-standing state structures with unlimited (at least at the design level) queue depth would have been a bridge too far. The hacks and conversion helpers are all gone by now, but "you can just peek at the object struct to get your state" was a huge help in reducing the conversion churn. But it definitely resulted in a big price we're still paying. tldr I don't think getting somewhere useful, even if somewhat deficient, is bad. -Sima -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v1] dynamic_debug: add support for logs destination
On Thu, Oct 12, 2023 at 11:55:48AM +0300, Pekka Paalanen wrote: > On Wed, 11 Oct 2023 11:42:24 +0200 > Daniel Vetter wrote: > > > On Wed, Oct 11, 2023 at 11:48:16AM +0300, Pekka Paalanen wrote: > > > On Tue, 10 Oct 2023 10:06:02 -0600 > > > jim.cro...@gmail.com wrote: > > > > > > > since I name-dropped you all, > > > > > > Hi everyone, > > > > > > I'm really happy to see this topic being developed! I've practically > > > forgot about it myself, but the need for it has not diminished at all. > > > > > > I didn't understand much of the conversation, so I'll just reiterate > > > what I would use it for, as a Wayland compositor developer. > > > > > > I added a few more cc's to get better coverage of DRM and Wayland > > > compositor developers. > > > > > > > On Tue, Oct 10, 2023 at 10:01 AM wrote: > > > > > > > > > > On Mon, Oct 9, 2023 at 4:47 PM Łukasz Bartosik > > > > > wrote: > > > > > > ... > > > > > > > > > I don't have a real life use case to configure different trace > > > > > > instance for each callsite. > > > > > > I just tried to be as much flexible as possible. > > > > > > > > > > > > > > > > Ive come around to agree - I looked back at some old threads > > > > > (that I was a part of, and barely remembered :-} > > > > > > > > > > At least Sean Paul, Lyude, Simon Ser, Pekka Paalanen > > > > > have expressed a desire for a "flight-recorder" > > > > > it'd be hard to say now that 2-3 such buffers would always be enough, > > > > > esp as theres a performance reason for having your own. > > > > > > A Wayland compositor has roughly three important things where the kernel > > > debugs might come in handy: > > > - input > > > - DRM KMS > > > - DRM GPU rendering > > > > > > DRM KMS is the one I've been thinking of in the flight recorder context > > > the most, because KMS hardware varies a lot, and there is plenty of > > > room for both KMS drivers and KMS userspace to go wrong. The usual > > > result is your display doesn't work, so the system is practically > > > unusable to the end user. In the wild, the simplest or maybe the only > > > way out of that may be a reboot, maybe an automated one (e.g. digital > > > signage). In order to debug such problems, we would need both > > > compositor logs and the relevant kernel debug messages. > > > > > > For example, Weston already has a flight recorder framework of its own, > > > so we have the compositor debug logs. It would be useful to get the > > > selected kernel debug logs in the same place. It could be used for > > > automated or semi-manual bug reporting, for example, making the > > > administrator or end user life much easier reporting issues. > > > > > > Since this is usually a production environment, and the Wayland > > > compositor runs without root privileges, we need something that works > > > with that. We would likely want the kernel debug messages in the > > > compositor to combine and order them properly with the compositor debug > > > messages. > > > > > > It's quite likely that developers would like to pick and choose which > > > kernel debug messages might be interesting enough to record, to avoid > > > excessive log flooding. The flight recorder in Weston is fixed size to > > > avoid running out of memory or disk space. I can also see that Weston > > > could have debugging options that affect which kernel debug messages it > > > subscribes to. We can have a reasonable default setup that allows us to > > > pinpoint the problem area and figure out most problems, and if needed, > > > we could ask the administrator pass another debug option to Weston. It > > > helps if there is just one place to configure everything about the > > > compositor. > > > > > > This implies that it would be really nice to have userspace subscriber > > > specific debug message streams from the kernel, or a good way to filter > > > the messages we want. A Wayland compositor would not be interested in > > > file system or wireless debugs for example, but another system > > > component might be. There is also a security aspect of which component is > > > allowed to see which messages in c
Re: [PATCH v1] dynamic_debug: add support for logs destination
ss to get a > little too much DRM KMS debug (that is, from the whole device instead > of just the leased parts), it may not be worth to consider splitting > debug message streams this far. > > If userspace is offered some standardised fields in kernel debug > structures, then userspace could do some filtering on its own too, but I > guess it would be better to filter at the source and not need that. > > There is also an anti-goal. The kernel debug message contents are > specifically not machine-parsable. I very much do not want to impose > debug strings as ABI, they are for human (and AI?) readers only. > > > As a summary, here are the most important requirements first: > - usable in production as a normal thing to enable always by default > - final delivery to unprivileged userspace process I think this is the one that's trickiest, and I don't fully understand why you need it. The issues I'm seeing: - logs tend to leak a lot of kernel internal state that's useful for attacks. There's measures for the worst (like obfuscating kernel pointers by hashing them), so there's always going to be a difference here between what full sysadmin can get and what unpriviledged userspace can get. And there's always a risk we miss something that we should obfuscate but didn't. - handing this to userspace increases the risks it becomes uapi. Who's going to stop compositors from sussing out the reason an atomic commit failed from the logs if they can get them easily, and these logs contain very interesting information about why something failed? Much better if journald or a crash handler assemebles all the different flight recorder logs and packages it into a bug report so that the compositor cannot ever get at these directly. Yeah this needs some OS support with a dbus request or similar so that the compositor can ask for a crash dump with everything relevant to its session. - the idea of an in-kernel flight recorder is that it's really fast. The entire tracing infra is built such that recording an event is really quick, but printing it is not - the entire string formatting is delayed to when userspace reads the buffers. If you constantly push the log messages to userspace we toss the advantage of the low-overhead in-kernel flight recorder. If you push logs to dmesg there's a substantial measureable overhead which you don't really want in production, and your requirement would impose the same. - I'm not sure how this is supposed to mesh with userspace log aggregators like journald when every compositor has it's own flight recorder on top. Feels a bit like a solution that ignores the entire os stack and assumes that the only pieces we can touch are the kernel and the compositor to get to such a flight recorder. You might object that events will get out-of-order if you merge multiple logs after the fact, but that's the case anyway if we use the tracing framework since that's always a ringbuffer within the kernel and not synchronous. The only thing we could do is allow userspace to push markers into that ringbuffer, which is easily done by adding more debug output lines (heck we could even add a logging cookie to certain ioctl when userspace really cares about knowing exact ordering of it's requests with the stuff the kernel does). - If you really want direct deliver to userspace I guess we could do something where sessiond opens the flight recorder fd for you, sets it all up and passes it to the compositor. But I'm really not a big fan of sending the full kms dbg spam to compositors to freely digest in real time. > - per debug-print selection of messages (finer or coarser, categories > within a kernel sub-system could be enough) > - per originating device (driver instance) selection of messages The dyndbg stuff can do all that already, which is why I'm so much in favour of relying on that framework. > - all selections tailored separately for each userspace subscriber > (- per open device file description selection of messages) Again this feels like a userspace problem. Sessions could register what kind of info they need for their session, and something like journald can figure out how to record it all. If you want the kernel to keep separate flight recorders I guess we could add that, but I don't think it currently exists for the dyndbg stuff at least. Maybe a flight recorder v2 feature, once the basics are in. > That's my idea of it. It is interesting to see how far the requirements > can be reasonably realised. I think aside from the "make it available directly to unpriviledged userspace" everything sounds reasonable and doable. More on the process side of things, I think Jim is very much looking for acks and tested-by by people who are interested in better drm logging infra. That should help that things are moving in a direction that's actually useful, even when it's not yet entirely complete. Cheers, Sima -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v5 6/6] drm/doc: Define KMS atomic state set
On Mon, 31 Jul 2023 at 04:01, André Almeida wrote: > > Em 13/07/2023 04:51, Pekka Paalanen escreveu: > > On Tue, 11 Jul 2023 10:57:57 +0200 > > Daniel Vetter wrote: > > > >> On Fri, Jul 07, 2023 at 07:40:59PM -0300, André Almeida wrote: > >>> From: Pekka Paalanen > >>> > >>> Specify how the atomic state is maintained between userspace and > >>> kernel, plus the special case for async flips. > >>> > >>> Signed-off-by: Pekka Paalanen > >>> Signed-off-by: André Almeida > >>> --- > >>> v4: total rework by Pekka > >>> --- > >>> Documentation/gpu/drm-uapi.rst | 41 ++ > >>> 1 file changed, 41 insertions(+) > >>> > >>> diff --git a/Documentation/gpu/drm-uapi.rst > >>> b/Documentation/gpu/drm-uapi.rst > >>> index 65fb3036a580..6a1662c08901 100644 > >>> --- a/Documentation/gpu/drm-uapi.rst > >>> +++ b/Documentation/gpu/drm-uapi.rst > >>> @@ -486,3 +486,44 @@ and the CRTC index is its position in this array. > >>> > >>> .. kernel-doc:: include/uapi/drm/drm_mode.h > >>> :internal: > >>> + > >>> +KMS atomic state > >>> + > >>> + > >>> +An atomic commit can change multiple KMS properties in an atomic fashion, > >>> +without ever applying intermediate or partial state changes. Either the > >>> whole > >>> +commit succeeds or fails, and it will never be applied partially. This > >>> is the > >>> +fundamental improvement of the atomic API over the older non-atomic API > >>> which is > >>> +referred to as the "legacy API". Applying intermediate state could > >>> unexpectedly > >>> +fail, cause visible glitches, or delay reaching the final state. > >>> + > >>> +An atomic commit can be flagged with DRM_MODE_ATOMIC_TEST_ONLY, which > >>> means the > >>> +complete state change is validated but not applied. Userspace should > >>> use this > >>> +flag to validate any state change before asking to apply it. If > >>> validation fails > >>> +for any reason, userspace should attempt to fall back to another, perhaps > >>> +simpler, final state. This allows userspace to probe for various > >>> configurations > >>> +without causing visible glitches on screen and without the need to undo a > >>> +probing change. > >>> + > >>> +The changes recorded in an atomic commit apply on top the current KMS > >>> state in > >>> +the kernel. Hence, the complete new KMS state is the complete old KMS > >>> state with > >>> +the committed property settings done on top. The kernel will > >>> automatically avoid > >>> +no-operation changes, so it is safe and even expected for userspace to > >>> send > >>> +redundant property settings. No-operation changes do not count towards > >>> actually > >>> +needed changes, e.g. setting MODE_ID to a different blob with identical > >>> +contents as the current KMS state shall not be a modeset on its own. > >> > >> Small clarification: The kernel indeed tries very hard to make redundant > >> changes a no-op, and I think we should consider any issues here bugs. But > >> it still has to check, which means it needs to acquire the right locks and > >> put in the right (cross-crtc) synchronization points, and due to > >> implmentation challenges it's very hard to try to avoid that in all cases. > >> So adding redundant changes especially across crtc (and their connected > >> planes/connectors) might result in some oversynchronization issues, and > >> userspace should therefore avoid them if feasible. > >> > >> With some sentences added to clarify this: > >> > >> Reviewed-by: Daniel Vetter > > > > After talking on IRC yesterday, we realized that the no-op rule is > > nowhere near as generic as I have believed. Roughly: > > https://oftc.irclog.whitequark.org/dri-devel/2023-07-12#1689152446-1689157291; > > > > > > How about: > > The changes recorded in an atomic commit apply on top the current KMS > state in the kernel. Hence, the complete new KMS state is the complete > old KMS state with the committed property settings done on top. The > kernel will try to avoid no-operation changes, so it is safe for >
Re: Need support to display application at (0, 0) position on Weston desktop
Hi Huy, On Wed, 12 Jul 2023 at 16:15, huy nguyen wrote: > I have a Linux system based on weston wayland. I run MPV player and expect > it displays a video window at (0,0) position on the screen (top left corner > of the display). I already use x11egl backend option to MPV to support a > fixed position to application but the video window of MPV is displayed at > an offset (X offset, Y offset) from (0,0) position as shown by the picture > below: > You probably want to make mpv be fullscreen, and then it will take up the whole area of the screen. kiosk-shell does this well, by telling all applications to be fullscreen. Cheers, Daniel
Re: [PATCH v5 6/6] drm/doc: Define KMS atomic state set
On Fri, Jul 07, 2023 at 07:40:59PM -0300, André Almeida wrote: > From: Pekka Paalanen > > Specify how the atomic state is maintained between userspace and > kernel, plus the special case for async flips. > > Signed-off-by: Pekka Paalanen > Signed-off-by: André Almeida > --- > v4: total rework by Pekka > --- > Documentation/gpu/drm-uapi.rst | 41 ++ > 1 file changed, 41 insertions(+) > > diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst > index 65fb3036a580..6a1662c08901 100644 > --- a/Documentation/gpu/drm-uapi.rst > +++ b/Documentation/gpu/drm-uapi.rst > @@ -486,3 +486,44 @@ and the CRTC index is its position in this array. > > .. kernel-doc:: include/uapi/drm/drm_mode.h > :internal: > + > +KMS atomic state > + > + > +An atomic commit can change multiple KMS properties in an atomic fashion, > +without ever applying intermediate or partial state changes. Either the > whole > +commit succeeds or fails, and it will never be applied partially. This is the > +fundamental improvement of the atomic API over the older non-atomic API > which is > +referred to as the "legacy API". Applying intermediate state could > unexpectedly > +fail, cause visible glitches, or delay reaching the final state. > + > +An atomic commit can be flagged with DRM_MODE_ATOMIC_TEST_ONLY, which means > the > +complete state change is validated but not applied. Userspace should use > this > +flag to validate any state change before asking to apply it. If validation > fails > +for any reason, userspace should attempt to fall back to another, perhaps > +simpler, final state. This allows userspace to probe for various > configurations > +without causing visible glitches on screen and without the need to undo a > +probing change. > + > +The changes recorded in an atomic commit apply on top the current KMS state > in > +the kernel. Hence, the complete new KMS state is the complete old KMS state > with > +the committed property settings done on top. The kernel will automatically > avoid > +no-operation changes, so it is safe and even expected for userspace to send > +redundant property settings. No-operation changes do not count towards > actually > +needed changes, e.g. setting MODE_ID to a different blob with identical > +contents as the current KMS state shall not be a modeset on its own. Small clarification: The kernel indeed tries very hard to make redundant changes a no-op, and I think we should consider any issues here bugs. But it still has to check, which means it needs to acquire the right locks and put in the right (cross-crtc) synchronization points, and due to implmentation challenges it's very hard to try to avoid that in all cases. So adding redundant changes especially across crtc (and their connected planes/connectors) might result in some oversynchronization issues, and userspace should therefore avoid them if feasible. With some sentences added to clarify this: Reviewed-by: Daniel Vetter > + > +A "modeset" is a change in KMS state that might enable, disable, or > temporarily > +disrupt the emitted video signal, possibly causing visible glitches on > screen. A > +modeset may also take considerably more time to complete than other kinds of > +changes, and the video sink might also need time to adapt to the new signal > +properties. Therefore a modeset must be explicitly allowed with the flag > +DRM_MODE_ATOMIC_ALLOW_MODESET. This in combination with > +DRM_MODE_ATOMIC_TEST_ONLY allows userspace to determine if a state change is > +likely to cause visible disruption on screen and avoid such changes when end > +users do not expect them. > + > +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed to > +effectively change only the FB_ID property on any planes. No-operation > changes > +are ignored as always. Changing any other property will cause the commit to > be > +rejected. > -- > 2.41.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: Need support to have weston randr release
Hi Huy, On Sat, 8 Jul 2023 at 08:39, huy nguyen wrote: > I have a Linux system based on weston wayland and I need to get the > current setting of the display resolution. > Unfortunately, xrandr command does not work on Wayland. > After much searching, I came to this information which is about adding > weston-randr support for Weston compositor: > > https://lists.freedesktop.org/archives/wayland-devel/2014-February/013480.html > > However, I could not find a link to download the patch to apply in order > to have weston-randr command > Please advise if the patch is available for the community to use. > The current resolution is provided as an event on the wl_output interfaces. >From the command line, you can get this from wayland-info. Cheers, Daniel
Re: Weston mirror/clone to 2 different displays
Hi Dawn, On Thu, 22 Jun 2023 at 18:09, Dawn HOWE wrote: > I am developing an (embedded) medical device which is required to have a > touchscreen display and also mirror the output to a monitor connected via > HDMI. The device is using Wayland/Weston on TorizonCore (based on a yocto > kirkstone). I am able to get the display extended from HDMI to LVDS, but > not have the output mirrored to both displays. I posted a query on the > Toradex community, and received a response that Weston may not be capable > of doing this. ( > https://community.toradex.com/t/apalis-imx8-hdmi-and-lvds-display-not-mirroring-cloning/19869 > ). > > > > I have searched and found some old posts from several years ago indicating > that it was not supported, but may be with a patch. I understand that > “same-as” configuration in weston.ini does not work for my scenario. > > > > What is the current state of cloning/mirroring to two different outputs, > but on the same card. E.g (card1-HDMI-A-1 and card1-LVDS-1): > > ls /sys/class/drm > > card0 card1 card1-HDMI-A-1 card1-LVDS-1 renderD128 > renderD129 version > Weston can currently mirror it if the display driver directly supports it. You can use the same-as configuration option (see man weston-drm) to enable this. If your display driver doesn't support this in the kernel, then Weston won't do it for now, but we are actively working on this and expect to have a branch capable of this within the next couple of weeks or so. Cheers, Daniel
Re: Weston 12 compatibility with Yocto Kirkstone
Hi Namit, On Thu, 15 Jun 2023 at 16:37, Namit Solanki (QUIC) < quic_nsola...@quicinc.com> wrote: > > As we all know Weston 10 has bitbakes files available for Yocto kirkstone > version. Can Weston 12 work with Kirkstone as well? > > > > Is Weston 12 compatible with Kirkstone? > > > > Do we need to write our own bitbake files for Weston 12 to compile with > Kirkstone? > > > > Please help on these queries. > OpenEmbedded does have a 11.0.1 build definition for Weston. You should be able to reuse this whilst bumping the version to 12.0.0, as long as you also pull the other dependencies from OE such as libseat and probably a newer version of Meson. Cheers, Daniel
Re: Refresh rates with multiple monitors
Hi Joe, On Wed, 14 Jun 2023 at 21:33, Joe M wrote: > Thanks Daniel. Do you know if wl_output instances are decoupled from each > other, when it comes to display refresh? > Yep, absolutely. > The wl_output geometry info hints that each output can be thought of as a > region in a larger compositor canvas, given the logical x/y fields in the > geometry. Is the compositor able to handle the repaint scheduling in a > refresh-aware way? > Yes. > I'm trying to get a better understanding of how these pieces interact to > maximize draw time but still hit the glass every frame. The various blog > posts and documentation out there are pretty clear when it comes to drawing > to a single physical display, but less so when multiple displays are in use. > Per-output repaint cycles are taken as a given. You can assume that every compositor does this, and any compositor which doesn't do this is so hopelessly broken as to not be worth considering. Cheers, Daniel
Re: Refresh rates with multiple monitors
Hi, On Tue, 13 Jun 2023 at 10:20, Pekka Paalanen wrote: > On Tue, 13 Jun 2023 01:11:44 + (UTC) > Joe M wrote: > > As I understand, there is one global wl_display. Is there always one > > wl_compositor too? > > That is inconsequential. > Yeah, I think the really consequential thing is that a wl_display really just represents a connection to a Wayland server (aka compositor). Display targets (e.g. 'the HDMI connector on the left', 'the DSI panel') are represented by wl_output objects. There is one of those for each output. Cheers, Daniel
Re: Why does Java (XWayland / Weston) resize a Window to 1x1 pixel when HDMI is unplugged (and does not resize back when HDMI is plugged)
On Thu, 8 Jun 2023 at 16:54, Martin Petzold wrote: > Am 08.06.23 um 16:58 schrieb Daniel Stone: > > On Thu, 8 Jun 2023 at 14:28, Pekka Paalanen wrote: > >> On Thu, 8 Jun 2023 14:49:37 +0200 >> Martin Petzold wrote: >> > btw. we are using a Weston 9 package from NXP and there may be >> important >> > fixes for our i.MX8 platform in there. >> >> Oh. We cannot support modified Weston, sorry. Significant vendor >> modifications tend to break things, and we have no idea what they do or >> why. Maybe this problem is not because of that, maybe it is, hard to >> guess. > > > The good news is that mainline Linux runs very well on all i.MX6, and most > i.MX8 platforms. You can ditch the NXP BSP and just use a vanilla Yocto > build for your machine. This will have upstream Weston which should solve > your problem. > > Do you mean Linux mainline or Yocto mainline? > > Because we are building from Debian and not from Yocto, for several > reasons. We have a more complex system setup. > Ah, I wasn't aware they also had Debian distributions. Nice. Yes, I mean mainline of upstream Linux + Mesa + Weston (plus GStreamer etc if you want to use that). That's worked very well out of the box for a few years now with no vendor trees required. Cheers, Daniel
Re: Why does Java (XWayland / Weston) resize a Window to 1x1 pixel when HDMI is unplugged (and does not resize back when HDMI is plugged)
Hi, On Thu, 8 Jun 2023 at 14:28, Pekka Paalanen wrote: > On Thu, 8 Jun 2023 14:49:37 +0200 > Martin Petzold wrote: > > btw. we are using a Weston 9 package from NXP and there may be important > > fixes for our i.MX8 platform in there. > > Oh. We cannot support modified Weston, sorry. Significant vendor > modifications tend to break things, and we have no idea what they do or > why. Maybe this problem is not because of that, maybe it is, hard to > guess. The good news is that mainline Linux runs very well on all i.MX6, and most i.MX8 platforms. You can ditch the NXP BSP and just use a vanilla Yocto build for your machine. This will have upstream Weston which should solve your problem. Cheers, Daniel
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, 8 May 2023 at 10:58, Simon Ser wrote: > > On Friday, May 5th, 2023 at 21:53, Daniel Vetter wrote: > > > On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote: > > > On Friday, May 5th, 2023 at 17:28, Daniel Vetter wrote: > > > > > > > Ok no comments from me on the actual color operations and semantics of > > > > all > > > > that, because I have simply nothing to bring to that except confusion > > > > :-) > > > > > > > > Some higher level thoughts instead: > > > > > > > > - I really like that we just go with graph nodes here. I think that was > > > > bound to happen sooner or later with kms (we almost got there with > > > > writeback, and with hindsight maybe should have). > > > > > > I'd really rather not do graphs here. We only need linked lists as > > > Sebastian > > > said. Graphs would significantly add more complexity to this proposal, and > > > I don't think that's a good idea unless there is a strong use-case. > > > > You have a graph, because a graph is just nodes + links. I did _not_ > > propose a full generic graph structure, the link pointer would be in the > > class/type specific structure only. Like how we have the plane->crtc or > > connector->crtc links already like that (which already _is_ is full blown > > graph). > > I really don't get why a pointer in a struct makes plane->crtc a full-blown > graph. There is only a single parent-child link. A plane has a reference to a > CRTC, and nothing more. > > You could say that anything is a graph. Yes, even an isolated struct somewhere > is a graph: one with a single node and no link. But I don't follow what's the > point of explaining everything with a graph when we only need a much simpler > subset of the concept of graphs? > > Putting the graph thing aside, what are you suggesting exactly from a concrete > uAPI point-of-view? Introducing a new struct type? Would it be a colorop > specific struct, or a more generic one? What would be the fields? Why do you > think that's necessary and better than the current proposal? > > My understanding so far is that you're suggesting introducing something like > this at the uAPI level: > > struct drm_mode_node { > uint32_t id; > > uint32_t children_count; > uint32_t *children; // list of child object IDs > }; Already too much I think struct drm_mode_node { struct drm_mode_object base; struct drm_private_obj atomic_base; enum drm_mode_node_enum type; }; The actual graph links would be in the specific type's state structure, like they are for everything else. And the limits would be on the property type, we probably need a new DRM_MODE_PROP_OBJECT_ENUM to make the new limitations work correctly, since the current DRM_MODE_PROP_OBJECT only limits to a specific type of object, not an explicit list of drm_mode_object.id. You might not even need a node subclass for the state stuff, that would directly be a drm_color_op_state that only embeds drm_private_state. Another uapi difference is that the new kms objects would be of type DRM_MODE_OBJECT_NODE, and would always have a "class" property. > I don't think this is a good idea for multiple reasons. First, this is > overkill: we don't need this complexity, and this complexity will make it more > difficult to reason about the color pipeline. This is a premature abstraction, > one we don't need right now, and one I heaven't heard a potential future > use-case for. Sure, one can kill an ant with a sledgehammer if they'd like, > but > that's not the right tool for the job. > > Second, this will make user-space miserable. User-space already has a tricky > task to achieve to translate its abstract descriptive color pipeline to our > proposed simple list of color operations. If we expose a full-blown graph, > then > the user-space logic will need to handle arbitrary graphs. This will have a > significant cost (on implementation and testing), which we will be paying in > terms of time spent and in terms of bugs. The color op pipeline would still be linear. I did not ask for a non-linear one. > Last, this kind of generic "node" struct is at odds with existing KMS object > types. So far, KMS objects are concrete like CRTC, connector, plane, etc. > "Node" is abstract. This is inconsistent. Yeah I think I think we should change that. That's essentially the full extend of my proposal. The classes + possible_foo mask approach just always felt rather brittle to me (and there's plenty of userspace out there to prove that's the case), going more explicit with the links with enumerated combos feels better. P
Re: [RFC] Plane color pipeline KMS uAPI
On Mon, 8 May 2023 at 10:24, Pekka Paalanen wrote: > > On Fri, 5 May 2023 21:51:41 +0200 > Daniel Vetter wrote: > > > On Fri, May 05, 2023 at 05:57:37PM +0200, Sebastian Wick wrote: > > > On Fri, May 5, 2023 at 5:28 PM Daniel Vetter wrote: > > > > > > > > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote: > > > > > Hi all, > > > > > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > > > color > > > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > > > framebuffer and before it's blended with other planes. With this new > > > > > uAPI we > > > > > aim to reduce the battery life impact of color management and HDR on > > > > > mobile > > > > > devices, to improve performance and to decrease latency by skipping > > > > > composition on the 3D engine. This proposal is the result of > > > > > discussions at > > > > > the Red Hat HDR hackfest [1] which took place a few days ago. > > > > > Engineers > > > > > familiar with the AMD, Intel and NVIDIA hardware have participated in > > > > > the > > > > > discussion. > > > > > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > > > approach. > > > > > Drivers describe the available hardware blocks in terms of low-level > > > > > mathematical operations, then user-space configures each block. We > > > > > decided > > > > > against a descriptive approach where user-space would provide a > > > > > high-level > > > > > description of the colorspace and other parameters: we want to give > > > > > more > > > > > control and flexibility to user-space, e.g. to be able to replicate > > > > > exactly the > > > > > color pipeline with shaders and switch between shaders and KMS > > > > > pipelines > > > > > seamlessly, and to avoid forcing user-space into a particular color > > > > > management > > > > > policy. > > > > > > > > Ack on the prescriptive approach, but generic imo. Descriptive pretty > > > > much > > > > means you need the shaders at the same api level for fallback purposes, > > > > and we're not going to have that ever in kms. That would need something > > > > like hwc in userspace to work. > > > > > > Which would be nice to have but that would be forcing a specific color > > > pipeline on everyone and we explicitly want to avoid that. There are > > > just too many trade-offs to consider. > > > > > > > And not generic in it's ultimate consquence would mean we just do a blob > > > > for a crtc with all the vendor register stuff like adf (android display > > > > framework) does, because I really don't see a point in trying a > > > > generic-looking-but-not vendor uapi with each color op/stage split out. > > > > > > > > So from very far and pure gut feeling, this seems like a good middle > > > > ground in the uapi design space we have here. > > > > > > Good to hear! > > > > > > > > We've decided against mirroring the existing CRTC properties > > > > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color > > > > > management > > > > > pipeline can significantly differ between vendors and this approach > > > > > cannot > > > > > accurately abstract all hardware. In particular, the availability, > > > > > ordering and > > > > > capabilities of hardware blocks is different on each display engine. > > > > > So, we've > > > > > decided to go for a highly detailed hardware capability discovery. > > > > > > > > > > This new uAPI should not be in conflict with existing standard KMS > > > > > properties, > > > > > since there are none which control the pre-blending color pipeline at > > > > > the > > > > > moment. It does conflict with any vendor-specific properties like > > > > > NV_INPUT_COLORSPACE or the patches on the mailing list adding > > > > > AMD-specific > > > > > properties. Drivers will need to either reject atomic commits > > > > > co
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote: > On Friday, May 5th, 2023 at 17:28, Daniel Vetter wrote: > > > Ok no comments from me on the actual color operations and semantics of all > > that, because I have simply nothing to bring to that except confusion :-) > > > > Some higher level thoughts instead: > > > > - I really like that we just go with graph nodes here. I think that was > > bound to happen sooner or later with kms (we almost got there with > > writeback, and with hindsight maybe should have). > > I'd really rather not do graphs here. We only need linked lists as Sebastian > said. Graphs would significantly add more complexity to this proposal, and > I don't think that's a good idea unless there is a strong use-case. You have a graph, because a graph is just nodes + links. I did _not_ propose a full generic graph structure, the link pointer would be in the class/type specific structure only. Like how we have the plane->crtc or connector->crtc links already like that (which already _is_ is full blown graph). Maybe explain what exactly you're thinking under "do graphs here" so I understand what you mean differently than me? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [RFC] Plane color pipeline KMS uAPI
On Fri, May 05, 2023 at 05:57:37PM +0200, Sebastian Wick wrote: > On Fri, May 5, 2023 at 5:28 PM Daniel Vetter wrote: > > > > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote: > > > Hi all, > > > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the > > > color > > > pipeline before blending, ie. after a pixel is tapped from a plane's > > > framebuffer and before it's blended with other planes. With this new uAPI > > > we > > > aim to reduce the battery life impact of color management and HDR on > > > mobile > > > devices, to improve performance and to decrease latency by skipping > > > composition on the 3D engine. This proposal is the result of discussions > > > at > > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers > > > familiar with the AMD, Intel and NVIDIA hardware have participated in the > > > discussion. > > > > > > This proposal takes a prescriptive approach instead of a descriptive > > > approach. > > > Drivers describe the available hardware blocks in terms of low-level > > > mathematical operations, then user-space configures each block. We decided > > > against a descriptive approach where user-space would provide a high-level > > > description of the colorspace and other parameters: we want to give more > > > control and flexibility to user-space, e.g. to be able to replicate > > > exactly the > > > color pipeline with shaders and switch between shaders and KMS pipelines > > > seamlessly, and to avoid forcing user-space into a particular color > > > management > > > policy. > > > > Ack on the prescriptive approach, but generic imo. Descriptive pretty much > > means you need the shaders at the same api level for fallback purposes, > > and we're not going to have that ever in kms. That would need something > > like hwc in userspace to work. > > Which would be nice to have but that would be forcing a specific color > pipeline on everyone and we explicitly want to avoid that. There are > just too many trade-offs to consider. > > > And not generic in it's ultimate consquence would mean we just do a blob > > for a crtc with all the vendor register stuff like adf (android display > > framework) does, because I really don't see a point in trying a > > generic-looking-but-not vendor uapi with each color op/stage split out. > > > > So from very far and pure gut feeling, this seems like a good middle > > ground in the uapi design space we have here. > > Good to hear! > > > > We've decided against mirroring the existing CRTC properties > > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management > > > pipeline can significantly differ between vendors and this approach cannot > > > accurately abstract all hardware. In particular, the availability, > > > ordering and > > > capabilities of hardware blocks is different on each display engine. So, > > > we've > > > decided to go for a highly detailed hardware capability discovery. > > > > > > This new uAPI should not be in conflict with existing standard KMS > > > properties, > > > since there are none which control the pre-blending color pipeline at the > > > moment. It does conflict with any vendor-specific properties like > > > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific > > > properties. Drivers will need to either reject atomic commits configuring > > > both > > > uAPIs, or alternatively we could add a DRM client cap which hides the > > > vendor > > > properties and shows the new generic properties when enabled. > > > > > > To use this uAPI, first user-space needs to discover hardware > > > capabilities via > > > KMS objects and properties, then user-space can configure the hardware > > > via an > > > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes. > > > > > > Our proposal introduces a new "color_pipeline" plane property, and a new > > > KMS > > > object type, "COLOROP" (short for color operation). The "color_pipeline" > > > plane > > > property is an enum, each enum entry represents a color pipeline > > > supported by > > > the hardware. The special zero entry indicates that the pipeline is in > > > "bypass"/"no-op" mode. For instance, the following plane properties > > > describe
Re: [RFC] Plane color pipeline KMS uAPI
just go with graph nodes here. I think that was bound to happen sooner or later with kms (we almost got there with writeback, and with hindsight maybe should have). - Since there's other use-cases for graph nodes (maybe scaler modes, or histogram samplers for adaptive backglight, or blending that goes beyond the stacked alpha blending we have now) it think we should make this all fairly generic: * Add a new graph node kms object type. * Add a class type so that userspace knows which graph nodes it must understand for a feature (like "ColorOp" on planes here), and which it can ignore (like perhaps a scaler node to control the interpolation) * Probably need to adjust the object property type. Currently that accept any object of a given type (crtc, fb, blob are the major ones). I think for these graph nodes we want an explicit enumeration of the possible next objects. In kms thus far we've done that with the separate possible_* mask properties, but they're cumbersome. * It sounds like for now we only have immutable next pointers, so that would simplify the first iteration, but should probably anticipate all this. - I think the graph node should be built on top of the driver private atomic obj/state stuff, and could then be further subclassed for specific types. It's a bit much stacking, but avoids too much wheel reinventing, and the worst boilerplate can be avoided with some macros that combine the pointer chasing with the containter_of upcast. With that you can easily build some helpers to walk the graph for a crtc or plane or whatever really. - I guess core atomic code should at least do the graph link validation and basic things like that, probably not really more to do. And validating the standard properties on some graph nodes ofc. - I have no idea how we should support the standardization of the state structures. Doing a separate subclass for each type sounds extremely painful, but unions otoh are ugly. Ideally type-indexed and type safe union but C isn't good enough for that. I do think that we should keep up the goal that standard properties are decoded into state structures in core atomic code, and not in each implementation individaully. - I think the only other precendent for something like this is the media control api in the media subystem. I think it'd be really good to get someone like Laurent to ack the graph node infrastructure to make sure we're missing any lesson they've learned already. If there's anything else we should pull these folks in too ofc. For merge plan I dropped some ideas already on Harry's rfc for vendor-private properties, the only thing to add is that we might want to type up the consensus plan into a merged doc like Documentation/gpu/rfc/hdr-plane.rst or whatever you feel like for a name. Cheers, Daniel > > Color operation 42 > ├─ "type": enum {Bypass, 1D curve} = 1D curve > ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT > ├─ "lut_size": immutable range = 4096 > ├─ "lut_data": blob > └─ "next": immutable color operation ID = 43 > > To configure this hardware block, user-space can fill a KMS blob with 4096 u32 > entries, then set "lut_data" to the blob ID. Other color operation types might > have different properties. > > Here is another example with a 3D LUT: > > Color operation 42 > ├─ "type": enum {Bypass, 3D LUT} = 3D LUT > ├─ "lut_size": immutable range = 33 > ├─ "lut_data": blob > └─ "next": immutable color operation ID = 43 > > And one last example with a matrix: > > Color operation 42 > ├─ "type": enum {Bypass, Matrix} = Matrix > ├─ "matrix_data": blob > └─ "next": immutable color operation ID = 43 > > [Simon note: having "Bypass" in the "type" enum, and making "type" mutable is > a bit weird. Maybe we can just add an "active"/"bypass" boolean property on > blocks which can be bypassed instead.] > > [Jonas note: perhaps a single "data" property for both LUTs and matrices > would make more sense. And a "size" prop for both 1D and 3D LUTs.] > > If some hardware supports re-ordering operations in the color pipeline, the > driver can expose multiple pipelines with different operation ordering, and > user-space can pick the ordering it prefers by selecting the right pipeline. > The same scheme can be used to expose hardware blocks supporting multiple > precision levels. > > That's pretty much all there is to it, but as always the devil is in the > details. > > First, we realized that we need a way to indicate where the scaling operation > is happeni
Re: Weston 10+ and GLES2 compatibility
Hi Daniel, On Fri, 10 Mar 2023 at 14:28, Levin, Daniel wrote: > We are currently attempting to update from Weston 9.0.0 to Weston 10+ and > facing issues with GLES2 compatibility at both build time and run time. > > For instance, gl_renderer_setup() exits with error if GL_EXT_unpack_subimage > is not present. Other code explicitly includes GLES3/gl3.h and uses pixel > formats from GL_EXT_texture_storage. > > We are using Mali 400 with proprietary Arm userspace GL drivers, which > supports only GLES2 without extensions above. > > Could you please clarify whether for Weston 10+ GLES3 is now mandatory > dependency? Was this highlighted in any release notes? > > If so, then we have to freeze Weston on version 9.0.0. That did indeed change. We require GLES3 headers at build time (no requirement for ES3 runtime contexts), and we do require GL_EXT_unpack_subimage. It's safe to use newer header sets than your driver supports; you could just take the headers directly from Khronos, or from Mesa, and build against those whilst using your other driver at runtime. GL_EXT_unpack_subimage also has no hardware dependency. It's literally about two lines to implement in software. If your proprietary driver can't support this, I strongly recommend switching to the Lima driver shipped as part of the Linux kernel and Mesa. > Could you please explain is it safe to keep updating all other Wayland > components (client, protocols, xwayland), and keep only Weston compositor > downgraded to 9.0.0? I tested and see that such combination works properly. > Though I am not sure if that is the correct approach, or it might cause > issues. And instead we have to downgrade all the Wayland components to the > same older version (in our case: client 1.19, protocols 1.21, weston 9.0.0). It's completely safe to use newer wayland + wayland-protocols + etc with an older Weston. Cheers, Daniel
Weston 10+ and GLES2 compatibility
Hello, We are currently attempting to update from Weston 9.0.0 to Weston 10+ and facing issues with GLES2 compatibility at both build time and run time. For instance, gl_renderer_setup() exits with error if GL_EXT_unpack_subimage is not present. Other code explicitly includes GLES3/gl3.h and uses pixel formats from GL_EXT_texture_storage. We are using Mali 400 with proprietary Arm userspace GL drivers, which supports only GLES2 without extensions above. Could you please clarify whether for Weston 10+ GLES3 is now mandatory dependency? Was this highlighted in any release notes? If so, then we have to freeze Weston on version 9.0.0. Could you please explain is it safe to keep updating all other Wayland components (client, protocols, xwayland), and keep only Weston compositor downgraded to 9.0.0? I tested and see that such combination works properly. Though I am not sure if that is the correct approach, or it might cause issues. And instead we have to downgrade all the Wayland components to the same older version (in our case: client 1.19, protocols 1.21, weston 9.0.0). Thanks, Daniel
Re: Weston does not start with "Failed to open device: No such file or directory, Try again..."
Hi Martin, On Fri, 17 Feb 2023 at 11:27, Martin Petzold wrote: > Feb 17 12:16:24 tavla DISPLAY Wayland[957]: [12:16:24.624] Loading module > '/usr/lib/aarch64-linux-gnu/libweston-9/g2d-renderer.so' > Feb 17 12:16:25 tavla DISPLAY Wayland[957]: [ 1] Failed to open device: > No such file or directory, Try again... > Feb 17 12:16:26 tavla DISPLAY Wayland[957]: [ 2] Failed to open device: > No such file or directory, Try again... > Feb 17 12:16:27 tavla DISPLAY Wayland[957]: [ 3] Failed to open device: > No such file or directory, Try again... > Feb 17 12:16:28 tavla DISPLAY Wayland[957]: [ 4] Failed to open device: > No such file or directory, Try again... > Feb 17 12:16:28 tavla DISPLAY Wayland[957]: [ 5] _OpenDevice(1249): > FATAL: Failed to open device, errno=No such file or directory. g2d-renderer comes from the NXP fork of Weston, customised to work on their downstream kernels with their libraries. It's presumably looking for some kind of G2D device node which it can't see for some reason. If you're using an upstream kernel then vanilla Weston 9.0.0 (with no NXP patches) works great there on i.MX devices. If you're using a downstream kernel/GLES/Weston/etc from NXP, then I'm afraid you need to contact them for support. Cheers, Daniel
Re: [RFC PATCH v3 0/3] Support for Solid Fill Planes
On Fri, Jan 06, 2023 at 04:33:04PM -0800, Abhinav Kumar wrote: > Hi Daniel > > Thanks for looking into this series. > > On 1/6/2023 1:49 PM, Dmitry Baryshkov wrote: > > On Fri, 6 Jan 2023 at 20:41, Daniel Vetter wrote: > > > > > > On Fri, Jan 06, 2023 at 05:43:23AM +0200, Dmitry Baryshkov wrote: > > > > On Fri, 6 Jan 2023 at 02:38, Jessica Zhang > > > > wrote: > > > > > > > > > > > > > > > > > > > > On 1/5/2023 3:33 AM, Daniel Vetter wrote: > > > > > > On Wed, Jan 04, 2023 at 03:40:33PM -0800, Jessica Zhang wrote: > > > > > > > Introduce and add support for a solid_fill property. When the > > > > > > > solid_fill > > > > > > > property is set, and the framebuffer is set to NULL, memory fetch > > > > > > > will be > > > > > > > disabled. > > > > > > > > > > > > > > In addition, loosen the NULL FB checks within the atomic commit > > > > > > > callstack > > > > > > > to allow a NULL FB when the solid_fill property is set and add FB > > > > > > > checks > > > > > > > in methods where the FB was previously assumed to be non-NULL. > > > > > > > > > > > > > > Finally, have the DPU driver use drm_plane_state.solid_fill and > > > > > > > instead of > > > > > > > dpu_plane_state.color_fill, and add extra checks in the DPU > > > > > > > atomic commit > > > > > > > callstack to account for a NULL FB in cases where solid_fill is > > > > > > > set. > > > > > > > > > > > > > > Some drivers support hardware that have optimizations for solid > > > > > > > fill > > > > > > > planes. This series aims to expose these capabilities to > > > > > > > userspace as > > > > > > > some compositors have a solid fill flag (ex. SOLID_COLOR in the > > > > > > > Android > > > > > > > hardware composer HAL) that can be set by apps like the Android > > > > > > > Gears > > > > > > > app. > > > > > > > > > > > > > > Userspace can set the solid_fill property to a blob containing the > > > > > > > appropriate version number and solid fill color (in RGB323232 > > > > > > > format) and > > > > > > > setting the framebuffer to NULL. > > > > > > > > > > > > > > Note: Currently, there's only one version of the solid_fill blob > > > > > > > property. > > > > > > > However if other drivers want to support a similar feature, but > > > > > > > require > > > > > > > more than just the solid fill color, they can extend this feature > > > > > > > by > > > > > > > creating additional versions of the drm_solid_fill struct. > > > > > > > > > > > > > > Changes in V2: > > > > > > > - Dropped SOLID_FILL_FORMAT property (Simon) > > > > > > > - Switched to implementing solid_fill property as a blob (Simon, > > > > > > > Dmitry) > > > > > > > - Changed to checks for if solid_fill_blob is set (Dmitry) > > > > > > > - Abstracted (plane_state && !solid_fill_blob) checks to helper > > > > > > > method > > > > > > > (Dmitry) > > > > > > > - Removed DPU_PLANE_COLOR_FILL_FLAG > > > > > > > - Fixed whitespace and indentation issues (Dmitry) > > > > > > > > > > > > Now that this is a blob, I do wonder again whether it's not cleaner > > > > > > to set > > > > > > the blob as the FB pointer. Or create some kind other kind of > > > > > > special data > > > > > > source objects (because solid fill is by far not the only such > > > > > > thing). > > > > > > > > > > > > We'd still end up in special cases like when userspace that doesn't > > > > > > understand solid fill tries to read out such a framebuffer, but > > > > > > these > > > > > > cases already exist anyway for lack of priviledges. > > >
Re: [RFC PATCH v3 0/3] Support for Solid Fill Planes
On Fri, Jan 06, 2023 at 05:43:23AM +0200, Dmitry Baryshkov wrote: > On Fri, 6 Jan 2023 at 02:38, Jessica Zhang wrote: > > > > > > > > On 1/5/2023 3:33 AM, Daniel Vetter wrote: > > > On Wed, Jan 04, 2023 at 03:40:33PM -0800, Jessica Zhang wrote: > > >> Introduce and add support for a solid_fill property. When the solid_fill > > >> property is set, and the framebuffer is set to NULL, memory fetch will be > > >> disabled. > > >> > > >> In addition, loosen the NULL FB checks within the atomic commit callstack > > >> to allow a NULL FB when the solid_fill property is set and add FB checks > > >> in methods where the FB was previously assumed to be non-NULL. > > >> > > >> Finally, have the DPU driver use drm_plane_state.solid_fill and instead > > >> of > > >> dpu_plane_state.color_fill, and add extra checks in the DPU atomic commit > > >> callstack to account for a NULL FB in cases where solid_fill is set. > > >> > > >> Some drivers support hardware that have optimizations for solid fill > > >> planes. This series aims to expose these capabilities to userspace as > > >> some compositors have a solid fill flag (ex. SOLID_COLOR in the Android > > >> hardware composer HAL) that can be set by apps like the Android Gears > > >> app. > > >> > > >> Userspace can set the solid_fill property to a blob containing the > > >> appropriate version number and solid fill color (in RGB323232 format) and > > >> setting the framebuffer to NULL. > > >> > > >> Note: Currently, there's only one version of the solid_fill blob > > >> property. > > >> However if other drivers want to support a similar feature, but require > > >> more than just the solid fill color, they can extend this feature by > > >> creating additional versions of the drm_solid_fill struct. > > >> > > >> Changes in V2: > > >> - Dropped SOLID_FILL_FORMAT property (Simon) > > >> - Switched to implementing solid_fill property as a blob (Simon, Dmitry) > > >> - Changed to checks for if solid_fill_blob is set (Dmitry) > > >> - Abstracted (plane_state && !solid_fill_blob) checks to helper method > > >>(Dmitry) > > >> - Removed DPU_PLANE_COLOR_FILL_FLAG > > >> - Fixed whitespace and indentation issues (Dmitry) > > > > > > Now that this is a blob, I do wonder again whether it's not cleaner to set > > > the blob as the FB pointer. Or create some kind other kind of special data > > > source objects (because solid fill is by far not the only such thing). > > > > > > We'd still end up in special cases like when userspace that doesn't > > > understand solid fill tries to read out such a framebuffer, but these > > > cases already exist anyway for lack of priviledges. > > > > > > So I still think that feels like the more consistent way to integrate this > > > feature. Which doesn't mean it has to happen like that, but the > > > patches/cover letter should at least explain why we don't do it like this. > > > > Hi Daniel, > > > > IIRC we were facing some issues with this check [1] when trying to set > > FB to a PROP_BLOB instead. Which is why we went with making it a > > separate property instead. Will mention this in the cover letter. > > What kind of issues? Could you please describe them? We switched from bitmask to enum style for prop types, which means it's not possible to express with the current uapi a property which accepts both an object or a blob. Which yeah sucks a bit ... But! blob properties are kms objects (like framebuffers), so it should be possible to stuff a blob into an object property as-is. Of course you need to update the validation code to make sure we accept either an fb or a blob for the internal representation. But that kind of split internally is required no matter what I think. -Daniel > > > > > [1] > > https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/drm_property.c#L71 > > > > Thanks, > > > > Jessica Zhang > > > > > -Daniel > > > > > >> > > >> Changes in V3: > > >> - Fixed some logic errors in atomic checks (Dmitry) > > >> - Introduced drm_plane_has_visible_data() and drm_atomic_check_fb() > > >> helper > > >>methods (Dmitry) > > >> > > >> Jessica Zhang (3): > > >>drm: Introduce solid fill propert
Re: [RFC PATCH v3 0/3] Support for Solid Fill Planes
On Wed, Jan 04, 2023 at 03:40:33PM -0800, Jessica Zhang wrote: > Introduce and add support for a solid_fill property. When the solid_fill > property is set, and the framebuffer is set to NULL, memory fetch will be > disabled. > > In addition, loosen the NULL FB checks within the atomic commit callstack > to allow a NULL FB when the solid_fill property is set and add FB checks > in methods where the FB was previously assumed to be non-NULL. > > Finally, have the DPU driver use drm_plane_state.solid_fill and instead of > dpu_plane_state.color_fill, and add extra checks in the DPU atomic commit > callstack to account for a NULL FB in cases where solid_fill is set. > > Some drivers support hardware that have optimizations for solid fill > planes. This series aims to expose these capabilities to userspace as > some compositors have a solid fill flag (ex. SOLID_COLOR in the Android > hardware composer HAL) that can be set by apps like the Android Gears > app. > > Userspace can set the solid_fill property to a blob containing the > appropriate version number and solid fill color (in RGB323232 format) and > setting the framebuffer to NULL. > > Note: Currently, there's only one version of the solid_fill blob property. > However if other drivers want to support a similar feature, but require > more than just the solid fill color, they can extend this feature by > creating additional versions of the drm_solid_fill struct. > > Changes in V2: > - Dropped SOLID_FILL_FORMAT property (Simon) > - Switched to implementing solid_fill property as a blob (Simon, Dmitry) > - Changed to checks for if solid_fill_blob is set (Dmitry) > - Abstracted (plane_state && !solid_fill_blob) checks to helper method > (Dmitry) > - Removed DPU_PLANE_COLOR_FILL_FLAG > - Fixed whitespace and indentation issues (Dmitry) Now that this is a blob, I do wonder again whether it's not cleaner to set the blob as the FB pointer. Or create some kind other kind of special data source objects (because solid fill is by far not the only such thing). We'd still end up in special cases like when userspace that doesn't understand solid fill tries to read out such a framebuffer, but these cases already exist anyway for lack of priviledges. So I still think that feels like the more consistent way to integrate this feature. Which doesn't mean it has to happen like that, but the patches/cover letter should at least explain why we don't do it like this. -Daniel > > Changes in V3: > - Fixed some logic errors in atomic checks (Dmitry) > - Introduced drm_plane_has_visible_data() and drm_atomic_check_fb() helper > methods (Dmitry) > > Jessica Zhang (3): > drm: Introduce solid fill property for drm plane > drm: Adjust atomic checks for solid fill color > drm/msm/dpu: Use color_fill property for DPU planes > > drivers/gpu/drm/drm_atomic.c | 136 +- > drivers/gpu/drm/drm_atomic_helper.c | 34 +++--- > drivers/gpu/drm/drm_atomic_state_helper.c | 9 ++ > drivers/gpu/drm/drm_atomic_uapi.c | 59 ++ > drivers/gpu/drm/drm_blend.c | 17 +++ > drivers/gpu/drm/drm_plane.c | 8 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 9 +- > drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 65 +++ > include/drm/drm_atomic_helper.h | 5 +- > include/drm/drm_blend.h | 1 + > include/drm/drm_plane.h | 62 ++ > 11 files changed, 302 insertions(+), 103 deletions(-) > > -- > 2.38.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/atomic: add quirks for blind save/restore
On Thu, Nov 17, 2022 at 07:54:40AM +, Simon Ser wrote: > Two quirks to make blind atomic save/restore [1] work correctly: > > - Mark the DPMS property as immutable for atomic clients, since > atomic clients cannot change it. > - Allow user-space to set content protection to "enabled", interpret > it as "desired". > > [1]: https://gitlab.freedesktop.org/wlroots/wlroots/-/merge_requests/3794 > > Signed-off-by: Simon Ser Reviewed-by: Daniel Vetter I think a doc patch which documents the guarantees we're trying to make here and that they're uapi would be really nice. Maybe somewhere in the KMS properties section in the docs. -Daniel > --- > > I don't have the motivation to write IGT tests for this. > > drivers/gpu/drm/drm_atomic_uapi.c | 5 +++-- > drivers/gpu/drm/drm_property.c| 7 +++ > 2 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/drm_atomic_uapi.c > b/drivers/gpu/drm/drm_atomic_uapi.c > index c06d0639d552..95363aac7f69 100644 > --- a/drivers/gpu/drm/drm_atomic_uapi.c > +++ b/drivers/gpu/drm/drm_atomic_uapi.c > @@ -741,8 +741,9 @@ static int drm_atomic_connector_set_property(struct > drm_connector *connector, > state->scaling_mode = val; > } else if (property == config->content_protection_property) { > if (val == DRM_MODE_CONTENT_PROTECTION_ENABLED) { > - drm_dbg_kms(dev, "only drivers can set CP Enabled\n"); > - return -EINVAL; > + /* Degrade ENABLED to DESIRED so that blind atomic > + * save/restore works as intended. */ > + val = DRM_MODE_CONTENT_PROTECTION_DESIRED; > } > state->content_protection = val; > } else if (property == config->hdcp_content_type_property) { > diff --git a/drivers/gpu/drm/drm_property.c b/drivers/gpu/drm/drm_property.c > index dfec479830e4..dde42986f8cb 100644 > --- a/drivers/gpu/drm/drm_property.c > +++ b/drivers/gpu/drm/drm_property.c > @@ -474,7 +474,14 @@ int drm_mode_getproperty_ioctl(struct drm_device *dev, > return -ENOENT; > > strscpy_pad(out_resp->name, property->name, DRM_PROP_NAME_LEN); > + > out_resp->flags = property->flags; > + if (file_priv->atomic && property == dev->mode_config.dpms_property) { > + /* Quirk: indicate that the legacy DPMS property is not > + * writable from atomic user-space, so that blind atomic > + * save/restore works as intended. */ > + out_resp->flags |= DRM_MODE_PROP_IMMUTABLE; > + } > > value_count = property->num_values; > values_ptr = u64_to_user_ptr(out_resp->values_ptr); > -- > 2.38.1 > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: Meeting (BOF) at Plumbers Dublin to discuss backlight brightness as connector object property RFC?
On Fri, 9 Sept 2022 at 12:50, Simon Ser wrote: > On Friday, September 9th, 2022 at 12:23, Hans de Goede < > hdego...@redhat.com> wrote: > > "people using > > non fully integrated desktop environments like e.g. sway often use custom > > scripts binded to hotkeys to get functionality like the brightness > > up/down keyboard hotkeys changing the brightness. This typically involves > > e.g. the xbacklight utility. > > > > Even if the xbacklight utility is ported to use kms with the new > connector > > object brightness properties then this still will not work because > > changing the properties will require drm-master rights and e.g. sway will > > already hold those." > > I don't think this is a good argument. Sway (which I'm a maintainer of) > can add a command to change the backlight, and then users can bind their > keybinding to that command. This is not very different from e.g. a > keybind to switch on/off a monitor. > > We can also standardize a protocol to change the backlight across all > non-fully-integrated desktop environments (would be a simple addition > to output-power-management [1]), so that a single tool can work for > multiple compositors. Yeah, I mean, as one of the main people arguing that non-fully-integrated desktops are not the design we want, I agree with Simon. Cheers, Daniel
Re: Position set/get prototype template
On Mon, 8 Aug 2022 at 17:24, samuel ammonius wrote: > I've just looked at the xdg-shell protocol as you said. I was really > surprised at the > amount of features it had, but one part in particular caught my eye: Window geometry is relative to surface co-ordinates. As the first paragraph describes, it is used to describe the region of the surface which excludes external decor like drop shadows. One thing you might note is that Wayland does not supply a global co-ordinate space to clients. Everything is supplied in a surface co-ordinate space. Again, this will become clear if you took the time to work through an introductory guide to Wayland. Think of it this way: the developers are giving you their time and effort by explaining to you how Wayland works, and trying to guide you through why your proposed designs are unworkable. It would be very polite if you could repay the favour by investing some of your time and effort to understand Wayland before proposing drastic changes to it and demanding the developers justify why they won't be accepted. Cheers, Daniel
Re: Position set/get prototype template
Hi Samuel, On Mon, 8 Aug 2022 at 17:04, samuel ammonius wrote: > On Mon, Aug 8, 2022 at 12:06 PM Simon Ser wrote: >> If you are not interested in explaining the use-cases, and are just >> interested in blindly adding new features: sorry, but this isn't how we >> want to approach things. > > Sorry for not giving a more specific answer to this earlier. The main reason > is regarding an app's location when it starts up. For example, a large > workspace application like VS-Studio or Blender may want to start maximized, > but not full screen. AFAIK, Wayland has a protocol for fullscreen but not for > maximizing/minimizing an app Wayland does have a protocol for maximising apps. Without it, we wouldn't have been able to ship as the default desktop for every mainline distribution for several years now. I recommend you look into the xdg-shell family of protocols. Introductory guides to Wayland will also cover this. > ... so this can only be done by setting the position to (0, 0) and setting > the size to the viewport size of the compositor. Other apps may want to start > where they left off. I know that this is something that the compositor should > handle by default, but many of them don't. The largest obstacle is that apps > like screenshot utilities may want to start in the top left, autoclickers > start in the top right, console applications start in the center when opened > with ALT+T, and many other small situations like. There are probably many > other scenarios like this, and it seems like an unnecessary amount of pain to > add a new protocol every time a new thing pops up. > > Anyways, I don't see the harm in adding this feature. Who says that a > compositor is any smarter than an app? Many compositors don't support > remembering a window's location, while many apps do. Apps aren't users, so > taking power away from them isn't "making things simpler". It's not taking any power away from apps. It's giving the entire ecosystem more power. It takes more work to achieve than just blindly taking the shortest possible path, but conversely it also saves us from being painted into the same corners that made X11 no longer viable. Cheers, Daniel
Re: Position set/get prototype template
Igor, On Mon, 8 Aug 2022 at 15:54, Igor Korot wrote: > Or even better - when one of you will need to quickly create a > "proof-of-concept" application that will jump all over the places on every > startup... We've written applications for years. We know how to make it work. We've explained to you how to make it work. It may not work in the exact way you have in your head, but the real usecases can and do work, even if you won't listen to patient explanations. This discussion has more than run its course by now. Unfortunately as you seem hellbent on continuing to repeat the same thing over and over without listening to anything else being said, your posts will now be moderated. Cheers, Daniel
Re: Window positions under wayland
On Fri, 5 Aug 2022 at 17:21, Igor Korot wrote: > On Thu, Aug 4, 2022 at 9:20 PM Thiago Macieira wrote: > > No, they are about position too. 100,100 on a 1920x1080 resolution is about > > 5% > > to the right of the left edge and 10% from the top. 100,100 on 3840x2160 is > > 2.5% from the left and 5% from the top, on the same monitor. The user has a > > right to expect the same finger-width position on the screen, relative to > > where > > their eyes are looking at. > > No, they are not. > It should be up to the application to decide the coordinate system > PPI/DPI/etc. This is why there is no value in this discussion. You are making assertions like this as if they are axiomatic. It would be possible to redesign Wayland following the principles you have described. No-one is doing this, however. We have carefully considered the points you have raised over the past 14 years and reached different conclusions. I'm sorry that Wayland is not the perfect window system you envisage in your own head, but just repeating your same beliefs that the design is fundamentally flawed over and over, will not force us to change the design.
Re: Window positions under wayland
On Fri, 5 Aug 2022 at 14:11, samuel ammonius wrote: > Please don't close this discussion on account of something someone > else said. Wouldn't it be better for users, compositors, and apps if there > was a way to manage window positions, but the compositor could choose > not to let them? The best apps typically have an option in the preferences > for windows to remember their positions, and window managers are given > the option to move/resize windows as they please. Window managers can > then let users decide weather they want to let windows choose their own > size/position or stay put (such as in tiling window managers). Why wouldn't > Wayland benifit from a similar system? (I'm assuming that the reason you > said "for better or for worse" was just so everyone could stop talking rather > than because you actually wouldn't mind having a worse system in place) It's not inherently better or worse, just different. For things like remembering window positions, there has already been a specific protocol written to handle that usecase linked in this thread, which is more flexible and capable than having every client save their last-seen position and forcibly restore it no matter what. Cheers, Daniel
Re: Window positions under wayland
Hi all, On Fri, 5 Aug 2022 at 10:42, Jean-Michaël Celerier wrote: > [lots of musing about win32 snipped] > > It would make me pretty sad to tell these people (both end-users and devs) > that Windows is a better operating system for their use case than desktop > Linux. OK, this discussion is closed now. For better or worse, Wayland will not have a standard protocol to get and set window positions. As Carsten and others have explained, specific targeted usecases - and 'I want to control the position of my toplevel window' is not a usecase, it is a dictated solution to a number of very separate problems - are addressed with specific new protocol extensions. As said, Wayland is descriptive rather than prescriptive: the client gives the compositor as much information is required to make good decisions, rather than tells the compositor exactly to do with no context. To the other points raised: X11 still exists and will continue to for a very long time. A lot of effort has gone into Xwayland to make X11 apps work completely seamlessly. If you do not want to use Wayland, then no-one is forcing you to. Please feel free to continue using X11 if you feel that it works better for you or your apps. As to the notion that users can force window managers to do bad things, the simple answer is that if you don't want bad window management, then don't use a bad window manager. If you have specific targeted usecases that you want addressed, please contribute to the development of those protocols in an issue on https://gitlab.freedesktop.org/wayland/wayland-protocols/. If you want to discuss whether Wayland's fundamental design concepts are or aren't a good idea, or whether Windows is bad, please take that discussion to Hacker News or Reddit or something. Cheers, Daniel
Re: I'm adding features to VKMS! What would you like to see?
Hi Jim, On Fri, 29 Jul 2022 at 08:30, Jim Shargo wrote: > TL;DR: I'm working on extending VKMS and wanted feedback from other > compositor/wayland devs. Awesome! :) Glad to see it, and yeah, I second everything Pekka said. Having hotplug in particular would be really great. > // Questions > > - What VKMS features could help your testing the most? > - How much do you care about writeback buffer support vs CRC checks > in tests atm? > - What kinds of bugs do you get around DRM/KMS? > - Any thoughts in general? One thing I've really wanted is corner-case handling which just can't be done in generic code. Weston is really aggressive at trying to get things into planes, but we can only test those on actual systems with particular semantics. I'd love to be able to programmatically fake those to be able to check our fallbacks in an automatic way. About the best idea I've come up with for that is being able to attach an eBPF hook to atomic_check. The absolute MVP would be no arguments and an errno return; if you completely control the environment, then you can store a counter in a map and return a particular error for the n'th attempt. But a better one would allow you to inspect the properties on each object in the atomic state, and also things like framebuffer attributes (dimensions, format, modifier, etc) so you could take action accordingly. Thanks for taking this on! Really looking forward to it. Cheers, Daniel
Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS
Hi, On Tue, 14 Jun 2022 at 15:40, Zack Rusin wrote: > On Tue, 2022-06-14 at 10:36 +0300, Pekka Paalanen wrote: > > The reason I am saying that you need to fix other issues with > > virtualized drivers at the same time is because those other issues are > > the sole and only reason why the driver would ever need hotspot info. > > > > Hotspot info must not be necessary for correct KMS operation, yet you > > seem to insist it is. You assume that cursor plane commandeering is ok > > to do. But it is not ok without adding the KMS UAPI that would allow > > it: a way for guest userspace to explicitly say that commandeering will > > be ok. > > > > If and only if guest userspace says commandeering is ok, then you can > > require hotspot info. On the other hand, you cannot retrofit hotspot > > info by saying that if a driver exposes hotspot properties then all KMS > > clients must set them. That would be a kernel regression by definition: > > KMS clients that used to abide the KMS UAPI contract are suddenly > > breaking the contract because the kernel changed the contract. > > > > Therefore I very much disagree that virtualized drivers need hotspot > > info. They do not strictly need hotspot info for correct operation, > > they need it only for making the user experience more smooth, which is > > an optional optimization. That optimization may be very important in > > practise, but it's still an optimization and is generally not expected > > by KMS clients. > > I strongly disagree with that (both the sentiment towards hotspots and the > client > handling of it). I don't think we have to agree on reasoning here at all to > make > progress though so I'm going to let it go (we can always continue on irc or > email if > you'd like to try to conclude this bit but we could all use a few days of > break from > this discussion probably). Well, it's just coming from two different directions: * many current KMS clients want the cursor plane to be displayed as the client-programmed plane properties indicate, and the output can be nonsensical if they aren't * VMware optimises the cursor by displaying the cursor plane not as the client-programmed plane properties indicate, and the output is sometimes good (faster response!) but sometimes bad (nonsensical display!) The client cap sounds good. As a further suggestion, given that universal planes are supposed to make planes, er, universal, rather than imbued with magical special behaviour, how about _also_ adding an 'is cursor' plane prop which userspace has to set to 1 to indicate that the output is a cursor and has the hotspot correctly set and the 'display hardware' is free to make the cursor fly around the screen in accordance with the input events it sends? That way it's really really clear what's happening and no-one's getting surprised when 'the right thing' doesn't happen, not least because it's really clear what 'the right thing' is. Cheers, Daniel
Re: 504 to gitlab.freedesktop.org
Hi, On Mon, 13 Jun 2022 at 08:39, Daniel Stone wrote: > Yes, that's what's happening. Our (multi-host-replicated etc) Ceph > storage setup has entered a degraded mode due to the loss of a couple > of disks - no data has been lost but the cluster is currently unhappy. > We're walking through fixing this, but have bumped into some other > issues since, including a newly-flaky network setup, and changes since > we last provisioned a new storage host. > > We're working through them one by one and will have the service back > up with all our data intact - hopefully in a matter of hours but we > have no firm ETA right now. Thanks mainly to Ben, everything is back up and running now. Cheers, Daniel
Re: 504 to gitlab.freedesktop.org
On Mon, 13 Jun 2022 at 05:17, Peter Hutterer wrote: > On Sun, Jun 12, 2022 at 05:57:05PM -0700, Jeremy Sequoia wrote: > > I was going to spend a little bit of time putting out an update to XQuartz > > to address a few bugs that I've been meaning to squash, but I'm having a bit > > of an issue pulling down sources. > > > > Fetching via ssh://g...@gitlab.freedesktop.org is giving me Permission > > denied > > (publickey,keyboard-interactive). I'm not sure if the latter is an infra > > issue or if the ssh key I have stored in my gitlab account are out of date > > (it's been about a year since I touched this). Unfortunately, I can't seem > > to access https://gitlab.freedesktop.org to check as it's constantly > > presenting me a 504 Gateway timeout. > > > > Is anyone else able to pull sources via ssh://g...@gitlab.freedesktop.org > > right now? Is someone looking into the 504 issue? > > not an fdo admin but judging by the chatter on #freedesktop: no and yes, in > that order. seems like the infrastructure is in various stages of depositing > fecal matter on itself and the fixes are involved enough that the admins have > to be mentally awake, not merely physically. Yes, that's what's happening. Our (multi-host-replicated etc) Ceph storage setup has entered a degraded mode due to the loss of a couple of disks - no data has been lost but the cluster is currently unhappy. We're walking through fixing this, but have bumped into some other issues since, including a newly-flaky network setup, and changes since we last provisioned a new storage host. We're working through them one by one and will have the service back up with all our data intact - hopefully in a matter of hours but we have no firm ETA right now. Cheers, Daniel
Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS
On Fri, Jun 10, 2022 at 09:15:35AM +, Simon Ser wrote: > On Friday, June 10th, 2022 at 10:41, Daniel Vetter wrote: > > > Anything I've missed? Or got completely wrong? > > This plan looks good to me. > > As Pekka mentionned, I'd also like to have a conversation of how far we want > to > push virtualized driver features. I think KMS support is a good feature to > have > to spin up a VM and have all of the basics working. However I don't think it's > a good idea to try to plumb an ever-growing list of fancy features > (seamless integration of guest windows into the host, HiDPI, multi-monitor, > etc) into KMS. You'd just end up re-inventing Wayland or RDP on top of KMS. > Instead of re-inventing these, just use RDP or waypipe or X11 forwarding > directly. > > So I think we need to draw a line somewhere, and decide e.g. that virtualized > cursors are fine to add in KMS, but HiDPI is not. It's getting a bit far off-topic, but google cros team has an out-of-tree (at least I think it's not merged yet) wayland-virtio driver for exactly this use-case. Trying to move towards something like that for fancy virtualized setups sounds like the better approach indeed, with kms just as the bare-bones fallback option. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS
On Fri, Jun 10, 2022 at 08:54:03AM +, Simon Ser wrote: > I agree with what others have replied, just adding a few more details. > > On Thursday, June 9th, 2022 at 21:39, Zack Rusin wrote: > > > virtualized drivers send drm_kms_helper_hotplug_event which sends a > > HOTPLUG=1 > > event with a changed preferred width/height > > (Note: and the "hotplug_mode_update" property is set to 1.) > > > suggested_x and suggested_y properties > > These come with their own set of issues. They are poorly defined, but it seems > like they describe a position in physical pixel coordinates. Compositors don't > use physical pixel coordinates to organize their outputs, instead they use > logical coordinates. For instance, a HiDPI 4k screen with a scale of 2 will > take up 1920x1080 logical pixels. There is no way to convert physical pixel > coordinates to logical pixel coordinates in the general case, because there's > no "global scale factor". So suggested_x/y are incompatible with the way > compositors work. I dropped a request for a proper doc section that explains all the virtualized kms driver stuff. I think we should also put in a "limitations" part there and just spec that any kind of scaling is a no-go on these (and that drivers better validate this is the case). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS
On Fri, Jun 10, 2022 at 10:41:05AM +0200, Daniel Vetter wrote: > Hi all, > > Kinda top post because the thread is sprawling and I think we need a > summary/restart. I think there's at least 3 issues here: > > - lack of hotspot property support, which means compositors can't really > support hotspot with atomic. Which isn't entirely true, because you > totally can use atomic for the primary planes/crtcs and the legacy > cursor ioctls, but I understand that people might find that a bit silly :-) > > Anyway this problme is solved by the patch set here, and I think results > in some nice cleanups to boot. > > - the fact that cursors for virtual drivers are not planes, but really > special things. Which just breaks the universal plane kms uapi. That > part isn't solved, and I do agree with Simon and Pekka that we really > should solve this before we unleash even more compositors onto the > atomic paths of virtual drivers. > > I think the simplest solution for this is: > 1. add a new DRM_PLANE_TYPE_VIRTUAL_CURSOR, and set that for these > special cursor planes on all virtual drivers > 2. add the new "I understand virtual cursors planes" setparam, filter > virtual cursor planes for userspace which doesn't set this (like we do > right now if userspace doesn't set the universal plane mode) > 3. backport the above patches to all stable kernels > 4. make sure the hotspot property is only set on VIRTUAL_CURSOR planes > and nothing else in the rebased patch series Simon also mentioned on irc that these special planes must not expose the CRTC_X/Y property, since that doesn't really do much at all. Or is our understanding of how this all works for commandeered cursors wrong? > - third issue: These special virtual display properties arent documented. > Aside from hotspot there's also suggested X/Y and maybe other stuff. I > have no idea what suggested X/Y does and what userspace should do with > it. I think we need a new section for virtualized drivers which: > - documents all the properties involved > - documents the new cap for enabling virtual cursor planes > - documents some of the key flows that compositors should implement for > best experience > - documents how exactly the user experience will degrade if compositors > pretend it's just a normal kms driver (maybe put that into each of the > special flows that a compositor ideally supports) > - whatever other comments and gaps I've missed, I'm sure > Simon/Pekka/others will chime in once the patch exists. Great bonus would be an igt which demonstrates these flows. With the interactive debug breakpoints to wait for resizing or whatever this should be all neatly possible. -Daniel > > There's a bit of fixing oopsies (virtualized drivers really shouldn't have > enabled universal planes for their cursors) and debt (all these properties > predate the push to document stuff so we need to fix that), but I don't > think it's too much. And I think, from reading the threads, that this > should cover everything? > > Anything I've missed? Or got completely wrong? > > Cheers, Daniel > > On Fri, Jun 03, 2022 at 02:14:59PM +, Simon Ser wrote: > > Hi, > > > > Please, read this thread: > > https://lists.freedesktop.org/archives/dri-devel/2020-March/thread.html#259615 > > > > It has a lot of information about the pitfalls of cursor hotspot and > > other things done by VM software. > > > > In particular: since the driver will ignore the KMS cursor plane > > position set by user-space, I don't think it's okay to just expose > > without opt-in from user-space (e.g. with a DRM_CLIENT_CAP). > > > > cc wayland-devel and Pekka for user-space feedback. > > > > On Thursday, June 2nd, 2022 at 17:42, Zack Rusin wrote: > > > > > - all userspace code needs to hardcore a list of drivers which require > > > hotspots because there's no way to query from drm "does this driver > > > require hotspot" > > > > Can you elaborate? I'm not sure I understand what you mean here. > > > > Thanks, > > > > Simon > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS
Hi all, Kinda top post because the thread is sprawling and I think we need a summary/restart. I think there's at least 3 issues here: - lack of hotspot property support, which means compositors can't really support hotspot with atomic. Which isn't entirely true, because you totally can use atomic for the primary planes/crtcs and the legacy cursor ioctls, but I understand that people might find that a bit silly :-) Anyway this problme is solved by the patch set here, and I think results in some nice cleanups to boot. - the fact that cursors for virtual drivers are not planes, but really special things. Which just breaks the universal plane kms uapi. That part isn't solved, and I do agree with Simon and Pekka that we really should solve this before we unleash even more compositors onto the atomic paths of virtual drivers. I think the simplest solution for this is: 1. add a new DRM_PLANE_TYPE_VIRTUAL_CURSOR, and set that for these special cursor planes on all virtual drivers 2. add the new "I understand virtual cursors planes" setparam, filter virtual cursor planes for userspace which doesn't set this (like we do right now if userspace doesn't set the universal plane mode) 3. backport the above patches to all stable kernels 4. make sure the hotspot property is only set on VIRTUAL_CURSOR planes and nothing else in the rebased patch series - third issue: These special virtual display properties arent documented. Aside from hotspot there's also suggested X/Y and maybe other stuff. I have no idea what suggested X/Y does and what userspace should do with it. I think we need a new section for virtualized drivers which: - documents all the properties involved - documents the new cap for enabling virtual cursor planes - documents some of the key flows that compositors should implement for best experience - documents how exactly the user experience will degrade if compositors pretend it's just a normal kms driver (maybe put that into each of the special flows that a compositor ideally supports) - whatever other comments and gaps I've missed, I'm sure Simon/Pekka/others will chime in once the patch exists. There's a bit of fixing oopsies (virtualized drivers really shouldn't have enabled universal planes for their cursors) and debt (all these properties predate the push to document stuff so we need to fix that), but I don't think it's too much. And I think, from reading the threads, that this should cover everything? Anything I've missed? Or got completely wrong? Cheers, Daniel On Fri, Jun 03, 2022 at 02:14:59PM +, Simon Ser wrote: > Hi, > > Please, read this thread: > https://lists.freedesktop.org/archives/dri-devel/2020-March/thread.html#259615 > > It has a lot of information about the pitfalls of cursor hotspot and > other things done by VM software. > > In particular: since the driver will ignore the KMS cursor plane > position set by user-space, I don't think it's okay to just expose > without opt-in from user-space (e.g. with a DRM_CLIENT_CAP). > > cc wayland-devel and Pekka for user-space feedback. > > On Thursday, June 2nd, 2022 at 17:42, Zack Rusin wrote: > > > - all userspace code needs to hardcore a list of drivers which require > > hotspots because there's no way to query from drm "does this driver > > require hotspot" > > Can you elaborate? I'm not sure I understand what you mean here. > > Thanks, > > Simon -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: running scanner in wayland-main
Hi Egy, On Mon, 16 May 2022 at 15:09, Egy Ketto wrote: > I'm digging the source code of wayland and weston. I'm stuck at getting the > code generated by wayland-scanner. I was following the build instructions for > wayland: > > $ git clone https://gitlab.freedesktop.org/wayland/wayland.git > $ cd wayland > $ meson build/ --prefix=PREFIX //(which I set to wayland/build) > $ ninja -C build/ install > > but I get the error: > > meson.build:83:1: ERROR: lexer > > Can someone please help me with this? Which version of Meson are you using? I don't understand how the root meson.build could cause Meson to not be able to parse. Cheese, Daniel
Re: [RFC] drm/kms: control display brightness through drm_connector properties
On Wed, Apr 27, 2022 at 05:23:22PM +0300, Jani Nikula wrote: > On Wed, 27 Apr 2022, Daniel Vetter wrote: > > On Thu, Apr 14, 2022 at 01:24:30PM +0300, Jani Nikula wrote: > >> On Mon, 11 Apr 2022, Alex Deucher wrote: > >> > On Mon, Apr 11, 2022 at 6:18 AM Hans de Goede > >> > wrote: > >> >> > >> >> Hi, > >> >> > >> >> On 4/8/22 17:11, Alex Deucher wrote: > >> >> > On Fri, Apr 8, 2022 at 10:56 AM Hans de Goede > >> >> > wrote: > >> >> >> > >> >> >> Hi, > >> >> >> > >> >> >> On 4/8/22 16:08, Alex Deucher wrote: > >> >> >>> On Fri, Apr 8, 2022 at 4:07 AM Daniel Vetter > >> >> >>> wrote: > >> >> >>>> > >> >> >>>> On Thu, Apr 07, 2022 at 05:05:52PM -0400, Alex Deucher wrote: > >> >> >>>>> On Thu, Apr 7, 2022 at 1:43 PM Hans de Goede > >> >> >>>>> wrote: > >> >> >>>>>> > >> >> >>>>>> Hi Simon, > >> >> >>>>>> > >> >> >>>>>> On 4/7/22 18:51, Simon Ser wrote: > >> >> >>>>>>> Very nice plan! Big +1 for the overall approach. > >> >> >>>>>> > >> >> >>>>>> Thanks. > >> >> >>>>>> > >> >> >>>>>>> On Thursday, April 7th, 2022 at 17:38, Hans de Goede > >> >> >>>>>>> wrote: > >> >> >>>>>>> > >> >> >>>>>>>> The drm_connector brightness properties > >> >> >>>>>>>> === > >> >> >>>>>>>> > >> >> >>>>>>>> bl_brightness: rw 0-int32_max property controlling the > >> >> >>>>>>>> brightness setting > >> >> >>>>>>>> of the connected display. The actual maximum of this will be > >> >> >>>>>>>> less then > >> >> >>>>>>>> int32_max and is given in bl_brightness_max. > >> >> >>>>>>> > >> >> >>>>>>> Do we need to split this up into two props for sw/hw state? The > >> >> >>>>>>> privacy screen > >> >> >>>>>>> stuff needed this, but you're pretty familiar with that. :) > >> >> >>>>>> > >> >> >>>>>> Luckily that won't be necessary, since the privacy-screen is a > >> >> >>>>>> security > >> >> >>>>>> feature the firmware/embedded-controller may refuse our requests > >> >> >>>>>> (may temporarily lock-out changes) and/or may make changes > >> >> >>>>>> without > >> >> >>>>>> us requesting them itself. Neither is really the case with the > >> >> >>>>>> brightness setting of displays. > >> >> >>>>>> > >> >> >>>>>>>> bl_brightness_max: ro 0-int32_max property giving the actual > >> >> >>>>>>>> maximum > >> >> >>>>>>>> of the display's brightness setting. This will report 0 when > >> >> >>>>>>>> brightness > >> >> >>>>>>>> control is not available (yet). > >> >> >>>>>>> > >> >> >>>>>>> I don't think we actually need that one. Integer KMS props all > >> >> >>>>>>> have a > >> >> >>>>>>> range which can be fetched via drmModeGetProperty. The max can > >> >> >>>>>>> be > >> >> >>>>>>> exposed via this range. Example with the existing alpha prop: > >> >> >>>>>>> > >> >> >>>>>>> "alpha": range [0, UINT16_MAX] = 65535 > >> >> >>>>>> > >> >> >>>>>> Right, I already knew that, which is why I explicitly added a > >> >> >>>>>> range &
Re: [RFC] drm/kms: control display brightness through drm_connector properties
On Thu, Apr 14, 2022 at 01:24:30PM +0300, Jani Nikula wrote: > On Mon, 11 Apr 2022, Alex Deucher wrote: > > On Mon, Apr 11, 2022 at 6:18 AM Hans de Goede wrote: > >> > >> Hi, > >> > >> On 4/8/22 17:11, Alex Deucher wrote: > >> > On Fri, Apr 8, 2022 at 10:56 AM Hans de Goede > >> > wrote: > >> >> > >> >> Hi, > >> >> > >> >> On 4/8/22 16:08, Alex Deucher wrote: > >> >>> On Fri, Apr 8, 2022 at 4:07 AM Daniel Vetter wrote: > >> >>>> > >> >>>> On Thu, Apr 07, 2022 at 05:05:52PM -0400, Alex Deucher wrote: > >> >>>>> On Thu, Apr 7, 2022 at 1:43 PM Hans de Goede > >> >>>>> wrote: > >> >>>>>> > >> >>>>>> Hi Simon, > >> >>>>>> > >> >>>>>> On 4/7/22 18:51, Simon Ser wrote: > >> >>>>>>> Very nice plan! Big +1 for the overall approach. > >> >>>>>> > >> >>>>>> Thanks. > >> >>>>>> > >> >>>>>>> On Thursday, April 7th, 2022 at 17:38, Hans de Goede > >> >>>>>>> wrote: > >> >>>>>>> > >> >>>>>>>> The drm_connector brightness properties > >> >>>>>>>> === > >> >>>>>>>> > >> >>>>>>>> bl_brightness: rw 0-int32_max property controlling the brightness > >> >>>>>>>> setting > >> >>>>>>>> of the connected display. The actual maximum of this will be less > >> >>>>>>>> then > >> >>>>>>>> int32_max and is given in bl_brightness_max. > >> >>>>>>> > >> >>>>>>> Do we need to split this up into two props for sw/hw state? The > >> >>>>>>> privacy screen > >> >>>>>>> stuff needed this, but you're pretty familiar with that. :) > >> >>>>>> > >> >>>>>> Luckily that won't be necessary, since the privacy-screen is a > >> >>>>>> security > >> >>>>>> feature the firmware/embedded-controller may refuse our requests > >> >>>>>> (may temporarily lock-out changes) and/or may make changes without > >> >>>>>> us requesting them itself. Neither is really the case with the > >> >>>>>> brightness setting of displays. > >> >>>>>> > >> >>>>>>>> bl_brightness_max: ro 0-int32_max property giving the actual > >> >>>>>>>> maximum > >> >>>>>>>> of the display's brightness setting. This will report 0 when > >> >>>>>>>> brightness > >> >>>>>>>> control is not available (yet). > >> >>>>>>> > >> >>>>>>> I don't think we actually need that one. Integer KMS props all > >> >>>>>>> have a > >> >>>>>>> range which can be fetched via drmModeGetProperty. The max can be > >> >>>>>>> exposed via this range. Example with the existing alpha prop: > >> >>>>>>> > >> >>>>>>> "alpha": range [0, UINT16_MAX] = 65535 > >> >>>>>> > >> >>>>>> Right, I already knew that, which is why I explicitly added a range > >> >>>>>> to the props already. The problem is that the range must be set > >> >>>>>> before registering the connector and when the backlight driver > >> >>>>>> only shows up (much) later during boot then we don't know the > >> >>>>>> range when registering the connector. I guess we could "patch-up" > >> >>>>>> the range later. But AFAIK that would be a bit of abuse of the > >> >>>>>> property API as the range is intended to never change, not > >> >>>>>> even after hotplug uevents. At least atm there is no infra > >> >>>>>> in the kernel to change the range later. > >> >>>>>> &
Re: [RFC] drm/kms: control display brightness through drm_connector properties
On Fri, Apr 08, 2022 at 12:26:24PM +0200, Hans de Goede wrote: > Hi, > > On 4/8/22 12:16, Simon Ser wrote: > > Would it be an option to only support the KMS prop for Good devices, > > and continue using the suboptimal existing sysfs API for Bad devices? > > > > (I'm just throwing ideas around to see what sticks, feel free to ignore.) > > Currently suid-root or pkexec helpers are used to deal with the > /sys/class/backlight requires root rights issue. I really want to > be able to disable these helpers at build time in e.g. GNOME once > the new properties are supported in GNOME. So that distros with > a new enough kernel can reduce their attack surface this way. Yeah but otoh perpetuating a bad interface forever isn't a great idea either. I think the pragmatic plan here is - Implement this properly on good devices, i.e. anything new. - Do some runtime disabling in the pkexec helpers if they detect a modern system (we should be able to put a proper symlink into the drm sysfs connector directories, to make this easy to detect). It's not as great as doing this at compile time, but it's a step. - Figure out a solution for the old crap. We can't really change anything with the load ordering for existing systems, so if the hacked-up compat libbacklight-backlight isn't an option then I guess we need some quirk list/extracted code which makes i915/nouveau/radeon driver load fail until the right backlight shows up. And that needs to be behind a Kconfig to avoid breaking existing systems. Inflicting hotplug complications on all userspace (including uevent handling for this hotpluggable backlight and everything) just because fixing the old crap systems is work is imo really not a good idea. Much better if we get to the correct future step-by-step. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [RFC] drm/kms: control display brightness through drm_connector properties
oblem is that we really don't know if 0 is off > > or min-brightness. In the given example where we actually never go > > down to a duty-cycle of 0% because the video BIOS tables tell us > > not to, we can be certain that setting the brightness prop to 0 > > will not turn of the backlight, since we then set the duty-cycle > > to the VBT provided minimum. Note the intend here is to only set > > the boolean to true if the VBT provided minimum is _not_ 0, 0 > > just means the vendor did not bother to provide a minimum. > > > > Currently e.g. GNOME never goes lower then something like 5% > > of brightness_max to avoid accidentally turning the screen off. > > > > Turning the screen off is quite bad to do on e.g. tablets where > > the GUI is the only way to undo the brightness change and now > > the user can no longer see the GUI. > > > > The idea behind this boolean is to give e.g. GNOME a way to > > know that it is safe to go down to 0% and for it to use > > the entire range. > > Why not just make it policy that 0 is defined as minimum brightness, > not off, and have all drivers conform to that? Because the backlight subsystem isn't as consistent on this, and it's been an epic source of confusion since forever. What's worse, there's both userspace out there which assumes brightness = 0 is a really fast dpms off _and_ userspace that assumes that brightness = 0 is the lowest setting. Of course on different sets of machines. So yeah we're screwed. I have no idea how to get out of this. -Daniel > > Alex > > > > > > For instance if we can guarantee that the min level won't turn the screen > > > completely off we could make the range start from 1 instead of 0. > > > Or allow -1 to mean "minimum value, maybe completely off". > > > > Right, the problem is we really don't know and when the range is > > e.g. 0-65535 then something like 1 will almost always still just > > turn the screen completely off. There will be a value of say like > > 150 or some such which is then the actual minimum value to still > > get the backlight to light up at all. The problem is we have > > no clue what the actual minimum is. And if the PWM output does > > not directly drive the LEDs but is used as an input for some > > LED backlight driver chip, that chip itself may have a lookup > > table (which may also take care of doing perceived brightness > > mapping) and may guarantee a minimum backlight even when given > > a 0% duty cycle PWM signal... > > > > This prop is sort of orthogonal to the generic change to > > drm_connector props, so we could also do this later as a follow up > > change. At a minimum when I code this up this should be in its > > own commit(s) I believe. > > > > But I do think having this will be useful for the above > > GNOME example. > > > > >> bl_brightness_control_method: ro, enum, possible values: > > >> none: The GPU driver expects brightness control to be provided by another > > >> driver and that driver has not loaded yet. > > >> unknown: The underlying control mechanism is unknown. > > >> pwm: The brightness property directly controls the duty-cycle of a PWM > > >> output. > > >> firmware: The brightness is controlled through firmware calls. > > >> DDC/CI: The brightness is controlled through the DDC/CI protocol. > > >> gmux: The brightness is controlled by the GMUX. > > >> Note this enum may be extended in the future, so other values may > > >> be read, these should be treated as "unknown". > > >> > > >> When brightness control becomes available after being reported > > >> as not available before (bl_brightness_control_method=="none") > > >> a uevent with CONNECTOR= and > > >> > > >> PROPERTY= will be generated > > >> > > >> at this point all the properties must be re-read. > > >> > > >> When/once brightness control is available then all the read-only > > >> properties are fixed and will never change. > > >> > > >> Besides the "none" value for no driver having loaded yet, > > >> the different bl_brightness_control_method values are intended for > > >> (userspace) heuristics for such things as the brightness setting > > >> linearly controlling electrical power or setting perceived brightness. > > > > > > Can you elaborate? I don't know enough about brightness control to > > > understand all of the implications here. > > > > So after send
Re: weston crashing when no HDMI connection
Hi Rusty, On Fri, 10 Dec 2021 at 20:38, Rusty Howell wrote: > We are working on an embedded linux device built using Yocto (dunfell). We > have weston running and we are seeing our QT5 application running as well. > One problem we are having is that our application is crashing weston if the > application is started when there is no active HDMI connection. We see the > error message "The Wayland connection broke. Did the Wayland compositor > die?". We also see weston restart. > > I have a smaller QT demo app with just a few controls, and that seems to work > just fine. Launching it while there is no HDMI connection does not seem to > affect weston at all. Is this a known issue with weston? Has anyone seen > similar issues with Qt5 and weston? This definitely isn't a known issue. However, if you are using a vendor BSP (e.g. NXP), they may have substantially forked and changed Weston. In most cases, these bugs are introduced by the vendor changes. If this is the case, please seek support from your vendor, as unfortunately we can't support those changes. Cheers, Daniel
Re: Where is eglGetConfigs getting its configs from?
Hi Chris, On Fri, 5 Nov 2021 at 12:01, chris.lapla...@agilent.com wrote: > Can anyone please explain how eglGetConfigs actually works? i.e. what > information is it consulting in order to determine the configs to return? > Unfortunately we are using a processor (Xilinx Zynq UltraScale) with a GPU > (Mali-400 MP2) whose EGL implementation is closed-source. So I cannot look at > the source code for eglGetConfigs. 'It depends' is the short and disappointing answer ... > My expectation was that eglGetConfigs would simply enumerate DRM framebuffers > in the system. However, the only framebuffer in our system (a Xilinx > Framebuffer Read IP core in the FPGA) is using XRGB colors and the list > that Weston reports makes no sense. See below for /var/log/weston.log > contents. Some component of the system seems to be finding RGB565 and > ARGB framebuffers somewhere. EGLConfigs are only semi-related to this. Ultimately they do have to render to a framebuffer for display, however they can be used for intermediate non-display rendering as well (e.g. FBOs, depth/stencil buffers). So, the list of EGLConfigs you get is essentially a list of configurations the GPU is able to render to. A subset of those will be suitable for display. The mechanism used for selecting a display-compatible config is to look at the EGL_NATIVE_VISUAL_ID and match this to a DRM format ... > [01:36:24.725] Bad/unknown DRM format code 0x. This is already quite suspicious, since drm_output_init_egl() shouldn't be passing anything with 0 in the list? > [01:36:24.725] No EGLConfig matches { win|pbf; XRGB }. I would also expect to see ARGB in this list ... >id: 9 rgba: 8 8 8 8 buf: 32 dep: 24 stcl: 8 int: 0-10 type: > win|pix|pbf|swap_preserved vis_id: ARGB (0x34325241) This is an ARGB config (note the rgba: 8 8 8 8) and the vis_id ... >id: 38 rgba: 8 8 8 0 buf: 24 dep: 24 stcl: 8 int: 0-10 type: > win|pix|pbf|swap_preserved vis_id: ARGB (0x34325241) And this is an XRGB config, but incorrectly declared to have a native visual ID of ARGB. >id: 41 rgba: 8 8 8 0 buf: 24 dep: 24 stcl: 8 int: 0-10 type: > win|pix|pbf|ms_resolve_box|swap_preserved vis_id: ARGB (0x34325241) These and the following are not relevant as they are multi-sampled and/or depth/stencil configs. So, my thoughts are: - how do you get to 'Bad/unknown DRM format code 0x' given that drm_output_init_egl() explicitly prunes the list for this? - why is DRM_FORMAT_ARGB not found as a fallback format for DRM_FORMAT_XRGB, given that we should be getting that through fallback_format_for()? - your EGL stack is buggy, because it declares a 8880 format (i.e. XRGB) to be ARGB, so fixing that would likely solve the immediate problem, but the fallbacks above should work Best of luck. Cheers, Daniel
Re: FW: xrandr and xwayland
Hi Guillermo, On Fri, 6 Aug 2021 at 10:44, Guillermo Rodriguez Garcia wrote: > El vie, 6 ago 2021 a las 10:14, Daniel Stone () > escribió: >> kiosk-shell is something we have in newer versions of Weston which >> sounds like it would work well for your usecases - it's designed to >> just run a single application fullscreen. You might want to check out >> what we have in git, which will be released as 10.0 in a few weeks' >> time. > > I have a use case for this which is conceptually one single application, > fullscreen, no desktop stuff (navigation bar, window management etc) but > needs to support additional processes with separate top-level windows. This > would be used e.g. to overlay a video stream (using gstreamer) on top of the > "main" application. Will this be supported by kiosk-shell ? For clients to be able to position themselves relative to other clients, wl_subcompositor gives you the subsurface mechanism for embedding. This was designed for this exact usecase: an application embedding media content in its own top-level window. Using this is very strongly recommended. If you are unable to do this for whatever reason, then you will need to customise the window manager - in this case, kiosk-shell. We are planning to extend this with Lua scripting to make this easier, but have no firmly-defined ETA for this right now. Cheers, Daniel
Re: FW: xrandr and xwayland
Hi David, On Thu, 5 Aug 2021 at 22:17, David Deyo wrote: > > Sounds like you're missing wl_display_flush() in your client code, so the > > requests don't make it to the socket buffer until they're forced to because > > it's filled up. > > That did it. You guys are awesome. I don’t suppose there’s a Weston doc > somewhere that would have told me that, had I looked. It's a little bit buried, but this is the best explanation of how to integrate Wayland into an event loop, as you would with a toolkit: https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__display_1a40039c1169b153269a3dc0796a54ddb0 If you scroll up to the main wl_display section, it explains how event queues are used as well. Broadly speaking, the advice is to: - when active, process your events and send any requests - immediately before you go into a passive state (waiting for events, sleeping, etc), flush the display so your requests get delivered - run the prepare_read_queue() / dispatch_queue_pending() loop immediately before sleeping, in order to make sure you get all events queued for you, then flush again in case you've queued any requests from your event handlers - poll on the Wayland display FD as well as any other activity sources (other event queues, timers, etc) - when you wake up, dispatch your Wayland event queue as well as other relevant event sources > > Also, my taskbar is the wrong length and my background is black. Other > > than that, pretty cool. > > Yep, desktop-shell isn't designed to handle runtime rotation. It could be > made to pretty easily by working on the client code. For your case though I'd > assume something like kiosk-shell would be a much better bet. kiosk-shell is something we have in newer versions of Weston which sounds like it would work well for your usecases - it's designed to just run a single application fullscreen. You might want to check out what we have in git, which will be released as 10.0 in a few weeks' time. The rotation patches never got merged because we had some issues with the IIO integration in particular, but having runtime rotation tests sure would be nice, and kiosk-shell should at least be a lot easier to fix than desktop-shell, if it does even need any fixes. Cheers, Daniel
Re: FW: xrandr and xwayland
On Thu, 5 Aug 2021 at 21:15, David Deyo wrote: > I was able to re-create the files of your patch and added them into my > build tree. > > > > Not having an accelerometer, I’ve had to make a few changes. > > When you said, ‘It had issues’, I am also seeing some issues. > > > > I can rotate my screen, but something about the callback > (weston_rotate_rotate) is acting strange. > > I have added a loop in autorotate that calls weston_rotate_rotate every 10 > secs. I am logging out to Weston_log. > > It seems I only get those logs once every 15-30 minutes and when I do, > it’s hundreds of logs. What’s up with that? > Sounds like you're missing wl_display_flush() in your client code, so the requests don't make it to the socket buffer until they're forced to because it's filled up. > Also, my taskbar is the wrong length and my background is black. Other > than that, pretty cool. > Yep, desktop-shell isn't designed to handle runtime rotation. It could be made to pretty easily by working on the client code. For your case though I'd assume something like kiosk-shell would be a much better bet. Cheers, Daniel
Re: Proxying Wayland for security
Hi Alyssa, On Tue, 27 Jul 2021 at 20:30, Alyssa Ross wrote: > Hi! I'm Alyssa and I'm working on Spectrum[1], which is a project > aiming to create a compartmentalized desktop Linux system, with high > levels of isolation between applications. I've seen, it's neat! > One big issue for us is protecting the system against potentially > malicious Wayland clients. It's important that a compartmentalized > application can't read from the clipboard or take a screenshot of the > whole desktop without user consent. (The latter is possible in > wlroots compositors with wlr-screencopy.) > > So an idea I had was to was to write a proxy program that would sit > in front of the compositor, and receive connections from clients. If > a client sent a wl_data_offer::receive, for example, the proxy could > ask for user confirmation before forwarding that to the compositor. As you've noted, the core protocol doesn't offer any way to scrape these contents without additional extension protocols, which are not implemented by all compositors. Generally speaking, GNOME's Mutter and Weston tend not to implement these protocols, and wlroots-based compositors tend to implement them. > I could just implement this stuff in a compositor, but doing it with a > proxy would mean that a known subset of the protocol could be used > with any compositor, with appropriate access controls. It would also > be a reusable component that could be customised to have different > access control policy depending on the needs of a distributor or user. I think you'd be better off dealing with the problem at source. libwayland-server already has a 'global filter' mechanism which allows an arbitrary hook to decide which clients can and cannot see which interfaces. This is used to advertise private protocols to trusted clients: for example, Weston's UI and screenshot support are handled by external clients, but we use the filter to ensure that _only_ those clients can access those protocols, not arbitrary clients. Implementing a proxy just shifts this problem rather than solving it once; every time someone adds a new protocol, you have to plumb it through the proxy and add some kind of policy mechanism. But the compositors themselves probably have their own policy mechanism, so why not just reuse that? Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
On Mon, Jun 21, 2021 at 12:16:55PM +0200, Christian König wrote: > Am 18.06.21 um 20:45 schrieb Daniel Vetter: > > On Fri, Jun 18, 2021 at 8:02 PM Christian König > > wrote: > > > Am 18.06.21 um 19:20 schrieb Daniel Vetter: > > > [SNIP] > > > The whole thing was introduced with this commit here: > > > > > > commit f2c24b83ae90292d315aa7ac029c6ce7929e01aa > > > Author: Maarten Lankhorst > > > Date: Wed Apr 2 17:14:48 2014 +0200 > > > > > > drm/ttm: flip the switch, and convert to dma_fence > > > > > > Signed-off-by: Maarten Lankhorst > > > > > >int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, > > > > > > - bo->sync_obj = driver->sync_obj_ref(sync_obj); > > > + reservation_object_add_excl_fence(bo->resv, fence); > > > if (evict) { > > > > > > Maarten replaced the bo->sync_obj reference with the dma_resv exclusive > > > fence. > > > > > > This means that we need to apply the sync_obj semantic to all drivers > > > using a DMA-buf with its dma_resv object, otherwise you break imports > > > from TTM drivers. > > > > > > Since then and up till now the exclusive fence must be waited on and > > > never replaced with anything which signals before the old fence. > > > > > > Maarten and I think Thomas did that and I was always assuming that you > > > know about this design decision. > > Surprisingly I do actually know this. > > > > Still the commit you cite did _not_ change any of the rules around > > dma_buf: Importers have _no_ obligation to obey the exclusive fence, > > because the buffer is pinned. None of the work that Maarten has done > > has fundamentally changed this contract in any way. > > Well I now agree that the rules around dma_resv are different than I > thought, but this change should have raised more eyebrows. > > The problem is this completely broke interop with all drivers using TTM and > I think might even explain some bug reports. > > I re-introduced the moving fence by adding bo->moving a few years after the > initial introduction of dma_resv, but that was just to work around > performance problems introduced by using the exclusive fence for both use > cases. Ok that part is indeed not something I've known. > > If amdgpu (or any other ttm based driver) hands back and sgt without > > waiting for ttm_bo->moving or the exclusive fence first, then that's a > > bug we need to fix in these drivers. But if ttm based drivers did get > > this wrong, then they got this wrong both before and after the switch > > over to using dma_resv - this bug would go back all the way to Dave's > > introduction of drm_prime.c and support for that. > > I'm not 100% sure, but I think before the switch to the dma_resv object > drivers just waited for the BOs to become idle and that should have > prevented this. > > Anyway let's stop discussing history and move forward. Sending patches for > all affected TTM driver with CC: stable tags in a minute. > > > > The only thing which importers have to do is not wreak the DAG nature > > of the dma_resv fences and drop dependencies. Currently there's a > > handful of drivers which break this (introduced over the last few > > years), and I have it somewhere on my todo list to audit them all. > > Please give that some priority. > > Ignoring the moving fence is a information leak, but messing up the DAG > gives you access to freed up memory. Yeah will try to. I've also been hung up a bit on how to fix that, but I think just closing the DAG-breakage is simplest. Any userspace which then complains about the additional sync that causes would then be motivated to look into the import ioctl Jason has. And I think the impact in practice should be minimal, aside from some corner cases. > > The goal with extracting dma_resv from ttm was to make implicit sync > > working and get rid of some terrible stalls on the userspace side. > > Eventually it was also the goal to make truly dynamic buffer > > reservation possible, but that took another 6 or so years to realize > > with your work. And we had to make dynamic dma-buf very much opt-in, > > because auditing all the users is very hard work and no one > > volunteered. And for dynamic dma-buf the rule is that the exclusive > > fence must _never_ be ignored, and the two drivers supporting it (mlx5 > > and amdgpu) obey that. > > > > So yeah for ttm drivers dma_resv is primarily for memory management, > > with a side effect of also
Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
On Fri, Jun 18, 2021 at 8:02 PM Christian König wrote: > > Am 18.06.21 um 19:20 schrieb Daniel Vetter: > > On Fri, Jun 18, 2021 at 6:43 PM Christian König > > wrote: > >> Am 18.06.21 um 17:17 schrieb Daniel Vetter: > >>> [SNIP] > >>> Ignoring _all_ fences is officially ok for pinned dma-buf. This is > >>> what v4l does. Aside from it's definitely not just i915 that does this > >>> even on the drm side, we have a few more drivers nowadays. > >> No it seriously isn't. If drivers are doing this they are more than broken. > >> > >> See the comment in dma-resv.h > >> > >>* Based on bo.c which bears the following copyright notice, > >>* but is dual licensed: > >> > >> > >> > >> The handling in ttm_bo.c is and always was that the exclusive fence is > >> used for buffer moves. > >> > >> As I said multiple times now the *MAIN* purpose of the dma_resv object > >> is memory management and *NOT* synchronization. > >> > >> Those restrictions come from the original design of TTM where the > >> dma_resv object originated from. > >> > >> The resulting consequences are that: > >> > >> a) If you access the buffer without waiting for the exclusive fence you > >> run into a potential information leak. > >> We kind of let that slip for V4L since they only access the buffers > >> for writes, so you can't do any harm there. > >> > >> b) If you overwrite the exclusive fence with a new one without waiting > >> for the old one to signal you open up the possibility for userspace to > >> access freed up memory. > >> This is a complete show stopper since it means that taking over the > >> system is just a typing exercise. > >> > >> > >> What you have done by allowing this in is ripping open a major security > >> hole for any DMA-buf import in i915 from all TTM based driver. > >> > >> This needs to be fixed ASAP, either by waiting in i915 and all other > >> drivers doing this for the exclusive fence while importing a DMA-buf or > >> by marking i915 and all other drivers as broken. > >> > >> Sorry, but if you allowed that in you seriously have no idea what you > >> are talking about here and where all of this originated from. > > Dude, get a grip, seriously. dma-buf landed in 2011 > > > > commit d15bd7ee445d0702ad801fdaece348fdb79e6581 > > Author: Sumit Semwal > > Date: Mon Dec 26 14:53:15 2011 +0530 > > > > dma-buf: Introduce dma buffer sharing mechanism > > > > and drm prime landed in the same year > > > > commit 3248877ea1796915419fba7c89315fdbf00cb56a > > (airlied/drm-prime-dmabuf-initial) > > Author: Dave Airlie > > Date: Fri Nov 25 15:21:02 2011 + > > > > drm: base prime/dma-buf support (v5) > > > > dma-resv was extracted much later > > > > commit 786d7257e537da0674c02e16e3b30a44665d1cee > > Author: Maarten Lankhorst > > Date: Thu Jun 27 13:48:16 2013 +0200 > > > > reservation: cross-device reservation support, v4 > > > > Maarten's patch only extracted the dma_resv stuff so it's there, > > optionally. There was never any effort to roll this out to all the > > existing drivers, of which there were plenty. > > > > It is, and has been since 10 years, totally fine to access dma-buf > > without looking at any fences at all. From your pov of a ttm driver > > dma-resv is mainly used for memory management and not sync, but I > > think that's also due to some reinterpretation of the actual sync > > rules on your side. For everyone else the dma_resv attached to a > > dma-buf has been about implicit sync only, nothing else. > > No, that was way before my time. > > The whole thing was introduced with this commit here: > > commit f2c24b83ae90292d315aa7ac029c6ce7929e01aa > Author: Maarten Lankhorst > Date: Wed Apr 2 17:14:48 2014 +0200 > > drm/ttm: flip the switch, and convert to dma_fence > > Signed-off-by: Maarten Lankhorst > > int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo, > > - bo->sync_obj = driver->sync_obj_ref(sync_obj); > + reservation_object_add_excl_fence(bo->resv, fence); > if (evict) { > > Maarten replaced the bo->sync_obj reference with the dma_resv exclusive > fence. > > This means that we need to apply the sync_obj semantic to all drivers > using a DMA-buf with its dma_resv object, otherwise you br
Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
Sorry for the mobile reply, but V4L2 is absolutely not write-only; there has never been an intersection of V4L2 supporting dmabuf and not supporting reads. I see your point about the heritage of dma_resv but it’s a red herring. It doesn’t matter who’s right, or who was first, or where the code was extracted from. It’s well defined that amdgpu defines resv to be one thing, that every other non-TTM user defines it to be something very different, and that the other TTM users define it to be something in the middle. We’ll never get to anything workable if we keep arguing who’s right. Everyone is wrong, because dma_resv doesn’t globally mean anything. It seems clear that there are three classes of synchronisation barrier (not using the ‘f’ word here), in descending exclusion order: - memory management barriers (amdgpu exclusive fence / ttm_bo->moving) - implicit synchronisation write barriers (everyone else’s exclusive fences, amdgpu’s shared fences) - implicit synchronisation read barriers (everyone else’s shared fences, also amdgpu’s shared fences sometimes) I don’t see a world in which these three uses can be reduced to two slots. What also isn’t clear to me though, is how the memory-management barriers can exclude all other access in the original proposal with purely userspace CS. Retaining the three separate modes also seems like a hard requirement to not completely break userspace, but then I don’t see how three separate slots would work if they need to be temporally ordered. amdgpu fixed this by redefining the meaning of the two slots, others fixed this by not doing one of the three modes. So how do we square the circle without encoding a DAG into the kernel? Do the two slots need to become a single list which is ordered by time + ‘weight’ and flattened whenever modified? Something else? Have a great weekend. -d > On 18 Jun 2021, at 5:43 pm, Christian König wrote: > > Am 18.06.21 um 17:17 schrieb Daniel Vetter: >> [SNIP] >> Ignoring _all_ fences is officially ok for pinned dma-buf. This is >> what v4l does. Aside from it's definitely not just i915 that does this >> even on the drm side, we have a few more drivers nowadays. > > No it seriously isn't. If drivers are doing this they are more than broken. > > See the comment in dma-resv.h > > * Based on bo.c which bears the following copyright notice, > * but is dual licensed: > > > > The handling in ttm_bo.c is and always was that the exclusive fence is used > for buffer moves. > > As I said multiple times now the *MAIN* purpose of the dma_resv object is > memory management and *NOT* synchronization. > > Those restrictions come from the original design of TTM where the dma_resv > object originated from. > > The resulting consequences are that: > > a) If you access the buffer without waiting for the exclusive fence you run > into a potential information leak. > We kind of let that slip for V4L since they only access the buffers for > writes, so you can't do any harm there. > > b) If you overwrite the exclusive fence with a new one without waiting for > the old one to signal you open up the possibility for userspace to access > freed up memory. > This is a complete show stopper since it means that taking over the > system is just a typing exercise. > > > What you have done by allowing this in is ripping open a major security hole > for any DMA-buf import in i915 from all TTM based driver. > > This needs to be fixed ASAP, either by waiting in i915 and all other drivers > doing this for the exclusive fence while importing a DMA-buf or by marking > i915 and all other drivers as broken. > > Sorry, but if you allowed that in you seriously have no idea what you are > talking about here and where all of this originated from. > > Regards, > Christian. ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
On Fri, Jun 18, 2021 at 6:43 PM Christian König wrote: > > Am 18.06.21 um 17:17 schrieb Daniel Vetter: > > [SNIP] > > Ignoring _all_ fences is officially ok for pinned dma-buf. This is > > what v4l does. Aside from it's definitely not just i915 that does this > > even on the drm side, we have a few more drivers nowadays. > > No it seriously isn't. If drivers are doing this they are more than broken. > > See the comment in dma-resv.h > > * Based on bo.c which bears the following copyright notice, > * but is dual licensed: > > > > The handling in ttm_bo.c is and always was that the exclusive fence is > used for buffer moves. > > As I said multiple times now the *MAIN* purpose of the dma_resv object > is memory management and *NOT* synchronization. > > Those restrictions come from the original design of TTM where the > dma_resv object originated from. > > The resulting consequences are that: > > a) If you access the buffer without waiting for the exclusive fence you > run into a potential information leak. > We kind of let that slip for V4L since they only access the buffers > for writes, so you can't do any harm there. > > b) If you overwrite the exclusive fence with a new one without waiting > for the old one to signal you open up the possibility for userspace to > access freed up memory. > This is a complete show stopper since it means that taking over the > system is just a typing exercise. > > > What you have done by allowing this in is ripping open a major security > hole for any DMA-buf import in i915 from all TTM based driver. > > This needs to be fixed ASAP, either by waiting in i915 and all other > drivers doing this for the exclusive fence while importing a DMA-buf or > by marking i915 and all other drivers as broken. > > Sorry, but if you allowed that in you seriously have no idea what you > are talking about here and where all of this originated from. Dude, get a grip, seriously. dma-buf landed in 2011 commit d15bd7ee445d0702ad801fdaece348fdb79e6581 Author: Sumit Semwal Date: Mon Dec 26 14:53:15 2011 +0530 dma-buf: Introduce dma buffer sharing mechanism and drm prime landed in the same year commit 3248877ea1796915419fba7c89315fdbf00cb56a (airlied/drm-prime-dmabuf-initial) Author: Dave Airlie Date: Fri Nov 25 15:21:02 2011 + drm: base prime/dma-buf support (v5) dma-resv was extracted much later commit 786d7257e537da0674c02e16e3b30a44665d1cee Author: Maarten Lankhorst Date: Thu Jun 27 13:48:16 2013 +0200 reservation: cross-device reservation support, v4 Maarten's patch only extracted the dma_resv stuff so it's there, optionally. There was never any effort to roll this out to all the existing drivers, of which there were plenty. It is, and has been since 10 years, totally fine to access dma-buf without looking at any fences at all. From your pov of a ttm driver dma-resv is mainly used for memory management and not sync, but I think that's also due to some reinterpretation of the actual sync rules on your side. For everyone else the dma_resv attached to a dma-buf has been about implicit sync only, nothing else. _only_ when you have a dynamic importer/exporter can you assume that the dma_resv fences must actually be obeyed. That's one of the reasons why we had to make this a completely new mode (the other one was locking, but they really tie together). Wrt your problems: a) needs to be fixed in drivers exporting buffers and failing to make sure the memory is there by the time dma_buf_map_attachment returns. b) needs to be fixed in the importers, and there's quite a few of those. There's more than i915 here, which is why I think we should have the dma_resv_add_shared_exclusive helper extracted from amdgpu. Avoids hand-rolling this about 5 times (6 if we include the import ioctl from Jason). Also I've like been trying to explain this ever since the entire dynamic dma-buf thing started. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
On Fri, Jun 18, 2021 at 4:42 PM Christian König wrote: > > Am 18.06.21 um 16:31 schrieb Daniel Vetter: > > [SNIP] > >> And that drivers choose to ignore the exclusive fence is an absolutely > >> no-go from a memory management and security point of view. Exclusive > >> access means exclusive access. Ignoring that won't work. > > Yeah, this is why I've been going all over the place about lifting > > ttm_bo->moving to dma_resv. And also that I flat out don't trust your > > audit, if you havent found these drivers then very clearly you didn't > > audit much at all :-) > > I just didn't though that anybody could be so stupid to allow such a > thing in. > > >> The only thing which saved us so far is the fact that drivers doing this > >> are not that complex. > >> > >> BTW: How does it even work? I mean then you would run into the same > >> problem as amdgpu with its page table update fences, e.g. that your > >> shared fences might signal before the exclusive one. > > So we don't ignore any fences when we rip out the backing storage. > > > > And yes there's currently a bug in all these drivers that if you set > > both the "ignore implicit fences" and the "set the exclusive fence" > > flag, then we just break this. Which is why I think we want to have a > > dma_fence_add_shared_exclusive() helper extracted from your amdgpu > > code, which we can then use everywhere to plug this. > > Daniel are you realizing what you are talking about here? Does that also > apply for imported DMA-bufs? > > If yes than that is a security hole you can push an elephant through. > > Can you point me to the code using that? > > >>> For dma-buf this isn't actually a problem, because dma-buf are pinned. You > >>> can't move them while other drivers are using them, hence there's not > >>> actually a ttm_bo->moving fence we can ignore. > >>> > >>> p2p dma-buf aka dynamic dma-buf is a different beast, and i915 (and fwiw > >>> these other drivers) need to change before they can do dynamic dma-buf. > >>> > >>>> Otherwise we have an information leak worth a CVE and that is certainly > >>>> not > >>>> something we want. > >>> Because yes otherwise we get a CVE. But right now I don't think we have > >>> one. > >> Yeah, agree. But this is just because of coincident and not because of > >> good engineering :) > > Well the good news is that I think we're now talking slightly less > > past each another than the past few weeks :-) > > > >>> We do have a quite big confusion on what exactly the signaling ordering is > >>> supposed to be between exclusive and the collective set of shared fences, > >>> and there's some unifying that needs to happen here. But I think what > >>> Jason implements here in the import ioctl is the most defensive version > >>> possible, so really can't break any driver. It really works like you have > >>> an ad-hoc gpu engine that does nothing itself, but waits for the current > >>> exclusive fence and then sets the exclusive fence with its "CS" completion > >>> fence. > >>> > >>> That's imo perfectly legit use-case. > >> The use case is certainly legit, but I'm not sure if merging this at the > >> moment is a good idea. > >> > >> Your note that drivers are already ignoring the exclusive fence in the > >> dma_resv object was eye opening to me. And I now have the very strong > >> feeling that the synchronization and the design of the dma_resv object > >> is even more messy then I thought it is. > >> > >> To summarize we can be really lucky that it didn't blow up into our > >> faces already. > > I don't think there was that much luck involved (ok I did find a > > possible bug in i915 already around cpu cache flushing) - for SoC the > > exclusive slot in dma_resv really is only used for implicit sync and > > nothing else. The fun only starts when you throw in pipelined backing > > storage movement. > > > > I guess this also explains why you just seemed to ignore me when I was > > asking for a memory management exclusive fence for the p2p stuff, or > > some other way to specifically handling movements (like ttm_bo->moving > > or whatever it is). From my pov we clearly needed that to make p2p > > dma-buf work well enough, mixing up the memory management exclusive > > slot with the implicit sync exclusive slot never looked like a bright
Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
On Fri, Jun 18, 2021 at 11:15 AM Christian König wrote: > > Am 17.06.21 um 21:58 schrieb Daniel Vetter: > > On Thu, Jun 17, 2021 at 09:37:36AM +0200, Christian König wrote: > >> [SNIP] > >>> But, to the broader point, maybe? I'm a little fuzzy on exactly where > >>> i915 inserts and/or depends on fences. > >>> > >>>> When you combine that with complex drivers which use TTM and buffer > >>>> moves underneath you can construct an information leak using this and > >>>> give userspace access to memory which is allocated to the driver, but > >>>> not yet initialized. > >>>> > >>>> This way you can leak things like page tables, passwords, kernel data > >>>> etc... in large amounts to userspace and is an absolutely no-go for > >>>> security. > >>> Ugh... Unfortunately, I'm really out of my depth on the implications > >>> going on here but I think I see your point. > >>> > >>>> That's why I'm said we need to get this fixed before we upstream this > >>>> patch set here and especially the driver change which is using that. > >>> Well, i915 has had uAPI for a while to ignore fences. > >> Yeah, exactly that's illegal. > > You're a few years too late with closing that barn door. The following > > drives have this concept > > - i915 > > - msm > > - etnaviv > > > > Because you can't write a competent vulkan driver without this. > > WHAT? ^^ > > > This was discussed at absolute epic length in various xdcs iirc. We did > > ignore a > > bit the vram/ttm/bo-moving problem because all the people present were > > hacking on integrated gpu (see list above), but that just means we need to > > treat the ttm_bo->moving fence properly. > > I should have visited more XDCs in the past, the problem is much larger > than this. > > But I now start to understand what you are doing with that design and > why it looks so messy to me, amdgpu is just currently the only driver > which does Vulkan and complex memory management at the same time. > > >> At least the kernel internal fences like moving or clearing a buffer object > >> needs to be taken into account before a driver is allowed to access a > >> buffer. > > Yes i915 needs to make sure it never ignores ttm_bo->moving. > > No, that is only the tip of the iceberg. See TTM for example also puts > fences which drivers needs to wait for into the shared slots. Same thing > for use cases like clear on release etc > > From my point of view the main purpose of the dma_resv object is to > serve memory management, synchronization for command submission is just > a secondary use case. > > And that drivers choose to ignore the exclusive fence is an absolutely > no-go from a memory management and security point of view. Exclusive > access means exclusive access. Ignoring that won't work. Yeah, this is why I've been going all over the place about lifting ttm_bo->moving to dma_resv. And also that I flat out don't trust your audit, if you havent found these drivers then very clearly you didn't audit much at all :-) > The only thing which saved us so far is the fact that drivers doing this > are not that complex. > > BTW: How does it even work? I mean then you would run into the same > problem as amdgpu with its page table update fences, e.g. that your > shared fences might signal before the exclusive one. So we don't ignore any fences when we rip out the backing storage. And yes there's currently a bug in all these drivers that if you set both the "ignore implicit fences" and the "set the exclusive fence" flag, then we just break this. Which is why I think we want to have a dma_fence_add_shared_exclusive() helper extracted from your amdgpu code, which we can then use everywhere to plug this. > > For dma-buf this isn't actually a problem, because dma-buf are pinned. You > > can't move them while other drivers are using them, hence there's not > > actually a ttm_bo->moving fence we can ignore. > > > > p2p dma-buf aka dynamic dma-buf is a different beast, and i915 (and fwiw > > these other drivers) need to change before they can do dynamic dma-buf. > > > >> Otherwise we have an information leak worth a CVE and that is certainly not > >> something we want. > > Because yes otherwise we get a CVE. But right now I don't think we have > > one. > > Yeah, agree. But this is just because of coincident and not because of > good engineering :) Well the good news is that I think we're now talking slightly less past each another than the p
Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)
On Thu, Jun 17, 2021 at 09:37:36AM +0200, Christian König wrote: > Am 16.06.21 um 20:30 schrieb Jason Ekstrand: > > On Tue, Jun 15, 2021 at 3:41 AM Christian König > > wrote: > > > Hi Jason & Daniel, > > > > > > maybe I should explain once more where the problem with this approach is > > > and why I think we need to get that fixed before we can do something > > > like this here. > > > > > > To summarize what this patch here does is that it copies the exclusive > > > fence and/or the shared fences into a sync_file. This alone is totally > > > unproblematic. > > > > > > The problem is what this implies. When you need to copy the exclusive > > > fence to a sync_file then this means that the driver is at some point > > > ignoring the exclusive fence on a buffer object. > > Not necessarily. Part of the point of this is to allow for CPU waits > > on a past point in buffers timeline. Today, we have poll() and > > GEM_WAIT both of which wait for the buffer to be idle from whatever > > GPU work is currently happening. We want to wait on something in the > > past and ignore anything happening now. > > Good point, yes that is indeed a valid use case. > > > But, to the broader point, maybe? I'm a little fuzzy on exactly where > > i915 inserts and/or depends on fences. > > > > > When you combine that with complex drivers which use TTM and buffer > > > moves underneath you can construct an information leak using this and > > > give userspace access to memory which is allocated to the driver, but > > > not yet initialized. > > > > > > This way you can leak things like page tables, passwords, kernel data > > > etc... in large amounts to userspace and is an absolutely no-go for > > > security. > > Ugh... Unfortunately, I'm really out of my depth on the implications > > going on here but I think I see your point. > > > > > That's why I'm said we need to get this fixed before we upstream this > > > patch set here and especially the driver change which is using that. > > Well, i915 has had uAPI for a while to ignore fences. > > Yeah, exactly that's illegal. You're a few years too late with closing that barn door. The following drives have this concept - i915 - msm - etnaviv Because you can't write a competent vulkan driver without this. This was discussed at absolute epic length in various xdcs iirc. We did ignore a bit the vram/ttm/bo-moving problem because all the people present were hacking on integrated gpu (see list above), but that just means we need to treat the ttm_bo->moving fence properly. > At least the kernel internal fences like moving or clearing a buffer object > needs to be taken into account before a driver is allowed to access a > buffer. Yes i915 needs to make sure it never ignores ttm_bo->moving. For dma-buf this isn't actually a problem, because dma-buf are pinned. You can't move them while other drivers are using them, hence there's not actually a ttm_bo->moving fence we can ignore. p2p dma-buf aka dynamic dma-buf is a different beast, and i915 (and fwiw these other drivers) need to change before they can do dynamic dma-buf. > Otherwise we have an information leak worth a CVE and that is certainly not > something we want. Because yes otherwise we get a CVE. But right now I don't think we have one. We do have a quite big confusion on what exactly the signaling ordering is supposed to be between exclusive and the collective set of shared fences, and there's some unifying that needs to happen here. But I think what Jason implements here in the import ioctl is the most defensive version possible, so really can't break any driver. It really works like you have an ad-hoc gpu engine that does nothing itself, but waits for the current exclusive fence and then sets the exclusive fence with its "CS" completion fence. That's imo perfectly legit use-case. Same for the export one. Waiting for a previous snapshot of implicit fences is imo perfectly ok use-case and useful for compositors - client might soon start more rendering, and on some drivers that always results in the exclusive slot being set, so if you dont take a snapshot you oversync real bad for your atomic flip. > > Those changes are years in the past. If we have a real problem here (not > > sure on > > that yet), then we'll have to figure out how to fix it without nuking > > uAPI. > > Well, that was the basic idea of attaching flags to the fences in the > dma_resv object. > > In other words you clearly denote when you have to wait for a fence before > accessing a buffer or you cause a security issue. Replied somewhere else, and I do kinda l
Re: weston drm option "--connector" in previous version
Hi RyunHyeon, On Tue, 8 Jun 2021 at 11:55, 김륜현 (RH Kim) wrote: > Hello, I want to output weston to specific display. So I searched about > Weston option. > > > I found "--connector" option. > > This option configures specific display for output > > > However, I checked the mailing list that said would remove the > "--connector" option from the previous version of Weston. > > > So, is there no option to output to a specific display among many displays > in the Weston 8.0.0 that I am currently using? (Is this the part that needs > code modification?) > You should instead configure this through weston.ini, for example: [output] name=HDMI-A-1 mode=1920x1080 [output] name=eDP-1 mode=off You can find more details with 'man weston.ini'. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Wayland and window position/size
Hi, On Wed, 26 May 2021 at 10:30, Pekka Paalanen wrote: > On Tue, 25 May 2021 22:10:38 -0500 Igor Korot wrote: > > > Positioning - Don't position and Wayland discourages it by not having > > > such an > > > API. Sizing - do whatever you like. > > > > It just discourages it, so it is not completely impossible, correct? > > I'd say Wayland strongly discourages client dictated positioning. > > First, there is no Wayland protocol interface that would allow you to > position a window (unless you invent a protocol extension to do it and > then implement it also in any Wayland compositors you want to run on - > some extensions like that exist and their support in different > compositors varies, and they are mostly privileged or reserved for > desktop components instead of apps). It might be possible to play > tricks with existing generic interfaces to *maybe* eventually end up in > some position, but that is extremely fragile and an outright abuse which > might also cause strange UI behaviour. > > Second, Wayland does not define a coordinate system that would be > useful for window positioning. Every window has its own local > coordinate system, but they exist in a "vacuum" and independently of > each other and of monitors. So at most you can position a window > respective to another window, but not globally or per-monitor. > > Third, Wayland does not allow you to find out about other windows or > desktop elements that you might want to stay clear of, nor about > monitor edges (well... not for this purpose). So it's hard to choose > your position properly. > > So, is it possible depends on how badly you are willing to break things > to get there. > > As for sizing, I'd think the xdg-shell protocol extensions are mature. > They allow you to freely size your window when the window state > supports it, and they include the provision for the display server to > tell you what size you should or must be, depending on window state. It > also has an interface for positioning your popups such that they don't > go out of view but also match where they should be respective to your > window. Pekka's answer is very thorough. As a much shorter version: just use the XDG extensions which already exist for popup/dialog windows. Your application is not the only one which needs to request credentials - so do browsers, mail clients, file managers, and just about everything else. The X11 model is that every application has its own semantics for doing this and decides exactly how to place the window. The client tells the server: place this window at these co-ordinates, at this z position, and give me all the input until I tell you otherwise. The Wayland model is that you tell the Wayland server that you would like to present a dialog or popup window, and which top-level window it should be attached to. It then handles those windows in a completely consistent and uniform way between all your applications, including positioning, stacking, and focus. So, just use that. It works. :) Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Change default Wayland branches to 'main'
Hi all, On Thu, 8 Apr 2021 at 12:20, Daniel Stone wrote: > I propose that we do this for all the wayland/* repositories, either this > weekend or next; I'm happy to make the changes (rename 'master' to 'main' > and retarget all open MRs). Does anyone have any opinions or suggestions? > Astute observers will notice that multiple weekends passed, but it's now been done. All MRs against the various repos (wayland, wayland-protocols, weston, etc) have been retargeted. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Split render/display SoCs, Mesa's renderonly, and Wayland dmabuf hints
Just 2 comments on the kernel aspects here. On Tue, Apr 20, 2021 at 12:18 PM Daniel Stone wrote: > > Hi, > > On Mon, 19 Apr 2021 at 13:06, Simon Ser wrote: >> >> I'm working on a Wayland extension [1] that, among other things, allows >> compositors to advertise the preferred device to be used by Wayland >> clients. >> >> In general, compositors will send a render node. However, in the case >> of split render/display SoCs, things get a little bit complicated. >> >> [...] > > > Thanks for the write-up Simon! > >> >> There are a few solutions: >> >> 1. Require compositors to discover the render device by trying to import >>a buffer. For each available render device, the compositor would >>allocate a buffer, export it as a DMA-BUF, import it to the >>display-only device, then try to drmModeAddFB. > > > I don't think this is actually tractable? Assuming that 'allocate a buffer' > means 'obtain a gbm_device for the render node directly and allocate a gbm_bo > from it', even with compatible formats and modifiers this will fail for more > restrictive display hardware. imx-drm and pl111 (combined with vc4 on some > Raspberry Pis) will fail this, since they'll take different allocation paths > when they're bound through kmsro vs. directly, accounting for things like > contiguous allocation. So we'd get false negatives on at least some platforms. > >> >> 2. Allow compositors to query the render device magically opened by >>kmsro. This could be done either via EGL_EXT_device_drm, or via a >>new EGL extension. > > > This would be my strong preference, and I don't entirely understand anholt's > pushback here. The way I see it, GBM is about allocation for scanout, and EGL > is about rendering. If, on a split GPU/display system, we create a gbm_device > from a KMS display-only device node, then creating an EGLDisplay from that > magically binds us to a completely different DRM GPU node, and anything using > that EGLDisplay will use that GPU device to render. > > Being able to discover the GPU device node through the device query is really > useful, because it tells us exactly what implicit magic EGL did under the > hood, and about the device that EGL will use. Being able to discover the > display node is much less useful; it does tell us how GBM will allocate > buffers, but the user already knows which device is in use because they > supplied it to GBM. I see the display node as a property of GBM, and the GPU > node as a property of EGL, even if EGL does do (*waves hands*) stuff under > the hood to ensure the two are compatible. > > If we had EGL_EXT_explicit_device, things get even more weird, I think; would > the device query on an EGLDisplay created with a combination of a gbm_device > native display handle and an explicit EGLDevice handle return the scanout > device from GBM or the GPU device from EGL? On my reading, I'd expect it to > be the latter; if the queries returned very different things based on whether > GPU device selection was implicit (returning the KMS node) or explicit (GPU > node), that would definitely violate the principle of least surprise. > >> >> 3. Allow compositors to query the kernel drivers to know which devices >>are compatible with each other. Some uAPI to query a compatible >>display device from a render-only device, or vice-versa, has been >>suggested in the past. > > > What does 'compatible' mean? Would an Intel iGPU and and AMD dGPU be > compatible with each other? Would a Mali GPU bound to system memory through > AMBA be as compatible with the display controller as it would with an AMD GPU > on PCIE? I think a query which only exposed whether or not devices could > share dmabufs with each other is far too generic to be helpful for the actual > usecase we have, as well as not being useful enough for other usecases ('well > you _can_ use dmabufs from your AMD GPU on your Mali GPU, but only if they > were allocated in the right domain'). > >> >> (1) has a number of limitations and gotchas. It requires allocating >> real buffers, this has a rather big cost for something done at >> compositor initialization time. It requires to select a buffer format >> and modifier compatible with both devices, so it can't be isolated in >> a simple function (and e.g. shared between all compositors in libdrm). > > > We're already going to have to do throwaway allocations to make Intel's tiled > modes work; I'd rather not extend this out to doing throwaway allocations > across device combinations as well as modifier lists. > >> >> Some drivers will allow to drm
Re: Split render/display SoCs, Mesa's renderonly, and Wayland dmabuf hints
sn't have GBM in the same way, right? In the Vulkan case, we already know exactly what the GPU is, because it's the VkPhysicalDevice you had to explicitly select to create the VkDevice etc; if you're using GBM it's because you've _also_ created a gbm_device for the KMS node and are allocating gbm_bos to import to VkDeviceMemory/VkImage, so you already have both pieces of information. (If you're creating VkDeviceMemory/VkImage in Vulkan then exporting dmabuf from there, since there's no way to specify a target device, it's a blind guess as to whether it'll actually work for KMS. Maybe it will! But maybe not.) > I don't know how feasible (3) is. The kernel drivers must be able to > decide whether buffers coming from another driver can be scanned out, > but how early can they give an answer? Can they give an answer solely > based on a DRM node, and not a DMA-BUF? > Maybe! But maybe not. > Feedback is welcome. Do you agree with the premise that compositors > need access to the render node? Yes, strongly. Compositors may optimise for direct paths (e.g. direct scanout of client buffers through KMS, directly providing client buffers to media codecs for streaming) where possible. But they must always have a 'device of last resort': if these optimal paths are not possible (your codec doesn't like your client buffers, you can't do direct scanout because a notification occluded your client content and you've run out of overlay planes, you're on Intel and your display FIFO size is measured in bits), the compositor needs to know that it can access the client buffers somehow. This is done by always importing into a GPU device - for most current compositors as an EGLImage, for some others as a VkImage - and falling back to GL composition paths, or GL blits, or even ReadPixels if strictly necessary, so your client content continues to be accessible. There is no way to do this without telling the client what that GPU device node is, so it can allocate accordingly. Thanks to the implicit device selection performed when creating an EGLDisplay from a gbm_device, we cannot currently discover what that device node is. > Do you have any other potential solution in mind? I can't think of any right now, but am open to hearing them. > Which solution would you prefer? For all the reasons above, strongly #2, i.e. that querying the DRM device node from the EGLDevice returned by querying an EGLDisplay created from a gbm_device, returns the GPU device's render node and not the KMS device's primary node. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Change default Wayland branches to 'main'
On Thu, 8 Apr 2021 at 14:02, Jan Engelhardt wrote: > On Thursday 2021-04-08 13:20, Daniel Stone wrote: > >I propose that we do this for all the wayland/* repositories, either this > weekend or next; I'm happy > >to make the changes (rename 'master' to 'main' and retarget all open > MRs). Does anyone have any > >opinions or suggestions? > > That could be offensive to some people. Some might even be offended by not > being offended. > I had hoped that 'serious suggestions' was implicit, but maybe not. Do you have any serious suggestions? ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Change default Wayland branches to 'main'
Hi all, Going with a lot of other Git-based projects (and following the leads of e.g. GitHub and GitLab), freedesktop.org is planning to change the default branch name for its new projects to 'main' rather than 'master'. Mesa is already migrating, and they have helpfully prepared a small Python script which will retarget open MRs from 'master' to 'main'. I propose that we do this for all the wayland/* repositories, either this weekend or next; I'm happy to make the changes (rename 'master' to 'main' and retarget all open MRs). Does anyone have any opinions or suggestions? Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Managing surfaces with 2 different wl_displays
Hi Nimi, On Mon, 21 Dec 2020 at 18:50, Nimi Wariboko Jr. wrote: > The crux of the issue is that when we create our OpenGL context we create it > with a display connection that we initiate. Further down the line when we > need to create a wl_egl_window, we are given a wl_surface that was created by > a different call to wl_display_connect. Now physically these displays are the > same, but it seems when trying to use the foreign surface with our internal > wl_display causes this whole thing to not work. I get the following error in > Weston 8: > > [17:54:27.891] libwayland: invalid object (6), type > (zwp_relative_pointer_manager_v1), message attach(?oii) > [17:54:27.893] libwayland: error in client communication (pid 1340) > > My hunch is that calling certain wl_surface functions with a display other > than the one that created it is unsupported in weston. I'm primarily mailing > this list to confirm this is true and that there are no other possible work > arounds. You're right. It's not just Weston, but Wayland itself. Objects are local to each wl_display, and may not be shared between different instances of a wl_display, even if it's the same server at the other end of the connection. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: about wl_display_poll
Hi Leo, On Mon, 9 Nov 2020 at 16:17, enpeng xu wrote: > I have a question about the functions of wl_display_dispatch/wl_display_poll. > In the function wl_display_poll, it ignores EINTR and keeps polling until it > gets an event from remote. > I am not sure if this is the expected, but a typical user case is, user setup > a signal handler and run a event loop by calling wl_display_dispatch_queue, > It expects the signal handler to set a stop flag and event loop will abort > the loop if the stop is true. but in practice, wl_display_poll will keep > being blocked and there is no way to exit the event loop. > > Is there any reason behind in doing it? or should we provide an > interruptible version of wl_display_dispatch ? You can easily write your own interruptible/non-dominant version, you just have to expand what that function does internally: https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__display_1a40039c1169b153269a3dc0796a54ddb0 Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Trying to reduce boot time for weston and logo from weston
Hi, On Sat, 1 Aug 2020 at 06:28, Vadivelu Babu, Surendar (S.) wrote: > As Arun stated , we tried to boot the Weston application from initramfs , > however there is “pam” library dependency which is required for Weston . When > we include all the “pam” libraries the initramfs size increases from 9Mb to > 32 MB . > > Following are the few things which we tried to remove pam library from Weston > : > > 1. PACKAGECONFIG_remove= "pam" in Weston bbappend file > 2. EXTRA_OECONF_append = " --disable-pam " in Weston bbappend file > > In both the cases , still weston requires libpam library dependencies > > Kindly advise whether it is possible to remove pam libraries from weston > application , and if not how can we achieve integration of Weston into > initramfs without increasing the size of the same. Yocto (I believe OE core) carries a patch which removes PAM usage in Weston; you might want to pick that up. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Trying to reduce boot time for weston and logo from weston
Hi Arunkumar, On Mon, 27 Jul 2020 at 06:59, arunkrish20 wrote: > We are working with the i.MX8 platform. We are working with weston and DRM > backend. Below are the version details. > > NXP BSP Version: 4.14.98_2.0.0_ga > SC Firmware Version : 1.3.1 > wayland version 1.16 am > weston- ivi - 5.0.0 We are currently working our way towards releasing Weston 9.0.0, so this version is quite old. > Our requirement is to display the first screen in 2 Seconds. > > In the current environment we are able to see the first screen in the 6th > seconds. Ouch, that's quite long. > We tried to boot the weston in initramfs. But due to size constraints we are > not able to. Size comes around 65MB. > > Is there any possibility for reducing the size of weston? Weston itself with the DRM + GL backends only takes around 750kB once installed. I assume the 65MB comes from extra dependencies, but that is something you would have to investigate and configure in your Yocto build. Weston itself does not have many dependencies, and those dependencies are not large. > Weston taking 500ms to complete the initialization. Can we reduce this > timing? e.g if we block unwanted device probing etc, any idea? > > In case if we use separate drm based rendering application for the first > screen and switching to weston are seeing blank. Instead of clearing the > weston screen buffer can we have a logo on first rendering. so that blank can > be avoided between the first drm application to weston rendering. NXP has forked Weston and written their own backend, which is responsible for the initialisation (both the time and the blank screen). The default DRM backend is quite quick to come up and be responsive, and doesn't blank the screen. So both these issues are something you'd need to raise with NXP support, as they are due to NXP's changes to our code. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: weston-info as a standalone utility
Hi, On Thu, 9 Jul 2020 at 15:38, Pekka Paalanen wrote: > On Thu, 9 Jul 2020 10:32:56 +0200 > Olivier Fourdan wrote: > > In the meantime, Peter has already submitted patches to wayland-info > > (thanks Peter!) so the tip of wayland-info is different from > > weston-info (basically, we have diverged already). > > > > Eventually, if nobody has objections, we could move that repo to the > > wayland domain… > > thanks for doing this, looks good! > > +1 for having this under the wayland organization in Gitlab. > +1 for deleting weston-info from Weston repository. > > Shall we keep the new repository only for "info" tools, or should > it contain more, like Weston's simple-shm, simple-egl, and a > rewrite of weston-eventdemo that doesn't use toytoolkit? > > I would be fine with moving all "simple" clients from Weston > repository to there if that's appropriate. +1 to all of the above. I'd be happy to see it in a utils and examples repo, with at least the ones you mentioned here. I don't think toytoolkit should ever be pushed in there, because then we run the danger of people thinking it might be a good idea. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [ANNOUNCE] Weston 9.0 release schedule
Hi, On Wed, 1 Jul 2020 at 20:13, Simon Ser wrote: > On Wednesday, July 1, 2020 8:49 PM, Jan Engelhardt wrote: > > Usecases.. checking for releases, both new and, sometimes historic research, > > old ones. > > > > A fileindex has a "tabular" appearance where each "row" contains filename > > and > > date, and that table be sorted primarily by filename with no extra > > grouping, so > > that wayland-* and weston-* releases are not interspersed. > > > > The release page is the antithesis to that: it presents items like a > > blog rather than a filelisting, grouped by date with a static > > reverse-date ordering, and grouping different filenames by date as > > well. Imagine if `ls -l` did that all the time. > > IMHO the releases page is enough for this use-case. The difference is > not worth the time figuring out how to enable file listing (which > requires having access to the infrastructure, and our sysadmins already > have too much on their plate). GitLab Pages doesn't do file listings, so it's not just a matter of enabling the thing in nginx, but we'd have to generate a listing index.html as part of our static site generation. To be honest, I think you're looking in the wrong place though. Every release of both Wayland and Weston always has a tag in git, and git tags are only ever used for releases. So if you want to find out more, just run `git tag -l` on each of the two repositories (or `git ls-remote --tags https://gitlab.freedesktop.org/wayland/(wayland|weston).git`) and have all the information you need? Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Current state of Window Decorations
Hi, On Thu, 25 Jun 2020 at 10:01, Brad Robinson wrote: > As a toolkit developer coming from Windows/OSX this is fairly shocking. I'm > aware of the decoration protocol, but given it's not supported (and by the > sound of it never will be) on some of the major distros makes it almost > worthless. > > Seems odd to offload this responsibility to every toolkit without providing a > mechanism to achieve any consistency. Or... is this just my background in > Windows/OSX where consistency in these areas is really encouraged and this > just isn't expected on Linux? Others have said this well, but the big difference is that Windows and OS X both have complete native toolkits as part of their base platform. Those toolkits implement widgets, things like titlebars, IPC, intra-process message handling and signaling, internationalisation, application lifecycle management, etc etc. That's not something we have: toolkits are instead an optional and interchangeable component. If you use one of the major extant toolkits, then you can reuse all that functionality. This is why even some of the monoliths like Chromium reuse toolkits. If you want to create your own toolkit from scratch and not base on one of the existing ones, then for better or worse, you get to recreate all the functionality they provide. Shifting responsibility for window decorations to the compositor doesn't solve the problem. Yes the compositor would render them for you, but then you have additional protocols (with all the synchronisation issues that implies) for the client and compositor to co-ordinate rendering of the decorations and the content. Neither is objectively better or worse, it's just a different design which inherently brings different tradeoffs. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Language bindings for wl_registry_bind request
Hi, On Thu, 18 Jun 2020 at 07:25, Brad Robinson wrote: > I'm putting together a set of C# bindings for Wayland and it's coming along > nicely but I've hit an issue with wl_registry_bind where its implementation > doesn't seem to match the xml. > > The wayland.xml file declares it as: (essentially one input parameter - name) > > > > Binds a new, client-created object to the server using the > specified name as the identifier. > > > > > > > But the C implementation has additional version and interface parameters and > uses the wl_proxy_marshal_constructor_versioned - with apparently no hints in > the xml as to why. > > [...] > > Similarly the xml file would suggest the message signature should be "un", > but the C bindings have it as "usun". > > What's going on here? Is this a special case for this one method? Yeah, new_id with no interface gets expanded to new_id + interface name + version: https://gitlab.freedesktop.org/wayland/wayland/-/blob/master/src/scanner.c#L1233 Theoretically it applies to anything with that property, but in reality the only user is wl_registry.bind. It is generally not recommended to write your own bindings from scratch, however. When you need to integrate with EGL, you need to pass a pointer to the struct wl_display * for an EGLDisplay, and to wl_surface * for an EGLSurface (via wl_egl_surface). Mesa internally uses the C version of libwayland to send requests and receive events - which will obviously not work with the C# implementation. The same is true of Vulkan, GStreamer, and other libraries which integrate with Wayland. The recommended approach, avoiding these issues, is to wrap the Wayland library using the 'dispatcher' methods provided; there is more background here: https://smithay.github.io/wayland-rs-v-0-21.html Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: can subsurface and shell surface be used together to manage surfaces
Hi, On Mon, 27 Apr 2020 at 10:02, Pekka Paalanen wrote: > On Mon, 27 Apr 2020 15:07:20 +0800 zou lan wrote: > > I read some documents about chrome OS run Android Apks such as > > https://qiangbo-workspace.oss-cn-shanghai.aliyuncs.com/2019-09-10-chromeos-with-android-app/Arcpp_Graphics.pdf > > As far as I known, chrominum could run upon wayland, I just wondering how > > it handle Android windows on wayland. > > I think the surface of Android apks could be wayland surface in linux, the > > window could be the shell surface. > > Since all the android apks are still running on android container, android > > window manager will manage these windows, in wayland, the relationship of > > these surfaces should be parent- subsurface that map to android > > windows. That's a little of problem, as you are confirmed, one wl surface > > can't be both subsurface and shell surface. > > If each android apks are not subsurfaces, I am confused how Android to > > handle the input events from wayland. > > you'll have to ask or wait for someone who knows ARC++ to answer. I > don't dare extrapolate details based on that one simple PDF alone. > > Android window management is very different from desktop window > management, and I don't even know if CrOS window management is close > to either. Using custom Wayland extensions is always a possibility, it > happens even on the desktops, e.g. GNOME/GTK. > > Look at the slide titled "Chromium Wayland Interfaces", for instance. ARC++ is proprietary, and I haven't seen its source code either. But looking at https://github.com/chromium/chromium/blob/master/components/exo/wayland/protocol/aura-shell.xml I would very much expect that the ARC++ client implementation uses this as an extra to support Android applications running under Chromium - for example, the titlebar-colour request is certainly fulfilling an Android need. Integrating Android into a desktop system is non-trivial. You will have to make quite a lot of changes along the way: Android assumes that you have one, or maybe two, applications open, and a status bar, and maybe a button bar, and that's it. If you want to make Android behave more like a desktop, then you're going to have to change Android to fit your desktop, and you're going to have to change your desktop to accommodate Android. I believe the ARC++ solution of using multiple top-level windows, and having the window management be primarily done by the host compositor, is a better option than trying to use subsurfaces to invert responsibility and effectively control the window management from Android. However, there is no out-of-the-box solution. Whatever you do is going to require custom development and experimentation. 'SPURV' is a periodically-refreshed effort from Collabora to see what this integration would look like, however we never addressed the idea of having multiple active applications, as it requires too many changes in Android, such as deep changes to the Android activity manager to deal with more than one application being current at one time. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Getting Weston to use DRM/KMS planes
Hi Oliver, On Wed, 25 Mar 2020 at 10:31, Wohlmuth, Oliver wrote: > I just started to work with Wayland/Weston, so please forgive me if I ask > silly questions. > > I’m running Weston (8.0.0) on a custom ARM SoC using the DRM backend. As the > OpenGL > driver is not (yet) adapted for Wayland, I'm running Weston using the > '--use-pixman' option. > This works fine so far. > > As our DRM/KMS driver supports several HW planes, I was debugging into the > Weston > code trying to understand how Weston makes use of these HW planes (or what > needs > to be done to get Weston use the HW planes). If I understand the code > correctly: > > drm_fb_get_from_view() > { > ... > if (wl_shm_buffer_get(buffer->resource)) > return NULL; > ... > } > > Weston will never use HW planes for wl_shm_buffer, only for GBM or dmabuf type > buffer it will be used. > > > > - Is this correct? What is the reason for this? That's correct. The reason is that we need to be able to get a KMS framebuffer object with the pixel content from the client buffer in it. Effectively, the only way to import client content into a KMS framebuffer is via dmabuf; KMS has no method of creating framebuffers from an arbitrary pointer to user memory. And we need a framebuffer object in order to display anything on a plane. > - Any suggestions how to get HW planes used without OpenGL rendering? > > Currently I think of patching Weston or implement a Weston client that > uses dmabuf buffer. Any hint is appreciated that puts me on the most > promising path. There is no easy out-of-the-box path. The first place to start would be to patch a client to allocate a dmabuf through the udmabuf kernel API (the one in the mainline kernel tree, not the external module living on GitHub), or the vgem driver. Once you've done that, you will very quickly notice that Weston doesn't actually support the dmabuf extension when using the Pixman renderer. This would be possible to implement by implementing the import_dmabuf, query_dmabuf_formats, and query_dmabuf_modifiers hooks. You probably only want to declare support the ARGB/XRGB formats and the LINEAR modifier. For import_dmabuf, you would want to call the DMA_BUF_IOCTL_SYNC call on the provided FD, mmap the dmabuf, then call pixman_image_create_bits_no_clear() to obtain a pixman_image which the renderer can use to source from the dmabuf's content. The reason we require the renderer to support dmabuf as well as the backend is for fallback: in case we can't display the client content on a KMS plane (which is not guaranteed, as the driver can reject planes for any reason or limitation), we need to be able to use the renderer as a fallback to show the content. Alternately, if your SoC has an Arm Mali GPU, you can use the Panfrost driver available in the upstream Linux kernel and Mesa GL implementation, which fully supports dmabuf/Wayland/GBM/etc. Hope that helps. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: 2020 X.Org Board of Directors Elections Nomination period is NOW
Another reminder that we're in the election process, and the next deadline is approaching: - Send board nominations to elections AT x DOT org - Got to https://members.x.org/ to renew your membership (or become one to begin with!) On Tue, Mar 17, 2020 at 7:21 AM Daniel Vetter wrote: > > Just a quick reminder that both board nomination and membership > renewal periods are still opening: > > - Send board nominations to elections AT x DOT org > > - Got to https://members.x.org/ to renew your membership (or become > one to begin with!) > > Cheers, Daniel > > On Sun, Mar 8, 2020 at 8:51 PM Daniel Vetter wrote: > > > > We are seeking nominations for candidates for election to the X.Org > > Foundation Board of Directors. All X.Org Foundation members are > > eligible for election to the board. > > > > Nominations for the 202 election are now open and will remain open > > until 23:59 UTC on 29th March 2020. > > > > The Board consists of directors elected from the membership. Each > > year, an election is held to bring the total number of directors to > > eight. The four members receiving the highest vote totals will serve > > as directors for two year terms. > > > > The directors who received two year terms starting in 2019 wereSamuel > > Iglesias Gonsálvez, Manasi D Navare, Lyude Paul and Daniel Vetter. > > They will continue to serve until their term ends in 2021. Current > > directors whose term expires in 2020 are Eric Anholt, Bryce > > Harrington, Keith Packard and Harry Wentland. > > > > A director is expected to participate in the fortnightly IRC meeting > > to discuss current business and to attend the annual meeting of the > > X.Org Foundation, which will be held at a location determined in > > advance by the Board of Directors. > > > > A member may nominate themselves or any other member they feel is > > qualified. Nominations should be sent to the Election Committee at > > elections at x.org. > > > > Nominees shall be required to be current members of the X.Org > > Foundation, and submit a personal statement of up to 200 words that > > will be provided to prospective voters. The collected statements, > > along with the statement of contribution to the X.Org Foundation in > > the member's account page on http://members.x.org, will be made > > available to all voters to help them make their voting decisions. > > > > Nominations, membership applications or renewals and completed > > personal statements must be received no later than 23:59 UTC on 02 > > April 2020. > > > > The slate of candidates will be published 6 April 2020 and candidate > > Q will begin then. The deadline for Xorg membership applications and > > renewals is 02 April 2020. > > > > Cheers, Daniel, on behalf of the X.Org BoD > > > > PS: I cc'ed the usual dev lists since not many members put in the renewal > > yet. > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > +41 (0) 79 365 57 48 - http://blog.ffwll.ch > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Plumbing explicit synchronization through the Linux ecosystem
On Tue, Mar 17, 2020 at 11:01:57AM +0100, Michel Dänzer wrote: > On 2020-03-16 7:33 p.m., Marek Olšák wrote: > > On Mon, Mar 16, 2020 at 5:57 AM Michel Dänzer wrote: > >> On 2020-03-16 4:50 a.m., Marek Olšák wrote: > >>> The synchronization works because the Mesa driver waits for idle (drains > >>> the GFX pipeline) at the end of command buffers and there is only 1 > >>> graphics queue, so everything is ordered. > >>> > >>> The GFX pipeline runs asynchronously to the command buffer, meaning the > >>> command buffer only starts draws and doesn't wait for completion. If the > >>> Mesa driver didn't wait at the end of the command buffer, the command > >>> buffer would finish and a different process could start execution of its > >>> own command buffer while shaders of the previous process are still > >> running. > >>> > >>> If the Mesa driver submits a command buffer internally (because it's > >> full), > >>> it doesn't wait, so the GFX pipeline doesn't notice that a command buffer > >>> ended and a new one started. > >>> > >>> The waiting at the end of command buffers happens only when the flush is > >>> external (Swap buffers, glFlush). > >>> > >>> It's a performance problem, because the GFX queue is blocked until the > >> GFX > >>> pipeline is drained at the end of every frame at least. > >>> > >>> So explicit fences for SwapBuffers would help. > >> > >> Not sure what difference it would make, since the same thing needs to be > >> done for explicit fences as well, doesn't it? > > > > No. Explicit fences don't require userspace to wait for idle in the command > > buffer. Fences are signalled when the last draw is complete and caches are > > flushed. Before that happens, any command buffer that is not dependent on > > the fence can start execution. There is never a need for the GPU to be idle > > if there is enough independent work to do. > > I don't think explicit fences in the context of this discussion imply > using that different fence signalling mechanism though. My understanding > is that the API proposed by Jason allows implicit fences to be used as > explicit ones and vice versa, so presumably they have to use the same > signalling mechanism. > > > Anyway, maybe the different fence signalling mechanism you describe > could be used by the amdgpu kernel driver in general, then Mesa could > drop the waits for idle and get the benefits with implicit sync as well? Yeah, this is entirely about the programming model visible to userspace. There shouldn't be any impact on the driver's choice of a top vs. bottom of the gpu pipeline used for synchronization, that's entirely up to what you're hw/driver/scheduler can pull off. Doing a full gfx pipeline flush for shared buffers, when your hw can do be, sounds like an issue to me that's not related to this here at all. It might be intertwined with amdgpu's special interpretation of dma_resv fences though, no idea. We might need to revamp all that. But for a userspace client that does nothing fancy (no multiple render buffer targets in one bo, or vk style "I write to everything all the time, perhaps" stuff) there should be 0 perf difference between implicit sync through dma_resv and explicit sync through sync_file/syncobj/dma_fence directly. If there is I'd consider that a bit a driver bug. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Plumbing explicit synchronization through the Linux ecosystem
On Tue, Mar 17, 2020 at 11:27:28AM -0500, Jason Ekstrand wrote: > On Tue, Mar 17, 2020 at 10:33 AM Nicolas Dufresne > wrote: > > > > Le lundi 16 mars 2020 à 23:15 +0200, Laurent Pinchart a écrit : > > > Hi Jason, > > > > > > On Mon, Mar 16, 2020 at 10:06:07AM -0500, Jason Ekstrand wrote: > > > > On Mon, Mar 16, 2020 at 5:20 AM Laurent Pinchart wrote: > > > > > Another issue is that V4L2 doesn't offer any guarantee on job > > > > > ordering. > > > > > When you queue multiple buffers for camera capture for instance, you > > > > > don't know until capture complete in which buffer the frame has been > > > > > captured. > > > > > > > > Is this a Kernel UAPI issue? Surely the kernel driver knows at the > > > > start of frame capture which buffer it's getting written into. I > > > > would think that the kernel APIs could be adjusted (if we find good > > > > reason to do so!) such that they return earlier and return a (buffer, > > > > fence) pair. Am I missing something fundamental about video here? > > > > > > For cameras I believe we could do that, yes. I was pointing out the > > > issues caused by the current API. For video decoders I'll let Nicolas > > > answer the question, he's way more knowledgeable that I am on that > > > topic. > > > > Right now, there is simply no uAPI for supporting asynchronous errors > > reporting when fences are invovled. That is true for both camera's and > > CODEC. It's likely what all the attempt was missing, I don't know > > enough myself to suggest something. > > > > Now, why Stateless video decoders are special is another subject. In > > CODECs, the decoding and the presentation order may differ. For > > Stateless kind of CODEC, a bitstream is passed to the HW. We don't know > > if this bitstream is fully valid, since the it is being parsed and > > validated by the firmware. It's also firmware job to decide which > > buffer should be presented first. > > > > In most firmware interface, that information is communicated back all > > at once when the frame is ready to be presented (which may be quite > > some time after it was decoded). So indeed, a fence model is not really > > easy to add, unless the firmware was designed with that model in mind. > > Just to be clear, I think we should do whatever makes sense here and > not try to slam sync_file in when it doesn't make sense just because > we have it. The more I read on this thread, the less out-fences from > video decode sound like they make sense unless we have a really solid > plan for async error reporting. It's possible, depending on how many > processes are involved in the pipeline, that async error reporting > could help reduce latency a bit if it let the kernel report the error > directly to the last process in the chain. However, I'm not convinced > the potential for userspace programmer error is worth it.. That said, > I'm happy to leave that up to the actual video experts. (I just do 3D) dma_fence has an error state which you can set when things went south. The fence still completes (to guarantee forward progress). Currently that error code isn't really propagated anywhere (well i915 iirc does something like that since it tracks the depedencies internally in the scheduler). Definitely not at the dma_fence level, since we don't track the dependency graph there at all. We might want to add that, would at least be possible. If we track the cascading dma_fence error state in the kernel I do think this could work. I'm not sure whether it's actually a good/useful idea still. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
On Wed, Mar 18, 2020 at 11:05:48AM +0100, Michel Dänzer wrote: > On 2020-03-17 6:21 p.m., Lucas Stach wrote: > > That's one of the issues with implicit sync that explicit may solve: > > a single client taking way too much time to render something can > > block the whole pipeline up until the display flip. With explicit > > sync the compositor can just decide to use the last client buffer if > > the latest buffer isn't ready by some deadline. > > FWIW, the compositor can do this with implicit sync as well, by polling > a dma-buf fd for the buffer. (Currently, it has to poll for writable, > because waiting for the exclusive fence only isn't enough with amdgpu) Would be great if we don't have to make this recommended uapi, just because amdgpu leaks it's trickery into the wider world. Polling for read really should be enough (and I guess Christian gets to fix up amdgpu more, at least for anything that has a dma-buf attached even if it's not shared with anything !amdgpu.ko). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem
On Tue, Mar 17, 2020 at 12:18:47PM -0500, Jason Ekstrand wrote: > On Tue, Mar 17, 2020 at 12:13 PM Jacob Lifshay > wrote: > > > > One related issue with explicit sync using sync_file is that combined > > CPUs/GPUs (the CPU cores *are* the GPU cores) that do all the > > rendering in userspace (like llvmpipe but for Vulkan and with extra > > instructions for GPU tasks) but need to synchronize with other > > drivers/processes is that there should be some way to create an > > explicit fence/semaphore from userspace and later signal it. This > > seems to conflict with the requirement for a sync_file to complete in > > finite time, since the user process could be stopped or killed. > > Yeah... That's going to be a problem. The only way I could see that > working is if you created a sync_file that had a timeout associated > with it. However, then you run into the issue where you may have > corruption if stuff doesn't complete on time. Then again, you're not > really dealing with an external unit and so the latency cost of going > across the window system protocol probably isn't massively different > from the latency cost of triggering the sync_file. Maybe the answer > there is to just do everything in-order and not worry about > synchronization? vgem does that already (fences with timeout). The corruption issue is also not new, if your shaders take forever real gpus will nick your rendering with a quick reset. Iirc someone (from cros google team maybe) was even looking into making llvmpipe run on top of vgem as a real dri/drm mesa driver. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: 2020 X.Org Board of Directors Elections Nomination period is NOW
Just a quick reminder that both board nomination and membership renewal periods are still opening: - Send board nominations to elections AT x DOT org - Got to https://members.x.org/ to renew your membership (or become one to begin with!) Cheers, Daniel On Sun, Mar 8, 2020 at 8:51 PM Daniel Vetter wrote: > > We are seeking nominations for candidates for election to the X.Org > Foundation Board of Directors. All X.Org Foundation members are > eligible for election to the board. > > Nominations for the 202 election are now open and will remain open > until 23:59 UTC on 29th March 2020. > > The Board consists of directors elected from the membership. Each > year, an election is held to bring the total number of directors to > eight. The four members receiving the highest vote totals will serve > as directors for two year terms. > > The directors who received two year terms starting in 2019 wereSamuel > Iglesias Gonsálvez, Manasi D Navare, Lyude Paul and Daniel Vetter. > They will continue to serve until their term ends in 2021. Current > directors whose term expires in 2020 are Eric Anholt, Bryce > Harrington, Keith Packard and Harry Wentland. > > A director is expected to participate in the fortnightly IRC meeting > to discuss current business and to attend the annual meeting of the > X.Org Foundation, which will be held at a location determined in > advance by the Board of Directors. > > A member may nominate themselves or any other member they feel is > qualified. Nominations should be sent to the Election Committee at > elections at x.org. > > Nominees shall be required to be current members of the X.Org > Foundation, and submit a personal statement of up to 200 words that > will be provided to prospective voters. The collected statements, > along with the statement of contribution to the X.Org Foundation in > the member's account page on http://members.x.org, will be made > available to all voters to help them make their voting decisions. > > Nominations, membership applications or renewals and completed > personal statements must be received no later than 23:59 UTC on 02 > April 2020. > > The slate of candidates will be published 6 April 2020 and candidate > Q will begin then. The deadline for Xorg membership applications and > renewals is 02 April 2020. > > Cheers, Daniel, on behalf of the X.Org BoD > > PS: I cc'ed the usual dev lists since not many members put in the renewal yet. > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Plumbing explicit synchronization through the Linux ecosystem
Hi, On Mon, 16 Mar 2020 at 15:33, Tomek Bury wrote: > > GL and GLES are not relevant. What is relevant is EGL, which defines > > interfaces to make things work on the native platform. > Yes and no. This is what EGL spec says about sharing a texture between > contexts: Contexts are different though ... > There are similar statements with regards to the lack of > synchronisation guarantees for EGL images or between GL and native > rendering, etc. This also isn't about native rendering. > But the main thing here is that EGL and Vulkan differ > significantly. Sure, I totally agree. > The eglSwapBuffers() is expected to post an unspecified > "back buffer" to the display system using some internal driver magic. > EGL driver is then expected to obtain another back buffer at some > unspecified point in the future. Yes, this is rather the point: EGL doesn't specify platform-related 'black magic' to make things just work, because that's part of the platform implementation details. And, as things stand, on Linux one of those things is implicit synchronisation, unless the desired end state of your driver is no synchronisation. This thread is a discussion about changing that. > > If you are using EGL_WL_bind_wayland_display, then one of the things > > it is explicitly allowed/expected to do is to create a Wayland > > protocol interface between client and compositor, which can be used to > > pass buffer handles and metadata in a platform-specific way. Adding > > synchronisation is also possible. > Only one-way synchronisation is possible with this mechanism. There's > a standard protocol for recycling buffers - wl_buffer_release() so > buffer hand-over from the compositor to client remains unsynchronised > - see below. That's not true; you can post back a sync token every time the client buffer is used by the compositor. > > > The most troublesome part was Wayland buffer release mechanism, as it > > > only involves a CPU signalling over Wayland IPC, without any 3D driver > > > involvement. The choices were: explicit synchronisation extension or a > > > buffer copy in the compositor (i.e. compositor textures from the copy, so > > > the client can re-write the original), or some implicit synchronisation > > > in kernel space (but that wasn't an option in Broadcom driver). > > > > You can add your own explicit synchronisation extension. > I could but that requires implementing in in the driver and in a > number of compositors, therefore a standard extension > zwp_linux_explicit_synchronization_v1 is much better choice here than > a custom one. EGL_WL_bind_wayland_display is explicitly designed to allow each driver to implement its own private extensions without modifying compositors. For instance, Mesa adds the `wl_drm` extension, which is used for bidirectional communication between the EGL implementations in the client and compositor address spaces, without modifying either. > > In every cross-process and cross-subsystem usecase, synchronisation is > > obviously required. The two options for this are to implement kernel > > support for implicit synchronisation (as everyone else has done), > That would require major changes in driver architecture or a 2nd > mechanisms doing the same thing but in kernel space - both are > non-starters. OK. As it stands, everyone else has the kernel mechanism (e.g. via dmabuf resv), so in this case if you are reinventing the underlying platform in a proprietary stack, you get to solve the same problems yourselves. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: Plumbing explicit synchronization through the Linux ecosystem
Hi Tomek, On Mon, 16 Mar 2020 at 12:55, Tomek Bury wrote: > I've been wrestling with the sync problems in Wayland some time ago, but only > with regards to 3D drivers. > > The guarantee given by the GL/GLES spec is limited to a single graphics > context. If the same buffer is accessed by 2 contexts the outcome is > unspecified. The cross-context and cross-process synchronisation is not > guaranteed. It happens to work on Mesa, because the read/write locking is > implemented in the kernel space, but it didn't work on Broadcom driver, which > has read-write interlocks in user space. GL and GLES are not relevant. What is relevant is EGL, which defines interfaces to make things work on the native platform. EGL doesn't define any kind of synchronisation model for the Wayland, X11, or GBM/KMS platforms - but it's one of the things which has to work. It doesn't say that the implementation must make sure that the requested format is displayable, but you sort of take it for granted that if you ask EGL to display something it will do so. Synchronisation is one of those mechanisms which is left to the platform to implement under the hood. In the absence of platform support for explicit synchronisation, the synchronisation must be implicit. > A Vulkan client makes it even worse because of conflicting requirements: > Vulkan's vkQueuePresentKHR() passes in a number of semaphores but disallows > waiting. Wayland WSI requires wl_surface_commit() to be called from > vkQueuePresentKHR() which does require a wait, unless a synchronisation > primitive representing Vulkan samaphores is passed between Vulkan client and > the compositor. If you are using EGL_WL_bind_wayland_display, then one of the things it is explicitly allowed/expected to do is to create a Wayland protocol interface between client and compositor, which can be used to pass buffer handles and metadata in a platform-specific way. Adding synchronisation is also possible. > The most troublesome part was Wayland buffer release mechanism, as it only > involves a CPU signalling over Wayland IPC, without any 3D driver > involvement. The choices were: explicit synchronisation extension or a buffer > copy in the compositor (i.e. compositor textures from the copy, so the client > can re-write the original), or some implicit synchronisation in kernel space > (but that wasn't an option in Broadcom driver). You can add your own explicit synchronisation extension. In every cross-process and cross-subsystem usecase, synchronisation is obviously required. The two options for this are to implement kernel support for implicit synchronisation (as everyone else has done), or implement generic support for explicit synchronisation (as we have been working on with implementations inside Weston and Exosphere at least), or implement private support for explicit synchronisation, or do nothing and then be surprised at the lack of synchronisation. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
2020 X.Org Board of Directors Elections Nomination period is NOW
We are seeking nominations for candidates for election to the X.Org Foundation Board of Directors. All X.Org Foundation members are eligible for election to the board. Nominations for the 202 election are now open and will remain open until 23:59 UTC on 29th March 2020. The Board consists of directors elected from the membership. Each year, an election is held to bring the total number of directors to eight. The four members receiving the highest vote totals will serve as directors for two year terms. The directors who received two year terms starting in 2019 wereSamuel Iglesias Gonsálvez, Manasi D Navare, Lyude Paul and Daniel Vetter. They will continue to serve until their term ends in 2021. Current directors whose term expires in 2020 are Eric Anholt, Bryce Harrington, Keith Packard and Harry Wentland. A director is expected to participate in the fortnightly IRC meeting to discuss current business and to attend the annual meeting of the X.Org Foundation, which will be held at a location determined in advance by the Board of Directors. A member may nominate themselves or any other member they feel is qualified. Nominations should be sent to the Election Committee at elections at x.org. Nominees shall be required to be current members of the X.Org Foundation, and submit a personal statement of up to 200 words that will be provided to prospective voters. The collected statements, along with the statement of contribution to the X.Org Foundation in the member's account page on http://members.x.org, will be made available to all voters to help them make their voting decisions. Nominations, membership applications or renewals and completed personal statements must be received no later than 23:59 UTC on 02 April 2020. The slate of candidates will be published 6 April 2020 and candidate Q will begin then. The deadline for Xorg membership applications and renewals is 02 April 2020. Cheers, Daniel, on behalf of the X.Org BoD PS: I cc'ed the usual dev lists since not many members put in the renewal yet. -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie wrote: > > On Sat, 29 Feb 2020 at 05:34, Eric Anholt wrote: > > > > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie wrote: > > > > > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone wrote: > > > > > > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie wrote: > > > > > b) we probably need to take a large step back here. > > > > > > > > > > Look at this from a sponsor POV, why would I give X.org/fd.o > > > > > sponsorship money that they are just giving straight to google to pay > > > > > for hosting credits? Google are profiting in some minor way from these > > > > > hosting credits being bought by us, and I assume we aren't getting any > > > > > sort of discounts here. Having google sponsor the credits costs google > > > > > substantially less than having any other company give us money to do > > > > > it. > > > > > > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty > > > > comparable in terms of what you get and what you pay for them. > > > > Obviously providers like Packet and Digital Ocean who offer bare-metal > > > > services are cheaper, but then you need to find someone who is going > > > > to properly administer the various machines, install decent > > > > monitoring, make sure that more storage is provisioned when we need > > > > more storage (which is basically all the time), make sure that the > > > > hardware is maintained in decent shape (pretty sure one of the fd.o > > > > machines has had a drive in imminent-failure state for the last few > > > > months), etc. > > > > > > > > Given the size of our service, that's a much better plan (IMO) than > > > > relying on someone who a) isn't an admin by trade, b) has a million > > > > other things to do, and c) hasn't wanted to do it for the past several > > > > years. But as long as that's the resources we have, then we're paying > > > > the cloud tradeoff, where we pay more money in exchange for fewer > > > > problems. > > > > > > Admin for gitlab and CI is a full time role anyways. The system is > > > definitely not self sustaining without time being put in by you and > > > anholt still. If we have $75k to burn on credits, and it was diverted > > > to just pay an admin to admin the real hw + gitlab/CI would that not > > > be a better use of the money? I didn't know if we can afford $75k for > > > an admin, but suddenly we can afford it for gitlab credits? > > > > As I think about the time that I've spent at google in less than a > > year on trying to keep the lights on for CI and optimize our > > infrastructure in the current cloud environment, that's more than the > > entire yearly budget you're talking about here. Saying "let's just > > pay for people to do more work instead of paying for full-service > > cloud" is not a cost optimization. > > > > > > > > Yes, we could federate everything back out so everyone runs their own > > > > builds and executes those. Tinderbox did something really similar to > > > > that IIRC; not sure if Buildbot does as well. Probably rules out > > > > pre-merge testing, mind. > > > > > > Why? does gitlab not support the model? having builds done in parallel > > > on runners closer to the test runners seems like it should be a thing. > > > I guess artifact transfer would cost less then as a result. > > > > Let's do some napkin math. The biggest artifacts cost we have in Mesa > > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64, > > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day, > > makes ~1.8TB/month ($180 or so). We could build a local storage next > > to the lava dispatcher so that the artifacts didn't have to contain > > the rootfs that came from the container (~2/3 of the insides of the > > zip file), but that's another service to build and maintain. Building > > the drivers once locally and storing it would save downloading the > > other ~1/3 of the inside of the zip file, but that requires a big > > enough system to do builds in time. > > > > I'm planning on doing a local filestore for google's lava lab, since I > > need to be able to move our xml files off of the lava DUTs to get the > > xml results we've become accustomed to, but this would not bubble up > > to being a priority for my time if I wasn't
Re: gitlab.fd.o financial situation and impact on services
Hi Jan, On Fri, 28 Feb 2020 at 10:09, Jan Engelhardt wrote: > On Friday 2020-02-28 08:59, Daniel Stone wrote: > >I believe that in January, we had $2082 of network cost (almost > >entirely egress; ingress is basically free) and $1750 of > >cloud-storage cost (almost all of which was download). That's based > >on 16TB of cloud-storage (CI artifacts, container images, file > >uploads, Git LFS) egress and 17.9TB of other egress (the web service > >itself, repo activity). Projecting that out [×12 for a year] gives > >us roughly $45k of network activity alone, > > I had come to a similar conclusion a few years back: It is not very > economic to run ephemereal buildroots (and anything like it) between > two (or more) "significant locations" of which one end is located in > a Large Cloud datacenter like EC2/AWS/etc. > > As for such usecases, me and my surrounding peers have used (other) > offerings where there is 50 TB free network/month, and yes that may > have entailed doing more adminning than elsewhere - but an admin > appreciates $2000 a lot more than a corporation, too. Yes, absolutely. For context, our storage & network costs have increased >10x in the past 12 months (~$320 Jan 2019), >3x in the past 6 months (~$1350 July 2019), and ~2x in the past 3 months (~$2000 Oct 2019). I do now (personally) think that it's crossed the point at which it would be worthwhile paying an admin to solve the problems that cloud services currently solve for us - which wasn't true before. Such an admin could also deal with things like our SMTP delivery failure rate, which in the past year has spiked over 50% (see previous email), demand for new services such as Discourse which will enable user support without either a) users having to subscribe to a mailing list, or b) bug trackers being cluttered up with user requests and other non-bugs, etc. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund wrote: > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote: > > Yeah, changes on vulkan drivers or backend compilers should be > > fairly > > sandboxed. > > > > We also have tools that only work for intel stuff, that should never > > trigger anything on other people's HW. > > > > Could something be worked out using the tags? > > I think so! We have the pre-defined environment variable > CI_MERGE_REQUEST_LABELS, and we can do variable conditions: > > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables > > That sounds like a pretty neat middle-ground to me. I just hope that > new pipelines are triggered if new labels are added, because not > everyone is allowed to set labels, and sometimes people forget... There's also this which is somewhat more robust: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569 Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, Feb 28, 2020 at 10:29 AM Erik Faye-Lund wrote: > > On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote: > > On Fri, 28 Feb 2020 at 07:27, Daniel Vetter > > wrote: > > > Hi all, > > > > > > You might have read the short take in the X.org board meeting > > > minutes > > > already, here's the long version. > > > > > > The good news: gitlab.fd.o has become very popular with our > > > communities, and is used extensively. This especially includes all > > > the > > > CI integration. Modern development process and tooling, yay! > > > > > > The bad news: The cost in growth has also been tremendous, and it's > > > breaking our bank account. With reasonable estimates for continued > > > growth we're expecting hosting expenses totalling 75k USD this > > > year, > > > and 90k USD next year. With the current sponsors we've set up we > > > can't > > > sustain that. We estimate that hosting expenses for gitlab.fd.o > > > without any of the CI features enabled would total 30k USD, which > > > is > > > within X.org's ability to support through various sponsorships, > > > mostly > > > through XDC. > > > > > > Note that X.org does no longer sponsor any CI runners themselves, > > > we've stopped that. The huge additional expenses are all just in > > > storing and serving build artifacts and images to outside CI > > > runners > > > sponsored by various companies. A related topic is that with the > > > growth in fd.o it's becoming infeasible to maintain it all on > > > volunteer admin time. X.org is therefore also looking for admin > > > sponsorship, at least medium term. > > > > > > Assuming that we want cash flow reserves for one year of > > > gitlab.fd.o > > > (without CI support) and a trimmed XDC and assuming no sponsor > > > payment > > > meanwhile, we'd have to cut CI services somewhere between May and > > > June > > > this year. The board is of course working on acquiring sponsors, > > > but > > > filling a shortfall of this magnitude is neither easy nor quick > > > work, > > > and we therefore decided to give an early warning as soon as > > > possible. > > > Any help in finding sponsors for fd.o is very much appreciated. > > > > a) Ouch. > > > > b) we probably need to take a large step back here. > > > > I kinda agree, but maybe the step doesn't have to be *too* large? > > I wonder if we could solve this by restructuring the project a bit. I'm > talking purely from a Mesa point of view here, so it might not solve > the full problem, but: > > 1. It feels silly that we need to test changes to e.g the i965 driver > on dragonboards. We only have a big "do not run CI at all" escape- > hatch. > > 2. A lot of us are working for a company that can probably pay for > their own needs in terms of CI. Perhaps moving some costs "up front" to > the company that needs it can make the future of CI for those who can't > do this > > 3. I think we need a much more detailed break-down of the cost to make > educated changes. For instance, how expensive is Docker image > uploads/downloads (e.g intermediary artifacts) compared to build logs > and final test-results? What kind of artifacts? We have logs somewhere, but no one yet got around to analyzing that. Which will be quite a bit of work to do since the cloud storage is totally disconnected from the gitlab front-end, making the connection to which project or CI job caused something is going to require scripting. Volunteers definitely very much welcome I think. > One suggestion would be to do something more similar to what the kernel > does, and separate into different repos for different subsystems. This > could allow us to have separate testing-pipelines for these repos, > which would mean that for instance a change to RADV didn't trigger a > full Panfrost test-run. Uh as someone who lives the kernel multi-tree model daily, there's a _lot_ of pain involved. I think much better to look at filtering out CI targets for when nothing relevant happened. But that gets somewhat tricky, since "nothing relevant" is always only relative to some baseline, so bit of scripting and all involved to make sure you don't run stuff too often or (probably worse) not often enough. -Daniel > This would probably require us to accept using a more branch-heavy > work-flow. I don't personally think that would be a bad thing. > > But this is all kinda based on an assumption that running hardware- > testing is the expensive
Re: [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, 28 Feb 2020 at 08:48, Dave Airlie wrote: > On Fri, 28 Feb 2020 at 18:18, Daniel Stone wrote: > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty > > comparable in terms of what you get and what you pay for them. > > Obviously providers like Packet and Digital Ocean who offer bare-metal > > services are cheaper, but then you need to find someone who is going > > to properly administer the various machines, install decent > > monitoring, make sure that more storage is provisioned when we need > > more storage (which is basically all the time), make sure that the > > hardware is maintained in decent shape (pretty sure one of the fd.o > > machines has had a drive in imminent-failure state for the last few > > months), etc. > > > > Given the size of our service, that's a much better plan (IMO) than > > relying on someone who a) isn't an admin by trade, b) has a million > > other things to do, and c) hasn't wanted to do it for the past several > > years. But as long as that's the resources we have, then we're paying > > the cloud tradeoff, where we pay more money in exchange for fewer > > problems. > > Admin for gitlab and CI is a full time role anyways. The system is > definitely not self sustaining without time being put in by you and > anholt still. If we have $75k to burn on credits, and it was diverted > to just pay an admin to admin the real hw + gitlab/CI would that not > be a better use of the money? I didn't know if we can afford $75k for > an admin, but suddenly we can afford it for gitlab credits? s/gitlab credits/GCP credits/ I took a quick look at HPE, which we previously used for bare metal, and it looks like we'd be spending $25-50k (depending on how much storage you want to provision, how much room you want to leave to provision more storage later, how much you care about backups) to run a similar level of service so that'd put a bit of a dint in your year-one budget. The bare-metal hosting providers also add up to more expensive than you might think, again especially if you want either redundancy or just backups. > > Yes, we could federate everything back out so everyone runs their own > > builds and executes those. Tinderbox did something really similar to > > that IIRC; not sure if Buildbot does as well. Probably rules out > > pre-merge testing, mind. > > Why? does gitlab not support the model? having builds done in parallel > on runners closer to the test runners seems like it should be a thing. > I guess artifact transfer would cost less then as a result. It does support the model but if every single build executor is also compiling Mesa from scratch locally, how long do you think that's going to take? > > Again, if you want everything to be centrally > > designed/approved/monitored/controlled, that's a fine enough idea, and > > I'd be happy to support whoever it was who was doing that for all of > > fd.o. > > I don't think we have any choice but to have someone centrally > controlling it, You can't have a system in place that lets CI users > burn largs sums of money without authorisation, and that is what we > have now. OK, not sure who it is who's going to be approving every update to every .gitlab-ci.yml in the repository, or maybe we just have zero shared runners and anyone who wants to do builds can BYO. ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: [Intel-gfx] gitlab.fd.o financial situation and impact on services
On Fri, 28 Feb 2020 at 03:38, Dave Airlie wrote: > b) we probably need to take a large step back here. > > Look at this from a sponsor POV, why would I give X.org/fd.o > sponsorship money that they are just giving straight to google to pay > for hosting credits? Google are profiting in some minor way from these > hosting credits being bought by us, and I assume we aren't getting any > sort of discounts here. Having google sponsor the credits costs google > substantially less than having any other company give us money to do > it. The last I looked, Google GCP / Amazon AWS / Azure were all pretty comparable in terms of what you get and what you pay for them. Obviously providers like Packet and Digital Ocean who offer bare-metal services are cheaper, but then you need to find someone who is going to properly administer the various machines, install decent monitoring, make sure that more storage is provisioned when we need more storage (which is basically all the time), make sure that the hardware is maintained in decent shape (pretty sure one of the fd.o machines has had a drive in imminent-failure state for the last few months), etc. Given the size of our service, that's a much better plan (IMO) than relying on someone who a) isn't an admin by trade, b) has a million other things to do, and c) hasn't wanted to do it for the past several years. But as long as that's the resources we have, then we're paying the cloud tradeoff, where we pay more money in exchange for fewer problems. > If our current CI architecture is going to burn this amount of money a > year and we hadn't worked this out in advance of deploying it then I > suggest the system should be taken offline until we work out what a > sustainable system would look like within the budget we have, whether > that be never transferring containers and build artifacts from the > google network, just having local runner/build combos etc. Yes, we could federate everything back out so everyone runs their own builds and executes those. Tinderbox did something really similar to that IIRC; not sure if Buildbot does as well. Probably rules out pre-merge testing, mind. The reason we hadn't worked everything out in advance of deploying is because Mesa has had 3993 MRs in the not long over a year since moving, and a similar number in GStreamer, just taking the two biggest users. At the start it was 'maybe let's use MRs if you want to but make sure everything still goes through the list', and now it's something different. Similarly the CI architecture hasn't been 'designed', so much as that people want to run dEQP and Piglit on their hardware pre-merge in an open fashion that's actually accessible to people, and have just done it. Again, if you want everything to be centrally designed/approved/monitored/controlled, that's a fine enough idea, and I'd be happy to support whoever it was who was doing that for all of fd.o. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel
Re: gitlab.fd.o financial situation and impact on services
Hi Matt, On Thu, 27 Feb 2020 at 23:45, Matt Turner wrote: > We're paying 75K USD for the bandwidth to transfer data from the > GitLab cloud instance. i.e., for viewing the https site, for > cloning/updating git repos, and for downloading CI artifacts/images to > the testing machines (AFAIU). I believe that in January, we had $2082 of network cost (almost entirely egress; ingress is basically free) and $1750 of cloud-storage cost (almost all of which was download). That's based on 16TB of cloud-storage (CI artifacts, container images, file uploads, Git LFS) egress and 17.9TB of other egress (the web service itself, repo activity). Projecting that out gives us roughly $45k of network activity alone, so it looks like this figure is based on a projected increase of ~50%. The actual compute capacity is closer to $1150/month. > I was not aware that we were being charged for anything wrt GitLab > hosting yet (and neither was anyone on my team at Intel that I've > asked). This... kind of needs to be communicated. > > A consistent concern put forth when we were discussing switching to > GitLab and building CI was... how do we pay for it. It felt like that > concern was always handwaved away. I heard many times that if we > needed more runners that we could just ask Google to spin up a few > more. If we needed testing machines they'd be donated. No one > mentioned that all the while we were paying for bandwidth... Perhaps > people building the CI would make different decisions about its > structure if they knew it was going to wipe out the bank account. The original answer is that GitLab themselves offered to sponsor enough credit on Google Cloud to get us started. They used GCP themselves so they could assist us (me) in getting bootstrapped, which was invaluable. After that, Google's open-source program office offered to sponsor us for $30k/year, which was I believe last April. Since then the service usage has increased roughly by a factor of 10, so our 12-month sponsorship is no longer enough to cover 12 months. > What percentage of the bandwidth is consumed by transferring CI > images, etc? Wouldn't 75K USD would be enough to buy all the testing > machines we need and host them within Google or wherever so we don't > need to pay for huge amounts of bandwidth? Unless the Google Cloud Platform starts offering DragonBoards, it wouldn't reduce our bandwidth usage as the corporate network is treated separately for egress. Cheers, Daniel ___ wayland-devel mailing list wayland-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/wayland-devel