Re: Ways to test Weston during development (Re: Full-motion zero-copy screen capture in Weston)

2024-06-10 Thread Daniel Stone
Hi Matt,

On Fri, 7 Jun 2024 at 16:30, Hoosier, Matt  wrote:
> Okay, makes sense that you don’t want to have to repeat the dependencies’ 
> builds for every CI test. I’m not arguing that you should – it was just more 
> a thought experiment to see whether riding Meson subprojects is a reasonable 
> idea for establishing a development environment.
>
> I get your point that that can become a deep rabbit hole. But it seems that 
> you didn’t have any need to build LLVM and similar just to support the 
> hand-built copy of Mesa that’s in the CI. Is there some reason why a deeper 
> set of transitive dependencies would be needed using Meson subprojects than 
> when building by hand? Seems like I could probably just mimic what you’ve 
> done. Maybe your point is that the CI is a very constrained environment 
> that’s known not to need ATI or llvmpipe, but a general developer situation 
> with physical machines would?

Oh no, the CI environment absolutely needs llvmpipe! We install quite
a few development packages (cf .gitlab-ci/debian-install.sh) into the
CI environment though, so although we don't build LLVM, we do
absolutely depend on distro LLVM development packages which aren't
present in a clean distro install.

You're completely right though that it makes no difference to the
dependency chain whether the dependencies come from Meson subprojects
or previous installs though.

Cheers,
Daniel


Re: Ways to test Weston during development (Re: Full-motion zero-copy screen capture in Weston)

2024-06-07 Thread Daniel Stone
Hi Matt,

On Fri, 7 Jun 2024 at 15:30, Hoosier, Matt  wrote:
> Would Meson’s dependency wrapping capabilities be a viable solution here? I 
> think that most of Weston’s dependencies that have aggressive version 
> requirements are themselves also Meson projects.
>
> The Weston CI configuration builds a bunch of its dependencies (Mesa, libdrm, 
> libwayland …) manually. I wonder why Meson wrapping was not used for this?

We don't want to rebuild Mesa every time. We could've built it as a
subproject and cached it, but it didn't seem to offer much any
advantage over just installing it into the system.

We could probably add some subprojects, but you'd probably end up
pulling in more components as well - e.g. if you want to run Mesa with
its software renderer or the AMD drivers, you'll also need to use LLVM
- and at what point does your easy subproject build turn into, well, a
full distribution?

I guess one thing we could do is to jazz the CI build up a little so
it's easier to pull the OCI and run it inside a toolbox, as well as
reuse those scripts locally.

Cheers,
Daniel


Re: Ways to test Weston during development (Re: Full-motion zero-copy screen capture in Weston)

2024-06-05 Thread Daniel Stone
Hi,

On Wed, 5 Jun 2024 at 09:09, Pekka Paalanen
 wrote:
> On Tue, 4 Jun 2024 20:33:48 +
> "Hoosier, Matt"  wrote:
> > Tactical question: I somehow missed until this point that the remote
> > and pipewire plugins will only run if the DRM backend is being used.
> >
> > But the DRM backend *really* doesn't want to start nowadays unless
> > you're running on a system with seatd and/or logind available.
> > Toolbox [1] is the de facto way to develop on bleeding edge copies of
> > components these days. But it logind and seatd aren't exposed into it.
> >
> > How do Weston people interactively develop on the Weston DRM backend
> > nowadays?
> >
> > [1] https://docs.fedoraproject.org/en-US/fedora-silverblue/toolbox/
>
> I'm doing it old-school on my workstation, without any containers. What
> dependencies my distribution does not provide, I build and install
> manually into a prefix under $HOME:
>
> https://www.collabora.com/news-and-blog/blog/2020/04/10/clean-reliable-setup-for-dependency-installation/
>
> The "clean and reliable" is probably outdated in this era of
> containers...

Yes, doing it in containers is a little bit tricky since it's not
exactly the design case. Honestly, on my Silverblue systems, I just
install a bunch of relevant dependencies into the system image with
rpm-ostree, and have a pile of self-built dependencies in a local
prefix.

This might give you some insight however:
https://github.com/containers/toolbox/issues/992

It probably needs some minor changes in Weston but does at least seem doable ...

Cheers,
Daniel


Re: [RFC PATCH v4 00/42] Color Pipeline API w/ VKMS

2024-02-29 Thread Daniel Vetter
rs/gpu/drm/drm_mode_config.c |   7 +
>  drivers/gpu/drm/drm_plane.c   |  52 ++
>  drivers/gpu/drm/tests/Makefile|   3 +-
>  drivers/gpu/drm/tests/drm_fixp_test.c |  69 ++
>  drivers/gpu/drm/vkms/Kconfig  |  20 +
>  drivers/gpu/drm/vkms/Makefile |   4 +-
>  drivers/gpu/drm/vkms/tests/.kunitconfig   |   4 +
>  drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 449 ++
>  drivers/gpu/drm/vkms/vkms_colorop.c   | 100 +++
>  drivers/gpu/drm/vkms/vkms_composer.c  | 135 ++-
>  drivers/gpu/drm/vkms/vkms_drv.h   |   8 +
>  drivers/gpu/drm/vkms/vkms_luts.c  | 802 ++
>  drivers/gpu/drm/vkms/vkms_luts.h  |  12 +
>  drivers/gpu/drm/vkms/vkms_plane.c |   2 +
>  include/drm/drm_atomic.h  | 122 +++
>  include/drm/drm_atomic_uapi.h |   3 +
>  include/drm/drm_colorop.h | 301 +++
>  include/drm/drm_file.h|   7 +
>  include/drm/drm_fixed.h   |  35 +-
>  include/drm/drm_mode_config.h |  18 +
>  include/drm/drm_plane.h   |  13 +
>  include/uapi/drm/drm.h|  16 +
>  include/uapi/drm/drm_mode.h   |  14 +
>  38 files changed, 3882 insertions(+), 30 deletions(-)
>  create mode 100644 Documentation/gpu/rfc/color_pipeline.rst
>  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.c
>  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.h
>  create mode 100644 drivers/gpu/drm/drm_colorop.c
>  create mode 100644 drivers/gpu/drm/tests/drm_fixp_test.c
>  create mode 100644 drivers/gpu/drm/vkms/Kconfig
>  create mode 100644 drivers/gpu/drm/vkms/tests/.kunitconfig
>  create mode 100644 drivers/gpu/drm/vkms/tests/vkms_color_tests.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_colorop.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_luts.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_luts.h
>  create mode 100644 include/drm/drm_colorop.h
> 
> --
> 2.44.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/2] drm/tidss: Fix initial plane zpos values

2024-02-16 Thread Daniel Stone
Hi,

On Fri, 16 Feb 2024 at 09:00, Tomi Valkeinen
 wrote:
> On 13/02/2024 13:39, Daniel Stone wrote:
> > Specifically, you probably want commits 4cde507be6a1 and 58dde0e0c000.
> > I think the window of breakage was small enough that - assuming either
> > those commits or an upgrade to Weston 12/13 fixes it - we can just ask
> > people to upgrade to a fixed Weston.
> >
> >>> Presuming this is not related to any TI specific code, I guess it's a
> >>> regression in the sense that at some point Weston added the support to use
> >>> planes for composition, so previously with only a single plane per display
> >>> there was no issue.
> >
> > That point was 12 years ago, so not that novel. ;)
>
> Hmm, so do I understand it right, the plane code from 12 years back
> supposedly works ok, but somewhere around Weston 10 something broke, but
> was fixed with the commits you mention above?

We always had plane support but pre-zpos; we added support for zpos a
couple/few releases ago, but then massively refactored it ... so it
could've always been broken, or could've been broken for as long as we
have zpos, or it could've just been a small window in between the
refactor.


Re: [PATCH 1/2] drm/tidss: Fix initial plane zpos values

2024-02-13 Thread Daniel Stone
Hi,

On Tue, 13 Feb 2024 at 10:18, Marius Vlad  wrote:
> On Tue, Feb 13, 2024 at 11:57:59AM +0200, Tomi Valkeinen wrote:
> > I haven't. I'm quite unfamiliar with Weston, and Randolph from TI (cc'd) has
> > been working on the Weston side of things. I also don't know if there's
> > something TI specific here, as the use case is with non-mainline GPU drivers
> > and non-mainline Mesa. I should have been a bit clearer in the patch
> > description, as I didn't mean that upstream Weston has a bug (maybe it has,
> > maybe it has not).

Don't worry about it. We've had bugs in the past and I'm sure we'll
have more. :) Either way, it's definitely better to have the kernel
expose sensible behaviour rather than weird workarounds, unless
they've been around for so long that they're basically baked into ABI.

> > The issue seen is that when Weston decides to use DRM planes for
> > composition, the plane zpositions are not configured correctly (or at all?).
> > Afaics, this leads to e.g. weston showing a window with a DRM "overlay"
> > plane that is behind the "primary" root plane, so the window is not visible.
> > And as Weston thinks that the area supposedly covered by the overlay plane
> > does not need to be rendered on the root plane, there are also artifacts on
> > that area.
> >
> > Also, the Weston I used is a bit older one (10.0.1), as I needed to go back
> > in my buildroot versions to get all that non-mainline GPU stuff compiled and
> > working. A more recent Weston may behave differently.
>
> Right after Weston 10, we had a few minor changes related to the
> zpos-sorting list of planes and how we parse the plan list without having
> a temporary zpos ordered list to pick planes from.
>
> And there's another fix for missing out to set out the zpos for scanout
> to the minimum available - which seems like a good candidate to explain
> what happens in the issue described above. So if trying Weston again,
> please try with at least Weston 12, which should have those changes
> in.

Specifically, you probably want commits 4cde507be6a1 and 58dde0e0c000.
I think the window of breakage was small enough that - assuming either
those commits or an upgrade to Weston 12/13 fixes it - we can just ask
people to upgrade to a fixed Weston.

> > Presuming this is not related to any TI specific code, I guess it's a
> > regression in the sense that at some point Weston added the support to use
> > planes for composition, so previously with only a single plane per display
> > there was no issue.

That point was 12 years ago, so not that novel. ;)

Cheers,
Daniel


[ANNOUNCE] wayland-protocols 1.33

2024-01-19 Thread Daniel Stone
Hi,
wayland-protocols 1.33 has been released. This marks the linux-dmabuf
protocol - now at v5 - stable, introduces the ext-transient-seat
protocol, and has a number of minor fixes and clarifications for other
protocols. Thanks to all who have contributed.

Andri Yngvason (1):
  Add the transient seat protocol

Daniel Stone (1):
  build: Bump version to 1.33

Jonas Ådahl (1):
  xdg-shell: Clarify what a toplevel by default includes

Lleyton Gray (1):
  staging/drm-lease: fix typo in description

MaxVerevkin (1):
  linux-dmabuf: sync changes from unstable to stable

Sebastian Wick (3):
  security-context-v1: Document out of band metadata for flatpak
  security-context-v1: Document what can be done with the open sockets
  security-context-v1: Make sandbox engine names use reverse-DNS

Simon Ser (12):
  linux-dmabuf: add note about implicit sync
  members: remove EFL/Enlightenment
  build: simplify dict loops
  build: add version for stable protocols
  linux-dmabuf: mark as stable
  xdg-decoration: fix configure event summary
  xdg-decoration: remove ambiguous wording in configure event
  presentation-time: stop referring to Linux/glibc
  readme: version should be included in stable protocol filenames
  linux-dmabuf: require all planes to use the same modifier
  readme: make it clear that we are a standards body
  ci: upgrade ci-templates and Debian

Vaxry (2):
  README: fix typos
  governance: fix typos

git tag: 1.33

https://gitlab.freedesktop.org/wayland/wayland-protocols/-/releases/1.33/downloads/wayland-protocols-1.33.tar.xz
SHA256: 94f0c50b090d6e61a03f62048467b19abbe851be4e11ae7b36f65f8b98c3963a
 wayland-protocols-1.33.tar.xz
SHA512: 
4584f6ac86367655f9db5d0c0ed0681efa31e73f984e4b620fbe5317df21790927f4f5317ecbbc194ac31eaf88caebc431bcc52c23d9dc0098c71de3cb4a9fef
 wayland-protocols-1.33.tar.xz
PGP:
https://gitlab.freedesktop.org/wayland/wayland-protocols/-/releases/1.33/downloads/wayland-protocols-1.33.tar.xz.sig


Re: Right mailing list for mutter/gnome-remote-desktop question?

2024-01-17 Thread Daniel Stone
Hi Matt,

On Wed, 17 Jan 2024 at 17:08, Matt Hoosier  wrote:
> Does anybody know whether there’s a dedicated mailing list suitable for 
> asking questions about the hardware acceleration in the remote desktop 
> use-case for those two?
>
> I did a quick look through both repos’ README and CONTRIBUTING files, but 
> didn’t find anything.

https://discourse.gnome.org is probably your best bet there.

Cheers,
Daniel


Re: Sub 16ms render but missing swap

2023-10-18 Thread Daniel Stone
Hi Joe,

On Wed, 18 Oct 2023 at 02:00, Joe M  wrote:
> A few questions:
>   1. What other avenues of investigation should I pursue for the swap delay? 
> As in, why when I take 12 ms to render do I not see about 4ms for the swap 
> call to return? My display is running in at 60hz.

Further to Emmanuel's point about GPU rendering being async (you can
validate by calling glFinish before eglSwapBuffers, which will wait
for everything to complete) - which hardware platform are you using
here, and which software stack as well? As in, do your Weston +
drivers + etc come from upstream projects or are they provided by a
vendor?

>   2. Has EGL been optimized to use the available wayland callbacks and 
> maximize available client drawing time?

Yes, very much.

>   3. Does EGL leverage "weston_direct_display_v1" when available? What's 
> required to take advantage of it in the app code? (ie. run fullscreen?)

No need. We bypass composition as much as we possibly can. You can try
using weston-simple-egl with the flag to use direct-display if you
want to satisfy yourself, but it's in no way required to bypass GPU
composition and use the display controller to scan out.

Cheers,
Daniel


Re: [PATCH v1] dynamic_debug: add support for logs destination

2023-10-12 Thread Daniel Vetter
On Thu, Oct 12, 2023 at 01:39:44PM +0300, Pekka Paalanen wrote:
> On Thu, 12 Oct 2023 11:53:52 +0200
> Daniel Vetter  wrote:
> 
> > On Thu, Oct 12, 2023 at 11:55:48AM +0300, Pekka Paalanen wrote:
> > > On Wed, 11 Oct 2023 11:42:24 +0200
> > > Daniel Vetter  wrote:
> > >   
> > > > On Wed, Oct 11, 2023 at 11:48:16AM +0300, Pekka Paalanen wrote:  
> 
> ...
> 
> > > > > - all selections tailored separately for each userspace subscriber
> > > > > (- per open device file description selection of messages)
> > > > 
> > > > Again this feels like a userspace problem. Sessions could register what
> > > > kind of info they need for their session, and something like journald 
> > > > can
> > > > figure out how to record it all.  
> > > 
> > > Only if the kernel actually attaches all the required information to
> > > the debug messages *in machine readable form* so that userspace
> > > actually can do the filtering. And that makes *that* information UABI.
> > > Maybe that's fine? I wouldn't know.  
> > 
> > Well if you configure the filters to go into separate ringbuffers for each
> > session (or whatever you want to split) it also becomes uapi.
> 
> It's a different UAPI: filter configuration vs. message structure. I
> don't mind which it is, I just suspect one is easier to maintain and
> extend than the other.
> 
> > Also I'd say that for the first cut just getting the logs out on demand
> > should be good enough, multi-gpu (or multi-compositor) systems are a step
> > further. We can figure those out when we get there.
> 
> This reminds me of what you recently said in IRC about a very different
> topic:
> 
>swick[m], tell this past me roughly 10 years ago, would
>   have been easy to add into the design back when there was no
>   driver code yet 
> 
> I just want to mention today everything I can see as useful. It's up to
> the people doing the actual work to decide what they include and how.

I actually pondered this a bit more today, and I think even with hindsight
the atomic design we ended up with was probably rather close to optimal.

Sure there's a bunch of things that would have been nice to include, but
another very hard requirement of atomic was that it's feasible to convert
current drivers over to it. And I think going full free-standing state
structures with unlimited (at least at the design level) queue depth would
have been a bridge too far.

The hacks and conversion helpers are all gone by now, but "you can just
peek at the object struct to get your state" was a huge help in reducing
the conversion churn.

But it definitely resulted in a big price we're still paying.

tldr I don't think getting somewhere useful, even if somewhat deficient,
is bad.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v1] dynamic_debug: add support for logs destination

2023-10-12 Thread Daniel Vetter
On Thu, Oct 12, 2023 at 11:55:48AM +0300, Pekka Paalanen wrote:
> On Wed, 11 Oct 2023 11:42:24 +0200
> Daniel Vetter  wrote:
> 
> > On Wed, Oct 11, 2023 at 11:48:16AM +0300, Pekka Paalanen wrote:
> > > On Tue, 10 Oct 2023 10:06:02 -0600
> > > jim.cro...@gmail.com wrote:
> > >   
> > > > since I name-dropped you all,  
> > > 
> > > Hi everyone,
> > > 
> > > I'm really happy to see this topic being developed! I've practically
> > > forgot about it myself, but the need for it has not diminished at all.
> > > 
> > > I didn't understand much of the conversation, so I'll just reiterate
> > > what I would use it for, as a Wayland compositor developer.
> > > 
> > > I added a few more cc's to get better coverage of DRM and Wayland
> > > compositor developers.
> > >   
> > > > On Tue, Oct 10, 2023 at 10:01 AM  wrote:  
> > > > >
> > > > > On Mon, Oct 9, 2023 at 4:47 PM Łukasz Bartosik  
> > > > > wrote:
> > > 
> > > ...
> > >   
> > > > > > I don't have a real life use case to configure different trace
> > > > > > instance for each callsite.
> > > > > > I just tried to be as much flexible as possible.
> > > > > >
> > > > >
> > > > > Ive come around to agree - I looked back at some old threads
> > > > > (that I was a part of, and barely remembered :-}
> > > > >
> > > > > At least Sean Paul, Lyude, Simon Ser, Pekka Paalanen
> > > > > have expressed a desire for a "flight-recorder"
> > > > > it'd be hard to say now that 2-3 such buffers would always be enough,
> > > > > esp as theres a performance reason for having your own.  
> > > 
> > > A Wayland compositor has roughly three important things where the kernel
> > > debugs might come in handy:
> > > - input
> > > - DRM KMS
> > > - DRM GPU rendering
> > > 
> > > DRM KMS is the one I've been thinking of in the flight recorder context
> > > the most, because KMS hardware varies a lot, and there is plenty of
> > > room for both KMS drivers and KMS userspace to go wrong. The usual
> > > result is your display doesn't work, so the system is practically
> > > unusable to the end user. In the wild, the simplest or maybe the only
> > > way out of that may be a reboot, maybe an automated one (e.g. digital
> > > signage). In order to debug such problems, we would need both
> > > compositor logs and the relevant kernel debug messages.
> > > 
> > > For example, Weston already has a flight recorder framework of its own,
> > > so we have the compositor debug logs. It would be useful to get the
> > > selected kernel debug logs in the same place. It could be used for
> > > automated or semi-manual bug reporting, for example, making the
> > > administrator or end user life much easier reporting issues.
> > > 
> > > Since this is usually a production environment, and the Wayland
> > > compositor runs without root privileges, we need something that works
> > > with that. We would likely want the kernel debug messages in the
> > > compositor to combine and order them properly with the compositor debug
> > > messages.
> > > 
> > > It's quite likely that developers would like to pick and choose which
> > > kernel debug messages might be interesting enough to record, to avoid
> > > excessive log flooding. The flight recorder in Weston is fixed size to
> > > avoid running out of memory or disk space. I can also see that Weston
> > > could have debugging options that affect which kernel debug messages it
> > > subscribes to. We can have a reasonable default setup that allows us to
> > > pinpoint the problem area and figure out most problems, and if needed,
> > > we could ask the administrator pass another debug option to Weston. It
> > > helps if there is just one place to configure everything about the
> > > compositor.
> > > 
> > > This implies that it would be really nice to have userspace subscriber
> > > specific debug message streams from the kernel, or a good way to filter
> > > the messages we want. A Wayland compositor would not be interested in
> > > file system or wireless debugs for example, but another system
> > > component might be. There is also a security aspect of which component is
> > > allowed to see which messages in c

Re: [PATCH v1] dynamic_debug: add support for logs destination

2023-10-11 Thread Daniel Vetter
ss to get a
>   little too much DRM KMS debug (that is, from the whole device instead
>   of just the leased parts), it may not be worth to consider splitting
>   debug message streams this far.
> 
> If userspace is offered some standardised fields in kernel debug
> structures, then userspace could do some filtering on its own too, but I
> guess it would be better to filter at the source and not need that.
> 
> There is also an anti-goal. The kernel debug message contents are
> specifically not machine-parsable. I very much do not want to impose
> debug strings as ABI, they are for human (and AI?) readers only.
> 
> 
> As a summary, here are the most important requirements first:
> - usable in production as a normal thing to enable always by default
> - final delivery to unprivileged userspace process

I think this is the one that's trickiest, and I don't fully understand why
you need it. The issues I'm seeing:

- logs tend to leak a lot of kernel internal state that's useful for
  attacks. There's measures for the worst (like obfuscating kernel
  pointers by hashing them), so there's always going to be a difference
  here between what full sysadmin can get and what unpriviledged userspace
  can get. And there's always a risk we miss something that we should
  obfuscate but didn't.

- handing this to userspace increases the risks it becomes uapi. Who's
  going to stop compositors from sussing out the reason an atomic commit
  failed from the logs if they can get them easily, and these logs contain
  very interesting information about why something failed?

  Much better if journald or a crash handler assemebles all the different
  flight recorder logs and packages it into a bug report so that the
  compositor cannot ever get at these directly. Yeah this needs some OS
  support with a dbus request or similar so that the compositor can ask
  for a crash dump with everything relevant to its session.

- the idea of an in-kernel flight recorder is that it's really fast. The
  entire tracing infra is built such that recording an event is really
  quick, but printing it is not - the entire string formatting is delayed
  to when userspace reads the buffers. If you constantly push the log
  messages to userspace we toss the advantage of the low-overhead
  in-kernel flight recorder. If you push logs to dmesg there's a
  substantial measureable overhead which you don't really want in
  production, and your requirement would impose the same.

- I'm not sure how this is supposed to mesh with userspace log aggregators
  like journald when every compositor has it's own flight recorder on top.
  Feels a bit like a solution that ignores the entire os stack and assumes
  that the only pieces we can touch are the kernel and the compositor to
  get to such a flight recorder.

  You might object that events will get out-of-order if you merge multiple
  logs after the fact, but that's the case anyway if we use the tracing
  framework since that's always a ringbuffer within the kernel and not
  synchronous. The only thing we could do is allow userspace to push
  markers into that ringbuffer, which is easily done by adding more debug
  output lines (heck we could even add a logging cookie to certain ioctl
  when userspace really cares about knowing exact ordering of it's
  requests with the stuff the kernel does).

- If you really want direct deliver to userspace I guess we could do
  something where sessiond opens the flight recorder fd for you, sets it
  all up and passes it to the compositor. But I'm really not a big fan of
  sending the full kms dbg spam to compositors to freely digest in real
  time.

> - per debug-print selection of messages (finer or coarser, categories
>   within a kernel sub-system could be enough)
> - per originating device (driver instance) selection of messages

The dyndbg stuff can do all that already, which is why I'm so much in
favour of relying on that framework.

> - all selections tailored separately for each userspace subscriber
> (- per open device file description selection of messages)

Again this feels like a userspace problem. Sessions could register what
kind of info they need for their session, and something like journald can
figure out how to record it all.

If you want the kernel to keep separate flight recorders I guess we could
add that, but I don't think it currently exists for the dyndbg stuff at
least. Maybe a flight recorder v2 feature, once the basics are in.

> That's my idea of it. It is interesting to see how far the requirements
> can be reasonably realised.

I think aside from the "make it available directly to unpriviledged
userspace" everything sounds reasonable and doable.

More on the process side of things, I think Jim is very much looking for
acks and tested-by by people who are interested in better drm logging
infra. That should help that things are moving in a direction that's
actually useful, even when it's not yet entirely complete.

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v5 6/6] drm/doc: Define KMS atomic state set

2023-08-02 Thread Daniel Vetter
On Mon, 31 Jul 2023 at 04:01, André Almeida  wrote:
>
> Em 13/07/2023 04:51, Pekka Paalanen escreveu:
> > On Tue, 11 Jul 2023 10:57:57 +0200
> > Daniel Vetter  wrote:
> >
> >> On Fri, Jul 07, 2023 at 07:40:59PM -0300, André Almeida wrote:
> >>> From: Pekka Paalanen 
> >>>
> >>> Specify how the atomic state is maintained between userspace and
> >>> kernel, plus the special case for async flips.
> >>>
> >>> Signed-off-by: Pekka Paalanen 
> >>> Signed-off-by: André Almeida 
> >>> ---
> >>> v4: total rework by Pekka
> >>> ---
> >>>   Documentation/gpu/drm-uapi.rst | 41 ++
> >>>   1 file changed, 41 insertions(+)
> >>>
> >>> diff --git a/Documentation/gpu/drm-uapi.rst 
> >>> b/Documentation/gpu/drm-uapi.rst
> >>> index 65fb3036a580..6a1662c08901 100644
> >>> --- a/Documentation/gpu/drm-uapi.rst
> >>> +++ b/Documentation/gpu/drm-uapi.rst
> >>> @@ -486,3 +486,44 @@ and the CRTC index is its position in this array.
> >>>
> >>>   .. kernel-doc:: include/uapi/drm/drm_mode.h
> >>>  :internal:
> >>> +
> >>> +KMS atomic state
> >>> +
> >>> +
> >>> +An atomic commit can change multiple KMS properties in an atomic fashion,
> >>> +without ever applying intermediate or partial state changes.  Either the 
> >>> whole
> >>> +commit succeeds or fails, and it will never be applied partially. This 
> >>> is the
> >>> +fundamental improvement of the atomic API over the older non-atomic API 
> >>> which is
> >>> +referred to as the "legacy API".  Applying intermediate state could 
> >>> unexpectedly
> >>> +fail, cause visible glitches, or delay reaching the final state.
> >>> +
> >>> +An atomic commit can be flagged with DRM_MODE_ATOMIC_TEST_ONLY, which 
> >>> means the
> >>> +complete state change is validated but not applied.  Userspace should 
> >>> use this
> >>> +flag to validate any state change before asking to apply it. If 
> >>> validation fails
> >>> +for any reason, userspace should attempt to fall back to another, perhaps
> >>> +simpler, final state.  This allows userspace to probe for various 
> >>> configurations
> >>> +without causing visible glitches on screen and without the need to undo a
> >>> +probing change.
> >>> +
> >>> +The changes recorded in an atomic commit apply on top the current KMS 
> >>> state in
> >>> +the kernel. Hence, the complete new KMS state is the complete old KMS 
> >>> state with
> >>> +the committed property settings done on top. The kernel will 
> >>> automatically avoid
> >>> +no-operation changes, so it is safe and even expected for userspace to 
> >>> send
> >>> +redundant property settings.  No-operation changes do not count towards 
> >>> actually
> >>> +needed changes, e.g.  setting MODE_ID to a different blob with identical
> >>> +contents as the current KMS state shall not be a modeset on its own.
> >>
> >> Small clarification: The kernel indeed tries very hard to make redundant
> >> changes a no-op, and I think we should consider any issues here bugs. But
> >> it still has to check, which means it needs to acquire the right locks and
> >> put in the right (cross-crtc) synchronization points, and due to
> >> implmentation challenges it's very hard to try to avoid that in all cases.
> >> So adding redundant changes especially across crtc (and their connected
> >> planes/connectors) might result in some oversynchronization issues, and
> >> userspace should therefore avoid them if feasible.
> >>
> >> With some sentences added to clarify this:
> >>
> >> Reviewed-by: Daniel Vetter 
> >
> > After talking on IRC yesterday, we realized that the no-op rule is
> > nowhere near as generic as I have believed. Roughly:
> > https://oftc.irclog.whitequark.org/dri-devel/2023-07-12#1689152446-1689157291;
> >
> >
>
> How about:
>
> The changes recorded in an atomic commit apply on top the current KMS
> state in the kernel. Hence, the complete new KMS state is the complete
> old KMS state with the committed property settings done on top. The
> kernel will try to avoid no-operation changes, so it is safe for
>

Re: Need support to display application at (0, 0) position on Weston desktop

2023-07-12 Thread Daniel Stone
Hi Huy,

On Wed, 12 Jul 2023 at 16:15, huy nguyen 
wrote:

> I have a Linux system based on weston wayland. I run MPV player and expect
> it displays a video window at (0,0) position on the screen (top left corner
> of the display). I already use x11egl backend option to MPV to support a
> fixed position to application but the video window of MPV is displayed at
> an offset (X offset, Y offset) from (0,0) position as shown by the picture
> below:
>

You probably want to make mpv be fullscreen, and then it will take up the
whole area of the screen. kiosk-shell does this well, by telling all
applications to be fullscreen.

Cheers,
Daniel


Re: [PATCH v5 6/6] drm/doc: Define KMS atomic state set

2023-07-11 Thread Daniel Vetter
On Fri, Jul 07, 2023 at 07:40:59PM -0300, André Almeida wrote:
> From: Pekka Paalanen 
> 
> Specify how the atomic state is maintained between userspace and
> kernel, plus the special case for async flips.
> 
> Signed-off-by: Pekka Paalanen 
> Signed-off-by: André Almeida 
> ---
> v4: total rework by Pekka
> ---
>  Documentation/gpu/drm-uapi.rst | 41 ++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> index 65fb3036a580..6a1662c08901 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -486,3 +486,44 @@ and the CRTC index is its position in this array.
>  
>  .. kernel-doc:: include/uapi/drm/drm_mode.h
> :internal:
> +
> +KMS atomic state
> +
> +
> +An atomic commit can change multiple KMS properties in an atomic fashion,
> +without ever applying intermediate or partial state changes.  Either the 
> whole
> +commit succeeds or fails, and it will never be applied partially. This is the
> +fundamental improvement of the atomic API over the older non-atomic API 
> which is
> +referred to as the "legacy API".  Applying intermediate state could 
> unexpectedly
> +fail, cause visible glitches, or delay reaching the final state.
> +
> +An atomic commit can be flagged with DRM_MODE_ATOMIC_TEST_ONLY, which means 
> the
> +complete state change is validated but not applied.  Userspace should use 
> this
> +flag to validate any state change before asking to apply it. If validation 
> fails
> +for any reason, userspace should attempt to fall back to another, perhaps
> +simpler, final state.  This allows userspace to probe for various 
> configurations
> +without causing visible glitches on screen and without the need to undo a
> +probing change.
> +
> +The changes recorded in an atomic commit apply on top the current KMS state 
> in
> +the kernel. Hence, the complete new KMS state is the complete old KMS state 
> with
> +the committed property settings done on top. The kernel will automatically 
> avoid
> +no-operation changes, so it is safe and even expected for userspace to send
> +redundant property settings.  No-operation changes do not count towards 
> actually
> +needed changes, e.g.  setting MODE_ID to a different blob with identical
> +contents as the current KMS state shall not be a modeset on its own.

Small clarification: The kernel indeed tries very hard to make redundant
changes a no-op, and I think we should consider any issues here bugs. But
it still has to check, which means it needs to acquire the right locks and
put in the right (cross-crtc) synchronization points, and due to
implmentation challenges it's very hard to try to avoid that in all cases.
So adding redundant changes especially across crtc (and their connected
planes/connectors) might result in some oversynchronization issues, and
userspace should therefore avoid them if feasible.

With some sentences added to clarify this:

Reviewed-by: Daniel Vetter 

> +
> +A "modeset" is a change in KMS state that might enable, disable, or 
> temporarily
> +disrupt the emitted video signal, possibly causing visible glitches on 
> screen. A
> +modeset may also take considerably more time to complete than other kinds of
> +changes, and the video sink might also need time to adapt to the new signal
> +properties. Therefore a modeset must be explicitly allowed with the flag
> +DRM_MODE_ATOMIC_ALLOW_MODESET.  This in combination with
> +DRM_MODE_ATOMIC_TEST_ONLY allows userspace to determine if a state change is
> +likely to cause visible disruption on screen and avoid such changes when end
> +users do not expect them.
> +
> +An atomic commit with the flag DRM_MODE_PAGE_FLIP_ASYNC is allowed to
> +effectively change only the FB_ID property on any planes. No-operation 
> changes
> +are ignored as always. Changing any other property will cause the commit to 
> be
> +rejected.
> -- 
> 2.41.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Need support to have weston randr release

2023-07-08 Thread Daniel Stone
Hi Huy,

On Sat, 8 Jul 2023 at 08:39, huy nguyen  wrote:

> I have a Linux system based on weston wayland and I need to get the
> current setting of the display resolution.
> Unfortunately, xrandr command does not work on Wayland.
> After much searching, I came to this information which is about adding
> weston-randr support for Weston compositor:
>
> https://lists.freedesktop.org/archives/wayland-devel/2014-February/013480.html
>
> However, I could not find a link to download the patch to apply in order
> to have weston-randr command
> Please advise if the patch is available for the community to use.
>

The current resolution is provided as an event on the wl_output interfaces.
>From the command line, you can get this from wayland-info.

Cheers,
Daniel


Re: Weston mirror/clone to 2 different displays

2023-06-23 Thread Daniel Stone
Hi Dawn,

On Thu, 22 Jun 2023 at 18:09, Dawn HOWE  wrote:

> I am developing an (embedded) medical device which is required to have a
> touchscreen display and also mirror the output to a monitor connected via
> HDMI. The device is using Wayland/Weston on TorizonCore (based on a yocto
> kirkstone). I am able to get the display extended from HDMI to LVDS, but
> not have the output mirrored to both displays. I posted a query on the
> Toradex community, and received a response that Weston may not be capable
> of doing this. (
> https://community.toradex.com/t/apalis-imx8-hdmi-and-lvds-display-not-mirroring-cloning/19869
> ).
>
>
>
> I have searched and found some old posts from several years ago indicating
> that it was not supported, but may be with a patch. I understand that
> “same-as” configuration in weston.ini does not work for my scenario.
>
>
>
> What is the current state of cloning/mirroring to two different outputs,
> but on the same card. E.g (card1-HDMI-A-1 and card1-LVDS-1):
>
> ls /sys/class/drm
>
> card0  card1  card1-HDMI-A-1  card1-LVDS-1  renderD128
> renderD129  version
>

Weston can currently mirror it if the display driver directly supports it.
You can use the same-as configuration option (see man weston-drm) to enable
this. If your display driver doesn't support this in the kernel, then
Weston won't do it for now, but we are actively working on this and expect
to have a branch capable of this within the next couple of weeks or so.

Cheers,
Daniel


Re: Weston 12 compatibility with Yocto Kirkstone

2023-06-15 Thread Daniel Stone
Hi Namit,

On Thu, 15 Jun 2023 at 16:37, Namit Solanki (QUIC) <
quic_nsola...@quicinc.com> wrote:

>
> As we all know Weston 10 has bitbakes files available for Yocto kirkstone
> version. Can Weston 12 work with Kirkstone as well?
>
>
>
> Is Weston 12 compatible with Kirkstone?
>
>
>
> Do we need to write our own bitbake files for Weston 12 to compile with
> Kirkstone?
>
>
>
> Please help on these queries.
>

OpenEmbedded does have a 11.0.1 build definition for Weston. You should be
able to reuse this whilst bumping the version to 12.0.0, as long as you
also pull the other dependencies from OE such as libseat and probably a
newer version of Meson.

Cheers,
Daniel


Re: Refresh rates with multiple monitors

2023-06-14 Thread Daniel Stone
Hi Joe,

On Wed, 14 Jun 2023 at 21:33, Joe M  wrote:

> Thanks Daniel. Do you know if wl_output instances are decoupled from each
> other, when it comes to display refresh?
>

Yep, absolutely.


> The wl_output geometry info hints that each output can be thought of as a
> region in a larger compositor canvas, given the logical x/y fields in the
> geometry. Is the compositor able to handle the repaint scheduling in a
> refresh-aware way?
>

Yes.


> I'm trying to get a better understanding of how these pieces interact to
> maximize draw time but still hit the glass every frame. The various blog
> posts and documentation out there are pretty clear when it comes to drawing
> to a single physical display, but less so when multiple displays are in use.
>

Per-output repaint cycles are taken as a given. You can assume that every
compositor does this, and any compositor which doesn't do this is so
hopelessly broken as to not be worth considering.

Cheers,
Daniel


Re: Refresh rates with multiple monitors

2023-06-13 Thread Daniel Stone
Hi,

On Tue, 13 Jun 2023 at 10:20, Pekka Paalanen  wrote:

> On Tue, 13 Jun 2023 01:11:44 + (UTC)
> Joe M  wrote:
> > As I understand, there is one global wl_display. Is there always one
> > wl_compositor too?
>
> That is inconsequential.
>

Yeah, I think the really consequential thing is that a wl_display really
just represents a connection to a Wayland server (aka compositor).

Display targets (e.g. 'the HDMI connector on the left', 'the DSI panel')
are represented by wl_output objects. There is one of those for each output.

Cheers,
Daniel


Re: Why does Java (XWayland / Weston) resize a Window to 1x1 pixel when HDMI is unplugged (and does not resize back when HDMI is plugged)

2023-06-08 Thread Daniel Stone
On Thu, 8 Jun 2023 at 16:54, Martin Petzold  wrote:

> Am 08.06.23 um 16:58 schrieb Daniel Stone:
>
> On Thu, 8 Jun 2023 at 14:28, Pekka Paalanen  wrote:
>
>> On Thu, 8 Jun 2023 14:49:37 +0200
>> Martin Petzold  wrote:
>> > btw. we are using a Weston 9 package from NXP and there may be
>> important
>> > fixes for our i.MX8 platform in there.
>>
>> Oh. We cannot support modified Weston, sorry. Significant vendor
>> modifications tend to break things, and we have no idea what they do or
>> why. Maybe this problem is not because of that, maybe it is, hard to
>> guess.
>
>
> The good news is that mainline Linux runs very well on all i.MX6, and most
> i.MX8 platforms. You can ditch the NXP BSP and just use a vanilla Yocto
> build for your machine. This will have upstream Weston which should solve
> your problem.
>
> Do you mean Linux mainline or Yocto mainline?
>
> Because we are building from Debian and not from Yocto, for several
> reasons. We have a more complex system setup.
>
Ah, I wasn't aware they also had Debian distributions. Nice. Yes, I mean
mainline of upstream Linux + Mesa + Weston (plus GStreamer etc if you want
to use that). That's worked very well out of the box for a few years now
with no vendor trees required.

Cheers,
Daniel


Re: Why does Java (XWayland / Weston) resize a Window to 1x1 pixel when HDMI is unplugged (and does not resize back when HDMI is plugged)

2023-06-08 Thread Daniel Stone
Hi,

On Thu, 8 Jun 2023 at 14:28, Pekka Paalanen  wrote:

> On Thu, 8 Jun 2023 14:49:37 +0200
> Martin Petzold  wrote:
> > btw. we are using a Weston 9 package from NXP and there may be important
> > fixes for our i.MX8 platform in there.
>
> Oh. We cannot support modified Weston, sorry. Significant vendor
> modifications tend to break things, and we have no idea what they do or
> why. Maybe this problem is not because of that, maybe it is, hard to
> guess.


The good news is that mainline Linux runs very well on all i.MX6, and most
i.MX8 platforms. You can ditch the NXP BSP and just use a vanilla Yocto
build for your machine. This will have upstream Weston which should solve
your problem.

Cheers,
Daniel


Re: [RFC] Plane color pipeline KMS uAPI

2023-05-08 Thread Daniel Vetter
On Mon, 8 May 2023 at 10:58, Simon Ser  wrote:
>
> On Friday, May 5th, 2023 at 21:53, Daniel Vetter  wrote:
>
> > On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote:
> > > On Friday, May 5th, 2023 at 17:28, Daniel Vetter  wrote:
> > >
> > > > Ok no comments from me on the actual color operations and semantics of 
> > > > all
> > > > that, because I have simply nothing to bring to that except confusion 
> > > > :-)
> > > >
> > > > Some higher level thoughts instead:
> > > >
> > > > - I really like that we just go with graph nodes here. I think that was
> > > >   bound to happen sooner or later with kms (we almost got there with
> > > >   writeback, and with hindsight maybe should have).
> > >
> > > I'd really rather not do graphs here. We only need linked lists as 
> > > Sebastian
> > > said. Graphs would significantly add more complexity to this proposal, and
> > > I don't think that's a good idea unless there is a strong use-case.
> >
> > You have a graph, because a graph is just nodes + links. I did _not_
> > propose a full generic graph structure, the link pointer would be in the
> > class/type specific structure only. Like how we have the plane->crtc or
> > connector->crtc links already like that (which already _is_ is full blown
> > graph).
>
> I really don't get why a pointer in a struct makes plane->crtc a full-blown
> graph. There is only a single parent-child link. A plane has a reference to a
> CRTC, and nothing more.
>
> You could say that anything is a graph. Yes, even an isolated struct somewhere
> is a graph: one with a single node and no link. But I don't follow what's the
> point of explaining everything with a graph when we only need a much simpler
> subset of the concept of graphs?
>
> Putting the graph thing aside, what are you suggesting exactly from a concrete
> uAPI point-of-view? Introducing a new struct type? Would it be a colorop
> specific struct, or a more generic one? What would be the fields? Why do you
> think that's necessary and better than the current proposal?
>
> My understanding so far is that you're suggesting introducing something like
> this at the uAPI level:
>
> struct drm_mode_node {
> uint32_t id;
>
> uint32_t children_count;
> uint32_t *children; // list of child object IDs
> };

Already too much I think

struct drm_mode_node {
struct drm_mode_object base;
struct drm_private_obj atomic_base;
enum drm_mode_node_enum type;
};

The actual graph links would be in the specific type's state
structure, like they are for everything else. And the limits would be
on the property type, we probably need a new DRM_MODE_PROP_OBJECT_ENUM
to make the new limitations work correctly, since the current
DRM_MODE_PROP_OBJECT only limits to a specific type of object, not an
explicit list of drm_mode_object.id.

You might not even need a node subclass for the state stuff, that
would directly be a drm_color_op_state that only embeds
drm_private_state.

Another uapi difference is that the new kms objects would be of type
DRM_MODE_OBJECT_NODE, and would always have a "class" property.

> I don't think this is a good idea for multiple reasons. First, this is
> overkill: we don't need this complexity, and this complexity will make it more
> difficult to reason about the color pipeline. This is a premature abstraction,
> one we don't need right now, and one I heaven't heard a potential future
> use-case for. Sure, one can kill an ant with a sledgehammer if they'd like, 
> but
> that's not the right tool for the job.
>
> Second, this will make user-space miserable. User-space already has a tricky
> task to achieve to translate its abstract descriptive color pipeline to our
> proposed simple list of color operations. If we expose a full-blown graph, 
> then
> the user-space logic will need to handle arbitrary graphs. This will have a
> significant cost (on implementation and testing), which we will be paying in
> terms of time spent and in terms of bugs.

The color op pipeline would still be linear. I did not ask for a non-linear one.

> Last, this kind of generic "node" struct is at odds with existing KMS object
> types. So far, KMS objects are concrete like CRTC, connector, plane, etc.
> "Node" is abstract. This is inconsistent.

Yeah I think I think we should change that. That's essentially the
full extend of my proposal. The classes + possible_foo mask approach
just always felt rather brittle to me (and there's plenty of userspace
out there to prove that's the case), going more explicit with the
links with enumerated combos feels better. P

Re: [RFC] Plane color pipeline KMS uAPI

2023-05-08 Thread Daniel Vetter
On Mon, 8 May 2023 at 10:24, Pekka Paalanen  wrote:
>
> On Fri, 5 May 2023 21:51:41 +0200
> Daniel Vetter  wrote:
>
> > On Fri, May 05, 2023 at 05:57:37PM +0200, Sebastian Wick wrote:
> > > On Fri, May 5, 2023 at 5:28 PM Daniel Vetter  wrote:
> > > >
> > > > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote:
> > > > > Hi all,
> > > > >
> > > > > The goal of this RFC is to expose a generic KMS uAPI to configure the 
> > > > > color
> > > > > pipeline before blending, ie. after a pixel is tapped from a plane's
> > > > > framebuffer and before it's blended with other planes. With this new 
> > > > > uAPI we
> > > > > aim to reduce the battery life impact of color management and HDR on 
> > > > > mobile
> > > > > devices, to improve performance and to decrease latency by skipping
> > > > > composition on the 3D engine. This proposal is the result of 
> > > > > discussions at
> > > > > the Red Hat HDR hackfest [1] which took place a few days ago. 
> > > > > Engineers
> > > > > familiar with the AMD, Intel and NVIDIA hardware have participated in 
> > > > > the
> > > > > discussion.
> > > > >
> > > > > This proposal takes a prescriptive approach instead of a descriptive 
> > > > > approach.
> > > > > Drivers describe the available hardware blocks in terms of low-level
> > > > > mathematical operations, then user-space configures each block. We 
> > > > > decided
> > > > > against a descriptive approach where user-space would provide a 
> > > > > high-level
> > > > > description of the colorspace and other parameters: we want to give 
> > > > > more
> > > > > control and flexibility to user-space, e.g. to be able to replicate 
> > > > > exactly the
> > > > > color pipeline with shaders and switch between shaders and KMS 
> > > > > pipelines
> > > > > seamlessly, and to avoid forcing user-space into a particular color 
> > > > > management
> > > > > policy.
> > > >
> > > > Ack on the prescriptive approach, but generic imo. Descriptive pretty 
> > > > much
> > > > means you need the shaders at the same api level for fallback purposes,
> > > > and we're not going to have that ever in kms. That would need something
> > > > like hwc in userspace to work.
> > >
> > > Which would be nice to have but that would be forcing a specific color
> > > pipeline on everyone and we explicitly want to avoid that. There are
> > > just too many trade-offs to consider.
> > >
> > > > And not generic in it's ultimate consquence would mean we just do a blob
> > > > for a crtc with all the vendor register stuff like adf (android display
> > > > framework) does, because I really don't see a point in trying a
> > > > generic-looking-but-not vendor uapi with each color op/stage split out.
> > > >
> > > > So from very far and pure gut feeling, this seems like a good middle
> > > > ground in the uapi design space we have here.
> > >
> > > Good to hear!
> > >
> > > > > We've decided against mirroring the existing CRTC properties
> > > > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color 
> > > > > management
> > > > > pipeline can significantly differ between vendors and this approach 
> > > > > cannot
> > > > > accurately abstract all hardware. In particular, the availability, 
> > > > > ordering and
> > > > > capabilities of hardware blocks is different on each display engine. 
> > > > > So, we've
> > > > > decided to go for a highly detailed hardware capability discovery.
> > > > >
> > > > > This new uAPI should not be in conflict with existing standard KMS 
> > > > > properties,
> > > > > since there are none which control the pre-blending color pipeline at 
> > > > > the
> > > > > moment. It does conflict with any vendor-specific properties like
> > > > > NV_INPUT_COLORSPACE or the patches on the mailing list adding 
> > > > > AMD-specific
> > > > > properties. Drivers will need to either reject atomic commits 
> > > > > co

Re: [RFC] Plane color pipeline KMS uAPI

2023-05-05 Thread Daniel Vetter
On Fri, May 05, 2023 at 04:06:26PM +, Simon Ser wrote:
> On Friday, May 5th, 2023 at 17:28, Daniel Vetter  wrote:
> 
> > Ok no comments from me on the actual color operations and semantics of all
> > that, because I have simply nothing to bring to that except confusion :-)
> > 
> > Some higher level thoughts instead:
> > 
> > - I really like that we just go with graph nodes here. I think that was
> >   bound to happen sooner or later with kms (we almost got there with
> >   writeback, and with hindsight maybe should have).
> 
> I'd really rather not do graphs here. We only need linked lists as Sebastian
> said. Graphs would significantly add more complexity to this proposal, and
> I don't think that's a good idea unless there is a strong use-case.

You have a graph, because a graph is just nodes + links. I did _not_
propose a full generic graph structure, the link pointer would be in the
class/type specific structure only. Like how we have the plane->crtc or
connector->crtc links already like that (which already _is_ is full blown
graph).

Maybe explain what exactly you're thinking under "do graphs here" so I
understand what you mean differently than me?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC] Plane color pipeline KMS uAPI

2023-05-05 Thread Daniel Vetter
On Fri, May 05, 2023 at 05:57:37PM +0200, Sebastian Wick wrote:
> On Fri, May 5, 2023 at 5:28 PM Daniel Vetter  wrote:
> >
> > On Thu, May 04, 2023 at 03:22:59PM +, Simon Ser wrote:
> > > Hi all,
> > >
> > > The goal of this RFC is to expose a generic KMS uAPI to configure the 
> > > color
> > > pipeline before blending, ie. after a pixel is tapped from a plane's
> > > framebuffer and before it's blended with other planes. With this new uAPI 
> > > we
> > > aim to reduce the battery life impact of color management and HDR on 
> > > mobile
> > > devices, to improve performance and to decrease latency by skipping
> > > composition on the 3D engine. This proposal is the result of discussions 
> > > at
> > > the Red Hat HDR hackfest [1] which took place a few days ago. Engineers
> > > familiar with the AMD, Intel and NVIDIA hardware have participated in the
> > > discussion.
> > >
> > > This proposal takes a prescriptive approach instead of a descriptive 
> > > approach.
> > > Drivers describe the available hardware blocks in terms of low-level
> > > mathematical operations, then user-space configures each block. We decided
> > > against a descriptive approach where user-space would provide a high-level
> > > description of the colorspace and other parameters: we want to give more
> > > control and flexibility to user-space, e.g. to be able to replicate 
> > > exactly the
> > > color pipeline with shaders and switch between shaders and KMS pipelines
> > > seamlessly, and to avoid forcing user-space into a particular color 
> > > management
> > > policy.
> >
> > Ack on the prescriptive approach, but generic imo. Descriptive pretty much
> > means you need the shaders at the same api level for fallback purposes,
> > and we're not going to have that ever in kms. That would need something
> > like hwc in userspace to work.
> 
> Which would be nice to have but that would be forcing a specific color
> pipeline on everyone and we explicitly want to avoid that. There are
> just too many trade-offs to consider.
> 
> > And not generic in it's ultimate consquence would mean we just do a blob
> > for a crtc with all the vendor register stuff like adf (android display
> > framework) does, because I really don't see a point in trying a
> > generic-looking-but-not vendor uapi with each color op/stage split out.
> >
> > So from very far and pure gut feeling, this seems like a good middle
> > ground in the uapi design space we have here.
> 
> Good to hear!
> 
> > > We've decided against mirroring the existing CRTC properties
> > > DEGAMMA_LUT/CTM/GAMMA_LUT onto KMS planes. Indeed, the color management
> > > pipeline can significantly differ between vendors and this approach cannot
> > > accurately abstract all hardware. In particular, the availability, 
> > > ordering and
> > > capabilities of hardware blocks is different on each display engine. So, 
> > > we've
> > > decided to go for a highly detailed hardware capability discovery.
> > >
> > > This new uAPI should not be in conflict with existing standard KMS 
> > > properties,
> > > since there are none which control the pre-blending color pipeline at the
> > > moment. It does conflict with any vendor-specific properties like
> > > NV_INPUT_COLORSPACE or the patches on the mailing list adding AMD-specific
> > > properties. Drivers will need to either reject atomic commits configuring 
> > > both
> > > uAPIs, or alternatively we could add a DRM client cap which hides the 
> > > vendor
> > > properties and shows the new generic properties when enabled.
> > >
> > > To use this uAPI, first user-space needs to discover hardware 
> > > capabilities via
> > > KMS objects and properties, then user-space can configure the hardware 
> > > via an
> > > atomic commit. This works similarly to the existing KMS uAPI, e.g. planes.
> > >
> > > Our proposal introduces a new "color_pipeline" plane property, and a new 
> > > KMS
> > > object type, "COLOROP" (short for color operation). The "color_pipeline" 
> > > plane
> > > property is an enum, each enum entry represents a color pipeline 
> > > supported by
> > > the hardware. The special zero entry indicates that the pipeline is in
> > > "bypass"/"no-op" mode. For instance, the following plane properties 
> > > describe

Re: [RFC] Plane color pipeline KMS uAPI

2023-05-05 Thread Daniel Vetter
just go with graph nodes here. I think that was
  bound to happen sooner or later with kms (we almost got there with
  writeback, and with hindsight maybe should have).

- Since there's other use-cases for graph nodes (maybe scaler modes, or
  histogram samplers for adaptive backglight, or blending that goes beyond
  the stacked alpha blending we have now) it think we should make this all
  fairly generic:
  * Add a new graph node kms object type.
  * Add a class type so that userspace knows which graph nodes it must
understand for a feature (like "ColorOp" on planes here), and which it
can ignore (like perhaps a scaler node to control the interpolation)
  * Probably need to adjust the object property type. Currently that
accept any object of a given type (crtc, fb, blob are the major ones).
I think for these graph nodes we want an explicit enumeration of the
possible next objects. In kms thus far we've done that with the
separate possible_* mask properties, but they're cumbersome.
  * It sounds like for now we only have immutable next pointers, so that
would simplify the first iteration, but should probably anticipate all
this.

- I think the graph node should be built on top of the driver private
  atomic obj/state stuff, and could then be further subclassed for
  specific types. It's a bit much stacking, but avoids too much wheel
  reinventing, and the worst boilerplate can be avoided with some macros
  that combine the pointer chasing with the containter_of upcast. With
  that you can easily build some helpers to walk the graph for a crtc or
  plane or whatever really.

- I guess core atomic code should at least do the graph link validation
  and basic things like that, probably not really more to do. And
  validating the standard properties on some graph nodes ofc.

- I have no idea how we should support the standardization of the state
  structures. Doing a separate subclass for each type sounds extremely
  painful, but unions otoh are ugly. Ideally type-indexed and type safe
  union but C isn't good enough for that. I do think that we should keep
  up the goal that standard properties are decoded into state structures
  in core atomic code, and not in each implementation individaully.

- I think the only other precendent for something like this is the media
  control api in the media subystem. I think it'd be really good to get
  someone like Laurent to ack the graph node infrastructure to make sure
  we're missing any lesson they've learned already. If there's anything
  else we should pull these folks in too ofc.

For merge plan I dropped some ideas already on Harry's rfc for
vendor-private properties, the only thing to add is that we might want to
type up the consensus plan into a merged doc like
Documentation/gpu/rfc/hdr-plane.rst or whatever you feel like for a name.

Cheers, Daniel


> 
> Color operation 42
> ├─ "type": enum {Bypass, 1D curve} = 1D curve
> ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT
> ├─ "lut_size": immutable range = 4096
> ├─ "lut_data": blob
> └─ "next": immutable color operation ID = 43
> 
> To configure this hardware block, user-space can fill a KMS blob with 4096 u32
> entries, then set "lut_data" to the blob ID. Other color operation types might
> have different properties.
> 
> Here is another example with a 3D LUT:
> 
> Color operation 42
> ├─ "type": enum {Bypass, 3D LUT} = 3D LUT
> ├─ "lut_size": immutable range = 33
> ├─ "lut_data": blob
> └─ "next": immutable color operation ID = 43
> 
> And one last example with a matrix:
> 
> Color operation 42
> ├─ "type": enum {Bypass, Matrix} = Matrix
> ├─ "matrix_data": blob
> └─ "next": immutable color operation ID = 43
> 
> [Simon note: having "Bypass" in the "type" enum, and making "type" mutable is
> a bit weird. Maybe we can just add an "active"/"bypass" boolean property on
> blocks which can be bypassed instead.]
> 
> [Jonas note: perhaps a single "data" property for both LUTs and matrices
> would make more sense. And a "size" prop for both 1D and 3D LUTs.]
> 
> If some hardware supports re-ordering operations in the color pipeline, the
> driver can expose multiple pipelines with different operation ordering, and
> user-space can pick the ordering it prefers by selecting the right pipeline.
> The same scheme can be used to expose hardware blocks supporting multiple
> precision levels.
> 
> That's pretty much all there is to it, but as always the devil is in the
> details.
> 
> First, we realized that we need a way to indicate where the scaling operation
> is happeni

Re: Weston 10+ and GLES2 compatibility

2023-03-10 Thread Daniel Stone
Hi Daniel,

On Fri, 10 Mar 2023 at 14:28, Levin, Daniel  wrote:
> We are currently attempting to update from Weston 9.0.0 to Weston 10+ and 
> facing issues with GLES2 compatibility at both build time and run time.
>
> For instance, gl_renderer_setup() exits with error if GL_EXT_unpack_subimage 
> is not present. Other code explicitly includes GLES3/gl3.h and uses pixel 
> formats from GL_EXT_texture_storage.
>
> We are using Mali 400 with proprietary Arm userspace GL drivers, which 
> supports only GLES2 without extensions above.
>
> Could you please clarify whether for Weston 10+ GLES3 is now mandatory 
> dependency? Was this highlighted in any release notes?
>
> If so, then we have to freeze Weston on version 9.0.0.

That did indeed change. We require GLES3 headers at build time (no
requirement for ES3 runtime contexts), and we do require
GL_EXT_unpack_subimage.

It's safe to use newer header sets than your driver supports; you
could just take the headers directly from Khronos, or from Mesa, and
build against those whilst using your other driver at runtime.

GL_EXT_unpack_subimage also has no hardware dependency. It's literally
about two lines to implement in software. If your proprietary driver
can't support this, I strongly recommend switching to the Lima driver
shipped as part of the Linux kernel and Mesa.

> Could you please explain is it safe to keep updating all other Wayland 
> components (client, protocols, xwayland), and keep only Weston compositor 
> downgraded to 9.0.0? I tested and see that such combination works properly. 
> Though I am not sure if that is the correct approach, or it might cause 
> issues. And instead we have to downgrade all the Wayland components to the 
> same older version (in our case: client 1.19, protocols 1.21, weston 9.0.0).

It's completely safe to use newer wayland + wayland-protocols + etc
with an older Weston.

Cheers,
Daniel


Weston 10+ and GLES2 compatibility

2023-03-10 Thread Levin, Daniel
Hello,

We are currently attempting to update from Weston 9.0.0 to Weston 10+ and 
facing issues with GLES2 compatibility at both build time and run time.

For instance, gl_renderer_setup() exits with error if GL_EXT_unpack_subimage is 
not present. Other code explicitly includes GLES3/gl3.h and uses pixel formats 
from GL_EXT_texture_storage.

We are using Mali 400 with proprietary Arm userspace GL drivers, which supports 
only GLES2 without extensions above.

Could you please clarify whether for Weston 10+ GLES3 is now mandatory 
dependency? Was this highlighted in any release notes?

If so, then we have to freeze Weston on version 9.0.0.

Could you please explain is it safe to keep updating all other Wayland 
components (client, protocols, xwayland), and keep only Weston compositor 
downgraded to 9.0.0? I tested and see that such combination works properly. 
Though I am not sure if that is the correct approach, or it might cause issues. 
And instead we have to downgrade all the Wayland components to the same older 
version (in our case: client 1.19, protocols 1.21, weston 9.0.0).

Thanks,
Daniel


Re: Weston does not start with "Failed to open device: No such file or directory, Try again..."

2023-02-17 Thread Daniel Stone
Hi Martin,

On Fri, 17 Feb 2023 at 11:27, Martin Petzold  wrote:
> Feb 17 12:16:24 tavla DISPLAY Wayland[957]: [12:16:24.624] Loading module 
> '/usr/lib/aarch64-linux-gnu/libweston-9/g2d-renderer.so'
> Feb 17 12:16:25 tavla DISPLAY Wayland[957]: [ 1] Failed to open device: 
> No such file or directory, Try again...
> Feb 17 12:16:26 tavla DISPLAY Wayland[957]: [ 2] Failed to open device: 
> No such file or directory, Try again...
> Feb 17 12:16:27 tavla DISPLAY Wayland[957]: [ 3] Failed to open device: 
> No such file or directory, Try again...
> Feb 17 12:16:28 tavla DISPLAY Wayland[957]: [ 4] Failed to open device: 
> No such file or directory, Try again...
> Feb 17 12:16:28 tavla DISPLAY Wayland[957]: [ 5] _OpenDevice(1249): 
> FATAL: Failed to open device, errno=No such file or directory.

g2d-renderer comes from the NXP fork of Weston, customised to work on
their downstream kernels with their libraries. It's presumably looking
for some kind of G2D device node which it can't see for some reason.

If you're using an upstream kernel then vanilla Weston 9.0.0 (with no
NXP patches) works great there on i.MX devices. If you're using a
downstream kernel/GLES/Weston/etc from NXP, then I'm afraid you need
to contact them for support.

Cheers,
Daniel


Re: [RFC PATCH v3 0/3] Support for Solid Fill Planes

2023-01-11 Thread Daniel Vetter
On Fri, Jan 06, 2023 at 04:33:04PM -0800, Abhinav Kumar wrote:
> Hi Daniel
> 
> Thanks for looking into this series.
> 
> On 1/6/2023 1:49 PM, Dmitry Baryshkov wrote:
> > On Fri, 6 Jan 2023 at 20:41, Daniel Vetter  wrote:
> > > 
> > > On Fri, Jan 06, 2023 at 05:43:23AM +0200, Dmitry Baryshkov wrote:
> > > > On Fri, 6 Jan 2023 at 02:38, Jessica Zhang  
> > > > wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > On 1/5/2023 3:33 AM, Daniel Vetter wrote:
> > > > > > On Wed, Jan 04, 2023 at 03:40:33PM -0800, Jessica Zhang wrote:
> > > > > > > Introduce and add support for a solid_fill property. When the 
> > > > > > > solid_fill
> > > > > > > property is set, and the framebuffer is set to NULL, memory fetch 
> > > > > > > will be
> > > > > > > disabled.
> > > > > > > 
> > > > > > > In addition, loosen the NULL FB checks within the atomic commit 
> > > > > > > callstack
> > > > > > > to allow a NULL FB when the solid_fill property is set and add FB 
> > > > > > > checks
> > > > > > > in methods where the FB was previously assumed to be non-NULL.
> > > > > > > 
> > > > > > > Finally, have the DPU driver use drm_plane_state.solid_fill and 
> > > > > > > instead of
> > > > > > > dpu_plane_state.color_fill, and add extra checks in the DPU 
> > > > > > > atomic commit
> > > > > > > callstack to account for a NULL FB in cases where solid_fill is 
> > > > > > > set.
> > > > > > > 
> > > > > > > Some drivers support hardware that have optimizations for solid 
> > > > > > > fill
> > > > > > > planes. This series aims to expose these capabilities to 
> > > > > > > userspace as
> > > > > > > some compositors have a solid fill flag (ex. SOLID_COLOR in the 
> > > > > > > Android
> > > > > > > hardware composer HAL) that can be set by apps like the Android 
> > > > > > > Gears
> > > > > > > app.
> > > > > > > 
> > > > > > > Userspace can set the solid_fill property to a blob containing the
> > > > > > > appropriate version number and solid fill color (in RGB323232 
> > > > > > > format) and
> > > > > > > setting the framebuffer to NULL.
> > > > > > > 
> > > > > > > Note: Currently, there's only one version of the solid_fill blob 
> > > > > > > property.
> > > > > > > However if other drivers want to support a similar feature, but 
> > > > > > > require
> > > > > > > more than just the solid fill color, they can extend this feature 
> > > > > > > by
> > > > > > > creating additional versions of the drm_solid_fill struct.
> > > > > > > 
> > > > > > > Changes in V2:
> > > > > > > - Dropped SOLID_FILL_FORMAT property (Simon)
> > > > > > > - Switched to implementing solid_fill property as a blob (Simon, 
> > > > > > > Dmitry)
> > > > > > > - Changed to checks for if solid_fill_blob is set (Dmitry)
> > > > > > > - Abstracted (plane_state && !solid_fill_blob) checks to helper 
> > > > > > > method
> > > > > > > (Dmitry)
> > > > > > > - Removed DPU_PLANE_COLOR_FILL_FLAG
> > > > > > > - Fixed whitespace and indentation issues (Dmitry)
> > > > > > 
> > > > > > Now that this is a blob, I do wonder again whether it's not cleaner 
> > > > > > to set
> > > > > > the blob as the FB pointer. Or create some kind other kind of 
> > > > > > special data
> > > > > > source objects (because solid fill is by far not the only such 
> > > > > > thing).
> > > > > > 
> > > > > > We'd still end up in special cases like when userspace that doesn't
> > > > > > understand solid fill tries to read out such a framebuffer, but 
> > > > > > these
> > > > > > cases already exist anyway for lack of priviledges.
> > >

Re: [RFC PATCH v3 0/3] Support for Solid Fill Planes

2023-01-06 Thread Daniel Vetter
On Fri, Jan 06, 2023 at 05:43:23AM +0200, Dmitry Baryshkov wrote:
> On Fri, 6 Jan 2023 at 02:38, Jessica Zhang  wrote:
> >
> >
> >
> > On 1/5/2023 3:33 AM, Daniel Vetter wrote:
> > > On Wed, Jan 04, 2023 at 03:40:33PM -0800, Jessica Zhang wrote:
> > >> Introduce and add support for a solid_fill property. When the solid_fill
> > >> property is set, and the framebuffer is set to NULL, memory fetch will be
> > >> disabled.
> > >>
> > >> In addition, loosen the NULL FB checks within the atomic commit callstack
> > >> to allow a NULL FB when the solid_fill property is set and add FB checks
> > >> in methods where the FB was previously assumed to be non-NULL.
> > >>
> > >> Finally, have the DPU driver use drm_plane_state.solid_fill and instead 
> > >> of
> > >> dpu_plane_state.color_fill, and add extra checks in the DPU atomic commit
> > >> callstack to account for a NULL FB in cases where solid_fill is set.
> > >>
> > >> Some drivers support hardware that have optimizations for solid fill
> > >> planes. This series aims to expose these capabilities to userspace as
> > >> some compositors have a solid fill flag (ex. SOLID_COLOR in the Android
> > >> hardware composer HAL) that can be set by apps like the Android Gears
> > >> app.
> > >>
> > >> Userspace can set the solid_fill property to a blob containing the
> > >> appropriate version number and solid fill color (in RGB323232 format) and
> > >> setting the framebuffer to NULL.
> > >>
> > >> Note: Currently, there's only one version of the solid_fill blob 
> > >> property.
> > >> However if other drivers want to support a similar feature, but require
> > >> more than just the solid fill color, they can extend this feature by
> > >> creating additional versions of the drm_solid_fill struct.
> > >>
> > >> Changes in V2:
> > >> - Dropped SOLID_FILL_FORMAT property (Simon)
> > >> - Switched to implementing solid_fill property as a blob (Simon, Dmitry)
> > >> - Changed to checks for if solid_fill_blob is set (Dmitry)
> > >> - Abstracted (plane_state && !solid_fill_blob) checks to helper method
> > >>(Dmitry)
> > >> - Removed DPU_PLANE_COLOR_FILL_FLAG
> > >> - Fixed whitespace and indentation issues (Dmitry)
> > >
> > > Now that this is a blob, I do wonder again whether it's not cleaner to set
> > > the blob as the FB pointer. Or create some kind other kind of special data
> > > source objects (because solid fill is by far not the only such thing).
> > >
> > > We'd still end up in special cases like when userspace that doesn't
> > > understand solid fill tries to read out such a framebuffer, but these
> > > cases already exist anyway for lack of priviledges.
> > >
> > > So I still think that feels like the more consistent way to integrate this
> > > feature. Which doesn't mean it has to happen like that, but the
> > > patches/cover letter should at least explain why we don't do it like this.
> >
> > Hi Daniel,
> >
> > IIRC we were facing some issues with this check [1] when trying to set
> > FB to a PROP_BLOB instead. Which is why we went with making it a
> > separate property instead. Will mention this in the cover letter.
> 
> What kind of issues? Could you please describe them?

We switched from bitmask to enum style for prop types, which means it's
not possible to express with the current uapi a property which accepts
both an object or a blob.

Which yeah sucks a bit ...

But!

blob properties are kms objects (like framebuffers), so it should be
possible to stuff a blob into an object property as-is. Of course you need
to update the validation code to make sure we accept either an fb or a
blob for the internal representation. But that kind of split internally is
required no matter what I think.
-Daniel

> 
> >
> > [1]
> > https://gitlab.freedesktop.org/drm/msm/-/blob/msm-next/drivers/gpu/drm/drm_property.c#L71
> >
> > Thanks,
> >
> > Jessica Zhang
> >
> > > -Daniel
> > >
> > >>
> > >> Changes in V3:
> > >> - Fixed some logic errors in atomic checks (Dmitry)
> > >> - Introduced drm_plane_has_visible_data() and drm_atomic_check_fb() 
> > >> helper
> > >>methods (Dmitry)
> > >>
> > >> Jessica Zhang (3):
> > >>drm: Introduce solid fill propert

Re: [RFC PATCH v3 0/3] Support for Solid Fill Planes

2023-01-05 Thread Daniel Vetter
On Wed, Jan 04, 2023 at 03:40:33PM -0800, Jessica Zhang wrote:
> Introduce and add support for a solid_fill property. When the solid_fill
> property is set, and the framebuffer is set to NULL, memory fetch will be
> disabled.
> 
> In addition, loosen the NULL FB checks within the atomic commit callstack
> to allow a NULL FB when the solid_fill property is set and add FB checks
> in methods where the FB was previously assumed to be non-NULL.
> 
> Finally, have the DPU driver use drm_plane_state.solid_fill and instead of
> dpu_plane_state.color_fill, and add extra checks in the DPU atomic commit
> callstack to account for a NULL FB in cases where solid_fill is set.
> 
> Some drivers support hardware that have optimizations for solid fill
> planes. This series aims to expose these capabilities to userspace as
> some compositors have a solid fill flag (ex. SOLID_COLOR in the Android
> hardware composer HAL) that can be set by apps like the Android Gears
> app.
> 
> Userspace can set the solid_fill property to a blob containing the
> appropriate version number and solid fill color (in RGB323232 format) and
> setting the framebuffer to NULL.
> 
> Note: Currently, there's only one version of the solid_fill blob property.
> However if other drivers want to support a similar feature, but require
> more than just the solid fill color, they can extend this feature by
> creating additional versions of the drm_solid_fill struct.
> 
> Changes in V2:
> - Dropped SOLID_FILL_FORMAT property (Simon)
> - Switched to implementing solid_fill property as a blob (Simon, Dmitry)
> - Changed to checks for if solid_fill_blob is set (Dmitry)
> - Abstracted (plane_state && !solid_fill_blob) checks to helper method
>   (Dmitry)
> - Removed DPU_PLANE_COLOR_FILL_FLAG
> - Fixed whitespace and indentation issues (Dmitry)

Now that this is a blob, I do wonder again whether it's not cleaner to set
the blob as the FB pointer. Or create some kind other kind of special data
source objects (because solid fill is by far not the only such thing).

We'd still end up in special cases like when userspace that doesn't
understand solid fill tries to read out such a framebuffer, but these
cases already exist anyway for lack of priviledges.

So I still think that feels like the more consistent way to integrate this
feature. Which doesn't mean it has to happen like that, but the
patches/cover letter should at least explain why we don't do it like this.
-Daniel

> 
> Changes in V3:
> - Fixed some logic errors in atomic checks (Dmitry)
> - Introduced drm_plane_has_visible_data() and drm_atomic_check_fb() helper
>   methods (Dmitry)
> 
> Jessica Zhang (3):
>   drm: Introduce solid fill property for drm plane
>   drm: Adjust atomic checks for solid fill color
>   drm/msm/dpu: Use color_fill property for DPU planes
> 
>  drivers/gpu/drm/drm_atomic.c  | 136 +-
>  drivers/gpu/drm/drm_atomic_helper.c   |  34 +++---
>  drivers/gpu/drm/drm_atomic_state_helper.c |   9 ++
>  drivers/gpu/drm/drm_atomic_uapi.c |  59 ++
>  drivers/gpu/drm/drm_blend.c   |  17 +++
>  drivers/gpu/drm/drm_plane.c   |   8 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  |   9 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c |  65 +++
>  include/drm/drm_atomic_helper.h   |   5 +-
>  include/drm/drm_blend.h   |   1 +
>  include/drm/drm_plane.h   |  62 ++
>  11 files changed, 302 insertions(+), 103 deletions(-)
> 
> -- 
> 2.38.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/atomic: add quirks for blind save/restore

2022-11-17 Thread Daniel Vetter
On Thu, Nov 17, 2022 at 07:54:40AM +, Simon Ser wrote:
> Two quirks to make blind atomic save/restore [1] work correctly:
> 
> - Mark the DPMS property as immutable for atomic clients, since
>   atomic clients cannot change it.
> - Allow user-space to set content protection to "enabled", interpret
>   it as "desired".
> 
> [1]: https://gitlab.freedesktop.org/wlroots/wlroots/-/merge_requests/3794
> 
> Signed-off-by: Simon Ser 

Reviewed-by: Daniel Vetter 

I think a doc patch which documents the guarantees we're trying to make
here and that they're uapi would be really nice. Maybe somewhere in the
KMS properties section in the docs.
-Daniel

> ---
> 
> I don't have the motivation to write IGT tests for this.
> 
>  drivers/gpu/drm/drm_atomic_uapi.c | 5 +++--
>  drivers/gpu/drm/drm_property.c| 7 +++
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
> b/drivers/gpu/drm/drm_atomic_uapi.c
> index c06d0639d552..95363aac7f69 100644
> --- a/drivers/gpu/drm/drm_atomic_uapi.c
> +++ b/drivers/gpu/drm/drm_atomic_uapi.c
> @@ -741,8 +741,9 @@ static int drm_atomic_connector_set_property(struct 
> drm_connector *connector,
>   state->scaling_mode = val;
>   } else if (property == config->content_protection_property) {
>   if (val == DRM_MODE_CONTENT_PROTECTION_ENABLED) {
> - drm_dbg_kms(dev, "only drivers can set CP Enabled\n");
> - return -EINVAL;
> + /* Degrade ENABLED to DESIRED so that blind atomic
> +  * save/restore works as intended. */
> + val = DRM_MODE_CONTENT_PROTECTION_DESIRED;
>   }
>   state->content_protection = val;
>   } else if (property == config->hdcp_content_type_property) {
> diff --git a/drivers/gpu/drm/drm_property.c b/drivers/gpu/drm/drm_property.c
> index dfec479830e4..dde42986f8cb 100644
> --- a/drivers/gpu/drm/drm_property.c
> +++ b/drivers/gpu/drm/drm_property.c
> @@ -474,7 +474,14 @@ int drm_mode_getproperty_ioctl(struct drm_device *dev,
>   return -ENOENT;
>  
>   strscpy_pad(out_resp->name, property->name, DRM_PROP_NAME_LEN);
> +
>   out_resp->flags = property->flags;
> + if (file_priv->atomic && property == dev->mode_config.dpms_property) {
> + /* Quirk: indicate that the legacy DPMS property is not
> +  * writable from atomic user-space, so that blind atomic
> +  * save/restore works as intended. */
> + out_resp->flags |= DRM_MODE_PROP_IMMUTABLE;
> + }
>  
>   value_count = property->num_values;
>   values_ptr = u64_to_user_ptr(out_resp->values_ptr);
> -- 
> 2.38.1
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Meeting (BOF) at Plumbers Dublin to discuss backlight brightness as connector object property RFC?

2022-09-09 Thread Daniel Stone
On Fri, 9 Sept 2022 at 12:50, Simon Ser  wrote:

> On Friday, September 9th, 2022 at 12:23, Hans de Goede <
> hdego...@redhat.com> wrote:
> > "people using
> > non fully integrated desktop environments like e.g. sway often use custom
> > scripts binded to hotkeys to get functionality like the brightness
> > up/down keyboard hotkeys changing the brightness. This typically involves
> > e.g. the xbacklight utility.
> >
> > Even if the xbacklight utility is ported to use kms with the new
> connector
> > object brightness properties then this still will not work because
> > changing the properties will require drm-master rights and e.g. sway will
> > already hold those."
>
> I don't think this is a good argument. Sway (which I'm a maintainer of)
> can add a command to change the backlight, and then users can bind their
> keybinding to that command. This is not very different from e.g. a
> keybind to switch on/off a monitor.
>
> We can also standardize a protocol to change the backlight across all
> non-fully-integrated desktop environments (would be a simple addition
> to output-power-management [1]), so that a single tool can work for
> multiple compositors.


Yeah, I mean, as one of the main people arguing that non-fully-integrated
desktops are not the design we want, I agree with Simon.

Cheers,

Daniel


Re: Position set/get prototype template

2022-08-08 Thread Daniel Stone
On Mon, 8 Aug 2022 at 17:24, samuel ammonius  wrote:
> I've just looked at the xdg-shell protocol as you said. I was really 
> surprised at the
> amount of features it had, but one part in particular caught my eye:

Window geometry is relative to surface co-ordinates. As the first
paragraph describes, it is used to describe the region of the surface
which excludes external decor like drop shadows.

One thing you might note is that Wayland does not supply a global
co-ordinate space to clients. Everything is supplied in a surface
co-ordinate space. Again, this will become clear if you took the time
to work through an introductory guide to Wayland.

Think of it this way: the developers are giving you their time and
effort by explaining to you how Wayland works, and trying to guide you
through why your proposed designs are unworkable. It would be very
polite if you could repay the favour by investing some of your time
and effort to understand Wayland before proposing drastic changes to
it and demanding the developers justify why they won't be accepted.

Cheers,
Daniel


Re: Position set/get prototype template

2022-08-08 Thread Daniel Stone
Hi Samuel,

On Mon, 8 Aug 2022 at 17:04, samuel ammonius  wrote:
> On Mon, Aug 8, 2022 at 12:06 PM Simon Ser  wrote:
>> If you are not interested in explaining the use-cases, and are just
>> interested in blindly adding new features: sorry, but this isn't how we
>> want to approach things.
>
> Sorry for not giving a more specific answer to this earlier. The main reason 
> is regarding an app's location when it starts up. For example, a large 
> workspace application like VS-Studio or Blender may want to start maximized, 
> but not full screen. AFAIK, Wayland has a protocol for fullscreen but not for 
> maximizing/minimizing an app

Wayland does have a protocol for maximising apps. Without it, we
wouldn't have been able to ship as the default desktop for every
mainline distribution for several years now. I recommend you look into
the xdg-shell family of protocols. Introductory guides to Wayland will
also cover this.

> ... so this can only be done by setting the position to (0, 0) and setting 
> the size to the viewport size of the compositor. Other apps may want to start 
> where they left off. I know that this is something that the compositor should 
> handle by default, but many of them don't. The largest obstacle is that apps 
> like screenshot utilities may want to start in the top left, autoclickers 
> start in the top right, console applications start in the center when opened 
> with ALT+T, and many other small situations like. There are probably many 
> other scenarios like this, and it seems like an unnecessary amount of pain to 
> add a new protocol every time a new thing pops up.
>
> Anyways, I don't see the harm in adding this feature. Who says that a 
> compositor is any smarter than an app? Many compositors don't support 
> remembering a window's location, while many apps do. Apps aren't users, so 
> taking power away from them isn't "making things simpler".

It's not taking any power away from apps. It's giving the entire
ecosystem more power. It takes more work to achieve than just blindly
taking the shortest possible path, but conversely it also saves us
from being painted into the same corners that made X11 no longer
viable.

Cheers,
Daniel


Re: Position set/get prototype template

2022-08-08 Thread Daniel Stone
Igor,

On Mon, 8 Aug 2022 at 15:54, Igor Korot  wrote:
> Or even better - when one of you will need to quickly create a 
> "proof-of-concept" application that will jump all over the places on every 
> startup...

We've written applications for years. We know how to make it work.
We've explained to you how to make it work. It may not work in the
exact way you have in your head, but the real usecases can and do
work, even if you won't listen to patient explanations.

This discussion has more than run its course by now. Unfortunately as
you seem hellbent on continuing to repeat the same thing over and over
without listening to anything else being said, your posts will now be
moderated.

Cheers,
Daniel


Re: Window positions under wayland

2022-08-05 Thread Daniel Stone
On Fri, 5 Aug 2022 at 17:21, Igor Korot  wrote:
> On Thu, Aug 4, 2022 at 9:20 PM Thiago Macieira  wrote:
> > No, they are about position too. 100,100 on a 1920x1080 resolution is about 
> > 5%
> > to the right of the left edge and 10% from the top. 100,100 on 3840x2160 is
> > 2.5% from the left and 5% from the top, on the same monitor. The user has a
> > right to expect the same finger-width position on the screen, relative to 
> > where
> > their eyes are looking at.
>
> No, they are not.
> It should be up to the application to decide the coordinate system 
> PPI/DPI/etc.

This is why there is no value in this discussion. You are making
assertions like this as if they are axiomatic.

It would be possible to redesign Wayland following the principles you
have described. No-one is doing this, however. We have carefully
considered the points you have raised over the past 14 years and
reached different conclusions. I'm sorry that Wayland is not the
perfect window system you envisage in your own head, but just
repeating your same beliefs that the design is fundamentally flawed
over and over, will not force us to change the design.


Re: Window positions under wayland

2022-08-05 Thread Daniel Stone
On Fri, 5 Aug 2022 at 14:11, samuel ammonius  wrote:
> Please don't close this discussion on account of something someone
> else said. Wouldn't it be better for users, compositors, and apps if there
> was a way to manage window positions, but the compositor could choose
> not to let them? The best apps typically have an option in the preferences
> for windows to remember their positions, and window managers are given
> the option to move/resize windows as they please. Window managers can
> then let users decide weather they want to let windows choose their own
> size/position or stay put (such as in tiling window managers). Why wouldn't
> Wayland benifit from a similar system? (I'm assuming that the reason you
> said "for better or for worse" was just so everyone could stop talking rather
> than because you actually wouldn't mind having a worse system in place)

It's not inherently better or worse, just different.

For things like remembering window positions, there has already been a
specific protocol written to handle that usecase linked in this
thread, which is more flexible and capable than having every client
save their last-seen position and forcibly restore it no matter what.

Cheers,
Daniel


Re: Window positions under wayland

2022-08-05 Thread Daniel Stone
Hi all,

On Fri, 5 Aug 2022 at 10:42, Jean-Michaël Celerier
 wrote:
> [lots of musing about win32 snipped]
>
> It would make me pretty sad to tell these people (both end-users and devs) 
> that Windows is a better operating system for their use case than desktop 
> Linux.

OK, this discussion is closed now.

For better or worse, Wayland will not have a standard protocol to get
and set window positions.

As Carsten and others have explained, specific targeted usecases - and
'I want to control the position of my toplevel window' is not a
usecase, it is a dictated solution to a number of very separate
problems - are addressed with specific new protocol extensions. As
said, Wayland is descriptive rather than prescriptive: the client
gives the compositor as much information is required to make good
decisions, rather than tells the compositor exactly to do with no
context.

To the other points raised: X11 still exists and will continue to for
a very long time. A lot of effort has gone into Xwayland to make X11
apps work completely seamlessly. If you do not want to use Wayland,
then no-one is forcing you to. Please feel free to continue using X11
if you feel that it works better for you or your apps. As to the
notion that users can force window managers to do bad things, the
simple answer is that if you don't want bad window management, then
don't use a bad window manager.

If you have specific targeted usecases that you want addressed, please
contribute to the development of those protocols in an issue on
https://gitlab.freedesktop.org/wayland/wayland-protocols/. If you want
to discuss whether Wayland's fundamental design concepts are or aren't
a good idea, or whether Windows is bad, please take that discussion to
Hacker News or Reddit or something.

Cheers,
Daniel


Re: I'm adding features to VKMS! What would you like to see?

2022-07-29 Thread Daniel Stone
Hi Jim,

On Fri, 29 Jul 2022 at 08:30, Jim Shargo  wrote:
> TL;DR: I'm working on extending VKMS and wanted feedback from other
> compositor/wayland devs.

Awesome! :) Glad to see it, and yeah, I second everything Pekka said.
Having hotplug in particular would be really great.

> // Questions
>
>   - What VKMS features could help your testing the most?
>   - How much do you care about writeback buffer support vs CRC checks
> in tests atm?
>   - What kinds of bugs do you get around DRM/KMS?
>   - Any thoughts in general?

One thing I've really wanted is corner-case handling which just can't
be done in generic code. Weston is really aggressive at trying to get
things into planes, but we can only test those on actual systems with
particular semantics.

I'd love to be able to programmatically fake those to be able to check
our fallbacks in an automatic way. About the best idea I've come up
with for that is being able to attach an eBPF hook to atomic_check.
The absolute MVP would be no arguments and an errno return; if you
completely control the environment, then you can store a counter in a
map and return a particular error for the n'th attempt. But a better
one would allow you to inspect the properties on each object in the
atomic state, and also things like framebuffer attributes (dimensions,
format, modifier, etc) so you could take action accordingly.

Thanks for taking this on! Really looking forward to it.

Cheers,
Daniel


Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS

2022-06-14 Thread Daniel Stone
Hi,

On Tue, 14 Jun 2022 at 15:40, Zack Rusin  wrote:
> On Tue, 2022-06-14 at 10:36 +0300, Pekka Paalanen wrote:
> > The reason I am saying that you need to fix other issues with
> > virtualized drivers at the same time is because those other issues are
> > the sole and only reason why the driver would ever need hotspot info.
> >
> > Hotspot info must not be necessary for correct KMS operation, yet you
> > seem to insist it is. You assume that cursor plane commandeering is ok
> > to do. But it is not ok without adding the KMS UAPI that would allow
> > it: a way for guest userspace to explicitly say that commandeering will
> > be ok.
> >
> > If and only if guest userspace says commandeering is ok, then you can
> > require hotspot info. On the other hand, you cannot retrofit hotspot
> > info by saying that if a driver exposes hotspot properties then all KMS
> > clients must set them. That would be a kernel regression by definition:
> > KMS clients that used to abide the KMS UAPI contract are suddenly
> > breaking the contract because the kernel changed the contract.
> >
> > Therefore I very much disagree that virtualized drivers need hotspot
> > info. They do not strictly need hotspot info for correct operation,
> > they need it only for making the user experience more smooth, which is
> > an optional optimization. That optimization may be very important in
> > practise, but it's still an optimization and is generally not expected
> > by KMS clients.
>
> I strongly disagree with that (both the sentiment towards hotspots and the 
> client
> handling of it). I don't think we have to agree on reasoning here at all to 
> make
> progress though so I'm going to let it go (we can always continue on irc or 
> email if
> you'd like to try to conclude this bit but we could all use a few days of 
> break from
> this discussion probably).

Well, it's just coming from two different directions:
* many current KMS clients want the cursor plane to be displayed as
the client-programmed plane properties indicate, and the output can be
nonsensical if they aren't
* VMware optimises the cursor by displaying the cursor plane not as
the client-programmed plane properties indicate, and the output is
sometimes good (faster response!) but sometimes bad (nonsensical
display!)

The client cap sounds good. As a further suggestion, given that
universal planes are supposed to make planes, er, universal, rather
than imbued with magical special behaviour, how about _also_ adding an
'is cursor' plane prop which userspace has to set to 1 to indicate
that the output is a cursor and has the hotspot correctly set and the
'display hardware' is free to make the cursor fly around the screen in
accordance with the input events it sends? That way it's really really
clear what's happening and no-one's getting surprised when 'the right
thing' doesn't happen, not least because it's really clear what 'the
right thing' is.

Cheers,
Daniel


Re: 504 to gitlab.freedesktop.org

2022-06-13 Thread Daniel Stone
Hi,

On Mon, 13 Jun 2022 at 08:39, Daniel Stone  wrote:
> Yes, that's what's happening. Our (multi-host-replicated etc) Ceph
> storage setup has entered a degraded mode due to the loss of a couple
> of disks - no data has been lost but the cluster is currently unhappy.
> We're walking through fixing this, but have bumped into some other
> issues since, including a newly-flaky network setup, and changes since
> we last provisioned a new storage host.
>
> We're working through them one by one and will have the service back
> up with all our data intact - hopefully in a matter of hours but we
> have no firm ETA right now.

Thanks mainly to Ben, everything is back up and running now.

Cheers,
Daniel


Re: 504 to gitlab.freedesktop.org

2022-06-13 Thread Daniel Stone
On Mon, 13 Jun 2022 at 05:17, Peter Hutterer  wrote:
> On Sun, Jun 12, 2022 at 05:57:05PM -0700, Jeremy Sequoia wrote:
> > I was going to spend a little bit of time putting out an update to XQuartz
> > to address a few bugs that I've been meaning to squash, but I'm having a bit
> > of an issue pulling down sources.
> >
> > Fetching via ssh://g...@gitlab.freedesktop.org is giving me Permission 
> > denied
> > (publickey,keyboard-interactive).  I'm not sure if the latter is an infra
> > issue or if the ssh key I have stored in my gitlab account are out of date
> > (it's been about a year since I touched this).  Unfortunately, I can't seem
> > to access https://gitlab.freedesktop.org to check as it's constantly
> > presenting me a 504 Gateway timeout.
> >
> > Is anyone else able to pull sources via ssh://g...@gitlab.freedesktop.org
> > right now?  Is someone looking into the 504 issue?
>
> not an fdo admin but judging by the chatter on #freedesktop: no and yes, in
> that order. seems like the infrastructure is in various stages of depositing
> fecal matter on itself and the fixes are involved enough that the admins have
> to be mentally awake, not merely physically.

Yes, that's what's happening. Our (multi-host-replicated etc) Ceph
storage setup has entered a degraded mode due to the loss of a couple
of disks - no data has been lost but the cluster is currently unhappy.
We're walking through fixing this, but have bumped into some other
issues since, including a newly-flaky network setup, and changes since
we last provisioned a new storage host.

We're working through them one by one and will have the service back
up with all our data intact - hopefully in a matter of hours but we
have no firm ETA right now.

Cheers,
Daniel


Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS

2022-06-10 Thread Daniel Vetter
On Fri, Jun 10, 2022 at 09:15:35AM +, Simon Ser wrote:
> On Friday, June 10th, 2022 at 10:41, Daniel Vetter  wrote:
> 
> > Anything I've missed? Or got completely wrong?
> 
> This plan looks good to me.
> 
> As Pekka mentionned, I'd also like to have a conversation of how far we want 
> to
> push virtualized driver features. I think KMS support is a good feature to 
> have
> to spin up a VM and have all of the basics working. However I don't think it's
> a good idea to try to plumb an ever-growing list of fancy features
> (seamless integration of guest windows into the host, HiDPI, multi-monitor,
> etc) into KMS. You'd just end up re-inventing Wayland or RDP on top of KMS.
> Instead of re-inventing these, just use RDP or waypipe or X11 forwarding
> directly.
> 
> So I think we need to draw a line somewhere, and decide e.g. that virtualized
> cursors are fine to add in KMS, but HiDPI is not.

It's getting a bit far off-topic, but google cros team has an out-of-tree
(at least I think it's not merged yet) wayland-virtio driver for exactly
this use-case. Trying to move towards something like that for fancy
virtualized setups sounds like the better approach indeed, with kms just
as the bare-bones fallback option.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS

2022-06-10 Thread Daniel Vetter
On Fri, Jun 10, 2022 at 08:54:03AM +, Simon Ser wrote:
> I agree with what others have replied, just adding a few more details.
> 
> On Thursday, June 9th, 2022 at 21:39, Zack Rusin  wrote:
> 
> > virtualized drivers send drm_kms_helper_hotplug_event which sends a 
> > HOTPLUG=1
> > event with a changed preferred width/height
> 
> (Note: and the "hotplug_mode_update" property is set to 1.)
> 
> > suggested_x and suggested_y properties
> 
> These come with their own set of issues. They are poorly defined, but it seems
> like they describe a position in physical pixel coordinates. Compositors don't
> use physical pixel coordinates to organize their outputs, instead they use
> logical coordinates. For instance, a HiDPI 4k screen with a scale of 2 will
> take up 1920x1080 logical pixels. There is no way to convert physical pixel
> coordinates to logical pixel coordinates in the general case, because there's
> no "global scale factor". So suggested_x/y are incompatible with the way
> compositors work.

I dropped a request for a proper doc section that explains all the
virtualized kms driver stuff. I think we should also put in a
"limitations" part there and just spec that any kind of scaling is a no-go
on these (and that drivers better validate this is the case).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS

2022-06-10 Thread Daniel Vetter
On Fri, Jun 10, 2022 at 10:41:05AM +0200, Daniel Vetter wrote:
> Hi all,
> 
> Kinda top post because the thread is sprawling and I think we need a
> summary/restart. I think there's at least 3 issues here:
> 
> - lack of hotspot property support, which means compositors can't really
>   support hotspot with atomic. Which isn't entirely true, because you
>   totally can use atomic for the primary planes/crtcs and the legacy
>   cursor ioctls, but I understand that people might find that a bit silly :-)
> 
>   Anyway this problme is solved by the patch set here, and I think results
>   in some nice cleanups to boot.
> 
> - the fact that cursors for virtual drivers are not planes, but really
>   special things. Which just breaks the universal plane kms uapi. That
>   part isn't solved, and I do agree with Simon and Pekka that we really
>   should solve this before we unleash even more compositors onto the
>   atomic paths of virtual drivers.
> 
>   I think the simplest solution for this is:
>   1. add a new DRM_PLANE_TYPE_VIRTUAL_CURSOR, and set that for these
>   special cursor planes on all virtual drivers
>   2. add the new "I understand virtual cursors planes" setparam, filter
>   virtual cursor planes for userspace which doesn't set this (like we do
>   right now if userspace doesn't set the universal plane mode)
>   3. backport the above patches to all stable kernels
>   4. make sure the hotspot property is only set on VIRTUAL_CURSOR planes
>   and nothing else in the rebased patch series

Simon also mentioned on irc that these special planes must not expose the
CRTC_X/Y property, since that doesn't really do much at all. Or is our
understanding of how this all works for commandeered cursors wrong?

> - third issue: These special virtual display properties arent documented.
>   Aside from hotspot there's also suggested X/Y and maybe other stuff. I
>   have no idea what suggested X/Y does and what userspace should do with
>   it. I think we need a new section for virtualized drivers which:
>   - documents all the properties involved
>   - documents the new cap for enabling virtual cursor planes
>   - documents some of the key flows that compositors should implement for
> best experience
>   - documents how exactly the user experience will degrade if compositors
> pretend it's just a normal kms driver (maybe put that into each of the
> special flows that a compositor ideally supports)
>   - whatever other comments and gaps I've missed, I'm sure
> Simon/Pekka/others will chime in once the patch exists.

Great bonus would be an igt which demonstrates these flows. With the
interactive debug breakpoints to wait for resizing or whatever this should
be all neatly possible.
-Daniel

> 
> There's a bit of fixing oopsies (virtualized drivers really shouldn't have
> enabled universal planes for their cursors) and debt (all these properties
> predate the push to document stuff so we need to fix that), but I don't
> think it's too much. And I think, from reading the threads, that this
> should cover everything?
> 
> Anything I've missed? Or got completely wrong?
> 
> Cheers, Daniel
> 
> On Fri, Jun 03, 2022 at 02:14:59PM +, Simon Ser wrote:
> > Hi,
> > 
> > Please, read this thread:
> > https://lists.freedesktop.org/archives/dri-devel/2020-March/thread.html#259615
> > 
> > It has a lot of information about the pitfalls of cursor hotspot and
> > other things done by VM software.
> > 
> > In particular: since the driver will ignore the KMS cursor plane
> > position set by user-space, I don't think it's okay to just expose
> > without opt-in from user-space (e.g. with a DRM_CLIENT_CAP).
> > 
> > cc wayland-devel and Pekka for user-space feedback.
> > 
> > On Thursday, June 2nd, 2022 at 17:42, Zack Rusin  wrote:
> > 
> > > - all userspace code needs to hardcore a list of drivers which require
> > > hotspots because there's no way to query from drm "does this driver
> > > require hotspot"
> > 
> > Can you elaborate? I'm not sure I understand what you mean here.
> > 
> > Thanks,
> > 
> > Simon
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 0/6] drm: Add mouse cursor hotspot support to atomic KMS

2022-06-10 Thread Daniel Vetter
Hi all,

Kinda top post because the thread is sprawling and I think we need a
summary/restart. I think there's at least 3 issues here:

- lack of hotspot property support, which means compositors can't really
  support hotspot with atomic. Which isn't entirely true, because you
  totally can use atomic for the primary planes/crtcs and the legacy
  cursor ioctls, but I understand that people might find that a bit silly :-)

  Anyway this problme is solved by the patch set here, and I think results
  in some nice cleanups to boot.

- the fact that cursors for virtual drivers are not planes, but really
  special things. Which just breaks the universal plane kms uapi. That
  part isn't solved, and I do agree with Simon and Pekka that we really
  should solve this before we unleash even more compositors onto the
  atomic paths of virtual drivers.

  I think the simplest solution for this is:
  1. add a new DRM_PLANE_TYPE_VIRTUAL_CURSOR, and set that for these
  special cursor planes on all virtual drivers
  2. add the new "I understand virtual cursors planes" setparam, filter
  virtual cursor planes for userspace which doesn't set this (like we do
  right now if userspace doesn't set the universal plane mode)
  3. backport the above patches to all stable kernels
  4. make sure the hotspot property is only set on VIRTUAL_CURSOR planes
  and nothing else in the rebased patch series

- third issue: These special virtual display properties arent documented.
  Aside from hotspot there's also suggested X/Y and maybe other stuff. I
  have no idea what suggested X/Y does and what userspace should do with
  it. I think we need a new section for virtualized drivers which:
  - documents all the properties involved
  - documents the new cap for enabling virtual cursor planes
  - documents some of the key flows that compositors should implement for
best experience
  - documents how exactly the user experience will degrade if compositors
pretend it's just a normal kms driver (maybe put that into each of the
special flows that a compositor ideally supports)
  - whatever other comments and gaps I've missed, I'm sure
Simon/Pekka/others will chime in once the patch exists.

There's a bit of fixing oopsies (virtualized drivers really shouldn't have
enabled universal planes for their cursors) and debt (all these properties
predate the push to document stuff so we need to fix that), but I don't
think it's too much. And I think, from reading the threads, that this
should cover everything?

Anything I've missed? Or got completely wrong?

Cheers, Daniel

On Fri, Jun 03, 2022 at 02:14:59PM +, Simon Ser wrote:
> Hi,
> 
> Please, read this thread:
> https://lists.freedesktop.org/archives/dri-devel/2020-March/thread.html#259615
> 
> It has a lot of information about the pitfalls of cursor hotspot and
> other things done by VM software.
> 
> In particular: since the driver will ignore the KMS cursor plane
> position set by user-space, I don't think it's okay to just expose
> without opt-in from user-space (e.g. with a DRM_CLIENT_CAP).
> 
> cc wayland-devel and Pekka for user-space feedback.
> 
> On Thursday, June 2nd, 2022 at 17:42, Zack Rusin  wrote:
> 
> > - all userspace code needs to hardcore a list of drivers which require
> > hotspots because there's no way to query from drm "does this driver
> > require hotspot"
> 
> Can you elaborate? I'm not sure I understand what you mean here.
> 
> Thanks,
> 
> Simon

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: running scanner in wayland-main

2022-05-16 Thread Daniel Stone
Hi Egy,

On Mon, 16 May 2022 at 15:09, Egy Ketto  wrote:
> I'm digging the source code of wayland and weston. I'm stuck at getting the 
> code generated by wayland-scanner. I was following the build instructions for 
> wayland:
>
> $ git clone https://gitlab.freedesktop.org/wayland/wayland.git
> $ cd wayland
> $ meson build/ --prefix=PREFIX //(which I set to wayland/build)
> $ ninja -C build/ install
>
> but I get the error:
>
> meson.build:83:1: ERROR: lexer
>
> Can someone please help me with this?

Which version of Meson are you using? I don't understand how the root
meson.build could cause Meson to not be able to parse.

Cheese,
Daniel


Re: [RFC] drm/kms: control display brightness through drm_connector properties

2022-04-27 Thread Daniel Vetter
On Wed, Apr 27, 2022 at 05:23:22PM +0300, Jani Nikula wrote:
> On Wed, 27 Apr 2022, Daniel Vetter  wrote:
> > On Thu, Apr 14, 2022 at 01:24:30PM +0300, Jani Nikula wrote:
> >> On Mon, 11 Apr 2022, Alex Deucher  wrote:
> >> > On Mon, Apr 11, 2022 at 6:18 AM Hans de Goede  
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> On 4/8/22 17:11, Alex Deucher wrote:
> >> >> > On Fri, Apr 8, 2022 at 10:56 AM Hans de Goede  
> >> >> > wrote:
> >> >> >>
> >> >> >> Hi,
> >> >> >>
> >> >> >> On 4/8/22 16:08, Alex Deucher wrote:
> >> >> >>> On Fri, Apr 8, 2022 at 4:07 AM Daniel Vetter  
> >> >> >>> wrote:
> >> >> >>>>
> >> >> >>>> On Thu, Apr 07, 2022 at 05:05:52PM -0400, Alex Deucher wrote:
> >> >> >>>>> On Thu, Apr 7, 2022 at 1:43 PM Hans de Goede 
> >> >> >>>>>  wrote:
> >> >> >>>>>>
> >> >> >>>>>> Hi Simon,
> >> >> >>>>>>
> >> >> >>>>>> On 4/7/22 18:51, Simon Ser wrote:
> >> >> >>>>>>> Very nice plan! Big +1 for the overall approach.
> >> >> >>>>>>
> >> >> >>>>>> Thanks.
> >> >> >>>>>>
> >> >> >>>>>>> On Thursday, April 7th, 2022 at 17:38, Hans de Goede 
> >> >> >>>>>>>  wrote:
> >> >> >>>>>>>
> >> >> >>>>>>>> The drm_connector brightness properties
> >> >> >>>>>>>> ===
> >> >> >>>>>>>>
> >> >> >>>>>>>> bl_brightness: rw 0-int32_max property controlling the 
> >> >> >>>>>>>> brightness setting
> >> >> >>>>>>>> of the connected display. The actual maximum of this will be 
> >> >> >>>>>>>> less then
> >> >> >>>>>>>> int32_max and is given in bl_brightness_max.
> >> >> >>>>>>>
> >> >> >>>>>>> Do we need to split this up into two props for sw/hw state? The 
> >> >> >>>>>>> privacy screen
> >> >> >>>>>>> stuff needed this, but you're pretty familiar with that. :)
> >> >> >>>>>>
> >> >> >>>>>> Luckily that won't be necessary, since the privacy-screen is a 
> >> >> >>>>>> security
> >> >> >>>>>> feature the firmware/embedded-controller may refuse our requests
> >> >> >>>>>> (may temporarily lock-out changes) and/or may make changes 
> >> >> >>>>>> without
> >> >> >>>>>> us requesting them itself. Neither is really the case with the
> >> >> >>>>>> brightness setting of displays.
> >> >> >>>>>>
> >> >> >>>>>>>> bl_brightness_max: ro 0-int32_max property giving the actual 
> >> >> >>>>>>>> maximum
> >> >> >>>>>>>> of the display's brightness setting. This will report 0 when 
> >> >> >>>>>>>> brightness
> >> >> >>>>>>>> control is not available (yet).
> >> >> >>>>>>>
> >> >> >>>>>>> I don't think we actually need that one. Integer KMS props all 
> >> >> >>>>>>> have a
> >> >> >>>>>>> range which can be fetched via drmModeGetProperty. The max can 
> >> >> >>>>>>> be
> >> >> >>>>>>> exposed via this range. Example with the existing alpha prop:
> >> >> >>>>>>>
> >> >> >>>>>>> "alpha": range [0, UINT16_MAX] = 65535
> >> >> >>>>>>
> >> >> >>>>>> Right, I already knew that, which is why I explicitly added a 
> >> >> >>>>>> range
&

Re: [RFC] drm/kms: control display brightness through drm_connector properties

2022-04-27 Thread Daniel Vetter
On Thu, Apr 14, 2022 at 01:24:30PM +0300, Jani Nikula wrote:
> On Mon, 11 Apr 2022, Alex Deucher  wrote:
> > On Mon, Apr 11, 2022 at 6:18 AM Hans de Goede  wrote:
> >>
> >> Hi,
> >>
> >> On 4/8/22 17:11, Alex Deucher wrote:
> >> > On Fri, Apr 8, 2022 at 10:56 AM Hans de Goede  
> >> > wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> On 4/8/22 16:08, Alex Deucher wrote:
> >> >>> On Fri, Apr 8, 2022 at 4:07 AM Daniel Vetter  wrote:
> >> >>>>
> >> >>>> On Thu, Apr 07, 2022 at 05:05:52PM -0400, Alex Deucher wrote:
> >> >>>>> On Thu, Apr 7, 2022 at 1:43 PM Hans de Goede  
> >> >>>>> wrote:
> >> >>>>>>
> >> >>>>>> Hi Simon,
> >> >>>>>>
> >> >>>>>> On 4/7/22 18:51, Simon Ser wrote:
> >> >>>>>>> Very nice plan! Big +1 for the overall approach.
> >> >>>>>>
> >> >>>>>> Thanks.
> >> >>>>>>
> >> >>>>>>> On Thursday, April 7th, 2022 at 17:38, Hans de Goede 
> >> >>>>>>>  wrote:
> >> >>>>>>>
> >> >>>>>>>> The drm_connector brightness properties
> >> >>>>>>>> ===
> >> >>>>>>>>
> >> >>>>>>>> bl_brightness: rw 0-int32_max property controlling the brightness 
> >> >>>>>>>> setting
> >> >>>>>>>> of the connected display. The actual maximum of this will be less 
> >> >>>>>>>> then
> >> >>>>>>>> int32_max and is given in bl_brightness_max.
> >> >>>>>>>
> >> >>>>>>> Do we need to split this up into two props for sw/hw state? The 
> >> >>>>>>> privacy screen
> >> >>>>>>> stuff needed this, but you're pretty familiar with that. :)
> >> >>>>>>
> >> >>>>>> Luckily that won't be necessary, since the privacy-screen is a 
> >> >>>>>> security
> >> >>>>>> feature the firmware/embedded-controller may refuse our requests
> >> >>>>>> (may temporarily lock-out changes) and/or may make changes without
> >> >>>>>> us requesting them itself. Neither is really the case with the
> >> >>>>>> brightness setting of displays.
> >> >>>>>>
> >> >>>>>>>> bl_brightness_max: ro 0-int32_max property giving the actual 
> >> >>>>>>>> maximum
> >> >>>>>>>> of the display's brightness setting. This will report 0 when 
> >> >>>>>>>> brightness
> >> >>>>>>>> control is not available (yet).
> >> >>>>>>>
> >> >>>>>>> I don't think we actually need that one. Integer KMS props all 
> >> >>>>>>> have a
> >> >>>>>>> range which can be fetched via drmModeGetProperty. The max can be
> >> >>>>>>> exposed via this range. Example with the existing alpha prop:
> >> >>>>>>>
> >> >>>>>>> "alpha": range [0, UINT16_MAX] = 65535
> >> >>>>>>
> >> >>>>>> Right, I already knew that, which is why I explicitly added a range
> >> >>>>>> to the props already. The problem is that the range must be set
> >> >>>>>> before registering the connector and when the backlight driver
> >> >>>>>> only shows up (much) later during boot then we don't know the
> >> >>>>>> range when registering the connector. I guess we could "patch-up"
> >> >>>>>> the range later. But AFAIK that would be a bit of abuse of the
> >> >>>>>> property API as the range is intended to never change, not
> >> >>>>>> even after hotplug uevents. At least atm there is no infra
> >> >>>>>> in the kernel to change the range later.
> >> >>>>>>
&

Re: [RFC] drm/kms: control display brightness through drm_connector properties

2022-04-13 Thread Daniel Vetter
On Fri, Apr 08, 2022 at 12:26:24PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 4/8/22 12:16, Simon Ser wrote:
> > Would it be an option to only support the KMS prop for Good devices,
> > and continue using the suboptimal existing sysfs API for Bad devices?
> > 
> > (I'm just throwing ideas around to see what sticks, feel free to ignore.)
> 
> Currently suid-root or pkexec helpers are used to deal with the
> /sys/class/backlight requires root rights issue. I really want to
> be able to disable these helpers at build time in e.g. GNOME once
> the new properties are supported in GNOME.  So that distros with
> a new enough kernel can reduce their attack surface this way.

Yeah but otoh perpetuating a bad interface forever isn't a great idea
either. I think the pragmatic plan here is
- Implement this properly on good devices, i.e. anything new.
- Do some runtime disabling in the pkexec helpers if they detect a modern
  system (we should be able to put a proper symlink into the drm sysfs
  connector directories, to make this easy to detect). It's not as great
  as doing this at compile time, but it's a step.
- Figure out a solution for the old crap. We can't really change anything
  with the load ordering for existing systems, so if the hacked-up compat
  libbacklight-backlight isn't an option then I guess we need some quirk
  list/extracted code which makes i915/nouveau/radeon driver load fail
  until the right backlight shows up. And that needs to be behind a
  Kconfig to avoid breaking existing systems.

Inflicting hotplug complications on all userspace (including uevent
handling for this hotpluggable backlight and everything) just because
fixing the old crap systems is work is imo really not a good idea. Much
better if we get to the correct future step-by-step.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC] drm/kms: control display brightness through drm_connector properties

2022-04-08 Thread Daniel Vetter
oblem is that we really don't know if 0 is off
> > or min-brightness. In the given example where we actually never go
> > down to a duty-cycle of 0% because the video BIOS tables tell us
> > not to, we can be certain that setting the brightness prop to 0
> > will not turn of the backlight, since we then set the duty-cycle
> > to the VBT provided minimum. Note the intend here is to only set
> > the boolean to true if the VBT provided minimum is _not_ 0, 0
> > just means the vendor did not bother to provide a minimum.
> >
> > Currently e.g. GNOME never goes lower then something like 5%
> > of brightness_max to avoid accidentally turning the screen off.
> >
> > Turning the screen off is quite bad to do on e.g. tablets where
> > the GUI is the only way to undo the brightness change and now
> > the user can no longer see the GUI.
> >
> > The idea behind this boolean is to give e.g. GNOME a way to
> > know that it is safe to go down to 0% and for it to use
> > the entire range.
> 
> Why not just make it policy that 0 is defined as minimum brightness,
> not off, and have all drivers conform to that?

Because the backlight subsystem isn't as consistent on this, and it's been
an epic source of confusion since forever.

What's worse, there's both userspace out there which assumes brightness =
0 is a really fast dpms off _and_ userspace that assumes that brightness =
0 is the lowest setting. Of course on different sets of machines.

So yeah we're screwed. I have no idea how to get out of this.
-Daniel

> 
> Alex
> 
> >
> > > For instance if we can guarantee that the min level won't turn the screen
> > > completely off we could make the range start from 1 instead of 0.
> > > Or allow -1 to mean "minimum value, maybe completely off".
> >
> > Right, the problem is we really don't know and when the range is
> > e.g. 0-65535 then something like 1 will almost always still just
> > turn the screen completely off. There will be a value of say like
> > 150 or some such which is then the actual minimum value to still
> > get the backlight to light up at all. The problem is we have
> > no clue what the actual minimum is. And if the PWM output does
> > not directly drive the LEDs but is used as an input for some
> > LED backlight driver chip, that chip itself may have a lookup
> > table (which may also take care of doing perceived brightness
> > mapping) and may guarantee a minimum backlight even when given
> > a 0% duty cycle PWM signal...
> >
> > This prop is sort of orthogonal to the generic change to
> > drm_connector props, so we could also do this later as a follow up
> > change. At a minimum when I code this up this should be in its
> > own commit(s) I believe.
> >
> > But I do think having this will be useful for the above
> > GNOME example.
> >
> > >> bl_brightness_control_method: ro, enum, possible values:
> > >> none: The GPU driver expects brightness control to be provided by another
> > >> driver and that driver has not loaded yet.
> > >> unknown: The underlying control mechanism is unknown.
> > >> pwm: The brightness property directly controls the duty-cycle of a PWM
> > >> output.
> > >> firmware: The brightness is controlled through firmware calls.
> > >> DDC/CI: The brightness is controlled through the DDC/CI protocol.
> > >> gmux: The brightness is controlled by the GMUX.
> > >> Note this enum may be extended in the future, so other values may
> > >> be read, these should be treated as "unknown".
> > >>
> > >> When brightness control becomes available after being reported
> > >> as not available before (bl_brightness_control_method=="none")
> > >> a uevent with CONNECTOR= and
> > >>
> > >> PROPERTY= will be generated
> > >>
> > >> at this point all the properties must be re-read.
> > >>
> > >> When/once brightness control is available then all the read-only
> > >> properties are fixed and will never change.
> > >>
> > >> Besides the "none" value for no driver having loaded yet,
> > >> the different bl_brightness_control_method values are intended for
> > >> (userspace) heuristics for such things as the brightness setting
> > >> linearly controlling electrical power or setting perceived brightness.
> > >
> > > Can you elaborate? I don't know enough about brightness control to
> > > understand all of the implications here.
> >
> > So after send

Re: weston crashing when no HDMI connection

2021-12-10 Thread Daniel Stone
Hi Rusty,

On Fri, 10 Dec 2021 at 20:38, Rusty Howell  wrote:
> We are working on an embedded linux device built using Yocto (dunfell).   We 
> have weston running and we are seeing our QT5 application running as well.  
> One problem we are having is that our application is crashing weston if the 
> application is started when there is no active HDMI connection.  We see the 
> error message "The Wayland connection broke. Did the Wayland compositor 
> die?". We also see weston restart.
>
> I have a smaller QT demo app with just a few controls, and that seems to work 
> just fine. Launching it while there is no HDMI connection does not seem to 
> affect weston at all. Is this a known issue with weston?  Has anyone seen 
> similar issues with Qt5 and weston?

This definitely isn't a known issue.

However, if you are using a vendor BSP (e.g. NXP), they may have
substantially forked and changed Weston. In most cases, these bugs are
introduced by the vendor changes. If this is the case, please seek
support from your vendor, as unfortunately we can't support those
changes.

Cheers,
Daniel


Re: Where is eglGetConfigs getting its configs from?

2021-11-05 Thread Daniel Stone
Hi Chris,

On Fri, 5 Nov 2021 at 12:01, chris.lapla...@agilent.com
 wrote:
> Can anyone please explain how eglGetConfigs actually works? i.e. what 
> information is it consulting in order to determine the configs to return? 
> Unfortunately we are using a processor (Xilinx Zynq UltraScale) with a GPU 
> (Mali-400 MP2) whose EGL implementation is closed-source. So I cannot look at 
> the source code for eglGetConfigs.

'It depends' is the short and disappointing answer ...

> My expectation was that eglGetConfigs would simply enumerate DRM framebuffers 
> in the system. However, the only framebuffer in our system (a Xilinx 
> Framebuffer Read IP core in the FPGA) is using XRGB colors and the list 
> that Weston reports makes no sense. See below for /var/log/weston.log 
> contents. Some component of the system seems to be finding RGB565 and 
> ARGB framebuffers somewhere.

EGLConfigs are only semi-related to this. Ultimately they do have to
render to a framebuffer for display, however they can be used for
intermediate non-display rendering as well (e.g. FBOs, depth/stencil
buffers). So, the list of EGLConfigs you get is essentially a list of
configurations the GPU is able to render to. A subset of those will be
suitable for display. The mechanism used for selecting a
display-compatible config is to look at the EGL_NATIVE_VISUAL_ID and
match this to a DRM format ...

> [01:36:24.725] Bad/unknown DRM format code 0x.

This is already quite suspicious, since drm_output_init_egl()
shouldn't be passing anything with 0 in the list?

> [01:36:24.725] No EGLConfig matches { win|pbf; XRGB }.

I would also expect to see ARGB in this list ...

>id:   9 rgba: 8 8 8 8 buf: 32 dep: 24 stcl: 8 int: 0-10 type: 
> win|pix|pbf|swap_preserved vis_id: ARGB (0x34325241)

This is an ARGB config (note the rgba: 8 8 8 8) and the vis_id ...

>id:  38 rgba: 8 8 8 0 buf: 24 dep: 24 stcl: 8 int: 0-10 type: 
> win|pix|pbf|swap_preserved vis_id: ARGB (0x34325241)

And this is an XRGB config, but incorrectly declared to have a
native visual ID of ARGB.

>id:  41 rgba: 8 8 8 0 buf: 24 dep: 24 stcl: 8 int: 0-10 type: 
> win|pix|pbf|ms_resolve_box|swap_preserved vis_id: ARGB (0x34325241)

These and the following are not relevant as they are multi-sampled
and/or depth/stencil configs.

So, my thoughts are:
  - how do you get to 'Bad/unknown DRM format code 0x' given
that drm_output_init_egl() explicitly prunes the list for this?
  - why is DRM_FORMAT_ARGB not found as a fallback format for
DRM_FORMAT_XRGB, given that we should be getting that through
fallback_format_for()?
  - your EGL stack is buggy, because it declares a 8880 format (i.e.
XRGB) to be ARGB, so fixing that would likely solve the
immediate problem, but the fallbacks above should work

Best of luck.

Cheers,
Daniel


Re: FW: xrandr and xwayland

2021-08-06 Thread Daniel Stone
Hi Guillermo,

On Fri, 6 Aug 2021 at 10:44, Guillermo Rodriguez Garcia
 wrote:
> El vie, 6 ago 2021 a las 10:14, Daniel Stone () 
> escribió:
>> kiosk-shell is something we have in newer versions of Weston which
>> sounds like it would work well for your usecases - it's designed to
>> just run a single application fullscreen. You might want to check out
>> what we have in git, which will be released as 10.0 in a few weeks'
>> time.
>
> I have a use case for this which is conceptually one single application, 
> fullscreen, no desktop stuff (navigation bar, window management etc) but 
> needs to support additional processes with separate top-level windows. This 
> would be used e.g. to overlay a video stream (using gstreamer) on top of the 
> "main" application. Will this be supported by kiosk-shell ?

For clients to be able to position themselves relative to other
clients, wl_subcompositor gives you the subsurface mechanism for
embedding. This was designed for this exact usecase: an application
embedding media content in its own top-level window. Using this is
very strongly recommended.

If you are unable to do this for whatever reason, then you will need
to customise the window manager - in this case, kiosk-shell. We are
planning to extend this with Lua scripting to make this easier, but
have no firmly-defined ETA for this right now.

Cheers,
Daniel


Re: FW: xrandr and xwayland

2021-08-06 Thread Daniel Stone
Hi David,

On Thu, 5 Aug 2021 at 22:17, David Deyo  wrote:
> > Sounds like you're missing wl_display_flush() in your client code, so the 
> > requests don't make it to the socket buffer until they're forced to because 
> > it's filled up.
>
> That did it.  You guys are awesome.  I don’t suppose there’s a Weston doc 
> somewhere that would have told me that, had I looked.

It's a little bit buried, but this is the best explanation of how to
integrate Wayland into an event loop, as you would with a toolkit:
  
https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__display_1a40039c1169b153269a3dc0796a54ddb0

If you scroll up to the main wl_display section, it explains how event
queues are used as well.

Broadly speaking, the advice is to:
- when active, process your events and send any requests
- immediately before you go into a passive state (waiting for events,
sleeping, etc), flush the display so your requests get delivered
- run the prepare_read_queue() / dispatch_queue_pending() loop
immediately before sleeping, in order to make sure you get all events
queued for you, then flush again in case you've queued any requests
from your event handlers
- poll on the Wayland display FD as well as any other activity sources
(other event queues, timers, etc)
- when you wake up, dispatch your Wayland event queue as well as other
relevant event sources

> > Also, my taskbar is the wrong length and my background is black.  Other 
> > than that, pretty cool.
>
> Yep, desktop-shell isn't designed to handle runtime rotation. It could be 
> made to pretty easily by working on the client code. For your case though I'd 
> assume something like kiosk-shell would be a much better bet.

kiosk-shell is something we have in newer versions of Weston which
sounds like it would work well for your usecases - it's designed to
just run a single application fullscreen. You might want to check out
what we have in git, which will be released as 10.0 in a few weeks'
time.

The rotation patches never got merged because we had some issues with
the IIO integration in particular, but having runtime rotation tests
sure would be nice, and kiosk-shell should at least be a lot easier to
fix than desktop-shell, if it does even need any fixes.

Cheers,
Daniel


Re: FW: xrandr and xwayland

2021-08-05 Thread Daniel Stone
On Thu, 5 Aug 2021 at 21:15, David Deyo  wrote:

> I was able to re-create the files of your patch and added them into my
> build tree.
>
>
>
> Not having an accelerometer, I’ve had to make a few changes.
>
> When you said, ‘It had issues’, I am also seeing some issues.
>
>
>
> I can rotate my screen, but something about the callback
> (weston_rotate_rotate) is acting strange.
>
> I have added a loop in autorotate that calls weston_rotate_rotate every 10
> secs.   I am logging out to Weston_log.
>
> It seems I only get those logs once every 15-30 minutes and when I do,
> it’s hundreds of logs.  What’s up with that?
>

Sounds like you're missing wl_display_flush() in your client code, so the
requests don't make it to the socket buffer until they're forced to because
it's filled up.


> Also, my taskbar is the wrong length and my background is black.  Other
> than that, pretty cool.
>

Yep, desktop-shell isn't designed to handle runtime rotation. It could be
made to pretty easily by working on the client code. For your case though
I'd assume something like kiosk-shell would be a much better bet.

Cheers,
Daniel


Re: Proxying Wayland for security

2021-07-28 Thread Daniel Stone
Hi Alyssa,

On Tue, 27 Jul 2021 at 20:30, Alyssa Ross  wrote:
> Hi!  I'm Alyssa and I'm working on Spectrum[1], which is a project
> aiming to create a compartmentalized desktop Linux system, with high
> levels of isolation between applications.

I've seen, it's neat!

> One big issue for us is protecting the system against potentially
> malicious Wayland clients.  It's important that a compartmentalized
> application can't read from the clipboard or take a screenshot of the
> whole desktop without user consent.  (The latter is possible in
> wlroots compositors with wlr-screencopy.)
>
> So an idea I had was to was to write a proxy program that would sit
> in front of the compositor, and receive connections from clients.  If
> a client sent a wl_data_offer::receive, for example, the proxy could
> ask for user confirmation before forwarding that to the compositor.

As you've noted, the core protocol doesn't offer any way to scrape
these contents without additional extension protocols, which are not
implemented by all compositors. Generally speaking, GNOME's Mutter and
Weston tend not to implement these protocols, and wlroots-based
compositors tend to implement them.

> I could just implement this stuff in a compositor, but doing it with a
> proxy would mean that a known subset of the protocol could be used
> with any compositor, with appropriate access controls.  It would also
> be a reusable component that could be customised to have different
> access control policy depending on the needs of a distributor or user.

I think you'd be better off dealing with the problem at source.
libwayland-server already has a 'global filter' mechanism which allows
an arbitrary hook to decide which clients can and cannot see which
interfaces. This is used to advertise private protocols to trusted
clients: for example, Weston's UI and screenshot support are handled
by external clients, but we use the filter to ensure that _only_ those
clients can access those protocols, not arbitrary clients.

Implementing a proxy just shifts this problem rather than solving it
once; every time someone adds a new protocol, you have to plumb it
through the proxy and add some kind of policy mechanism. But the
compositors themselves probably have their own policy mechanism, so
why not just reuse that?

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-21 Thread Daniel Vetter
On Mon, Jun 21, 2021 at 12:16:55PM +0200, Christian König wrote:
> Am 18.06.21 um 20:45 schrieb Daniel Vetter:
> > On Fri, Jun 18, 2021 at 8:02 PM Christian König
> >  wrote:
> > > Am 18.06.21 um 19:20 schrieb Daniel Vetter:
> > > [SNIP]
> > > The whole thing was introduced with this commit here:
> > > 
> > > commit f2c24b83ae90292d315aa7ac029c6ce7929e01aa
> > > Author: Maarten Lankhorst 
> > > Date:   Wed Apr 2 17:14:48 2014 +0200
> > > 
> > >   drm/ttm: flip the switch, and convert to dma_fence
> > > 
> > >   Signed-off-by: Maarten Lankhorst 
> > > 
> > >int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> > > 
> > > -   bo->sync_obj = driver->sync_obj_ref(sync_obj);
> > > +   reservation_object_add_excl_fence(bo->resv, fence);
> > >   if (evict) {
> > > 
> > > Maarten replaced the bo->sync_obj reference with the dma_resv exclusive
> > > fence.
> > > 
> > > This means that we need to apply the sync_obj semantic to all drivers
> > > using a DMA-buf with its dma_resv object, otherwise you break imports
> > > from TTM drivers.
> > > 
> > > Since then and up till now the exclusive fence must be waited on and
> > > never replaced with anything which signals before the old fence.
> > > 
> > > Maarten and I think Thomas did that and I was always assuming that you
> > > know about this design decision.
> > Surprisingly I do actually know this.
> > 
> > Still the commit you cite did _not_ change any of the rules around
> > dma_buf: Importers have _no_ obligation to obey the exclusive fence,
> > because the buffer is pinned. None of the work that Maarten has done
> > has fundamentally changed this contract in any way.
> 
> Well I now agree that the rules around dma_resv are different than I
> thought, but this change should have raised more eyebrows.
> 
> The problem is this completely broke interop with all drivers using TTM and
> I think might even explain some bug reports.
> 
> I re-introduced the moving fence by adding bo->moving a few years after the
> initial introduction of dma_resv, but that was just to work around
> performance problems introduced by using the exclusive fence for both use
> cases.

Ok that part is indeed not something I've known.

> > If amdgpu (or any other ttm based driver) hands back and sgt without
> > waiting for ttm_bo->moving or the exclusive fence first, then that's a
> > bug we need to fix in these drivers. But if ttm based drivers did get
> > this wrong, then they got this wrong both before and after the switch
> > over to using dma_resv - this bug would go back all the way to Dave's
> > introduction of drm_prime.c and support for that.
> 
> I'm not 100% sure, but I think before the switch to the dma_resv object
> drivers just waited for the BOs to become idle and that should have
> prevented this.
> 
> Anyway let's stop discussing history and move forward. Sending patches for
> all affected TTM driver with CC: stable tags in a minute.
> 
> 
> > The only thing which importers have to do is not wreak the DAG nature
> > of the dma_resv fences and drop dependencies. Currently there's a
> > handful of drivers which break this (introduced over the last few
> > years), and I have it somewhere on my todo list to audit them all.
> 
> Please give that some priority.
> 
> Ignoring the moving fence is a information leak, but messing up the DAG
> gives you access to freed up memory.

Yeah will try to. I've also been hung up a bit on how to fix that, but I
think just closing the DAG-breakage is simplest. Any userspace which then
complains about the additional sync that causes would then be motivated to
look into the import ioctl Jason has. And I think the impact in practice
should be minimal, aside from some corner cases.

> > The goal with extracting dma_resv from ttm was to make implicit sync
> > working and get rid of some terrible stalls on the userspace side.
> > Eventually it was also the goal to make truly dynamic buffer
> > reservation possible, but that took another 6 or so years to realize
> > with your work. And we had to make dynamic dma-buf very much opt-in,
> > because auditing all the users is very hard work and no one
> > volunteered. And for dynamic dma-buf the rule is that the exclusive
> > fence must _never_ be ignored, and the two drivers supporting it (mlx5
> > and amdgpu) obey that.
> > 
> > So yeah for ttm drivers dma_resv is primarily for memory management,
> > with a side effect of also

Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-18 Thread Daniel Vetter
On Fri, Jun 18, 2021 at 8:02 PM Christian König
 wrote:
>
> Am 18.06.21 um 19:20 schrieb Daniel Vetter:
> > On Fri, Jun 18, 2021 at 6:43 PM Christian König
> >  wrote:
> >> Am 18.06.21 um 17:17 schrieb Daniel Vetter:
> >>> [SNIP]
> >>> Ignoring _all_ fences is officially ok for pinned dma-buf. This is
> >>> what v4l does. Aside from it's definitely not just i915 that does this
> >>> even on the drm side, we have a few more drivers nowadays.
> >> No it seriously isn't. If drivers are doing this they are more than broken.
> >>
> >> See the comment in dma-resv.h
> >>
> >>* Based on bo.c which bears the following copyright notice,
> >>* but is dual licensed:
> >> 
> >>
> >>
> >> The handling in ttm_bo.c is and always was that the exclusive fence is
> >> used for buffer moves.
> >>
> >> As I said multiple times now the *MAIN* purpose of the dma_resv object
> >> is memory management and *NOT* synchronization.
> >>
> >> Those restrictions come from the original design of TTM where the
> >> dma_resv object originated from.
> >>
> >> The resulting consequences are that:
> >>
> >> a) If you access the buffer without waiting for the exclusive fence you
> >> run into a potential information leak.
> >>   We kind of let that slip for V4L since they only access the buffers
> >> for writes, so you can't do any harm there.
> >>
> >> b) If you overwrite the exclusive fence with a new one without waiting
> >> for the old one to signal you open up the possibility for userspace to
> >> access freed up memory.
> >>   This is a complete show stopper since it means that taking over the
> >> system is just a typing exercise.
> >>
> >>
> >> What you have done by allowing this in is ripping open a major security
> >> hole for any DMA-buf import in i915 from all TTM based driver.
> >>
> >> This needs to be fixed ASAP, either by waiting in i915 and all other
> >> drivers doing this for the exclusive fence while importing a DMA-buf or
> >> by marking i915 and all other drivers as broken.
> >>
> >> Sorry, but if you allowed that in you seriously have no idea what you
> >> are talking about here and where all of this originated from.
> > Dude, get a grip, seriously. dma-buf landed in 2011
> >
> > commit d15bd7ee445d0702ad801fdaece348fdb79e6581
> > Author: Sumit Semwal 
> > Date:   Mon Dec 26 14:53:15 2011 +0530
> >
> > dma-buf: Introduce dma buffer sharing mechanism
> >
> > and drm prime landed in the same year
> >
> > commit 3248877ea1796915419fba7c89315fdbf00cb56a
> > (airlied/drm-prime-dmabuf-initial)
> > Author: Dave Airlie 
> > Date:   Fri Nov 25 15:21:02 2011 +
> >
> > drm: base prime/dma-buf support (v5)
> >
> > dma-resv was extracted much later
> >
> > commit 786d7257e537da0674c02e16e3b30a44665d1cee
> > Author: Maarten Lankhorst 
> > Date:   Thu Jun 27 13:48:16 2013 +0200
> >
> > reservation: cross-device reservation support, v4
> >
> > Maarten's patch only extracted the dma_resv stuff so it's there,
> > optionally. There was never any effort to roll this out to all the
> > existing drivers, of which there were plenty.
> >
> > It is, and has been since 10 years, totally fine to access dma-buf
> > without looking at any fences at all. From your pov of a ttm driver
> > dma-resv is mainly used for memory management and not sync, but I
> > think that's also due to some reinterpretation of the actual sync
> > rules on your side. For everyone else the dma_resv attached to a
> > dma-buf has been about implicit sync only, nothing else.
>
> No, that was way before my time.
>
> The whole thing was introduced with this commit here:
>
> commit f2c24b83ae90292d315aa7ac029c6ce7929e01aa
> Author: Maarten Lankhorst 
> Date:   Wed Apr 2 17:14:48 2014 +0200
>
>  drm/ttm: flip the switch, and convert to dma_fence
>
>  Signed-off-by: Maarten Lankhorst 
>
>   int ttm_bo_move_accel_cleanup(struct ttm_buffer_object *bo,
> 
> -   bo->sync_obj = driver->sync_obj_ref(sync_obj);
> +   reservation_object_add_excl_fence(bo->resv, fence);
>  if (evict) {
>
> Maarten replaced the bo->sync_obj reference with the dma_resv exclusive
> fence.
>
> This means that we need to apply the sync_obj semantic to all drivers
> using a DMA-buf with its dma_resv object, otherwise you br

Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-18 Thread Daniel Stone
Sorry for the mobile reply, but V4L2 is absolutely not write-only; there has 
never been an intersection of V4L2 supporting dmabuf and not supporting reads.

I see your point about the heritage of dma_resv but it’s a red herring. It 
doesn’t matter who’s right, or who was first, or where the code was extracted 
from.

It’s well defined that amdgpu defines resv to be one thing, that every other 
non-TTM user defines it to be something very different, and that the other TTM 
users define it to be something in the middle.

We’ll never get to anything workable if we keep arguing who’s right. Everyone 
is wrong, because dma_resv doesn’t globally mean anything.

It seems clear that there are three classes of synchronisation barrier (not 
using the ‘f’ word here), in descending exclusion order:
  - memory management barriers (amdgpu exclusive fence / ttm_bo->moving)
  - implicit synchronisation write barriers (everyone else’s exclusive fences, 
amdgpu’s shared fences)
  - implicit synchronisation read barriers (everyone else’s shared fences, also 
amdgpu’s shared fences sometimes)

I don’t see a world in which these three uses can be reduced to two slots. What 
also isn’t clear to me though, is how the memory-management barriers can 
exclude all other access in the original proposal with purely userspace CS. 
Retaining the three separate modes also seems like a hard requirement to not 
completely break userspace, but then I don’t see how three separate slots would 
work if they need to be temporally ordered. amdgpu fixed this by redefining the 
meaning of the two slots, others fixed this by not doing one of the three modes.

So how do we square the circle without encoding a DAG into the kernel? Do the 
two slots need to become a single list which is ordered by time + ‘weight’ and 
flattened whenever modified? Something else?

Have a great weekend.

-d

> On 18 Jun 2021, at 5:43 pm, Christian König  wrote:
> 
> Am 18.06.21 um 17:17 schrieb Daniel Vetter:
>> [SNIP]
>> Ignoring _all_ fences is officially ok for pinned dma-buf. This is
>> what v4l does. Aside from it's definitely not just i915 that does this
>> even on the drm side, we have a few more drivers nowadays.
> 
> No it seriously isn't. If drivers are doing this they are more than broken.
> 
> See the comment in dma-resv.h
> 
>  * Based on bo.c which bears the following copyright notice,
>  * but is dual licensed:
> 
> 
> 
> The handling in ttm_bo.c is and always was that the exclusive fence is used 
> for buffer moves.
> 
> As I said multiple times now the *MAIN* purpose of the dma_resv object is 
> memory management and *NOT* synchronization.
> 
> Those restrictions come from the original design of TTM where the dma_resv 
> object originated from.
> 
> The resulting consequences are that:
> 
> a) If you access the buffer without waiting for the exclusive fence you run 
> into a potential information leak.
> We kind of let that slip for V4L since they only access the buffers for 
> writes, so you can't do any harm there.
> 
> b) If you overwrite the exclusive fence with a new one without waiting for 
> the old one to signal you open up the possibility for userspace to access 
> freed up memory.
> This is a complete show stopper since it means that taking over the 
> system is just a typing exercise.
> 
> 
> What you have done by allowing this in is ripping open a major security hole 
> for any DMA-buf import in i915 from all TTM based driver.
> 
> This needs to be fixed ASAP, either by waiting in i915 and all other drivers 
> doing this for the exclusive fence while importing a DMA-buf or by marking 
> i915 and all other drivers as broken.
> 
> Sorry, but if you allowed that in you seriously have no idea what you are 
> talking about here and where all of this originated from.
> 
> Regards,
> Christian.

___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-18 Thread Daniel Vetter
On Fri, Jun 18, 2021 at 6:43 PM Christian König
 wrote:
>
> Am 18.06.21 um 17:17 schrieb Daniel Vetter:
> > [SNIP]
> > Ignoring _all_ fences is officially ok for pinned dma-buf. This is
> > what v4l does. Aside from it's definitely not just i915 that does this
> > even on the drm side, we have a few more drivers nowadays.
>
> No it seriously isn't. If drivers are doing this they are more than broken.
>
> See the comment in dma-resv.h
>
>   * Based on bo.c which bears the following copyright notice,
>   * but is dual licensed:
> 
>
>
> The handling in ttm_bo.c is and always was that the exclusive fence is
> used for buffer moves.
>
> As I said multiple times now the *MAIN* purpose of the dma_resv object
> is memory management and *NOT* synchronization.
>
> Those restrictions come from the original design of TTM where the
> dma_resv object originated from.
>
> The resulting consequences are that:
>
> a) If you access the buffer without waiting for the exclusive fence you
> run into a potential information leak.
>  We kind of let that slip for V4L since they only access the buffers
> for writes, so you can't do any harm there.
>
> b) If you overwrite the exclusive fence with a new one without waiting
> for the old one to signal you open up the possibility for userspace to
> access freed up memory.
>  This is a complete show stopper since it means that taking over the
> system is just a typing exercise.
>
>
> What you have done by allowing this in is ripping open a major security
> hole for any DMA-buf import in i915 from all TTM based driver.
>
> This needs to be fixed ASAP, either by waiting in i915 and all other
> drivers doing this for the exclusive fence while importing a DMA-buf or
> by marking i915 and all other drivers as broken.
>
> Sorry, but if you allowed that in you seriously have no idea what you
> are talking about here and where all of this originated from.

Dude, get a grip, seriously. dma-buf landed in 2011

commit d15bd7ee445d0702ad801fdaece348fdb79e6581
Author: Sumit Semwal 
Date:   Mon Dec 26 14:53:15 2011 +0530

   dma-buf: Introduce dma buffer sharing mechanism

and drm prime landed in the same year

commit 3248877ea1796915419fba7c89315fdbf00cb56a
(airlied/drm-prime-dmabuf-initial)
Author: Dave Airlie 
Date:   Fri Nov 25 15:21:02 2011 +

   drm: base prime/dma-buf support (v5)

dma-resv was extracted much later

commit 786d7257e537da0674c02e16e3b30a44665d1cee
Author: Maarten Lankhorst 
Date:   Thu Jun 27 13:48:16 2013 +0200

   reservation: cross-device reservation support, v4

Maarten's patch only extracted the dma_resv stuff so it's there,
optionally. There was never any effort to roll this out to all the
existing drivers, of which there were plenty.

It is, and has been since 10 years, totally fine to access dma-buf
without looking at any fences at all. From your pov of a ttm driver
dma-resv is mainly used for memory management and not sync, but I
think that's also due to some reinterpretation of the actual sync
rules on your side. For everyone else the dma_resv attached to a
dma-buf has been about implicit sync only, nothing else.

_only_ when you have a dynamic importer/exporter can you assume that
the dma_resv fences must actually be obeyed. That's one of the reasons
why we had to make this a completely new mode (the other one was
locking, but they really tie together).

Wrt your problems:
a) needs to be fixed in drivers exporting buffers and failing to make
sure the memory is there by the time dma_buf_map_attachment returns.
b) needs to be fixed in the importers, and there's quite a few of
those. There's more than i915 here, which is why I think we should
have the dma_resv_add_shared_exclusive helper extracted from amdgpu.
Avoids hand-rolling this about 5 times (6 if we include the import
ioctl from Jason).

Also I've like been trying to explain this ever since the entire
dynamic dma-buf thing started.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-18 Thread Daniel Vetter
On Fri, Jun 18, 2021 at 4:42 PM Christian König
 wrote:
>
> Am 18.06.21 um 16:31 schrieb Daniel Vetter:
> > [SNIP]
> >> And that drivers choose to ignore the exclusive fence is an absolutely
> >> no-go from a memory management and security point of view. Exclusive
> >> access means exclusive access. Ignoring that won't work.
> > Yeah, this is why I've been going all over the place about lifting
> > ttm_bo->moving to dma_resv. And also that I flat out don't trust your
> > audit, if you havent found these drivers then very clearly you didn't
> > audit much at all :-)
>
> I just didn't though that anybody could be so stupid to allow such a
> thing in.
>
> >> The only thing which saved us so far is the fact that drivers doing this
> >> are not that complex.
> >>
> >> BTW: How does it even work? I mean then you would run into the same
> >> problem as amdgpu with its page table update fences, e.g. that your
> >> shared fences might signal before the exclusive one.
> > So we don't ignore any fences when we rip out the backing storage.
> >
> > And yes there's currently a bug in all these drivers that if you set
> > both the "ignore implicit fences" and the "set the exclusive fence"
> > flag, then we just break this. Which is why I think we want to have a
> > dma_fence_add_shared_exclusive() helper extracted from your amdgpu
> > code, which we can then use everywhere to plug this.
>
> Daniel are you realizing what you are talking about here? Does that also
> apply for imported DMA-bufs?
>
> If yes than that is a security hole you can push an elephant through.
>
> Can you point me to the code using that?
>
> >>> For dma-buf this isn't actually a problem, because dma-buf are pinned. You
> >>> can't move them while other drivers are using them, hence there's not
> >>> actually a ttm_bo->moving fence we can ignore.
> >>>
> >>> p2p dma-buf aka dynamic dma-buf is a different beast, and i915 (and fwiw
> >>> these other drivers) need to change before they can do dynamic dma-buf.
> >>>
> >>>> Otherwise we have an information leak worth a CVE and that is certainly 
> >>>> not
> >>>> something we want.
> >>> Because yes otherwise we get a CVE. But right now I don't think we have
> >>> one.
> >> Yeah, agree. But this is just because of coincident and not because of
> >> good engineering :)
> > Well the good news is that I think we're now talking slightly less
> > past each another than the past few weeks :-)
> >
> >>> We do have a quite big confusion on what exactly the signaling ordering is
> >>> supposed to be between exclusive and the collective set of shared fences,
> >>> and there's some unifying that needs to happen here. But I think what
> >>> Jason implements here in the import ioctl is the most defensive version
> >>> possible, so really can't break any driver. It really works like you have
> >>> an ad-hoc gpu engine that does nothing itself, but waits for the current
> >>> exclusive fence and then sets the exclusive fence with its "CS" completion
> >>> fence.
> >>>
> >>> That's imo perfectly legit use-case.
> >> The use case is certainly legit, but I'm not sure if merging this at the
> >> moment is a good idea.
> >>
> >> Your note that drivers are already ignoring the exclusive fence in the
> >> dma_resv object was eye opening to me. And I now have the very strong
> >> feeling that the synchronization and the design of the dma_resv object
> >> is even more messy then I thought it is.
> >>
> >> To summarize we can be really lucky that it didn't blow up into our
> >> faces already.
> > I don't think there was that much luck involved (ok I did find a
> > possible bug in i915 already around cpu cache flushing) - for SoC the
> > exclusive slot in dma_resv really is only used for implicit sync and
> > nothing else. The fun only starts when you throw in pipelined backing
> > storage movement.
> >
> > I guess this also explains why you just seemed to ignore me when I was
> > asking for a memory management exclusive fence for the p2p stuff, or
> > some other way to specifically handling movements (like ttm_bo->moving
> > or whatever it is). From my pov we clearly needed that to make p2p
> > dma-buf work well enough, mixing up the memory management exclusive
> > slot with the implicit sync exclusive slot never looked like a bright

Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-18 Thread Daniel Vetter
On Fri, Jun 18, 2021 at 11:15 AM Christian König
 wrote:
>
> Am 17.06.21 um 21:58 schrieb Daniel Vetter:
> > On Thu, Jun 17, 2021 at 09:37:36AM +0200, Christian König wrote:
> >> [SNIP]
> >>> But, to the broader point, maybe?  I'm a little fuzzy on exactly where
> >>> i915 inserts and/or depends on fences.
> >>>
> >>>> When you combine that with complex drivers which use TTM and buffer
> >>>> moves underneath you can construct an information leak using this and
> >>>> give userspace access to memory which is allocated to the driver, but
> >>>> not yet initialized.
> >>>>
> >>>> This way you can leak things like page tables, passwords, kernel data
> >>>> etc... in large amounts to userspace and is an absolutely no-go for
> >>>> security.
> >>> Ugh...  Unfortunately, I'm really out of my depth on the implications
> >>> going on here but I think I see your point.
> >>>
> >>>> That's why I'm said we need to get this fixed before we upstream this
> >>>> patch set here and especially the driver change which is using that.
> >>> Well, i915 has had uAPI for a while to ignore fences.
> >> Yeah, exactly that's illegal.
> > You're a few years too late with closing that barn door. The following
> > drives have this concept
> > - i915
> > - msm
> > - etnaviv
> >
> > Because you can't write a competent vulkan driver without this.
>
> WHAT? ^^
>
> > This was discussed at absolute epic length in various xdcs iirc. We did 
> > ignore a
> > bit the vram/ttm/bo-moving problem because all the people present were
> > hacking on integrated gpu (see list above), but that just means we need to
> > treat the ttm_bo->moving fence properly.
>
> I should have visited more XDCs in the past, the problem is much larger
> than this.
>
> But I now start to understand what you are doing with that design and
> why it looks so messy to me, amdgpu is just currently the only driver
> which does Vulkan and complex memory management at the same time.
>
> >> At least the kernel internal fences like moving or clearing a buffer object
> >> needs to be taken into account before a driver is allowed to access a
> >> buffer.
> > Yes i915 needs to make sure it never ignores ttm_bo->moving.
>
> No, that is only the tip of the iceberg. See TTM for example also puts
> fences which drivers needs to wait for into the shared slots. Same thing
> for use cases like clear on release etc
>
>  From my point of view the main purpose of the dma_resv object is to
> serve memory management, synchronization for command submission is just
> a secondary use case.
>
> And that drivers choose to ignore the exclusive fence is an absolutely
> no-go from a memory management and security point of view. Exclusive
> access means exclusive access. Ignoring that won't work.

Yeah, this is why I've been going all over the place about lifting
ttm_bo->moving to dma_resv. And also that I flat out don't trust your
audit, if you havent found these drivers then very clearly you didn't
audit much at all :-)

> The only thing which saved us so far is the fact that drivers doing this
> are not that complex.
>
> BTW: How does it even work? I mean then you would run into the same
> problem as amdgpu with its page table update fences, e.g. that your
> shared fences might signal before the exclusive one.

So we don't ignore any fences when we rip out the backing storage.

And yes there's currently a bug in all these drivers that if you set
both the "ignore implicit fences" and the "set the exclusive fence"
flag, then we just break this. Which is why I think we want to have a
dma_fence_add_shared_exclusive() helper extracted from your amdgpu
code, which we can then use everywhere to plug this.

> > For dma-buf this isn't actually a problem, because dma-buf are pinned. You
> > can't move them while other drivers are using them, hence there's not
> > actually a ttm_bo->moving fence we can ignore.
> >
> > p2p dma-buf aka dynamic dma-buf is a different beast, and i915 (and fwiw
> > these other drivers) need to change before they can do dynamic dma-buf.
> >
> >> Otherwise we have an information leak worth a CVE and that is certainly not
> >> something we want.
> > Because yes otherwise we get a CVE. But right now I don't think we have
> > one.
>
> Yeah, agree. But this is just because of coincident and not because of
> good engineering :)

Well the good news is that I think we're now talking slightly less
past each another than the p

Re: [Mesa-dev] [PATCH 0/6] dma-buf: Add an API for exporting sync files (v12)

2021-06-17 Thread Daniel Vetter
On Thu, Jun 17, 2021 at 09:37:36AM +0200, Christian König wrote:
> Am 16.06.21 um 20:30 schrieb Jason Ekstrand:
> > On Tue, Jun 15, 2021 at 3:41 AM Christian König
> >  wrote:
> > > Hi Jason & Daniel,
> > > 
> > > maybe I should explain once more where the problem with this approach is
> > > and why I think we need to get that fixed before we can do something
> > > like this here.
> > > 
> > > To summarize what this patch here does is that it copies the exclusive
> > > fence and/or the shared fences into a sync_file. This alone is totally
> > > unproblematic.
> > > 
> > > The problem is what this implies. When you need to copy the exclusive
> > > fence to a sync_file then this means that the driver is at some point
> > > ignoring the exclusive fence on a buffer object.
> > Not necessarily.  Part of the point of this is to allow for CPU waits
> > on a past point in buffers timeline.  Today, we have poll() and
> > GEM_WAIT both of which wait for the buffer to be idle from whatever
> > GPU work is currently happening.  We want to wait on something in the
> > past and ignore anything happening now.
> 
> Good point, yes that is indeed a valid use case.
> 
> > But, to the broader point, maybe?  I'm a little fuzzy on exactly where
> > i915 inserts and/or depends on fences.
> > 
> > > When you combine that with complex drivers which use TTM and buffer
> > > moves underneath you can construct an information leak using this and
> > > give userspace access to memory which is allocated to the driver, but
> > > not yet initialized.
> > > 
> > > This way you can leak things like page tables, passwords, kernel data
> > > etc... in large amounts to userspace and is an absolutely no-go for
> > > security.
> > Ugh...  Unfortunately, I'm really out of my depth on the implications
> > going on here but I think I see your point.
> > 
> > > That's why I'm said we need to get this fixed before we upstream this
> > > patch set here and especially the driver change which is using that.
> > Well, i915 has had uAPI for a while to ignore fences.
> 
> Yeah, exactly that's illegal.

You're a few years too late with closing that barn door. The following
drives have this concept
- i915
- msm
- etnaviv

Because you can't write a competent vulkan driver without this. This was
discussed at absolute epic length in various xdcs iirc. We did ignore a
bit the vram/ttm/bo-moving problem because all the people present were
hacking on integrated gpu (see list above), but that just means we need to
treat the ttm_bo->moving fence properly.

> At least the kernel internal fences like moving or clearing a buffer object
> needs to be taken into account before a driver is allowed to access a
> buffer.

Yes i915 needs to make sure it never ignores ttm_bo->moving.

For dma-buf this isn't actually a problem, because dma-buf are pinned. You
can't move them while other drivers are using them, hence there's not
actually a ttm_bo->moving fence we can ignore.

p2p dma-buf aka dynamic dma-buf is a different beast, and i915 (and fwiw
these other drivers) need to change before they can do dynamic dma-buf.

> Otherwise we have an information leak worth a CVE and that is certainly not
> something we want.

Because yes otherwise we get a CVE. But right now I don't think we have
one.

We do have a quite big confusion on what exactly the signaling ordering is
supposed to be between exclusive and the collective set of shared fences,
and there's some unifying that needs to happen here. But I think what
Jason implements here in the import ioctl is the most defensive version
possible, so really can't break any driver. It really works like you have
an ad-hoc gpu engine that does nothing itself, but waits for the current
exclusive fence and then sets the exclusive fence with its "CS" completion
fence.

That's imo perfectly legit use-case.

Same for the export one. Waiting for a previous snapshot of implicit
fences is imo perfectly ok use-case and useful for compositors - client
might soon start more rendering, and on some drivers that always results
in the exclusive slot being set, so if you dont take a snapshot you
oversync real bad for your atomic flip.

> > Those changes are years in the past.  If we have a real problem here (not 
> > sure on
> > that yet), then we'll have to figure out how to fix it without nuking
> > uAPI.
> 
> Well, that was the basic idea of attaching flags to the fences in the
> dma_resv object.
> 
> In other words you clearly denote when you have to wait for a fence before
> accessing a buffer or you cause a security issue.

Replied somewhere else, and I do kinda l

Re: weston drm option "--connector" in previous version

2021-06-08 Thread Daniel Stone
Hi RyunHyeon,

On Tue, 8 Jun 2021 at 11:55, 김륜현 (RH Kim)  wrote:

> Hello, I want to output weston to specific display. So I searched about
> Weston option.
>
>
> I found "--connector" option.
>
> This option configures specific display for output
>
>
> However, I checked the mailing list that said would remove the
> "--connector" option from the previous version of Weston.
>
>
> So, is there no option to output to a specific display among many displays
> in the Weston 8.0.0 that I am currently using? (Is this the part that needs
> code modification?)
>

You should instead configure this through weston.ini, for example:
[output]
name=HDMI-A-1
mode=1920x1080

[output]
name=eDP-1
mode=off

You can find more details with 'man weston.ini'.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Wayland and window position/size

2021-05-26 Thread Daniel Stone
Hi,

On Wed, 26 May 2021 at 10:30, Pekka Paalanen  wrote:
> On Tue, 25 May 2021 22:10:38 -0500 Igor Korot  wrote:
> > > Positioning - Don't position and Wayland discourages it by not having 
> > > such an
> > > API. Sizing - do whatever you like.
> >
> > It just discourages it, so it is not completely impossible, correct?
>
> I'd say Wayland strongly discourages client dictated positioning.
>
> First, there is no Wayland protocol interface that would allow you to
> position a window (unless you invent a protocol extension to do it and
> then implement it also in any Wayland compositors you want to run on -
> some extensions like that exist and their support in different
> compositors varies, and they are mostly privileged or reserved for
> desktop components instead of apps). It might be possible to play
> tricks with existing generic interfaces to *maybe* eventually end up in
> some position, but that is extremely fragile and an outright abuse which
> might also cause strange UI behaviour.
>
> Second, Wayland does not define a coordinate system that would be
> useful for window positioning. Every window has its own local
> coordinate system, but they exist in a "vacuum" and independently of
> each other and of monitors. So at most you can position a window
> respective to another window, but not globally or per-monitor.
>
> Third, Wayland does not allow you to find out about other windows or
> desktop elements that you might want to stay clear of, nor about
> monitor edges (well... not for this purpose). So it's hard to choose
> your position properly.
>
> So, is it possible depends on how badly you are willing to break things
> to get there.
>
> As for sizing, I'd think the xdg-shell protocol extensions are mature.
> They allow you to freely size your window when the window state
> supports it, and they include the provision for the display server to
> tell you what size you should or must be, depending on window state. It
> also has an interface for positioning your popups such that they don't
> go out of view but also match where they should be respective to your
> window.

Pekka's answer is very thorough. As a much shorter version: just use
the XDG extensions which already exist for popup/dialog windows.

Your application is not the only one which needs to request
credentials - so do browsers, mail clients, file managers, and just
about everything else.

The X11 model is that every application has its own semantics for
doing this and decides exactly how to place the window. The client
tells the server: place this window at these co-ordinates, at this z
position, and give me all the input until I tell you otherwise.

The Wayland model is that you tell the Wayland server that you would
like to present a dialog or popup window, and which top-level window
it should be attached to. It then handles those windows in a
completely consistent and uniform way between all your applications,
including positioning, stacking, and focus.

So, just use that. It works. :)

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Change default Wayland branches to 'main'

2021-04-27 Thread Daniel Stone
Hi all,

On Thu, 8 Apr 2021 at 12:20, Daniel Stone  wrote:

> I propose that we do this for all the wayland/* repositories, either this
> weekend or next; I'm happy to make the changes (rename 'master' to 'main'
> and retarget all open MRs). Does anyone have any opinions or suggestions?
>

Astute observers will notice that multiple weekends passed, but it's now
been done. All MRs against the various repos (wayland, wayland-protocols,
weston, etc) have been retargeted.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Split render/display SoCs, Mesa's renderonly, and Wayland dmabuf hints

2021-04-20 Thread Daniel Vetter
Just 2 comments on the kernel aspects here.

On Tue, Apr 20, 2021 at 12:18 PM Daniel Stone  wrote:
>
> Hi,
>
> On Mon, 19 Apr 2021 at 13:06, Simon Ser  wrote:
>>
>> I'm working on a Wayland extension [1] that, among other things, allows
>> compositors to advertise the preferred device to be used by Wayland
>> clients.
>>
>> In general, compositors will send a render node. However, in the case
>> of split render/display SoCs, things get a little bit complicated.
>>
>> [...]
>
>
> Thanks for the write-up Simon!
>
>>
>> There are a few solutions:
>>
>> 1. Require compositors to discover the render device by trying to import
>>a buffer. For each available render device, the compositor would
>>allocate a buffer, export it as a DMA-BUF, import it to the
>>display-only device, then try to drmModeAddFB.
>
>
> I don't think this is actually tractable? Assuming that 'allocate a buffer' 
> means 'obtain a gbm_device for the render node directly and allocate a gbm_bo 
> from it', even with compatible formats and modifiers this will fail for more 
> restrictive display hardware. imx-drm and pl111 (combined with vc4 on some 
> Raspberry Pis) will fail this, since they'll take different allocation paths 
> when they're bound through kmsro vs. directly, accounting for things like 
> contiguous allocation. So we'd get false negatives on at least some platforms.
>
>>
>> 2. Allow compositors to query the render device magically opened by
>>kmsro. This could be done either via EGL_EXT_device_drm, or via a
>>new EGL extension.
>
>
> This would be my strong preference, and I don't entirely understand anholt's 
> pushback here. The way I see it, GBM is about allocation for scanout, and EGL 
> is about rendering. If, on a split GPU/display system, we create a gbm_device 
> from a KMS display-only device node, then creating an EGLDisplay from that 
> magically binds us to a completely different DRM GPU node, and anything using 
> that EGLDisplay will use that GPU device to render.
>
> Being able to discover the GPU device node through the device query is really 
> useful, because it tells us exactly what implicit magic EGL did under the 
> hood, and about the device that EGL will use. Being able to discover the 
> display node is much less useful; it does tell us how GBM will allocate 
> buffers, but the user already knows which device is in use because they 
> supplied it to GBM. I see the display node as a property of GBM, and the GPU 
> node as a property of EGL, even if EGL does do (*waves hands*) stuff under 
> the hood to ensure the two are compatible.
>
> If we had EGL_EXT_explicit_device, things get even more weird, I think; would 
> the device query on an EGLDisplay created with a combination of a gbm_device 
> native display handle and an explicit EGLDevice handle return the scanout 
> device from GBM or the GPU device from EGL? On my reading, I'd expect it to 
> be the latter; if the queries returned very different things based on whether 
> GPU device selection was implicit (returning the KMS node) or explicit (GPU 
> node), that would definitely violate the principle of least surprise.
>
>>
>> 3. Allow compositors to query the kernel drivers to know which devices
>>are compatible with each other. Some uAPI to query a compatible
>>display device from a render-only device, or vice-versa, has been
>>suggested in the past.
>
>
> What does 'compatible' mean? Would an Intel iGPU and and AMD dGPU be 
> compatible with each other? Would a Mali GPU bound to system memory through 
> AMBA be as compatible with the display controller as it would with an AMD GPU 
> on PCIE? I think a query which only exposed whether or not devices could 
> share dmabufs with each other is far too generic to be helpful for the actual 
> usecase we have, as well as not being useful enough for other usecases ('well 
> you _can_ use dmabufs from your AMD GPU on your Mali GPU, but only if they 
> were allocated in the right domain').
>
>>
>> (1) has a number of limitations and gotchas. It requires allocating
>> real buffers, this has a rather big cost for something done at
>> compositor initialization time. It requires to select a buffer format
>> and modifier compatible with both devices, so it can't be isolated in
>> a simple function (and e.g. shared between all compositors in libdrm).
>
>
> We're already going to have to do throwaway allocations to make Intel's tiled 
> modes work; I'd rather not extend this out to doing throwaway allocations 
> across device combinations as well as modifier lists.
>
>>
>> Some drivers will allow to drm

Re: Split render/display SoCs, Mesa's renderonly, and Wayland dmabuf hints

2021-04-20 Thread Daniel Stone
sn't have GBM in the same way, right? In the Vulkan case,
we already know exactly what the GPU is, because it's the VkPhysicalDevice
you had to explicitly select to create the VkDevice etc; if you're using
GBM it's because you've _also_ created a gbm_device for the KMS node and
are allocating gbm_bos to import to VkDeviceMemory/VkImage, so you already
have both pieces of information. (If you're creating VkDeviceMemory/VkImage
in Vulkan then exporting dmabuf from there, since there's no way to specify
a target device, it's a blind guess as to whether it'll actually work for
KMS. Maybe it will! But maybe not.)


> I don't know how feasible (3) is. The kernel drivers must be able to
> decide whether buffers coming from another driver can be scanned out,
> but how early can they give an answer? Can they give an answer solely
> based on a DRM node, and not a DMA-BUF?
>

Maybe! But maybe not.


> Feedback is welcome. Do you agree with the premise that compositors
> need access to the render node?


Yes, strongly. Compositors may optimise for direct paths (e.g. direct
scanout of client buffers through KMS, directly providing client buffers to
media codecs for streaming) where possible. But they must always have a
'device of last resort': if these optimal paths are not possible (your
codec doesn't like your client buffers, you can't do direct scanout because
a notification occluded your client content and you've run out of overlay
planes, you're on Intel and your display FIFO size is measured in bits),
the compositor needs to know that it can access the client buffers somehow.

This is done by always importing into a GPU device - for most current
compositors as an EGLImage, for some others as a VkImage - and falling back
to GL composition paths, or GL blits, or even ReadPixels if strictly
necessary, so your client content continues to be accessible.

There is no way to do this without telling the client what that GPU device
node is, so it can allocate accordingly. Thanks to the implicit device
selection performed when creating an EGLDisplay from a gbm_device, we
cannot currently discover what that device node is.


> Do you have any other potential solution in mind?


I can't think of any right now, but am open to hearing them.


> Which solution would you prefer?


For all the reasons above, strongly #2, i.e. that querying the DRM device
node from the EGLDevice returned by querying an EGLDisplay created from a
gbm_device, returns the GPU device's render node and not the KMS device's
primary node.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Change default Wayland branches to 'main'

2021-04-08 Thread Daniel Stone
On Thu, 8 Apr 2021 at 14:02, Jan Engelhardt  wrote:

> On Thursday 2021-04-08 13:20, Daniel Stone wrote:
> >I propose that we do this for all the wayland/* repositories, either this
> weekend or next; I'm happy
> >to make the changes (rename 'master' to 'main' and retarget all open
> MRs). Does anyone have any
> >opinions or suggestions?
>
> That could be offensive to some people. Some might even be offended by not
> being offended.
>

I had hoped that 'serious suggestions' was implicit, but maybe not. Do you
have any serious suggestions?
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Change default Wayland branches to 'main'

2021-04-08 Thread Daniel Stone
Hi all,
Going with a lot of other Git-based projects (and following the leads of
e.g. GitHub and GitLab), freedesktop.org is planning to change the default
branch name for its new projects to 'main' rather than 'master'.

Mesa is already migrating, and they have helpfully prepared a small Python
script which will retarget open MRs from 'master' to 'main'.

I propose that we do this for all the wayland/* repositories, either this
weekend or next; I'm happy to make the changes (rename 'master' to 'main'
and retarget all open MRs). Does anyone have any opinions or suggestions?

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Managing surfaces with 2 different wl_displays

2020-12-21 Thread Daniel Stone
Hi Nimi,

On Mon, 21 Dec 2020 at 18:50, Nimi Wariboko Jr.  wrote:
> The crux of the issue is that when we create our OpenGL context we create it 
> with a display connection that we initiate. Further down the line when we 
> need to create a wl_egl_window, we are given a wl_surface that was created by 
> a different call to wl_display_connect. Now physically these displays are the 
> same, but it seems when trying to use the foreign surface with our internal 
> wl_display causes this whole thing to not work. I get the following error in 
> Weston 8:
>
> [17:54:27.891] libwayland: invalid object (6), type 
> (zwp_relative_pointer_manager_v1), message attach(?oii)
> [17:54:27.893] libwayland: error in client communication (pid 1340)
>
> My hunch is that calling certain wl_surface functions with a display other 
> than the one that created it is unsupported in weston. I'm primarily mailing 
> this list to confirm this is true and that there are no other possible work 
> arounds.

You're right. It's not just Weston, but Wayland itself. Objects are
local to each wl_display, and may not be shared between different
instances of a wl_display, even if it's the same server at the other
end of the connection.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: about wl_display_poll

2020-11-09 Thread Daniel Stone
Hi Leo,

On Mon, 9 Nov 2020 at 16:17, enpeng xu  wrote:
> I have a question about the functions of wl_display_dispatch/wl_display_poll.
> In the function wl_display_poll, it ignores EINTR and keeps polling until it 
> gets an event from remote.
> I am not sure if this is the expected, but a typical user case is, user setup 
> a signal handler and run a event loop by calling wl_display_dispatch_queue,
> It expects the signal handler to set a stop flag and event loop will abort 
> the loop if the stop is true. but in practice, wl_display_poll will keep 
> being blocked and there is no way to exit the event loop.
>
> Is there any reason behind  in doing it? or should we provide an 
> interruptible version of wl_display_dispatch ?

You can easily write your own interruptible/non-dominant version, you
just have to expand what that function does internally:

https://wayland.freedesktop.org/docs/html/apb.html#Client-classwl__display_1a40039c1169b153269a3dc0796a54ddb0

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Trying to reduce boot time for weston and logo from weston

2020-08-03 Thread Daniel Stone
Hi,

On Sat, 1 Aug 2020 at 06:28, Vadivelu Babu, Surendar (S.)
 wrote:
> As Arun stated , we tried to boot the Weston application from initramfs , 
> however there is “pam” library dependency which is required for Weston . When 
> we include all the “pam” libraries the initramfs size increases from 9Mb to 
> 32 MB .
>
> Following are the few things which we tried to remove pam library from Weston 
> :
>
> 1. PACKAGECONFIG_remove= "pam" in Weston bbappend file
> 2. EXTRA_OECONF_append = "   --disable-pam " in Weston bbappend file
>
> In both the cases , still weston requires libpam library  dependencies
>
> Kindly advise whether it is possible to remove pam libraries from weston 
> application , and if not how can we achieve integration of Weston into 
> initramfs without increasing the size of the same.

Yocto (I believe OE core) carries a patch which removes PAM usage in
Weston; you might want to pick that up.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Trying to reduce boot time for weston and logo from weston

2020-07-27 Thread Daniel Stone
Hi Arunkumar,

On Mon, 27 Jul 2020 at 06:59, arunkrish20  wrote:
> We are working with the i.MX8 platform. We are working with weston and DRM 
> backend. Below are the version details.
>
> NXP BSP Version: 4.14.98_2.0.0_ga
> SC Firmware Version : 1.3.1
> wayland version 1.16 am
> weston- ivi - 5.0.0

We are currently working our way towards releasing Weston 9.0.0, so
this version is quite old.

> Our requirement is to display the first screen in 2 Seconds.
>
> In the current environment we are able to see the first screen in the 6th 
> seconds.

Ouch, that's quite long.

> We tried to boot the weston in initramfs. But due to size constraints we are 
> not able to. Size comes around 65MB.
>
> Is there any possibility for reducing the size of weston?

Weston itself with the DRM + GL backends only takes around 750kB once
installed. I assume the 65MB comes from extra dependencies, but that
is something you would have to investigate and configure in your Yocto
build. Weston itself does not have many dependencies, and those
dependencies are not large.

> Weston taking 500ms to complete the initialization. Can we reduce this 
> timing? e.g if we block unwanted device probing etc, any idea?
>
> In case if we use separate drm based rendering application for the first 
> screen and switching to weston are seeing blank. Instead of clearing the 
> weston screen buffer can we have a logo on first rendering. so that blank can 
> be avoided between the first drm application to weston rendering.

NXP has forked Weston and written their own backend, which is
responsible for the initialisation (both the time and the blank
screen). The default DRM backend is quite quick to come up and be
responsive, and doesn't blank the screen. So both these issues are
something you'd need to raise with NXP support, as they are due to
NXP's changes to our code.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: weston-info as a standalone utility

2020-07-09 Thread Daniel Stone
Hi,

On Thu, 9 Jul 2020 at 15:38, Pekka Paalanen  wrote:
> On Thu, 9 Jul 2020 10:32:56 +0200
> Olivier Fourdan  wrote:
> > In the meantime, Peter has already submitted patches to wayland-info
> > (thanks Peter!) so the tip of wayland-info is different from
> > weston-info (basically, we have diverged already).
> >
> > Eventually, if nobody has objections, we could move that repo to the
> > wayland domain…
>
> thanks for doing this, looks good!
>
> +1 for having this under the wayland organization in Gitlab.
> +1 for deleting weston-info from Weston repository.
>
> Shall we keep the new repository only for "info" tools, or should
> it contain more, like Weston's simple-shm, simple-egl, and a
> rewrite of weston-eventdemo that doesn't use toytoolkit?
>
> I would be fine with moving all "simple" clients from Weston
> repository to there if that's appropriate.

+1 to all of the above. I'd be happy to see it in a utils and examples
repo, with at least the ones you mentioned here. I don't think
toytoolkit should ever be pushed in there, because then we run the
danger of people thinking it might be a good idea.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [ANNOUNCE] Weston 9.0 release schedule

2020-07-02 Thread Daniel Stone
Hi,

On Wed, 1 Jul 2020 at 20:13, Simon Ser  wrote:
> On Wednesday, July 1, 2020 8:49 PM, Jan Engelhardt  wrote:
> > Usecases.. checking for releases, both new and, sometimes historic research,
> > old ones.
> >
> > A fileindex has a "tabular" appearance where each "row" contains filename 
> > and
> > date, and that table be sorted primarily by filename with no extra 
> > grouping, so
> > that wayland-* and weston-* releases are not interspersed.
> >
> > The release page is the antithesis to that: it presents items like a
> > blog rather than a filelisting, grouped by date with a static
> > reverse-date ordering, and grouping different filenames by date as
> > well. Imagine if `ls -l` did that all the time.
>
> IMHO the releases page is enough for this use-case. The difference is
> not worth the time figuring out how to enable file listing (which
> requires having access to the infrastructure, and our sysadmins already
> have too much on their plate).

GitLab Pages doesn't do file listings, so it's not just a matter of
enabling the thing in nginx, but we'd have to generate a listing
index.html as part of our static site generation.

To be honest, I think you're looking in the wrong place though. Every
release of both Wayland and Weston always has a tag in git, and git
tags are only ever used for releases. So if you want to find out more,
just run `git tag -l` on each of the two repositories (or `git
ls-remote --tags
https://gitlab.freedesktop.org/wayland/(wayland|weston).git`) and have
all the information you need?

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Current state of Window Decorations

2020-06-25 Thread Daniel Stone
Hi,

On Thu, 25 Jun 2020 at 10:01, Brad Robinson
 wrote:
> As a toolkit developer coming from Windows/OSX this is fairly shocking.  I'm 
> aware of the decoration protocol, but given it's not supported (and by the 
> sound of it never will be) on some of the major distros makes it almost 
> worthless.
>
> Seems odd to offload this responsibility to every toolkit without providing a 
> mechanism to achieve any consistency.  Or... is this just my background in 
> Windows/OSX where consistency in these areas is really encouraged and this 
> just isn't expected on Linux?

Others have said this well, but the big difference is that Windows and
OS X both have complete native toolkits as part of their base
platform. Those toolkits implement widgets, things like titlebars,
IPC, intra-process message handling and signaling,
internationalisation, application lifecycle management, etc etc.
That's not something we have: toolkits are instead an optional and
interchangeable component.

If you use one of the major extant toolkits, then you can reuse all
that functionality. This is why even some of the monoliths like
Chromium reuse toolkits. If you want to create your own toolkit from
scratch and not base on one of the existing ones, then for better or
worse, you get to recreate all the functionality they provide.

Shifting responsibility for window decorations to the compositor
doesn't solve the problem. Yes the compositor would render them for
you, but then you have additional protocols (with all the
synchronisation issues that implies) for the client and compositor to
co-ordinate rendering of the decorations and the content.

Neither is objectively better or worse, it's just a different design
which inherently brings different tradeoffs.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Language bindings for wl_registry_bind request

2020-06-18 Thread Daniel Stone
Hi,

On Thu, 18 Jun 2020 at 07:25, Brad Robinson
 wrote:
> I'm putting together a set of C# bindings for Wayland and it's coming along 
> nicely but I've hit an issue with wl_registry_bind where its implementation 
> doesn't seem to match the xml.
>
> The wayland.xml file declares it as: (essentially one input parameter - name)
>
> 
>   
> Binds a new, client-created object to the server using the
> specified name as the identifier.
>   
>   
>   
> 
>
>
> But the C implementation has additional version and interface parameters and 
> uses the wl_proxy_marshal_constructor_versioned - with apparently no hints in 
> the xml as to why.
>
> [...]
>
> Similarly the xml file would suggest the message signature should be "un", 
> but the C bindings have it as "usun".
>
> What's going on here?  Is this a special case for this one method?

Yeah, new_id with no interface gets expanded to new_id + interface
name + version:

https://gitlab.freedesktop.org/wayland/wayland/-/blob/master/src/scanner.c#L1233

Theoretically it applies to anything with that property, but in
reality the only user is wl_registry.bind.

It is generally not recommended to write your own bindings from
scratch, however. When you need to integrate with EGL, you need to
pass a pointer to the struct wl_display * for an EGLDisplay, and to
wl_surface * for an EGLSurface (via wl_egl_surface). Mesa internally
uses the C version of libwayland to send requests and receive events -
which will obviously not work with the C# implementation. The same is
true of Vulkan, GStreamer, and other libraries which integrate with
Wayland.

The recommended approach, avoiding these issues, is to wrap the
Wayland library using the 'dispatcher' methods provided; there is more
background here: https://smithay.github.io/wayland-rs-v-0-21.html

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: can subsurface and shell surface be used together to manage surfaces

2020-04-29 Thread Daniel Stone
Hi,

On Mon, 27 Apr 2020 at 10:02, Pekka Paalanen  wrote:
> On Mon, 27 Apr 2020 15:07:20 +0800 zou lan  wrote:
> > I read some documents about chrome OS run Android Apks such as
> > https://qiangbo-workspace.oss-cn-shanghai.aliyuncs.com/2019-09-10-chromeos-with-android-app/Arcpp_Graphics.pdf
> >  As far as I known, chrominum could run upon wayland,  I just wondering how
> > it handle Android windows on wayland.
> > I think the surface of Android apks could be wayland surface in linux, the
> > window could be the shell surface.
> >  Since all the android apks are still running on android container, android
> > window manager will manage these windows, in wayland, the relationship of
> > these surfaces should be parent-   subsurface that map to android
> > windows. That's a little of problem, as you are confirmed, one wl surface
> > can't be both subsurface and shell surface.
> > If each android apks are not subsurfaces, I am confused how Android to
> > handle the input events from wayland.
>
> you'll have to ask or wait for someone who knows ARC++ to answer. I
> don't dare extrapolate details based on that one simple PDF alone.
>
> Android window management is very different from desktop window
> management, and I don't even know if CrOS window management is close
> to either. Using custom Wayland extensions is always a possibility, it
> happens even on the desktops, e.g. GNOME/GTK.
>
> Look at the slide titled "Chromium Wayland Interfaces", for instance.

ARC++ is proprietary, and I haven't seen its source code either.

But looking at 
https://github.com/chromium/chromium/blob/master/components/exo/wayland/protocol/aura-shell.xml
I would very much expect that the ARC++ client implementation uses
this as an extra to support Android applications running under
Chromium - for example, the titlebar-colour request is certainly
fulfilling an Android need.

Integrating Android into a desktop system is non-trivial. You will
have to make quite a lot of changes along the way: Android assumes
that you have one, or maybe two, applications open, and a status bar,
and maybe a button bar, and that's it. If you want to make Android
behave more like a desktop, then you're going to have to change
Android to fit your desktop, and you're going to have to change your
desktop to accommodate Android.

I believe the ARC++ solution of using multiple top-level windows, and
having the window management be primarily done by the host compositor,
is a better option than trying to use subsurfaces to invert
responsibility and effectively control the window management from
Android.

However, there is no out-of-the-box solution. Whatever you do is going
to require custom development and experimentation. 'SPURV' is a
periodically-refreshed effort from Collabora to see what this
integration would look like, however we never addressed the idea of
having multiple active applications, as it requires too many changes
in Android, such as deep changes to the Android activity manager to
deal with more than one application being current at one time.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Getting Weston to use DRM/KMS planes

2020-03-25 Thread Daniel Stone
Hi Oliver,

On Wed, 25 Mar 2020 at 10:31, Wohlmuth, Oliver
 wrote:
> I just started to work with Wayland/Weston, so please forgive me if I ask 
> silly questions.
>
> I’m running Weston (8.0.0) on a custom ARM SoC using the DRM backend. As the 
> OpenGL
> driver is not (yet) adapted for Wayland, I'm running Weston using the 
> '--use-pixman' option.
> This works fine so far.
>
> As our DRM/KMS driver supports several HW planes, I was debugging into the 
> Weston
> code trying to understand how Weston makes use of these HW planes (or what 
> needs
> to be done to get Weston use the HW planes). If I understand the code 
> correctly:
>
> drm_fb_get_from_view()
> {
> ...
> if (wl_shm_buffer_get(buffer->resource))
> return NULL;
> ...
> }
>
> Weston will never use HW planes for wl_shm_buffer, only for GBM or dmabuf type
> buffer it will be used.
>
>
>
> - Is this correct? What is the reason for this?

That's correct. The reason is that we need to be able to get a KMS
framebuffer object with the pixel content from the client buffer in
it. Effectively, the only way to import client content into a KMS
framebuffer is via dmabuf; KMS has no method of creating framebuffers
from an arbitrary pointer to user memory. And we need a framebuffer
object in order to display anything on a plane.

> - Any suggestions how to get HW planes used without OpenGL rendering?
>
>   Currently I think of patching Weston or implement a Weston client that
>   uses dmabuf buffer. Any hint is appreciated that puts me on the most
>   promising path.

There is no easy out-of-the-box path. The first place to start would
be to patch a client to allocate a dmabuf through the udmabuf kernel
API (the one in the mainline kernel tree, not the external module
living on GitHub), or the vgem driver.

Once you've done that, you will very quickly notice that Weston
doesn't actually support the dmabuf extension when using the Pixman
renderer. This would be possible to implement by implementing the
import_dmabuf, query_dmabuf_formats, and query_dmabuf_modifiers hooks.
You probably only want to declare support the ARGB/XRGB
formats and the LINEAR modifier. For import_dmabuf, you would want to
call the DMA_BUF_IOCTL_SYNC call on the provided FD, mmap the dmabuf,
then call pixman_image_create_bits_no_clear() to obtain a pixman_image
which the renderer can use to source from the dmabuf's content.

The reason we require the renderer to support dmabuf as well as the
backend is for fallback: in case we can't display the client content
on a KMS plane (which is not guaranteed, as the driver can reject
planes for any reason or limitation), we need to be able to use the
renderer as a fallback to show the content.

Alternately, if your SoC has an Arm Mali GPU, you can use the Panfrost
driver available in the upstream Linux kernel and Mesa GL
implementation, which fully supports dmabuf/Wayland/GBM/etc.

Hope that helps.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: 2020 X.Org Board of Directors Elections Nomination period is NOW

2020-03-24 Thread Daniel Vetter
Another reminder that we're in the election process, and the next
deadline is approaching:

- Send board nominations to elections AT x DOT org

- Got to https://members.x.org/ to renew your membership (or become
one to begin with!)

On Tue, Mar 17, 2020 at 7:21 AM Daniel Vetter  wrote:
>
> Just a quick reminder that both board nomination and membership
> renewal periods are still opening:
>
> - Send board nominations to elections AT x DOT org
>
> - Got to https://members.x.org/ to renew your membership (or become
> one to begin with!)
>
> Cheers, Daniel
>
> On Sun, Mar 8, 2020 at 8:51 PM Daniel Vetter  wrote:
> >
> > We are seeking nominations for candidates for election to the X.Org
> > Foundation Board of Directors. All X.Org Foundation members are
> > eligible for election to the board.
> >
> > Nominations for the 202 election are now open and will remain open
> > until 23:59 UTC on 29th March 2020.
> >
> > The Board consists of directors elected from the membership. Each
> > year, an election is held to bring the total number of directors to
> > eight. The four members receiving the highest vote totals will serve
> > as directors for two year terms.
> >
> > The directors who received two year terms starting in 2019 wereSamuel
> > Iglesias Gonsálvez, Manasi D Navare, Lyude Paul and Daniel Vetter.
> > They will continue to serve until their term ends in 2021. Current
> > directors whose term expires in 2020 are Eric Anholt,  Bryce
> > Harrington, Keith Packard and Harry Wentland.
> >
> > A director is expected to participate in the fortnightly IRC meeting
> > to discuss current business and to attend the annual meeting of the
> > X.Org Foundation, which will be held at a location determined in
> > advance by the Board of Directors.
> >
> > A member may nominate themselves or any other member they feel is
> > qualified. Nominations should be sent to the Election Committee at
> > elections at x.org.
> >
> > Nominees shall be required to be current members of the X.Org
> > Foundation, and submit a personal statement of up to 200 words that
> > will be provided to prospective voters. The collected statements,
> > along with the statement of contribution to the X.Org Foundation in
> > the member's account page on http://members.x.org, will be made
> > available to all voters to help them make their voting decisions.
> >
> > Nominations, membership applications or renewals and completed
> > personal statements must be received no later than 23:59 UTC on 02
> > April 2020.
> >
> > The slate of candidates will be published 6 April 2020 and candidate
> > Q will begin then. The deadline for Xorg membership applications and
> > renewals is 02 April 2020.
> >
> > Cheers, Daniel, on behalf of the X.Org BoD
> >
> > PS: I cc'ed the usual dev lists since not many members put in the renewal 
> > yet.
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-19 Thread Daniel Vetter
On Tue, Mar 17, 2020 at 11:01:57AM +0100, Michel Dänzer wrote:
> On 2020-03-16 7:33 p.m., Marek Olšák wrote:
> > On Mon, Mar 16, 2020 at 5:57 AM Michel Dänzer  wrote:
> >> On 2020-03-16 4:50 a.m., Marek Olšák wrote:
> >>> The synchronization works because the Mesa driver waits for idle (drains
> >>> the GFX pipeline) at the end of command buffers and there is only 1
> >>> graphics queue, so everything is ordered.
> >>>
> >>> The GFX pipeline runs asynchronously to the command buffer, meaning the
> >>> command buffer only starts draws and doesn't wait for completion. If the
> >>> Mesa driver didn't wait at the end of the command buffer, the command
> >>> buffer would finish and a different process could start execution of its
> >>> own command buffer while shaders of the previous process are still
> >> running.
> >>>
> >>> If the Mesa driver submits a command buffer internally (because it's
> >> full),
> >>> it doesn't wait, so the GFX pipeline doesn't notice that a command buffer
> >>> ended and a new one started.
> >>>
> >>> The waiting at the end of command buffers happens only when the flush is
> >>> external (Swap buffers, glFlush).
> >>>
> >>> It's a performance problem, because the GFX queue is blocked until the
> >> GFX
> >>> pipeline is drained at the end of every frame at least.
> >>>
> >>> So explicit fences for SwapBuffers would help.
> >>
> >> Not sure what difference it would make, since the same thing needs to be
> >> done for explicit fences as well, doesn't it?
> > 
> > No. Explicit fences don't require userspace to wait for idle in the command
> > buffer. Fences are signalled when the last draw is complete and caches are
> > flushed. Before that happens, any command buffer that is not dependent on
> > the fence can start execution. There is never a need for the GPU to be idle
> > if there is enough independent work to do.
> 
> I don't think explicit fences in the context of this discussion imply
> using that different fence signalling mechanism though. My understanding
> is that the API proposed by Jason allows implicit fences to be used as
> explicit ones and vice versa, so presumably they have to use the same
> signalling mechanism.
> 
> 
> Anyway, maybe the different fence signalling mechanism you describe
> could be used by the amdgpu kernel driver in general, then Mesa could
> drop the waits for idle and get the benefits with implicit sync as well?

Yeah, this is entirely about the programming model visible to userspace.
There shouldn't be any impact on the driver's choice of a top vs. bottom
of the gpu pipeline used for synchronization, that's entirely up to what
you're hw/driver/scheduler can pull off.

Doing a full gfx pipeline flush for shared buffers, when your hw can do
be, sounds like an issue to me that's not related to this here at all. It
might be intertwined with amdgpu's special interpretation of dma_resv
fences though, no idea. We might need to revamp all that. But for a
userspace client that does nothing fancy (no multiple render buffer
targets in one bo, or vk style "I write to everything all the time,
perhaps" stuff) there should be 0 perf difference between implicit sync
through dma_resv and explicit sync through sync_file/syncobj/dma_fence
directly.

If there is I'd consider that a bit a driver bug.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-19 Thread Daniel Vetter
On Tue, Mar 17, 2020 at 11:27:28AM -0500, Jason Ekstrand wrote:
> On Tue, Mar 17, 2020 at 10:33 AM Nicolas Dufresne  
> wrote:
> >
> > Le lundi 16 mars 2020 à 23:15 +0200, Laurent Pinchart a écrit :
> > > Hi Jason,
> > >
> > > On Mon, Mar 16, 2020 at 10:06:07AM -0500, Jason Ekstrand wrote:
> > > > On Mon, Mar 16, 2020 at 5:20 AM Laurent Pinchart wrote:
> > > > > Another issue is that V4L2 doesn't offer any guarantee on job 
> > > > > ordering.
> > > > > When you queue multiple buffers for camera capture for instance, you
> > > > > don't know until capture complete in which buffer the frame has been
> > > > > captured.
> > > >
> > > > Is this a Kernel UAPI issue?  Surely the kernel driver knows at the
> > > > start of frame capture which buffer it's getting written into.  I
> > > > would think that the kernel APIs could be adjusted (if we find good
> > > > reason to do so!) such that they return earlier and return a (buffer,
> > > > fence) pair.  Am I missing something fundamental about video here?
> > >
> > > For cameras I believe we could do that, yes. I was pointing out the
> > > issues caused by the current API. For video decoders I'll let Nicolas
> > > answer the question, he's way more knowledgeable that I am on that
> > > topic.
> >
> > Right now, there is simply no uAPI for supporting asynchronous errors
> > reporting when fences are invovled. That is true for both camera's and
> > CODEC. It's likely what all the attempt was missing, I don't know
> > enough myself to suggest something.
> >
> > Now, why Stateless video decoders are special is another subject. In
> > CODECs, the decoding and the presentation order may differ. For
> > Stateless kind of CODEC, a bitstream is passed to the HW. We don't know
> > if this bitstream is fully valid, since the it is being parsed and
> > validated by the firmware. It's also firmware job to decide which
> > buffer should be presented first.
> >
> > In most firmware interface, that information is communicated back all
> > at once when the frame is ready to be presented (which may be quite
> > some time after it was decoded). So indeed, a fence model is not really
> > easy to add, unless the firmware was designed with that model in mind.
> 
> Just to be clear, I think we should do whatever makes sense here and
> not try to slam sync_file in when it doesn't make sense just because
> we have it.  The more I read on this thread, the less out-fences from
> video decode sound like they make sense unless we have a really solid
> plan for async error reporting.  It's possible, depending on how many
> processes are involved in the pipeline, that async error reporting
> could help reduce latency a bit if it let the kernel report the error
> directly to the last process in the chain.  However, I'm not convinced
> the potential for userspace programmer error is worth it..  That said,
> I'm happy to leave that up to the actual video experts. (I just do 3D)

dma_fence has an error state which you can set when things went south. The
fence still completes (to guarantee forward progress).

Currently that error code isn't really propagated anywhere (well i915 iirc
does something like that since it tracks the depedencies internally in the
scheduler). Definitely not at the dma_fence level, since we don't track
the dependency graph there at all. We might want to add that, would at
least be possible.

If we track the cascading dma_fence error state in the kernel I do think
this could work. I'm not sure whether it's actually a good/useful idea
still.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-19 Thread Daniel Vetter
On Wed, Mar 18, 2020 at 11:05:48AM +0100, Michel Dänzer wrote:
> On 2020-03-17 6:21 p.m., Lucas Stach wrote:
> > That's one of the issues with implicit sync that explicit may solve: 
> > a single client taking way too much time to render something can 
> > block the whole pipeline up until the display flip. With explicit 
> > sync the compositor can just decide to use the last client buffer if 
> > the latest buffer isn't ready by some deadline.
> 
> FWIW, the compositor can do this with implicit sync as well, by polling
> a dma-buf fd for the buffer. (Currently, it has to poll for writable,
> because waiting for the exclusive fence only isn't enough with amdgpu)

Would be great if we don't have to make this recommended uapi, just
because amdgpu leaks it's trickery into the wider world. Polling for read
really should be enough (and I guess Christian gets to fix up amdgpu more,
at least for anything that has a dma-buf attached even if it's not shared
with anything !amdgpu.ko).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] Plumbing explicit synchronization through the Linux ecosystem

2020-03-19 Thread Daniel Vetter
On Tue, Mar 17, 2020 at 12:18:47PM -0500, Jason Ekstrand wrote:
> On Tue, Mar 17, 2020 at 12:13 PM Jacob Lifshay  
> wrote:
> >
> > One related issue with explicit sync using sync_file is that combined
> > CPUs/GPUs (the CPU cores *are* the GPU cores) that do all the
> > rendering in userspace (like llvmpipe but for Vulkan and with extra
> > instructions for GPU tasks) but need to synchronize with other
> > drivers/processes is that there should be some way to create an
> > explicit fence/semaphore from userspace and later signal it. This
> > seems to conflict with the requirement for a sync_file to complete in
> > finite time, since the user process could be stopped or killed.
> 
> Yeah... That's going to be a problem.  The only way I could see that
> working is if you created a sync_file that had a timeout associated
> with it.  However, then you run into the issue where you may have
> corruption if stuff doesn't complete on time.  Then again, you're not
> really dealing with an external unit and so the latency cost of going
> across the window system protocol probably isn't massively different
> from the latency cost of triggering the sync_file.  Maybe the answer
> there is to just do everything in-order and not worry about
> synchronization?

vgem does that already (fences with timeout). The corruption issue is also
not new, if your shaders take forever real gpus will nick your rendering
with a quick reset. Iirc someone (from cros google team maybe) was even
looking into making llvmpipe run on top of vgem as a real dri/drm mesa
driver.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: 2020 X.Org Board of Directors Elections Nomination period is NOW

2020-03-17 Thread Daniel Vetter
Just a quick reminder that both board nomination and membership
renewal periods are still opening:

- Send board nominations to elections AT x DOT org

- Got to https://members.x.org/ to renew your membership (or become
one to begin with!)

Cheers, Daniel

On Sun, Mar 8, 2020 at 8:51 PM Daniel Vetter  wrote:
>
> We are seeking nominations for candidates for election to the X.Org
> Foundation Board of Directors. All X.Org Foundation members are
> eligible for election to the board.
>
> Nominations for the 202 election are now open and will remain open
> until 23:59 UTC on 29th March 2020.
>
> The Board consists of directors elected from the membership. Each
> year, an election is held to bring the total number of directors to
> eight. The four members receiving the highest vote totals will serve
> as directors for two year terms.
>
> The directors who received two year terms starting in 2019 wereSamuel
> Iglesias Gonsálvez, Manasi D Navare, Lyude Paul and Daniel Vetter.
> They will continue to serve until their term ends in 2021. Current
> directors whose term expires in 2020 are Eric Anholt,  Bryce
> Harrington, Keith Packard and Harry Wentland.
>
> A director is expected to participate in the fortnightly IRC meeting
> to discuss current business and to attend the annual meeting of the
> X.Org Foundation, which will be held at a location determined in
> advance by the Board of Directors.
>
> A member may nominate themselves or any other member they feel is
> qualified. Nominations should be sent to the Election Committee at
> elections at x.org.
>
> Nominees shall be required to be current members of the X.Org
> Foundation, and submit a personal statement of up to 200 words that
> will be provided to prospective voters. The collected statements,
> along with the statement of contribution to the X.Org Foundation in
> the member's account page on http://members.x.org, will be made
> available to all voters to help them make their voting decisions.
>
> Nominations, membership applications or renewals and completed
> personal statements must be received no later than 23:59 UTC on 02
> April 2020.
>
> The slate of candidates will be published 6 April 2020 and candidate
> Q will begin then. The deadline for Xorg membership applications and
> renewals is 02 April 2020.
>
> Cheers, Daniel, on behalf of the X.Org BoD
>
> PS: I cc'ed the usual dev lists since not many members put in the renewal yet.
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-16 Thread Daniel Stone
Hi,

On Mon, 16 Mar 2020 at 15:33, Tomek Bury  wrote:
> > GL and GLES are not relevant. What is relevant is EGL, which defines
> > interfaces to make things work on the native platform.
> Yes and no. This is what EGL spec says about sharing a texture between 
> contexts:

Contexts are different though ...

> There are similar statements with regards to the lack of
> synchronisation guarantees for EGL images or between GL and native
> rendering, etc.

This also isn't about native rendering.

> But the main thing here is that EGL and Vulkan differ
> significantly.

Sure, I totally agree.

> The eglSwapBuffers() is expected to post an unspecified
> "back buffer" to the display system using some internal driver magic.
> EGL driver is then expected to obtain another back buffer at some
> unspecified point in the future.

Yes, this is rather the point: EGL doesn't specify platform-related
'black magic' to make things just work, because that's part of the
platform implementation details. And, as things stand, on Linux one of
those things is implicit synchronisation, unless the desired end state
of your driver is no synchronisation.

This thread is a discussion about changing that.

> > If you are using EGL_WL_bind_wayland_display, then one of the things
> > it is explicitly allowed/expected to do is to create a Wayland
> > protocol interface between client and compositor, which can be used to
> > pass buffer handles and metadata in a platform-specific way. Adding
> > synchronisation is also possible.
> Only one-way synchronisation is possible with this mechanism. There's
> a standard protocol for recycling buffers - wl_buffer_release() so
> buffer hand-over from the compositor to client remains unsynchronised
> - see below.

That's not true; you can post back a sync token every time the client
buffer is used by the compositor.

> > > The most troublesome part was Wayland buffer release mechanism, as it 
> > > only involves a CPU signalling over Wayland IPC, without any 3D driver 
> > > involvement. The choices were: explicit synchronisation extension or a 
> > > buffer copy in the compositor (i.e. compositor textures from the copy, so 
> > > the client can re-write the original), or some implicit synchronisation 
> > > in kernel space (but that wasn't an option in Broadcom driver).
> >
> > You can add your own explicit synchronisation extension.
> I could but that requires implementing in in the driver and in a
> number of compositors, therefore a standard extension
> zwp_linux_explicit_synchronization_v1 is much better choice here than
> a custom one.

EGL_WL_bind_wayland_display is explicitly designed to allow each
driver to implement its own private extensions without modifying
compositors. For instance, Mesa adds the `wl_drm` extension, which is
used for bidirectional communication between the EGL implementations
in the client and compositor address spaces, without modifying either.

> > In every cross-process and cross-subsystem usecase, synchronisation is
> > obviously required. The two options for this are to implement kernel
> > support for implicit synchronisation (as everyone else has done),
> That would require major changes in driver architecture or a 2nd
> mechanisms doing the same thing but in kernel space - both are
> non-starters.

OK. As it stands, everyone else has the kernel mechanism (e.g. via
dmabuf resv), so in this case if you are reinventing the underlying
platform in a proprietary stack, you get to solve the same problems
yourselves.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: Plumbing explicit synchronization through the Linux ecosystem

2020-03-16 Thread Daniel Stone
Hi Tomek,

On Mon, 16 Mar 2020 at 12:55, Tomek Bury  wrote:
> I've been wrestling with the sync problems in Wayland some time ago, but only 
> with regards to 3D drivers.
>
> The guarantee given by the GL/GLES spec is limited to a single graphics 
> context. If the same buffer is accessed by 2 contexts the outcome is 
> unspecified. The cross-context and cross-process synchronisation is not 
> guaranteed. It happens to work on Mesa, because the read/write locking is 
> implemented in the kernel space, but it didn't work on Broadcom driver, which 
> has read-write interlocks in user space.

GL and GLES are not relevant. What is relevant is EGL, which defines
interfaces to make things work on the native platform. EGL doesn't
define any kind of synchronisation model for the Wayland, X11, or
GBM/KMS platforms - but it's one of the things which has to work. It
doesn't say that the implementation must make sure that the requested
format is displayable, but you sort of take it for granted that if you
ask EGL to display something it will do so.

Synchronisation is one of those mechanisms which is left to the
platform to implement under the hood. In the absence of platform
support for explicit synchronisation, the synchronisation must be
implicit.

>  A Vulkan client makes it even worse because of conflicting requirements: 
> Vulkan's vkQueuePresentKHR() passes in a number of semaphores but disallows 
> waiting. Wayland WSI requires wl_surface_commit() to be called from 
> vkQueuePresentKHR() which does require a wait, unless a synchronisation 
> primitive representing Vulkan samaphores is passed between Vulkan client and 
> the compositor.

If you are using EGL_WL_bind_wayland_display, then one of the things
it is explicitly allowed/expected to do is to create a Wayland
protocol interface between client and compositor, which can be used to
pass buffer handles and metadata in a platform-specific way. Adding
synchronisation is also possible.

> The most troublesome part was Wayland buffer release mechanism, as it only 
> involves a CPU signalling over Wayland IPC, without any 3D driver 
> involvement. The choices were: explicit synchronisation extension or a buffer 
> copy in the compositor (i.e. compositor textures from the copy, so the client 
> can re-write the original), or some implicit synchronisation in kernel space 
> (but that wasn't an option in Broadcom driver).

You can add your own explicit synchronisation extension.

In every cross-process and cross-subsystem usecase, synchronisation is
obviously required. The two options for this are to implement kernel
support for implicit synchronisation (as everyone else has done), or
implement generic support for explicit synchronisation (as we have
been working on with implementations inside Weston and Exosphere at
least), or implement private support for explicit synchronisation, or
do nothing and then be surprised at the lack of synchronisation.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


2020 X.Org Board of Directors Elections Nomination period is NOW

2020-03-08 Thread Daniel Vetter
We are seeking nominations for candidates for election to the X.Org
Foundation Board of Directors. All X.Org Foundation members are
eligible for election to the board.

Nominations for the 202 election are now open and will remain open
until 23:59 UTC on 29th March 2020.

The Board consists of directors elected from the membership. Each
year, an election is held to bring the total number of directors to
eight. The four members receiving the highest vote totals will serve
as directors for two year terms.

The directors who received two year terms starting in 2019 wereSamuel
Iglesias Gonsálvez, Manasi D Navare, Lyude Paul and Daniel Vetter.
They will continue to serve until their term ends in 2021. Current
directors whose term expires in 2020 are Eric Anholt,  Bryce
Harrington, Keith Packard and Harry Wentland.

A director is expected to participate in the fortnightly IRC meeting
to discuss current business and to attend the annual meeting of the
X.Org Foundation, which will be held at a location determined in
advance by the Board of Directors.

A member may nominate themselves or any other member they feel is
qualified. Nominations should be sent to the Election Committee at
elections at x.org.

Nominees shall be required to be current members of the X.Org
Foundation, and submit a personal statement of up to 200 words that
will be provided to prospective voters. The collected statements,
along with the statement of contribution to the X.Org Foundation in
the member's account page on http://members.x.org, will be made
available to all voters to help them make their voting decisions.

Nominations, membership applications or renewals and completed
personal statements must be received no later than 23:59 UTC on 02
April 2020.

The slate of candidates will be published 6 April 2020 and candidate
Q will begin then. The deadline for Xorg membership applications and
renewals is 02 April 2020.

Cheers, Daniel, on behalf of the X.Org BoD

PS: I cc'ed the usual dev lists since not many members put in the renewal yet.
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Vetter
On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie  wrote:
>
> On Sat, 29 Feb 2020 at 05:34, Eric Anholt  wrote:
> >
> > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie  wrote:
> > >
> > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> > > > > b) we probably need to take a large step back here.
> > > > >
> > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > sponsorship money that they are just giving straight to google to pay
> > > > > for hosting credits? Google are profiting in some minor way from these
> > > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > > sort of discounts here. Having google sponsor the credits costs google
> > > > > substantially less than having any other company give us money to do
> > > > > it.
> > > >
> > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > comparable in terms of what you get and what you pay for them.
> > > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > > services are cheaper, but then you need to find someone who is going
> > > > to properly administer the various machines, install decent
> > > > monitoring, make sure that more storage is provisioned when we need
> > > > more storage (which is basically all the time), make sure that the
> > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > machines has had a drive in imminent-failure state for the last few
> > > > months), etc.
> > > >
> > > > Given the size of our service, that's a much better plan (IMO) than
> > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > other things to do, and c) hasn't wanted to do it for the past several
> > > > years. But as long as that's the resources we have, then we're paying
> > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > problems.
> > >
> > > Admin for gitlab and CI is a full time role anyways. The system is
> > > definitely not self sustaining without time being put in by you and
> > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > be a better use of the money? I didn't know if we can afford $75k for
> > > an admin, but suddenly we can afford it for gitlab credits?
> >
> > As I think about the time that I've spent at google in less than a
> > year on trying to keep the lights on for CI and optimize our
> > infrastructure in the current cloud environment, that's more than the
> > entire yearly budget you're talking about here.  Saying "let's just
> > pay for people to do more work instead of paying for full-service
> > cloud" is not a cost optimization.
> >
> >
> > > > Yes, we could federate everything back out so everyone runs their own
> > > > builds and executes those. Tinderbox did something really similar to
> > > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > > pre-merge testing, mind.
> > >
> > > Why? does gitlab not support the model? having builds done in parallel
> > > on runners closer to the test runners seems like it should be a thing.
> > > I guess artifact transfer would cost less then as a result.
> >
> > Let's do some napkin math.  The biggest artifacts cost we have in Mesa
> > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> > makes ~1.8TB/month ($180 or so).  We could build a local storage next
> > to the lava dispatcher so that the artifacts didn't have to contain
> > the rootfs that came from the container (~2/3 of the insides of the
> > zip file), but that's another service to build and maintain.  Building
> > the drivers once locally and storing it would save downloading the
> > other ~1/3 of the inside of the zip file, but that requires a big
> > enough system to do builds in time.
> >
> > I'm planning on doing a local filestore for google's lava lab, since I
> > need to be able to move our xml files off of the lava DUTs to get the
> > xml results we've become accustomed to, but this would not bubble up
> > to being a priority for my time if I wasn't

Re: gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Stone
Hi Jan,

On Fri, 28 Feb 2020 at 10:09, Jan Engelhardt  wrote:
> On Friday 2020-02-28 08:59, Daniel Stone wrote:
> >I believe that in January, we had $2082 of network cost (almost
> >entirely egress; ingress is basically free) and $1750 of
> >cloud-storage cost (almost all of which was download). That's based
> >on 16TB of cloud-storage (CI artifacts, container images, file
> >uploads, Git LFS) egress and 17.9TB of other egress (the web service
> >itself, repo activity). Projecting that out [×12 for a year] gives
> >us roughly $45k of network activity alone,
>
> I had come to a similar conclusion a few years back: It is not very
> economic to run ephemereal buildroots (and anything like it) between
> two (or more) "significant locations" of which one end is located in
> a Large Cloud datacenter like EC2/AWS/etc.
>
> As for such usecases, me and my surrounding peers have used (other)
> offerings where there is 50 TB free network/month, and yes that may
> have entailed doing more adminning than elsewhere - but an admin
> appreciates $2000 a lot more than a corporation, too.

Yes, absolutely. For context, our storage & network costs have
increased >10x in the past 12 months (~$320 Jan 2019), >3x in the past
6 months (~$1350 July 2019), and ~2x in the past 3 months (~$2000 Oct
2019).

I do now (personally) think that it's crossed the point at which it
would be worthwhile paying an admin to solve the problems that cloud
services currently solve for us - which wasn't true before. Such an
admin could also deal with things like our SMTP delivery failure rate,
which in the past year has spiked over 50% (see previous email),
demand for new services such as Discourse which will enable user
support without either a) users having to subscribe to a mailing list,
or b) bug trackers being cluttered up with user requests and other
non-bugs, etc.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Stone
On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
 wrote:
> On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > Yeah, changes on vulkan drivers or backend compilers should be
> > fairly
> > sandboxed.
> >
> > We also have tools that only work for intel stuff, that should never
> > trigger anything on other people's HW.
> >
> > Could something be worked out using the tags?
>
> I think so! We have the pre-defined environment variable
> CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
>
> https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
>
> That sounds like a pretty neat middle-ground to me. I just hope that
> new pipelines are triggered if new labels are added, because not
> everyone is allowed to set labels, and sometimes people forget...

There's also this which is somewhat more robust:
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Vetter
On Fri, Feb 28, 2020 at 10:29 AM Erik Faye-Lund
 wrote:
>
> On Fri, 2020-02-28 at 13:37 +1000, Dave Airlie wrote:
> > On Fri, 28 Feb 2020 at 07:27, Daniel Vetter 
> > wrote:
> > > Hi all,
> > >
> > > You might have read the short take in the X.org board meeting
> > > minutes
> > > already, here's the long version.
> > >
> > > The good news: gitlab.fd.o has become very popular with our
> > > communities, and is used extensively. This especially includes all
> > > the
> > > CI integration. Modern development process and tooling, yay!
> > >
> > > The bad news: The cost in growth has also been tremendous, and it's
> > > breaking our bank account. With reasonable estimates for continued
> > > growth we're expecting hosting expenses totalling 75k USD this
> > > year,
> > > and 90k USD next year. With the current sponsors we've set up we
> > > can't
> > > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > > without any of the CI features enabled would total 30k USD, which
> > > is
> > > within X.org's ability to support through various sponsorships,
> > > mostly
> > > through XDC.
> > >
> > > Note that X.org does no longer sponsor any CI runners themselves,
> > > we've stopped that. The huge additional expenses are all just in
> > > storing and serving build artifacts and images to outside CI
> > > runners
> > > sponsored by various companies. A related topic is that with the
> > > growth in fd.o it's becoming infeasible to maintain it all on
> > > volunteer admin time. X.org is therefore also looking for admin
> > > sponsorship, at least medium term.
> > >
> > > Assuming that we want cash flow reserves for one year of
> > > gitlab.fd.o
> > > (without CI support) and a trimmed XDC and assuming no sponsor
> > > payment
> > > meanwhile, we'd have to cut CI services somewhere between May and
> > > June
> > > this year. The board is of course working on acquiring sponsors,
> > > but
> > > filling a shortfall of this magnitude is neither easy nor quick
> > > work,
> > > and we therefore decided to give an early warning as soon as
> > > possible.
> > > Any help in finding sponsors for fd.o is very much appreciated.
> >
> > a) Ouch.
> >
> > b) we probably need to take a large step back here.
> >
>
> I kinda agree, but maybe the step doesn't have to be *too* large?
>
> I wonder if we could solve this by restructuring the project a bit. I'm
> talking purely from a Mesa point of view here, so it might not solve
> the full problem, but:
>
> 1. It feels silly that we need to test changes to e.g the i965 driver
> on dragonboards. We only have a big "do not run CI at all" escape-
> hatch.
>
> 2. A lot of us are working for a company that can probably pay for
> their own needs in terms of CI. Perhaps moving some costs "up front" to
> the company that needs it can make the future of CI for those who can't
> do this
>
> 3. I think we need a much more detailed break-down of the cost to make
> educated changes. For instance, how expensive is Docker image
> uploads/downloads (e.g intermediary artifacts) compared to build logs
> and final test-results? What kind of artifacts?

We have logs somewhere, but no one yet got around to analyzing that.
Which will be quite a bit of work to do since the cloud storage is
totally disconnected from the gitlab front-end, making the connection
to which project or CI job caused something is going to require
scripting. Volunteers definitely very much welcome I think.

> One suggestion would be to do something more similar to what the kernel
> does, and separate into different repos for different subsystems. This
> could allow us to have separate testing-pipelines for these repos,
> which would mean that for instance a change to RADV didn't trigger a
> full Panfrost test-run.

Uh as someone who lives the kernel multi-tree model daily, there's a
_lot_ of pain involved. I think much better to look at filtering out
CI targets for when nothing relevant happened. But that gets somewhat
tricky, since "nothing relevant" is always only relative to some
baseline, so bit of scripting and all involved to make sure you don't
run stuff too often or (probably worse) not often enough.
-Daniel

> This would probably require us to accept using a more branch-heavy
> work-flow. I don't personally think that would be a bad thing.
>
> But this is all kinda based on an assumption that running hardware-
> testing is the expensive 

Re: [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Stone
On Fri, 28 Feb 2020 at 08:48, Dave Airlie  wrote:
> On Fri, 28 Feb 2020 at 18:18, Daniel Stone  wrote:
> > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > comparable in terms of what you get and what you pay for them.
> > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > services are cheaper, but then you need to find someone who is going
> > to properly administer the various machines, install decent
> > monitoring, make sure that more storage is provisioned when we need
> > more storage (which is basically all the time), make sure that the
> > hardware is maintained in decent shape (pretty sure one of the fd.o
> > machines has had a drive in imminent-failure state for the last few
> > months), etc.
> >
> > Given the size of our service, that's a much better plan (IMO) than
> > relying on someone who a) isn't an admin by trade, b) has a million
> > other things to do, and c) hasn't wanted to do it for the past several
> > years. But as long as that's the resources we have, then we're paying
> > the cloud tradeoff, where we pay more money in exchange for fewer
> > problems.
>
> Admin for gitlab and CI is a full time role anyways. The system is
> definitely not self sustaining without time being put in by you and
> anholt still. If we have $75k to burn on credits, and it was diverted
> to just pay an admin to admin the real hw + gitlab/CI would that not
> be a better use of the money? I didn't know if we can afford $75k for
> an admin, but suddenly we can afford it for gitlab credits?

s/gitlab credits/GCP credits/

I took a quick look at HPE, which we previously used for bare metal,
and it looks like we'd be spending $25-50k (depending on how much
storage you want to provision, how much room you want to leave to
provision more storage later, how much you care about backups) to run
a similar level of service so that'd put a bit of a dint in your
year-one budget.

The bare-metal hosting providers also add up to more expensive than
you might think, again especially if you want either redundancy or
just backups.

> > Yes, we could federate everything back out so everyone runs their own
> > builds and executes those. Tinderbox did something really similar to
> > that IIRC; not sure if Buildbot does as well. Probably rules out
> > pre-merge testing, mind.
>
> Why? does gitlab not support the model? having builds done in parallel
> on runners closer to the test runners seems like it should be a thing.
> I guess artifact transfer would cost less then as a result.

It does support the model but if every single build executor is also
compiling Mesa from scratch locally, how long do you think that's
going to take?

> > Again, if you want everything to be centrally
> > designed/approved/monitored/controlled, that's a fine enough idea, and
> > I'd be happy to support whoever it was who was doing that for all of
> > fd.o.
>
> I don't think we have any choice but to have someone centrally
> controlling it, You can't have a system in place that lets CI users
> burn largs sums of money without authorisation, and that is what we
> have now.

OK, not sure who it is who's going to be approving every update to
every .gitlab-ci.yml in the repository, or maybe we just have zero
shared runners and anyone who wants to do builds can BYO.
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: [Intel-gfx] gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Stone
On Fri, 28 Feb 2020 at 03:38, Dave Airlie  wrote:
> b) we probably need to take a large step back here.
>
> Look at this from a sponsor POV, why would I give X.org/fd.o
> sponsorship money that they are just giving straight to google to pay
> for hosting credits? Google are profiting in some minor way from these
> hosting credits being bought by us, and I assume we aren't getting any
> sort of discounts here. Having google sponsor the credits costs google
> substantially less than having any other company give us money to do
> it.

The last I looked, Google GCP / Amazon AWS / Azure were all pretty
comparable in terms of what you get and what you pay for them.
Obviously providers like Packet and Digital Ocean who offer bare-metal
services are cheaper, but then you need to find someone who is going
to properly administer the various machines, install decent
monitoring, make sure that more storage is provisioned when we need
more storage (which is basically all the time), make sure that the
hardware is maintained in decent shape (pretty sure one of the fd.o
machines has had a drive in imminent-failure state for the last few
months), etc.

Given the size of our service, that's a much better plan (IMO) than
relying on someone who a) isn't an admin by trade, b) has a million
other things to do, and c) hasn't wanted to do it for the past several
years. But as long as that's the resources we have, then we're paying
the cloud tradeoff, where we pay more money in exchange for fewer
problems.

> If our current CI architecture is going to burn this amount of money a
> year and we hadn't worked this out in advance of deploying it then I
> suggest the system should be taken offline until we work out what a
> sustainable system would look like within the budget we have, whether
> that be never transferring containers and build artifacts from the
> google network, just having local runner/build combos etc.

Yes, we could federate everything back out so everyone runs their own
builds and executes those. Tinderbox did something really similar to
that IIRC; not sure if Buildbot does as well. Probably rules out
pre-merge testing, mind.

The reason we hadn't worked everything out in advance of deploying is
because Mesa has had 3993 MRs in the not long over a year since
moving, and a similar number in GStreamer, just taking the two biggest
users. At the start it was 'maybe let's use MRs if you want to but
make sure everything still goes through the list', and now it's
something different. Similarly the CI architecture hasn't been
'designed', so much as that people want to run dEQP and Piglit on
their hardware pre-merge in an open fashion that's actually accessible
to people, and have just done it.

Again, if you want everything to be centrally
designed/approved/monitored/controlled, that's a fine enough idea, and
I'd be happy to support whoever it was who was doing that for all of
fd.o.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


Re: gitlab.fd.o financial situation and impact on services

2020-02-28 Thread Daniel Stone
Hi Matt,

On Thu, 27 Feb 2020 at 23:45, Matt Turner  wrote:
> We're paying 75K USD for the bandwidth to transfer data from the
> GitLab cloud instance. i.e., for viewing the https site, for
> cloning/updating git repos, and for downloading CI artifacts/images to
> the testing machines (AFAIU).

I believe that in January, we had $2082 of network cost (almost
entirely egress; ingress is basically free) and $1750 of cloud-storage
cost (almost all of which was download). That's based on 16TB of
cloud-storage (CI artifacts, container images, file uploads, Git LFS)
egress and 17.9TB of other egress (the web service itself, repo
activity). Projecting that out gives us roughly $45k of network
activity alone, so it looks like this figure is based on a projected
increase of ~50%.

The actual compute capacity is closer to $1150/month.

> I was not aware that we were being charged for anything wrt GitLab
> hosting yet (and neither was anyone on my team at Intel that I've
> asked). This... kind of needs to be communicated.
>
> A consistent concern put forth when we were discussing switching to
> GitLab and building CI was... how do we pay for it. It felt like that
> concern was always handwaved away. I heard many times that if we
> needed more runners that we could just ask Google to spin up a few
> more. If we needed testing machines they'd be donated. No one
> mentioned that all the while we were paying for bandwidth... Perhaps
> people building the CI would make different decisions about its
> structure if they knew it was going to wipe out the bank account.

The original answer is that GitLab themselves offered to sponsor
enough credit on Google Cloud to get us started. They used GCP
themselves so they could assist us (me) in getting bootstrapped, which
was invaluable. After that, Google's open-source program office
offered to sponsor us for $30k/year, which was I believe last April.
Since then the service usage has increased roughly by a factor of 10,
so our 12-month sponsorship is no longer enough to cover 12 months.

> What percentage of the bandwidth is consumed by transferring CI
> images, etc? Wouldn't 75K USD would be enough to buy all the testing
> machines we need and host them within Google or wherever so we don't
> need to pay for huge amounts of bandwidth?

Unless the Google Cloud Platform starts offering DragonBoards, it
wouldn't reduce our bandwidth usage as the corporate network is
treated separately for egress.

Cheers,
Daniel
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel


  1   2   3   4   5   6   7   8   9   10   >