Re: [Mesa-dev] bad performance issue in GPU & CPU data sharing

2021-06-02 Thread Tapani Pälli

Hi;

On 5/31/21 12:33 PM, Zong, Wei wrote:

Hello,

I'm using GLES shader to run algorithms on image frames, I got very bad 
performance issue in GPU & CPU data sharing, especially retrieve data 
from GPU to CPU.


Basically, I use 
*/glGenBuffers/*/*/glBindBuffer/*/*/glBufferData(target, size, data, 
usage) /*to create GPU buffer object and initialize GPU data store with 
CPU data pointer. After GLES shader finished the processing, I use 
/*glMapBufferRange*///to retrieve processed image data back to CPU, and 
for some reason I have to do an extra data copy from the gl map pointer 
to another CPU buffer, this is super slow.


Here’s the code snippet 
https://github.com/intel/libxcam/blob/master/modules/gles/gl_buffer.cpp#L94 



https://github.com/intel/libxcam/blob/master/modules/gles/gl_buffer.cpp#L127 



I wonder If there has other efficient way to sharing data between CPU & 
GPU GLES shader?


Thanks,

Zong Wei



Could you break down the use-case here a bit, why do you need CPU access 
to the image? If I understand correctly, is it so that camera pipeline 
renders to a dmabuf and then this is imported to GLES for processing and 
then you map it to CPU for ... something?


Thanks;

// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
On Wed, Jun 2, 2021 at 2:48 PM Daniel Vetter  wrote:

> On Wed, Jun 02, 2021 at 05:38:51AM -0400, Marek Olšák wrote:
> > On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák  wrote:
> >
> > > Yes, we can't break anything because we don't want to complicate things
> > > for us. It's pretty much all NAK'd already. We are trying to gather
> more
> > > knowledge and then make better decisions.
> > >
> > > The idea we are considering is that we'll expose memory-based sync
> objects
> > > to userspace for read only, and the kernel or hw will strictly control
> the
> > > memory writes to those sync objects. The hole in that idea is that
> > > userspace can decide not to signal a job, so even if userspace can't
> > > overwrite memory-based sync object states arbitrarily, it can still
> decide
> > > not to signal them, and then a future fence is born.
> > >
> >
> > This would actually be treated as a GPU hang caused by that context, so
> it
> > should be fine.
>
> This is practically what I proposed already, except your not doing it with
> dma_fence. And on the memory fence side this also doesn't actually give
> what you want for that compute model.
>
> This seems like a bit a worst of both worlds approach to me? Tons of work
> in the kernel to hide these not-dma_fence-but-almost, and still pain to
> actually drive the hardware like it should be for compute or direct
> display.
>
> Also maybe I've missed it, but I didn't see any replies to my suggestion
> how to fake the entire dma_fence stuff on top of new hw. Would be
> interesting to know what doesn't work there instead of amd folks going of
> into internal again and then coming back with another rfc from out of
> nowhere :-)
>

Going internal again is probably a good idea to spare you the long
discussions and not waste your time, but we haven't talked about the
dma_fence stuff internally other than acknowledging that it can be solved.

The compute use case already uses the hw as-is with no inter-process
sharing, which mostly keeps the kernel out of the picture. It uses glFinish
to sync with GL.

The gfx use case needs new hardware logic to support implicit and explicit
sync. When we propose a solution, it's usually torn apart the next day by
ourselves.

Since we are talking about next hw or next next hw, preemption should be
better.

user queue = user-mapped ring buffer

For implicit sync, we will only let userspace lock access to a buffer via a
user queue, which waits for the per-buffer sequence counter in memory to be
>= the number assigned by the kernel, and later unlock the access with
another command, which increments the per-buffer sequence counter in memory
with atomic_inc regardless of the number assigned by the kernel. The kernel
counter and the counter in memory can be out-of-sync, and I'll explain why
it's OK. If a process increments the kernel counter but not the memory
counter, that's its problem and it's the same as a GPU hang caused by that
process. If a process increments the memory counter but not the kernel
counter, the ">=" condition alongside atomic_inc guarantee that signaling n
will signal n+1, so it will never deadlock but also it will effectively
disable synchronization. This method of disabling synchronization is
similar to a process corrupting the buffer, which should be fine. Can you
find any flaw in it? I can't find any.

The explicit submit can be done by userspace (if there is no
synchronization), but we plan to use the kernel to do it for implicit sync.
Essentially, the kernel will receive a buffer list and addresses of wait
commands in the user queue. It will assign new sequence numbers to all
buffers and write those numbers into the wait commands, and ring the hw
doorbell to start execution of that queue.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [ANNOUNCE] mesa 21.1.2

2021-06-02 Thread Eric Engestrom
Hello everyone,

The second bugfix release for the 21.1 branch is now available, containing
mostly AMD and Intel changes as usual, but also a decent amount of ARM
fixes and more.

The next bugfix release is schedules for two weeks from now, on June 16nd.

Cheers,
Eric

---

Adam Jackson (1):
  zink/ntv: Don't call free() on ralloc'd memory

Alyssa Rosenzweig (3):
  panfrost: Fix the reads_dest prototype
  panfrost: Fix is_opaque prototype
  panfrost: Increase tiler_heap max allocation to 64MB

Anuj Phogat (1):
  intel/gfx12+: Add Wa_14013840143

Charmaine Lee (1):
  svga: fix texture rectangle sampling when no sampler view declaration is 
defined

Emma Anholt (2):
  i915g: Disable 3D-pipeline clears.
  i915g: Add support for the .Absolute flag on TGSI srcs.

Eric Anholt (1):
  i915g: Stop advertising support for indirect addressing in the FS.

Eric Engestrom (9):
  .pick_status.json: Update to 17861aff9614abfea3b8a8f111a114b26b351915
  pick-ui & .pick_status.json: rename `master_sha` to `main_sha`
  .pick_status.json: Update to b663c544177e9547793ee405887f0d41c50e6d1d
  .pick_status.json: Update to 507e8907af913ab7b89211240568b8002b3475f1
  .pick_status.json: Update to 3179daf61393ee8a0fac943b94335b114e34873b
  .pick_status.json: Update to 761383720617b46617bd278ec6015c9520f43f5c
  .pick_status.json: Update to 1199d86b238a101e63bdf9b60a7391f96092
  docs: add release notes for 21.1.2
  VERSION: bump for 21.1.2

Erik Faye-Lund (2):
  zink: use actual const for const offset
  util/prim_restart: revert part of bad fix

Erik Kurzinger (1):
  vulkan/device_select: avoid segfault on Wayland if wl_drm is unavailable

Georg Lehmann (1):
  radv: Fix compatible image handle type for dmabufs.

Ian Romanick (2):
  nir/algebraic: Remove some optimizations of comparisons with fsat
  nir/algebraic: Invert comparisons less often

Icecream95 (1):
  panfrost: Fix polygon list size computations

Italo Nicola (1):
  panfrost: fix GL_EXT_multisampled_render_to_texture regression

Jason Ekstrand (3):
  anv: Plumb the shader into push constant helpers
  anv: Support pushing shader constants
  intel/vec4: Don't spill fp64 registers more than once

José Fonseca (1):
  draw: Allocate extra padding for extra shader outputs.

Juan A. Suarez Romero (1):
  vc4: initialize array

Kenneth Graunke (2):
  i965: Don't advertise Y-tiled modifiers for scanout buffers on Gfx8-
  iris: Don't advertise Y-tiled modifiers for scanout buffers on Gfx8

Marek Olšák (3):
  ac/gpu_info: set has_zero_index_buffer_bug for Navi12 too
  radeonsi: add a gfx10 hw bug workaround with the barrier before 
gs_alloc_req
  radeonsi: disable DFSM on gfx9 by default because it decreases 
performance a lot

Mike Blumenkrantz (4):
  util/prim_restart: fix util_translate_prim_restart_ib
  aux/vbuf: prevent uint underflow and assert if no vbs are dirty
  aux/trace: fix set_inlinable_constants hook
  zink: remove weird lod hack for texturing

Nanley Chery (2):
  anv,iris: Port the D16 workaround stalls to BLORP
  intel/isl: Fix HiZ+CCS comment about ambiguates

Neha Bhende (2):
  svga: Add target and sampler_return_type info into shader key
  svga: Use shader_key info to declare resources if TGSI shader is missing 
it

Rhys Perry (3):
  aco: disallow SGPRs on DPP instructions
  radv: add radv_absolute_depth_bias
  radv: workaround incorrect depthBiasConstantFactor by Path of Exile

Robert Tarasov (1):
  iris: Check data alignment for copy_mem_mem

Samuel Pitoiset (4):
  aco: fix derivatives/intrinsics with SGPR sources
  radv: fix fast clearing DCC if one level can't be compressed on GFX10+
  aco: fix emitting discard when the program just ends
  radv: enable RADV_DEBUG=invariantgeom for Monster Hunter World

SureshGuttula (1):
  frontends/va/picture:Fix wrong reallocation even surface is protected

cheyang (1):
  virgl:Fix the leak of hw_res used as fence

git tag: mesa-21.1.2

https://mesa.freedesktop.org/archive/mesa-21.1.2.tar.xz
SHA256: 23b4b63760561f3a4f98b5be12c6de621e9a6bdf355e087a83d9184cd4e2825f  
mesa-21.1.2.tar.xz
SHA512: 
a7907fa29fdb4e137015ee5405b9c8c0769ef9354bbe963c1af80318b398c05c79db6129b583106d620c42a5e9b625611b648fd5207334eb9b588d7963defc70
  mesa-21.1.2.tar.xz
PGP:  https://mesa.freedesktop.org/archive/mesa-21.1.2.tar.xz.sig



signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Daniel Vetter
On Wed, Jun 02, 2021 at 10:09:01AM +0200, Michel Dänzer wrote:
> On 2021-06-01 12:49 p.m., Michel Dänzer wrote:
> > On 2021-06-01 12:21 p.m., Christian König wrote:
> > 
> >> Another question is if that is sufficient as security for the display 
> >> server or if we need further handling down the road? I mean essentially we 
> >> are moving the reliability problem into the display server.
> > 
> > Good question. This should generally protect the display server from 
> > freezing due to client fences never signalling, but there might still be 
> > corner cases.
> 
> E.g. a client might be able to sneak in a fence between when the
> compositor checks fences and when it submits its drawing to the kernel.

This is why implicit sync should be handled with explicit IPC. You pick
the fence up once, and then you need to tell your GL stack to _not_ do
implicit sync. Would need a new extension. vk afaiui does this
automatically already.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Daniel Vetter
On Wed, Jun 02, 2021 at 08:52:38PM +0200, Christian König wrote:
> 
> 
> Am 02.06.21 um 20:48 schrieb Daniel Vetter:
> > On Wed, Jun 02, 2021 at 05:38:51AM -0400, Marek Olšák wrote:
> > > On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák  wrote:
> > > 
> > > > Yes, we can't break anything because we don't want to complicate things
> > > > for us. It's pretty much all NAK'd already. We are trying to gather more
> > > > knowledge and then make better decisions.
> > > > 
> > > > The idea we are considering is that we'll expose memory-based sync 
> > > > objects
> > > > to userspace for read only, and the kernel or hw will strictly control 
> > > > the
> > > > memory writes to those sync objects. The hole in that idea is that
> > > > userspace can decide not to signal a job, so even if userspace can't
> > > > overwrite memory-based sync object states arbitrarily, it can still 
> > > > decide
> > > > not to signal them, and then a future fence is born.
> > > > 
> > > This would actually be treated as a GPU hang caused by that context, so it
> > > should be fine.
> > This is practically what I proposed already, except your not doing it with
> > dma_fence. And on the memory fence side this also doesn't actually give
> > what you want for that compute model.
> > 
> > This seems like a bit a worst of both worlds approach to me? Tons of work
> > in the kernel to hide these not-dma_fence-but-almost, and still pain to
> > actually drive the hardware like it should be for compute or direct
> > display.
> > 
> > Also maybe I've missed it, but I didn't see any replies to my suggestion
> > how to fake the entire dma_fence stuff on top of new hw. Would be
> > interesting to know what doesn't work there instead of amd folks going of
> > into internal again and then coming back with another rfc from out of
> > nowhere :-)
> 
> Well to be honest I would just push back on our hardware/firmware guys that
> we need to keep kernel queues forever before going down that route.

I looked again, and you said the model wont work because preemption is way
too slow, even when the context is idle.

I guess at that point I got maybe too fed up and just figured "not my
problem", but if preempt is too slow as the unload fence, you can do it
with pte removal and tlb shootdown too (that is hopefully not too slow,
otherwise your hw is just garbage and wont even be fast for direct submit
compute workloads).

The only thing that you need to do when you use pte clearing + tlb
shootdown instad of preemption as the unload fence for buffers that get
moved is that if you get any gpu page fault, you don't serve that, but
instead treat it as a tdr and shot the context permanently.

So summarizing the model I proposed:

- you allow userspace to directly write into the ringbuffer, and also
  write the fences directly

- actual submit is done by the kernel, using drm/scheduler. The kernel
  blindly trusts userspace to set up everything else, and even just wraps
  dma_fences around the userspace memory fences.

- the only check is tdr. If a fence doesn't complete an tdr fires, a) the
  kernel shot the entire context and b) userspace recovers by setting up a
  new ringbuffer

- memory management is done using ttm only, you still need to supply the
  buffer list (ofc that list includes the always present ones, so CS will
  only get the list of special buffers like today). If you hw can't trun
  gpu page faults and you ever get one we pull up the same old solution:
  Kernel shots the entire context.

  The important thing is that from the gpu pov memory management works
  exactly like compute workload with direct submit, except that you just
  terminate the context on _any_ page fault, instead of only those that go
  somewhere where there's really no mapping and repair the others.

  Also I guess from reading the old thread this means you'd disable page
  fault retry because that is apparently also way too slow for anything.

- memory management uses an unload fence. That unload fences waits for all
  userspace memory fences (represented as dma_fence) to complete, with
  maybe some fudge to busy-spin until we've reached the actual end of the
  ringbuffer (maybe you have a IB tail there after the memory fence write,
  we have that on intel hw), and it waits for the memory to get
  "unloaded". This is either preemption, or pte clearing + tlb shootdown,
  or whatever else your hw provides which is a) used for dynamic memory
  management b) fast enough for actual memory management.

- any time a context dies we force-complete all it's pending fences,
  in-order ofc

So from hw pov this looks 99% like direct userspace submit, with the exact
same mappings, command sequences and everything else. The only difference
is that the rinbuffer head/tail updates happen from drm/scheduler, instead
of directly from userspace.

None of this stuff needs funny tricks where the kernel controls the
writes to memory fences, or where you need kernel ringbuffers, or anything
like thist. 

Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Christian König



Am 02.06.21 um 20:48 schrieb Daniel Vetter:

On Wed, Jun 02, 2021 at 05:38:51AM -0400, Marek Olšák wrote:

On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák  wrote:


Yes, we can't break anything because we don't want to complicate things
for us. It's pretty much all NAK'd already. We are trying to gather more
knowledge and then make better decisions.

The idea we are considering is that we'll expose memory-based sync objects
to userspace for read only, and the kernel or hw will strictly control the
memory writes to those sync objects. The hole in that idea is that
userspace can decide not to signal a job, so even if userspace can't
overwrite memory-based sync object states arbitrarily, it can still decide
not to signal them, and then a future fence is born.


This would actually be treated as a GPU hang caused by that context, so it
should be fine.

This is practically what I proposed already, except your not doing it with
dma_fence. And on the memory fence side this also doesn't actually give
what you want for that compute model.

This seems like a bit a worst of both worlds approach to me? Tons of work
in the kernel to hide these not-dma_fence-but-almost, and still pain to
actually drive the hardware like it should be for compute or direct
display.

Also maybe I've missed it, but I didn't see any replies to my suggestion
how to fake the entire dma_fence stuff on top of new hw. Would be
interesting to know what doesn't work there instead of amd folks going of
into internal again and then coming back with another rfc from out of
nowhere :-)


Well to be honest I would just push back on our hardware/firmware guys 
that we need to keep kernel queues forever before going down that route.


That syncfile and all that Android stuff isn't working out of the box 
with the new shiny user queue submission model (which in turn is mostly 
because of Windows) already raised some eyebrows here.


Christian.


-Daniel


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Daniel Vetter
On Wed, Jun 02, 2021 at 05:38:51AM -0400, Marek Olšák wrote:
> On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák  wrote:
> 
> > Yes, we can't break anything because we don't want to complicate things
> > for us. It's pretty much all NAK'd already. We are trying to gather more
> > knowledge and then make better decisions.
> >
> > The idea we are considering is that we'll expose memory-based sync objects
> > to userspace for read only, and the kernel or hw will strictly control the
> > memory writes to those sync objects. The hole in that idea is that
> > userspace can decide not to signal a job, so even if userspace can't
> > overwrite memory-based sync object states arbitrarily, it can still decide
> > not to signal them, and then a future fence is born.
> >
> 
> This would actually be treated as a GPU hang caused by that context, so it
> should be fine.

This is practically what I proposed already, except your not doing it with
dma_fence. And on the memory fence side this also doesn't actually give
what you want for that compute model.

This seems like a bit a worst of both worlds approach to me? Tons of work
in the kernel to hide these not-dma_fence-but-almost, and still pain to
actually drive the hardware like it should be for compute or direct
display.

Also maybe I've missed it, but I didn't see any replies to my suggestion
how to fake the entire dma_fence stuff on top of new hw. Would be
interesting to know what doesn't work there instead of amd folks going of
into internal again and then coming back with another rfc from out of
nowhere :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Christian König

Am 02.06.21 um 11:58 schrieb Marek Olšák:
On Wed, Jun 2, 2021 at 5:44 AM Christian König 
> wrote:


Am 02.06.21 um 10:57 schrieb Daniel Stone:
> Hi Christian,
>
> On Tue, 1 Jun 2021 at 13:51, Christian König
> mailto:ckoenig.leichtzumer...@gmail.com>> wrote:
>> Am 01.06.21 um 14:30 schrieb Daniel Vetter:
>>> If you want to enable this use-case with driver magic and
without the
>>> compositor being aware of what's going on, the solution is
EGLStreams.
>>> Not sure we want to go there, but it's definitely a lot more
feasible
>>> than trying to stuff eglstreams semantics into dma-buf implicit
>>> fencing support in a desperate attempt to not change compositors.
>> Well not changing compositors is certainly not something I
would try
>> with this use case.
>>
>> Not changing compositors is more like ok we have Ubuntu 20.04
and need
>> to support that we the newest hardware generation.
> Serious question, have you talked to Canonical?
>
> I mean there's a hell of a lot of effort being expended here, but it
> seems to all be predicated on the assumption that Ubuntu's LTS
> HWE/backport policy is totally immutable, and that we might need to
> make the kernel do backflips to work around that. But ... is it? Has
> anyone actually asked them how they feel about this?

This was merely an example. What I wanted to say is that we need to
support system already deployed.

In other words our customers won't accept that they need to
replace the
compositor just because they switch to a new hardware generation.

> I mean, my answer to the first email is 'no, absolutely not'
from the
> technical perspective (the initial proposal totally breaks
current and
> future userspace), from a design perspective (it breaks a lot of
> usecases which aren't single-vendor GPU+display+codec, or aren't
just
> a simple desktop), and from a sustainability perspective (cutting
> Android adrift again isn't acceptable collateral damage to make it
> easier to backport things to last year's Ubuntu release).
>
> But then again, I don't even know what I'm NAKing here ... ? The
> original email just lists a proposal to break a ton of things, with
> proposed replacements which aren't technically viable, and it's not
> clear why? Can we please have some more details and some reasoning
> behind them?
>
> I don't mind that userspace (compositor, protocols, clients like
Mesa
> as well as codec APIs) need to do a lot of work to support the new
> model. I do really care though that the hard-binary-switch model
works
> fine enough for AMD but totally breaks heterogeneous systems and
makes
> it impossible for userspace to do the right thing.

Well how the handling for new Android, distributions etc... is
going to
look like is a completely different story.

And I completely agree with both Daniel Vetter and you that we
need to
keep this in mind when designing the compatibility with older
software.

For Android I'm really not sure what to do. In general Android is
already trying to do the right thing by using explicit sync, the
problem
is that this is build around the idea that this explicit sync is
syncfile kernel based.

Either we need to change Android and come up with something that
works
with user fences as well or we somehow invent a compatibility
layer for
syncfile as well.


What's the issue with syncfiles that syncobjs don't suffer from?


Syncobjs where designed with future fences in mind. In other words we 
already have the ability to wait for a future submission to appear with 
all the nasty locking implications.


Syncfile on the other hand are just a container for up to N kernel 
fences and since we don't have kernel fences any more that is rather 
tricky to keep working.


Going to look into the uAPI around syncfiles once more and see if we can 
somehow use that for user fences as well.


Christian.



Marek


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
On Wed, Jun 2, 2021 at 5:44 AM Christian König <
ckoenig.leichtzumer...@gmail.com> wrote:

> Am 02.06.21 um 10:57 schrieb Daniel Stone:
> > Hi Christian,
> >
> > On Tue, 1 Jun 2021 at 13:51, Christian König
> >  wrote:
> >> Am 01.06.21 um 14:30 schrieb Daniel Vetter:
> >>> If you want to enable this use-case with driver magic and without the
> >>> compositor being aware of what's going on, the solution is EGLStreams.
> >>> Not sure we want to go there, but it's definitely a lot more feasible
> >>> than trying to stuff eglstreams semantics into dma-buf implicit
> >>> fencing support in a desperate attempt to not change compositors.
> >> Well not changing compositors is certainly not something I would try
> >> with this use case.
> >>
> >> Not changing compositors is more like ok we have Ubuntu 20.04 and need
> >> to support that we the newest hardware generation.
> > Serious question, have you talked to Canonical?
> >
> > I mean there's a hell of a lot of effort being expended here, but it
> > seems to all be predicated on the assumption that Ubuntu's LTS
> > HWE/backport policy is totally immutable, and that we might need to
> > make the kernel do backflips to work around that. But ... is it? Has
> > anyone actually asked them how they feel about this?
>
> This was merely an example. What I wanted to say is that we need to
> support system already deployed.
>
> In other words our customers won't accept that they need to replace the
> compositor just because they switch to a new hardware generation.
>
> > I mean, my answer to the first email is 'no, absolutely not' from the
> > technical perspective (the initial proposal totally breaks current and
> > future userspace), from a design perspective (it breaks a lot of
> > usecases which aren't single-vendor GPU+display+codec, or aren't just
> > a simple desktop), and from a sustainability perspective (cutting
> > Android adrift again isn't acceptable collateral damage to make it
> > easier to backport things to last year's Ubuntu release).
> >
> > But then again, I don't even know what I'm NAKing here ... ? The
> > original email just lists a proposal to break a ton of things, with
> > proposed replacements which aren't technically viable, and it's not
> > clear why? Can we please have some more details and some reasoning
> > behind them?
> >
> > I don't mind that userspace (compositor, protocols, clients like Mesa
> > as well as codec APIs) need to do a lot of work to support the new
> > model. I do really care though that the hard-binary-switch model works
> > fine enough for AMD but totally breaks heterogeneous systems and makes
> > it impossible for userspace to do the right thing.
>
> Well how the handling for new Android, distributions etc... is going to
> look like is a completely different story.
>
> And I completely agree with both Daniel Vetter and you that we need to
> keep this in mind when designing the compatibility with older software.
>
> For Android I'm really not sure what to do. In general Android is
> already trying to do the right thing by using explicit sync, the problem
> is that this is build around the idea that this explicit sync is
> syncfile kernel based.
>
> Either we need to change Android and come up with something that works
> with user fences as well or we somehow invent a compatibility layer for
> syncfile as well.
>

What's the issue with syncfiles that syncobjs don't suffer from?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Christian König

Am 02.06.21 um 10:57 schrieb Daniel Stone:

Hi Christian,

On Tue, 1 Jun 2021 at 13:51, Christian König
 wrote:

Am 01.06.21 um 14:30 schrieb Daniel Vetter:

If you want to enable this use-case with driver magic and without the
compositor being aware of what's going on, the solution is EGLStreams.
Not sure we want to go there, but it's definitely a lot more feasible
than trying to stuff eglstreams semantics into dma-buf implicit
fencing support in a desperate attempt to not change compositors.

Well not changing compositors is certainly not something I would try
with this use case.

Not changing compositors is more like ok we have Ubuntu 20.04 and need
to support that we the newest hardware generation.

Serious question, have you talked to Canonical?

I mean there's a hell of a lot of effort being expended here, but it
seems to all be predicated on the assumption that Ubuntu's LTS
HWE/backport policy is totally immutable, and that we might need to
make the kernel do backflips to work around that. But ... is it? Has
anyone actually asked them how they feel about this?


This was merely an example. What I wanted to say is that we need to 
support system already deployed.


In other words our customers won't accept that they need to replace the 
compositor just because they switch to a new hardware generation.



I mean, my answer to the first email is 'no, absolutely not' from the
technical perspective (the initial proposal totally breaks current and
future userspace), from a design perspective (it breaks a lot of
usecases which aren't single-vendor GPU+display+codec, or aren't just
a simple desktop), and from a sustainability perspective (cutting
Android adrift again isn't acceptable collateral damage to make it
easier to backport things to last year's Ubuntu release).

But then again, I don't even know what I'm NAKing here ... ? The
original email just lists a proposal to break a ton of things, with
proposed replacements which aren't technically viable, and it's not
clear why? Can we please have some more details and some reasoning
behind them?

I don't mind that userspace (compositor, protocols, clients like Mesa
as well as codec APIs) need to do a lot of work to support the new
model. I do really care though that the hard-binary-switch model works
fine enough for AMD but totally breaks heterogeneous systems and makes
it impossible for userspace to do the right thing.


Well how the handling for new Android, distributions etc... is going to 
look like is a completely different story.


And I completely agree with both Daniel Vetter and you that we need to 
keep this in mind when designing the compatibility with older software.


For Android I'm really not sure what to do. In general Android is 
already trying to do the right thing by using explicit sync, the problem 
is that this is build around the idea that this explicit sync is 
syncfile kernel based.


Either we need to change Android and come up with something that works 
with user fences as well or we somehow invent a compatibility layer for 
syncfile as well.


Regards,
Christian.



Cheers,
Daniel


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
On Wed, Jun 2, 2021 at 5:34 AM Marek Olšák  wrote:

> Yes, we can't break anything because we don't want to complicate things
> for us. It's pretty much all NAK'd already. We are trying to gather more
> knowledge and then make better decisions.
>
> The idea we are considering is that we'll expose memory-based sync objects
> to userspace for read only, and the kernel or hw will strictly control the
> memory writes to those sync objects. The hole in that idea is that
> userspace can decide not to signal a job, so even if userspace can't
> overwrite memory-based sync object states arbitrarily, it can still decide
> not to signal them, and then a future fence is born.
>

This would actually be treated as a GPU hang caused by that context, so it
should be fine.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Marek Olšák
Yes, we can't break anything because we don't want to complicate things for
us. It's pretty much all NAK'd already. We are trying to gather more
knowledge and then make better decisions.

The idea we are considering is that we'll expose memory-based sync objects
to userspace for read only, and the kernel or hw will strictly control the
memory writes to those sync objects. The hole in that idea is that
userspace can decide not to signal a job, so even if userspace can't
overwrite memory-based sync object states arbitrarily, it can still decide
not to signal them, and then a future fence is born.

Marek

On Wed, Jun 2, 2021 at 4:57 AM Daniel Stone  wrote:

> Hi Christian,
>
> On Tue, 1 Jun 2021 at 13:51, Christian König
>  wrote:
> > Am 01.06.21 um 14:30 schrieb Daniel Vetter:
> > > If you want to enable this use-case with driver magic and without the
> > > compositor being aware of what's going on, the solution is EGLStreams.
> > > Not sure we want to go there, but it's definitely a lot more feasible
> > > than trying to stuff eglstreams semantics into dma-buf implicit
> > > fencing support in a desperate attempt to not change compositors.
> >
> > Well not changing compositors is certainly not something I would try
> > with this use case.
> >
> > Not changing compositors is more like ok we have Ubuntu 20.04 and need
> > to support that we the newest hardware generation.
>
> Serious question, have you talked to Canonical?
>
> I mean there's a hell of a lot of effort being expended here, but it
> seems to all be predicated on the assumption that Ubuntu's LTS
> HWE/backport policy is totally immutable, and that we might need to
> make the kernel do backflips to work around that. But ... is it? Has
> anyone actually asked them how they feel about this?
>
> I mean, my answer to the first email is 'no, absolutely not' from the
> technical perspective (the initial proposal totally breaks current and
> future userspace), from a design perspective (it breaks a lot of
> usecases which aren't single-vendor GPU+display+codec, or aren't just
> a simple desktop), and from a sustainability perspective (cutting
> Android adrift again isn't acceptable collateral damage to make it
> easier to backport things to last year's Ubuntu release).
>
> But then again, I don't even know what I'm NAKing here ... ? The
> original email just lists a proposal to break a ton of things, with
> proposed replacements which aren't technically viable, and it's not
> clear why? Can we please have some more details and some reasoning
> behind them?
>
> I don't mind that userspace (compositor, protocols, clients like Mesa
> as well as codec APIs) need to do a lot of work to support the new
> model. I do really care though that the hard-binary-switch model works
> fine enough for AMD but totally breaks heterogeneous systems and makes
> it impossible for userspace to do the right thing.
>
> Cheers,
> Daniel
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Daniel Stone
Hi Christian,

On Tue, 1 Jun 2021 at 13:51, Christian König
 wrote:
> Am 01.06.21 um 14:30 schrieb Daniel Vetter:
> > If you want to enable this use-case with driver magic and without the
> > compositor being aware of what's going on, the solution is EGLStreams.
> > Not sure we want to go there, but it's definitely a lot more feasible
> > than trying to stuff eglstreams semantics into dma-buf implicit
> > fencing support in a desperate attempt to not change compositors.
>
> Well not changing compositors is certainly not something I would try
> with this use case.
>
> Not changing compositors is more like ok we have Ubuntu 20.04 and need
> to support that we the newest hardware generation.

Serious question, have you talked to Canonical?

I mean there's a hell of a lot of effort being expended here, but it
seems to all be predicated on the assumption that Ubuntu's LTS
HWE/backport policy is totally immutable, and that we might need to
make the kernel do backflips to work around that. But ... is it? Has
anyone actually asked them how they feel about this?

I mean, my answer to the first email is 'no, absolutely not' from the
technical perspective (the initial proposal totally breaks current and
future userspace), from a design perspective (it breaks a lot of
usecases which aren't single-vendor GPU+display+codec, or aren't just
a simple desktop), and from a sustainability perspective (cutting
Android adrift again isn't acceptable collateral damage to make it
easier to backport things to last year's Ubuntu release).

But then again, I don't even know what I'm NAKing here ... ? The
original email just lists a proposal to break a ton of things, with
proposed replacements which aren't technically viable, and it's not
clear why? Can we please have some more details and some reasoning
behind them?

I don't mind that userspace (compositor, protocols, clients like Mesa
as well as codec APIs) need to do a lot of work to support the new
model. I do really care though that the hard-binary-switch model works
fine enough for AMD but totally breaks heterogeneous systems and makes
it impossible for userspace to do the right thing.

Cheers,
Daniel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Linux Graphics Next: Userspace submission update

2021-06-02 Thread Michel Dänzer
On 2021-06-01 12:49 p.m., Michel Dänzer wrote:
> On 2021-06-01 12:21 p.m., Christian König wrote:
> 
>> Another question is if that is sufficient as security for the display server 
>> or if we need further handling down the road? I mean essentially we are 
>> moving the reliability problem into the display server.
> 
> Good question. This should generally protect the display server from freezing 
> due to client fences never signalling, but there might still be corner cases.

E.g. a client might be able to sneak in a fence between when the compositor 
checks fences and when it submits its drawing to the kernel.


-- 
Earthling Michel Dänzer   |   https://redhat.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev