Re: [PATCH v4 0/3] Improve gpu_scheduler trace events

2024-06-10 Thread Daniel Vetter
On Mon, Jun 10, 2024 at 03:26:53PM +0200, Pierre-Eric Pelloux-Prayer wrote:
> v3: https://lists.freedesktop.org/archives/dri-devel/2024-June/456792.html
> 
> Changes since v3:
> * trace device name instead of drm_device primary index
> * no pointer deref in the TP_printk anymore. Instead the fence context/seqno
> are saved in TP_fast_assign

Some high-level comments:

- Quick summary of the what, why and how in the cover letter would be
  great.

- Link to the userspace, once you have that. At least last time we chatted
  that was still wip.

- Maybe most important to make this actually work, work well, and work
  long-term: I think we should clearly commit to these tracepoints being
  stable uapi, and document that by adding a stable tracepoint section in
  the drm uapi book.

  And then get acks from a pile of driver maintainers that they really
  think this is a good idea and has a future. Should also help with
  getting good review on the tracepoints themselves.

  Otherwise I fear we'll miss the mark again and still force userspace to
  hand-roll tracing for every driver, or maybe worse, even specific kernel
  versions.

Cheers, Sima

> 
> Pierre-Eric Pelloux-Prayer (3):
>   drm/sched: add device name to the drm_sched_process_job event
>   drm/sched: cleanup gpu_scheduler trace events
>   drm/sched: trace dependencies for gpu jobs
> 
>  .../gpu/drm/scheduler/gpu_scheduler_trace.h   | 97 +++
>  drivers/gpu/drm/scheduler/sched_entity.c  |  8 +-
>  drivers/gpu/drm/scheduler/sched_main.c|  2 +-
>  3 files changed, 84 insertions(+), 23 deletions(-)
> 
> -- 
> 2.40.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH net-next v10 02/14] net: page_pool: create hooks for custom page providers

2024-06-10 Thread Daniel Vetter
On Mon, Jun 10, 2024 at 02:38:18PM +0200, Christian König wrote:
> Am 10.06.24 um 14:16 schrieb Jason Gunthorpe:
> > On Mon, Jun 10, 2024 at 02:07:01AM +0100, Pavel Begunkov wrote:
> > > On 6/10/24 01:37, David Wei wrote:
> > > > On 2024-06-07 17:52, Jason Gunthorpe wrote:
> > > > > IMHO it seems to compose poorly if you can only use the io_uring
> > > > > lifecycle model with io_uring registered memory, and not with DMABUF
> > > > > memory registered through Mina's mechanism.
> > > > By this, do you mean io_uring must be exclusively used to use this
> > > > feature?
> > > > 
> > > > And you'd rather see the two decoupled, so userspace can register w/ say
> > > > dmabuf then pass it to io_uring?
> > > Personally, I have no clue what Jason means. You can just as
> > > well say that it's poorly composable that write(2) to a disk
> > > cannot post a completion into a XDP ring, or a netlink socket,
> > > or io_uring's main completion queue, or name any other API.
> > There is no reason you shouldn't be able to use your fast io_uring
> > completion and lifecycle flow with DMABUF backed memory. Those are not
> > widly different things and there is good reason they should work
> > together.
> 
> Well there is the fundamental problem that you can't use io_uring to
> implement the semantics necessary for a dma_fence.
> 
> That's why we had to reject the io_uring work on DMA-buf sharing from Google
> a few years ago.
> 
> But this only affects the dma_fence synchronization part of DMA-buf, but
> *not* the general buffer sharing.

More precisely, it only impacts the userspace/data access implicit
synchronization part of dma-buf. For tracking buffer movements like on
invalidations/refault with a dynamic dma-buf importer/exporter I think the
dma-fence rules are acceptable. At least they've been for rdma drivers.

But the escape hatch is to (temporarily) pin the dma-buf, which is exactly
what direct I/O also does when accessing pages. So aside from the still
unsolved question on how we should account/track pinned dma-buf, there
shouldn't be an issue. Or at least I'm failing to see one.

And for synchronization to data access the dma-fence stuff on dma-buf is
anyway rather deprecated on the gpu side too, exactly because of all these
limitations. On the gpu side we've been moving to free-standing
drm_syncobj instead, but those are fairly gpu specific and any other
subsystem should be able to just reuse what they have already to signal
transaction completions.

Cheers, Sima

> 
> Regards,
> Christian.
> 
> > 
> > Pretending they are totally different just because two different
> > people wrote them is a very siloed view.
> > 
> > > The devmem TCP callback can implement it in a way feasible to
> > > the project, but it cannot directly post events to an unrelated
> > > API like io_uring. And devmem attaches buffers to a socket,
> > > for which a ring for returning buffers might even be a nuisance.
> > If you can't compose your io_uring completion mechanism with a DMABUF
> > provided backing store then I think it needs more work.
> > 
> > Jason
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/3] kci-gitlab: Introducing GitLab-CI Pipeline for Kernel Testing

2024-05-23 Thread Daniel Vetter
On Mon, Mar 04, 2024 at 06:45:33PM -0300, Helen Koike wrote:
> Hi Linus,
> 
> Thank you for your reply and valuable inputs.
> 
> On 01/03/2024 17:10, Linus Torvalds wrote:
> > On Fri, 1 Mar 2024 at 02:27, Nikolai Kondrashov  wrote:
> > > 
> > > I agree, it's hard to imagine even a simple majority agreeing on how 
> > > GitLab CI
> > > should be done. Still, we would like to help people, who are interested in
> > > this kind of thing, to set it up. How about we reframe this contribution 
> > > as a
> > > sort of template, or a reference for people to start their setup with,
> > > assuming that most maintainers would want to tweak it? We would also be 
> > > glad
> > > to stand by for questions and help, as people try to use it.
> > 
> > Ack. I think seeing it as a library for various gitlab CI models would
> > be a lot more palatable. Particularly if you can then show that yes,
> > it is also relevant to our currently existing drm case.
> 
> Having it as a library would certainly make my work as the DRM-CI maintainer
> easier and  also simplify the process whenever we consider integrating tests
> into other subsystems.

Kinda ignored this thread, just wanted to throw my +1 in here.

To spin it positively, the kernel CI space is wide open (more negatively,
it's a fractured mess). And I think there's just no way to force top-down
unification. Imo the only way is to land subsystem CI support in upstream,
figure out what exactly that should look like (I sketched a lot of open
questions in the DRM CI PR around what should and should not be in
upstream).

Then, once we have a few of those, extract common scripts and tools into
tools/ci/ or scripts/ci or whatever.

And only then, best case years down the road, dare to have some common
top-level CI, once it's clear what the actual common pieces and test
stages even are.

> > So I'm not objecting to having (for example) some kind of CI helper
> > templates - I think a logical place would be in tools/ci/ which is
> > kind of alongside our tools/testing subdirectory.
> 
> Works for me.
> 
> We  can skip having a default .gitlab-ci.yml in the root directory and
> instead include clear instructions in our documentation for using these
> templates.

I'd go a few steps more back and start with trying to get more subsystem
CI into upstream. And then once that dust has settled, figure out what the
common pieces actually are. Because I'm pretty sure that what we have for
drm ci or kernelci right now won't be it, but likely just a local optimum.

Cheers, Sima

> 
> Thanks,
> Helen Koike
> 
> > 
> > (And then perhaps have a 'gitlab' directory under that. I'm not sure
> > whether - and how much - commonality there might be between the
> > different CI models of different hosts).
> > 
> > Just to clarify: when I say "a logical place", I very much want to
> > emphasize the "a" - maybe there are better places, and I'm not saying
> > that is the only possible place. But it sounds more logical to me than
> > some.
> > 
> >  Linus

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 0/8] dma-buf: heaps: Support carved-out heaps and ECC related-flags

2024-05-23 Thread Daniel Vetter
On Wed, May 22, 2024 at 03:18:02PM +0200, Maxime Ripard wrote:
> On Tue, May 21, 2024 at 02:06:19PM GMT, Daniel Vetter wrote:
> > On Thu, May 16, 2024 at 09:51:35AM -0700, John Stultz wrote:
> > > On Thu, May 16, 2024 at 3:56 AM Daniel Vetter  wrote:
> > > > On Wed, May 15, 2024 at 11:42:58AM -0700, John Stultz wrote:
> > > > > But it makes me a little nervous to add a new generic allocation flag
> > > > > for a feature most hardware doesn't support (yet, at least). So it's
> > > > > hard to weigh how common the actual usage will be across all the
> > > > > heaps.
> > > > >
> > > > > I apologize as my worry is mostly born out of seeing vendors really
> > > > > push opaque feature flags in their old ion heaps, so in providing a
> > > > > flags argument, it was mostly intended as an escape hatch for
> > > > > obviously common attributes. So having the first be something that
> > > > > seems reasonable, but isn't actually that common makes me fret some.
> > > > >
> > > > > So again, not an objection, just something for folks to stew on to
> > > > > make sure this is really the right approach.
> > > >
> > > > Another good reason to go with full heap names instead of opaque flags 
> > > > on
> > > > existing heaps is that with the former we can use symlinks in sysfs to
> > > > specify heaps, with the latter we need a new idea. We haven't yet gotten
> > > > around to implement this anywhere, but it's been in the dma-buf/heap 
> > > > todo
> > > > since forever, and I like it as a design approach. So would be a good 
> > > > idea
> > > > to not toss it. With that display would have symlinks to cma-ecc and 
> > > > cma,
> > > > and rendering maybe cma-ecc, shmem, cma heaps (in priority order) for a
> > > > SoC where the display needs contig memory for scanout.
> > > 
> > > So indeed that is a good point to keep in mind, but I also think it
> > > might re-inforce the choice of having ECC as a flag here.
> > > 
> > > Since my understanding of the sysfs symlinks to heaps idea is about
> > > being able to figure out a common heap from a collection of devices,
> > > it's really about the ability for the driver to access the type of
> > > memory. If ECC is just an attribute of the type of memory (as in this
> > > patch series), it being on or off won't necessarily affect
> > > compatibility of the buffer with the device.  Similarly "uncached"
> > > seems more of an attribute of memory type and not a type itself.
> > > Hardware that can access non-contiguous "system" buffers can access
> > > uncached system buffers.
> > 
> > Yeah, but in graphics there's a wide band where "shit performance" is
> > defacto "not useable (as intended at least)".
> 
> Right, but "not useable" is still kind of usage dependent, which
> reinforces the need for flags (and possibly some way to discover what
> the heap supports).
> 
> Like, if I just want to allocate a buffer for a single writeback frame,
> then I probably don't have the same requirements than a compositor that
> needs to output a frame at 120Hz.
> 
> The former probably doesn't care about the buffer attributes aside that
> it's accessible by the device. The latter probably can't make any kind
> of compromise over what kind of memory characteristics it uses.
> 
> If we look into the current discussions we have, a compositor would
> probably need a buffer without ECC, non-secure, and probably wouldn't
> care about caching and being physically contiguous.
> 
> Libcamera's SoftISP would probably require that the buffer is cacheable,
> non-secure, without ECC and might ask for physically contiguous buffers.
> 
> As we add more memory types / attributes, I think being able to discover
> and enforce a particular set of flags will be more and more important,
> even more so if we tie heaps to devices, because it just gives a hint
> about the memory being reachable from the device, but as you said, you
> can still get a buffer with shit performance that won't be what you
> want.
> 
> > So if we limit the symlink idea to just making sure zero-copy access is
> > possible, then we might not actually solve the real world problem we need
> > to solve. And so the symlinks become somewhat useless, and we need to
> > somewhere encode which flags you need to use with each symlink.
> > 
> > But I also see the argument that

Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-23 Thread Daniel Vetter
On Wed, May 22, 2024 at 03:34:52PM +0200, Maxime Ripard wrote:
> Hi,
> 
> On Mon, May 06, 2024 at 03:38:24PM GMT, Daniel Vetter wrote:
> > On Mon, May 06, 2024 at 02:05:12PM +0200, Maxime Ripard wrote:
> > > Hi,
> > > 
> > > On Mon, May 06, 2024 at 01:49:17PM GMT, Hans de Goede wrote:
> > > > Hi dma-buf maintainers, et.al.,
> > > > 
> > > > Various people have been working on making complex/MIPI cameras work 
> > > > OOTB
> > > > with mainline Linux kernels and an opensource userspace stack.
> > > > 
> > > > The generic solution adds a software ISP (for Debayering and 3A) to
> > > > libcamera. Libcamera's API guarantees that buffers handed to 
> > > > applications
> > > > using it are dma-bufs so that these can be passed to e.g. a video 
> > > > encoder.
> > > > 
> > > > In order to meet this API guarantee the libcamera software ISP allocates
> > > > dma-bufs from userspace through one of the /dev/dma_heap/* heaps. For
> > > > the Fedora COPR repo for the PoC of this:
> > > > https://hansdegoede.dreamwidth.org/28153.html
> > > 
> > > For the record, we're also considering using them for ARM KMS devices,
> > > so it would be better if the solution wasn't only considering v4l2
> > > devices.
> > > 
> > > > I have added a simple udev rule to give physically present users access
> > > > to the dma_heap-s:
> > > > 
> > > > KERNEL=="system", SUBSYSTEM=="dma_heap", TAG+="uaccess"
> > > > 
> > > > (and on Rasperry Pi devices any users in the video group get access)
> > > > 
> > > > This was just a quick fix for the PoC. Now that we are ready to move out
> > > > of the PoC phase and start actually integrating this into distributions
> > > > the question becomes if this is an acceptable solution; or if we need 
> > > > some
> > > > other way to deal with this ?
> > > > 
> > > > Specifically the question is if this will have any negative security
> > > > implications? I can certainly see this being used to do some sort of
> > > > denial of service attack on the system (1). This is especially true for
> > > > the cma heap which generally speaking is a limited resource.
> > > 
> > > There's plenty of other ways to exhaust CMA, like allocating too much
> > > KMS or v4l2 buffers. I'm not sure we should consider dma-heaps
> > > differently than those if it's part of our threat model.
> > 
> > So generally for an arm soc where your display needs cma, your render node
> > doesn't. And user applications only have access to the later, while only
> > the compositor gets a kms fd through logind. At least in drm aside from
> > vc4 there's really no render driver that just gives you access to cma and
> > allows you to exhaust that, you need to be a compositor with drm master
> > access to the display.
> > 
> > Which means we're mostly protected against bad applications, and that's
> > not a threat the "user physically sits in front of the machine accounts
> > for", and which giving cma access to everyone would open up. And with
> > flathub/snaps/... this is very much an issue.
> > 
> > So you need more, either:
> > 
> > - cgroups limits on dma-buf and dma-buf heaps. This has been bikeshedded
> >   for years and is just not really moving.
> 
> For reference, are you talking about:
> 
> https://lore.kernel.org/r/20220502231944.3891435-1-tjmerc...@google.com
> 
> Or has there been a new version of that recently?

I think the design feedback from Tejun has changed to that system memory
should be tracked with memcg instead (but that kinda leaves the open of
what to do with cma), and only device memory be tracked with a separate
cgroups controller.

But I'm also not sure whether that would actually solve all the
tracking/isolation requirements people tossed around or just gives us
something that wont get the job done.

Either way, yes I think that was the most recent code.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC 4/7] drm/virtio: Import prime buffers from other devices as guest blobs

2024-05-22 Thread Daniel Vetter
On Thu, Mar 28, 2024 at 01:32:57AM -0700, Vivek Kasireddy wrote:
> By importing scanout buffers from other devices, we should be able
> to use the virtio-gpu driver in KMS only mode. Note that we attach
> dynamically and register a move_notify() callback so that we can
> let the VMM know of any location changes associated with the backing
> store of the imported object by sending detach_backing cmd.
> 
> Cc: Gerd Hoffmann 
> Signed-off-by: Vivek Kasireddy 
> ---
>  drivers/gpu/drm/virtio/virtgpu_prime.c | 54 +-
>  1 file changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c 
> b/drivers/gpu/drm/virtio/virtgpu_prime.c
> index 1e87dbc9a897..c65dacc1b2b5 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_prime.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c
> @@ -255,10 +255,36 @@ static int virtgpu_dma_buf_init_obj(struct drm_device 
> *dev,
>   return ret;
>  }
>  
> +static const struct drm_gem_object_funcs virtgpu_gem_dma_buf_funcs = {
> + .free = virtgpu_dma_buf_free_obj,
> +};
> +
> +static void virtgpu_dma_buf_move_notify(struct dma_buf_attachment *attach)
> +{
> + struct drm_gem_object *obj = attach->importer_priv;
> + struct virtio_gpu_device *vgdev = obj->dev->dev_private;
> + struct virtio_gpu_object *bo = gem_to_virtio_gpu_obj(obj);
> +
> + if (bo->created) {
> + virtio_gpu_cmd_resource_detach_backing(vgdev,
> +bo->hw_res_handle);
> + bo->has_backing = false;
> + }
> +}
> +
> +static const struct dma_buf_attach_ops virtgpu_dma_buf_attach_ops = {
> + .allow_peer2peer = true,
> + .move_notify = virtgpu_dma_buf_move_notify
> +};
> +
>  struct drm_gem_object *virtgpu_gem_prime_import(struct drm_device *dev,
>   struct dma_buf *buf)
>  {
> + struct virtio_gpu_device *vgdev = dev->dev_private;
> + struct dma_buf_attachment *attach;
> + struct virtio_gpu_object *bo;
>   struct drm_gem_object *obj;
> + int ret;
>  
>   if (buf->ops == _dmabuf_ops.ops) {
>   obj = buf->priv;
> @@ -272,7 +298,32 @@ struct drm_gem_object *virtgpu_gem_prime_import(struct 
> drm_device *dev,
>   }
>   }
>  
> - return drm_gem_prime_import(dev, buf);

I think overall this (entire series) makes sense, but needs someone with
overall virtio understanding to make sure it all fits correctly. Just a
refactor thought here: I think instead of open-coding should we have a
drm_gem_prime_dynamic_import?

Similar in another patch for the dma_buf_pin, should that be also in the
gem helpers to automatically forward to dma_buf if it's imported?

Cheers, Sima

> + if (!vgdev->has_resource_blob || vgdev->has_virgl_3d)
> + return drm_gem_prime_import(dev, buf);
> +
> + bo = kzalloc(sizeof(*bo), GFP_KERNEL);
> + if (!bo)
> + return ERR_PTR(-ENOMEM);
> +
> + obj = >base.base;
> + obj->funcs = _gem_dma_buf_funcs;
> + drm_gem_private_object_init(dev, obj, buf->size);
> +
> + attach = dma_buf_dynamic_attach(buf, dev->dev,
> + _dma_buf_attach_ops, obj);
> + if (IS_ERR(attach)) {
> + kfree(bo);
> + return ERR_CAST(attach);
> + }
> +
> + obj->import_attach = attach;
> + get_dma_buf(buf);
> +
> + ret = virtgpu_dma_buf_init_obj(dev, bo, attach);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + return obj;
>  }
>  
>  struct drm_gem_object *virtgpu_gem_prime_import_sg_table(
> @@ -281,3 +332,4 @@ struct drm_gem_object *virtgpu_gem_prime_import_sg_table(
>  {
>   return ERR_PTR(-ENODEV);
>  }
> +
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 1/2] MAINTAINERS: Change habanalabs maintainer and git repo path

2024-05-21 Thread Daniel Vetter
On Wed, May 15, 2024 at 07:22:21PM +0300, Oded Gabbay wrote:
> Because I left habana, Ofir Bitton is now the habanalabs driver
> maintainer.
> 
> The git repo also changed location to the Habana GitHub website.
> 
> Signed-off-by: Oded Gabbay 

Acked-by: Daniel Vetter 

I'm assuming Ofir will include this in the first pr for drm.git.
-Sima

> ---
>  MAINTAINERS | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index abd4dbe2c653..5bd45a919aff 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -9431,11 +9431,11 @@ S:Maintained
>  F:   block/partitions/efi.*
>  
>  HABANALABS PCI DRIVER
> -M:   Oded Gabbay 
> +M:   Ofir Bitton 
>  L:   dri-devel@lists.freedesktop.org
>  S:   Supported
>  C:   irc://irc.oftc.net/dri-devel
> -T:   git https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux.git
> +T:   git https://github.com/HabanaAI/drivers.accel.habanalabs.kernel.git
>  F:   Documentation/ABI/testing/debugfs-driver-habanalabs
>  F:   Documentation/ABI/testing/sysfs-driver-habanalabs
>  F:   drivers/accel/habanalabs/
> -- 
> 2.34.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)

2024-05-21 Thread Daniel Vetter
On Tue, 21 May 2024 at 14:38, Daniel Vetter  wrote:
>
> On Mon, May 20, 2024 at 12:05:14PM +0200, Jacek Lawrynowicz wrote:
> > From: "Wachowski, Karol" 
> >
> > Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap
> > allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag
> > causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot:
> > BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
> >
> > Return -EINVAL early if COW mapping is detected.
> >
> > This bug affects all drm drivers using default shmem helpers.
> > It can be reproduced by this simple example:
> > void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset);
> > ptr[0] = 0;
> >
> > Fixes: 2194a63a818d ("drm: Add library for shmem backed GEM objects")
> > Cc: Noralf Trønnes 
> > Cc: Eric Anholt 
> > Cc: Rob Herring 
> > Cc: Maarten Lankhorst 
> > Cc: Maxime Ripard 
> > Cc: Thomas Zimmermann 
> > Cc: David Airlie 
> > Cc: Daniel Vetter 
> > Cc: dri-devel@lists.freedesktop.org
> > Cc:  # v5.2+
> > Signed-off-by: Wachowski, Karol 
> > Signed-off-by: Jacek Lawrynowicz 
>
> Excellent catch!
>
> Reviewed-by: Daniel Vetter 
>
> I reviewed the other helpers, and ttm/vram helpers already block this with
> the check in ttm_bo_mmap_obj.
>
> But the dma helpers does not, because the remap_pfn_range that underlies
> the various dma_mmap* function (at least on most platforms) allows some
> limited use of cow. But it makes no sense at all to all that only for
> gpu buffer objects backed by specific allocators.
>
> Would you be up for the 2nd patch that also adds this check to
> drm_gem_dma_mmap, so that we have a consistent uapi?
>
> I'll go ahead and apply this one to drm-misc-fixes meanwhile.

Forgot to add: A testcase in igt would also be really lovely.

https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#validating-changes-with-igt
-Sima


>
> Thanks, Sima
>
> > ---
> >  drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 13bcdbfd..885a62c2e1be 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -611,6 +611,9 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object 
> > *shmem, struct vm_area_struct
> >   return ret;
> >   }
> >
> > + if (is_cow_mapping(vma->vm_flags))
> > + return -EINVAL;
> > +
> >   dma_resv_lock(shmem->base.resv, NULL);
> >   ret = drm_gem_shmem_get_pages(shmem);
> >   dma_resv_unlock(shmem->base.resv);
> > --
> > 2.45.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: simpledrm, running display servers, and drivers replacing simpledrm while the display server is running

2024-05-21 Thread Daniel Vetter
se the atomic ioctls (I think at least) and the real
driver has full atomic state takeover support (only i915 to my knowledge),
and your userspace doesn't unecessarily mess with the display state when
it takes over a new driver, then that should lead to flicker free boot
even across a simpledrm->real driver takeover.

If your userspace doesn't crash ofc :-)

But it's a real steep ask of all components to get this right.

> > > Arguably, the only place a more educated guess about whether to wait or
> > > not, and if so how long, is the kernel.
> > 
> > As I said before, driver modules come and go and hardware devices come and
> > go.
> > 
> > To detect if there might be a native driver waiting to be loaded, you can
> > test for
> > 
> > - 'nomodeset' on the command line -> no native driver
> 
> Makes sense to not wait here, and just assume simpledrm forever.
> 
> > - 'systemd-load-modules' not started -> maybe wait
> > - look for drivers under /lib/modules//kernel/drivers/gpu/drm/ ->
> > maybe wait
> 
> I suspect this is not useful for general purpose distributions. I have
> 43 kernel GPU modules there, on a F40 installation.
> 
> > - maybe udev can tell you more
> > - it might for detection help that recently simpledrm devices refer to their
> > parent PCI device
> > - maybe systemd tracks the probed devices
> 
> If the kernel already plumbs enough state so userspace components can
> make a decent decision, instead of just sleeping for an arbitrary amount
> of time, then great. This is to some degree what
> https://github.com/systemd/systemd/issues/32509 is about.

I think you can't avoid the timeout entirely for the use-case where the
user has disable the real driver by not compiling it, and simpledrm would
be the only driver you'll ever get.

But that's just not going to happen on any default distro setup, so I
think it's ok if it sucks a bit.

Cheers, Sima

> 
> 
> Jonas
> 
> > 
> > Best regards
> > Thomas
> > 
> > > 
> > > 
> > > Jonas
> > > 
> > > > The next best solution is to keep the final DRM device open until a new 
> > > > one
> > > > shows up. All DRM graphics drivers with hotplugging support are 
> > > > required to
> > > > accept commands after their hardware has been unplugged. They simply 
> > > > won't
> > > > display anything.
> > > > 
> > > > Best regards
> > > > Thomas
> > > > 
> > > > 
> > > > > Thanks
> > > > > 
> > > > -- 
> > > > --
> > > > Thomas Zimmermann
> > > > Graphics Driver Developer
> > > > SUSE Software Solutions Germany GmbH
> > > > Frankenstrasse 146, 90461 Nuernberg, Germany
> > > > GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> > > > HRB 36809 (AG Nuernberg)
> > > > 
> > 
> > -- 
> > --
> > Thomas Zimmermann
> > Graphics Driver Developer
> > SUSE Software Solutions Germany GmbH
> > Frankenstrasse 146, 90461 Nuernberg, Germany
> > GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> > HRB 36809 (AG Nuernberg)
> > 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/shmem-helper: Fix BUG_ON() on mmap(PROT_WRITE, MAP_PRIVATE)

2024-05-21 Thread Daniel Vetter
On Mon, May 20, 2024 at 12:05:14PM +0200, Jacek Lawrynowicz wrote:
> From: "Wachowski, Karol" 
> 
> Lack of check for copy-on-write (COW) mapping in drm_gem_shmem_mmap
> allows users to call mmap with PROT_WRITE and MAP_PRIVATE flag
> causing a kernel panic due to BUG_ON in vmf_insert_pfn_prot:
> BUG_ON((vma->vm_flags & VM_PFNMAP) && is_cow_mapping(vma->vm_flags));
> 
> Return -EINVAL early if COW mapping is detected.
> 
> This bug affects all drm drivers using default shmem helpers.
> It can be reproduced by this simple example:
> void *ptr = mmap(0, size, PROT_WRITE, MAP_PRIVATE, fd, mmap_offset);
> ptr[0] = 0;
> 
> Fixes: 2194a63a818d ("drm: Add library for shmem backed GEM objects")
> Cc: Noralf Trønnes 
> Cc: Eric Anholt 
> Cc: Rob Herring 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: dri-devel@lists.freedesktop.org
> Cc:  # v5.2+
> Signed-off-by: Wachowski, Karol 
> Signed-off-by: Jacek Lawrynowicz 

Excellent catch!

Reviewed-by: Daniel Vetter 

I reviewed the other helpers, and ttm/vram helpers already block this with
the check in ttm_bo_mmap_obj.

But the dma helpers does not, because the remap_pfn_range that underlies
the various dma_mmap* function (at least on most platforms) allows some
limited use of cow. But it makes no sense at all to all that only for
gpu buffer objects backed by specific allocators.

Would you be up for the 2nd patch that also adds this check to
drm_gem_dma_mmap, so that we have a consistent uapi?

I'll go ahead and apply this one to drm-misc-fixes meanwhile.

Thanks, Sima

> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 13bcdbfd..885a62c2e1be 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -611,6 +611,9 @@ int drm_gem_shmem_mmap(struct drm_gem_shmem_object 
> *shmem, struct vm_area_struct
>   return ret;
>   }
>  
> + if (is_cow_mapping(vma->vm_flags))
> + return -EINVAL;
> +
>   dma_resv_lock(shmem->base.resv, NULL);
>   ret = drm_gem_shmem_get_pages(shmem);
>   dma_resv_unlock(shmem->base.resv);
> -- 
> 2.45.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 0/7] Adds support for ConfigFS to VKMS!

2024-05-21 Thread Daniel Vetter
t; directory and will have a different 
> > set of attribute (connection status, current EDID...)
> 
> Once the device is enabled (i.e, `echo 1 > /config/vkms/my-device/enabled`),
> would it make sense to use sysfs instead of another configfs directory?
> The advantage is that with sysfs the kernel controls the lifetime of the
> objects and I think it *might* simplify the code, but I'll need to write a
> proof of concept to see if this works.

sysfs is very opinionated about lifetime, so we might actually make this
more complicated. Plus for the only thing we can hotplug (connectors) we
already have sysfs directories, so there could be a lifetime/name fight
between the sysfs interfaces to prepare a hotplugged connector, and the
connector sysfs files which are part of the existing uapi.

Also the second issue I'm seeing is that we're mixing up
testing/configuration apis with the generic uapi that should hold for
every kms driver. This could make the code in igt testcase or for driving
compositor end-to-end testcases a lot more confusing. I think separation
would be better.

The third point I'm seeing is that connectors can be created both before
we create the device, and at runtime. If we have two totally separate
interfaces for this, we might end up with needless code duplication.

But it's a complex topic, I think it does make sense to give sysfs some
serious thought. But maybe as part of the vkms driver directory, and not
in the drm_device chardev directories. So we could have some separation
that way maybe?

> > For the platform driver part, it seems logic to me to use a "real" 
> > platform driver and a platform device for each pipeline, but I don't have 
> > the experience to tell if this is a good idea or not.
> 
> I'm afraid I don't know which approach could work better. Trusting Sima and
> Maíra on this one.

As I've said, I'm not opposed to a switch. I just think it's an orthogonal
issue to the configfs and should be separately justified.

We're trying hard to get away from kms userspace sneaking too much under
the hood of the driver, and have gone a long way from the o.g. drm days
where "everything is pci" was encoded into uapi. So from that pov I kinda
like the fact that vkms is special and fairly free-floating.

But maybe userspace does want to be able to test their device enumeration
more like a real device, so if vkms currently sticks out there that would
be a really good reason to change things and make it look more like a real
driver/device.

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: DRM Accel BoF at Linux Plumbers

2024-05-21 Thread Daniel Vetter
On Sat, May 18, 2024 at 10:46:01AM +0200, Tomeu Vizoso wrote:
> Hi,
> 
> I would like to use the chance at the next Plumbers to discuss the
> present challenges related to ML accelerators in mainline.
> 
> I'm myself more oriented towards edge-oriented deployments, and don't
> know enough about how these accelerators are being used in the cloud
> (and maybe desktop?) to tell if there is enough overlap to warrant a
> common BoF.
> 
> In any case, these are the topics I would like to discuss, some
> probably more relevant to the edge than to the cloud or desktop:
> 
> * What is stopping vendors from mainlining their drivers?
> 
> * How could we make it easier for them?
> 
> * Userspace API: how close are we from a common API that we can ask
> userspace drivers to implement? What can be done to further this goal?
> 
> * Automated testing: DRM CI can be used, but would be good to have a
> common test suite to run there. This is probably dependent on a common
> userspace API.
> 
> * Other shared userspace infrastructure (compiler, execution,
> synchronization, virtualization, ...)
> 
> * Firmware-mediated IP: what should we do about it, if anything?
> 
> * Any standing issues in DRM infra (GEM, gpu scheduler, DMABuf, etc)
> that are hurting accel drivers?
> 
> What do people think, should we have a drivers/accel-wide BoF at
> Plumbers? If so, what other topics should we have in the agenda?

Yeah sounds good, and I'll try to at least attend lpc this year since it's
rather close ... Might be good to explicitly ping teams of merged and
in-flight drivers we have in accel already.

I think the topic list is at least a good starting point.

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/probe-helper: Call drm_mode_validate_ycbcr420() before connector->mode_valid()

2024-05-21 Thread Daniel Vetter
On Thu, May 16, 2024 at 08:33:24PM +0300, Ville Syrjala wrote:
> From: Ville Syrjälä 
> 
> Make life easier for drivers by filtering out unwanted YCbCr 4:2:0
> only modes prior to calling the connector->mode_valid() hook.
> Currently drivers will still see YCbCr 4:2:0 only modes in said
> hook, which will likely come as a suprise when the driver has
> declared no support for such modes (via setting
> connector->ycbcr_420_allowed to false).
> 
> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10992
> Signed-off-by: Ville Syrjälä 

Sounds reasonable.

Reviewed-by: Daniel Vetter 

> ---
>  drivers/gpu/drm/drm_probe_helper.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_probe_helper.c 
> b/drivers/gpu/drm/drm_probe_helper.c
> index 4f75a1cfd820..249c8c2cb319 100644
> --- a/drivers/gpu/drm/drm_probe_helper.c
> +++ b/drivers/gpu/drm/drm_probe_helper.c
> @@ -474,6 +474,10 @@ static int __drm_helper_update_and_validate(struct 
> drm_connector *connector,
>   if (mode->status != MODE_OK)
>   continue;
>  
> + mode->status = drm_mode_validate_ycbcr420(mode, connector);
> + if (mode->status != MODE_OK)
> + continue;
> +
>   ret = drm_mode_validate_pipeline(mode, connector, ctx,
>>status);
>   if (ret) {
> @@ -486,10 +490,6 @@ static int __drm_helper_update_and_validate(struct 
> drm_connector *connector,
>   else
>   return -EDEADLK;
>   }
> -
> - if (mode->status != MODE_OK)
> - continue;
> - mode->status = drm_mode_validate_ycbcr420(mode, connector);
>   }
>  
>   return 0;
> -- 
> 2.44.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 0/8] dma-buf: heaps: Support carved-out heaps and ECC related-flags

2024-05-21 Thread Daniel Vetter
On Thu, May 16, 2024 at 09:51:35AM -0700, John Stultz wrote:
> On Thu, May 16, 2024 at 3:56 AM Daniel Vetter  wrote:
> > On Wed, May 15, 2024 at 11:42:58AM -0700, John Stultz wrote:
> > > But it makes me a little nervous to add a new generic allocation flag
> > > for a feature most hardware doesn't support (yet, at least). So it's
> > > hard to weigh how common the actual usage will be across all the
> > > heaps.
> > >
> > > I apologize as my worry is mostly born out of seeing vendors really
> > > push opaque feature flags in their old ion heaps, so in providing a
> > > flags argument, it was mostly intended as an escape hatch for
> > > obviously common attributes. So having the first be something that
> > > seems reasonable, but isn't actually that common makes me fret some.
> > >
> > > So again, not an objection, just something for folks to stew on to
> > > make sure this is really the right approach.
> >
> > Another good reason to go with full heap names instead of opaque flags on
> > existing heaps is that with the former we can use symlinks in sysfs to
> > specify heaps, with the latter we need a new idea. We haven't yet gotten
> > around to implement this anywhere, but it's been in the dma-buf/heap todo
> > since forever, and I like it as a design approach. So would be a good idea
> > to not toss it. With that display would have symlinks to cma-ecc and cma,
> > and rendering maybe cma-ecc, shmem, cma heaps (in priority order) for a
> > SoC where the display needs contig memory for scanout.
> 
> So indeed that is a good point to keep in mind, but I also think it
> might re-inforce the choice of having ECC as a flag here.
> 
> Since my understanding of the sysfs symlinks to heaps idea is about
> being able to figure out a common heap from a collection of devices,
> it's really about the ability for the driver to access the type of
> memory. If ECC is just an attribute of the type of memory (as in this
> patch series), it being on or off won't necessarily affect
> compatibility of the buffer with the device.  Similarly "uncached"
> seems more of an attribute of memory type and not a type itself.
> Hardware that can access non-contiguous "system" buffers can access
> uncached system buffers.

Yeah, but in graphics there's a wide band where "shit performance" is
defacto "not useable (as intended at least)".

So if we limit the symlink idea to just making sure zero-copy access is
possible, then we might not actually solve the real world problem we need
to solve. And so the symlinks become somewhat useless, and we need to
somewhere encode which flags you need to use with each symlink.

But I also see the argument that there's a bit a combinatorial explosion
possible. So I guess the question is where we want to handle it ...

Also wondering whether we should get the symlink/allocator idea off the
ground first, but given that that hasn't moved in a decade it might be too
much. But then the question is, what userspace are we going to use for all
these new heaps (or heaps with new flags)?

Cheers, Sima

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 0/5] Add support for GE SUNH hot-pluggable connector (was: "drm: add support for hot-pluggable bridges")

2024-05-21 Thread Daniel Vetter
On Mon, May 20, 2024 at 02:01:48PM +0200, Luca Ceresoli wrote:
> Hello Daniel,
> 
> On Thu, 16 May 2024 15:22:01 +0200
> Daniel Vetter  wrote:
> 
> > Apologies for missing v1 ...
> > 
> > On Fri, May 10, 2024 at 09:10:36AM +0200, Luca Ceresoli wrote:
> > > DRM hotplug bridge driver
> > > =
> > > 
> > > DRM natively supports pipelines whose display can be removed, but all the
> > > components preceding it (all the display controller and any bridges) are
> > > assumed to be fixed and cannot be plugged, removed or modified at runtime.
> > > 
> > > This series adds support for DRM pipelines having a removable part after
> > > the encoder, thus also allowing bridges to be removed and reconnected at
> > > runtime, possibly with different components.
> > > 
> > > This picture summarizes the  DRM structure implemented by this series:
> > > 
> > >  ..
> > >  |   DISPLAY CONTROLLER   |
> > >  | .-.   .--. |
> > >  | | ENCODER |<--| CRTC | |
> > >  | '-'   '--' |
> > >  '--|-'
> > > |
> > > |DSIHOTPLUG
> > > V  CONNECTOR
> > >.-..--..-..-. .---.
> > >| 0 to N  || _|   _| || 1 to N  | |   |
> > >| BRIDGES |--DSI-->||_   |_  |--DSI-->| BRIDGES |--LVDS-->| PANEL |
> > >| ||  || || | |   |
> > >'-''--''-''-' '---'
> > > 
> > >  [--- fixed components --]  [--- removable add-on ---]
> > > 
> > > Fixed components include:
> > > 
> > >  * all components up to the DRM encoder, usually part of the SoC
> > >  * optionally some bridges, in the SoC and/or as external chips
> > > 
> > > Components on the removable add-on include:
> > > 
> > >  * one or more bridges
> > >  * a fixed connector (not one natively supporting hotplug such as HDMI)
> > >  * the panel  
> > 
> > So I think at a high level this design approach makes sense,
> 
> Good starting point :)
> 
> > but the
> > implementation needs some serious thought. One big thing upfront though,
> > we need to have a clear plan for the overlay hotunload issues, otherwise
> > trying to make drm bridges hotpluggable makes no sense to me. Hotunload is
> > very, very tricky, full of lifetime issues, and those need to be sorted
> > out first or we're just trying to build a castle on quicksand.
> > 
> > For bridges itself I don't think the current locking works. You're trying
> > to really cleverly hide it all behind a normal-looking bridge driver, but
> > there's many things beyond that which will blow up if bridges just
> > disappear. Most importantly the bridge states part of an atomic update.
> 
> Surely possible as atomic updates are definitely not stimulated in my
> use case. Can you recommend any testing tools to be able to trigger any
> issues?

Uh really hard ... You'd need to create an atomic commit that's blocked on
a sync_file in-fence (so that you can extend the race window). And then
hotunplug the bridge chain _before_ you signal that fence.

That's not going to cover all possible races, but at least a large chunk
of the really big ones.

> The main setups I used for my testing so far are 'modetest -s' for my
> daily work and a simple weston setup to periodically test a complete
> user space stack.
> 
> > Now in drm we have drm_connector as the only hotunpluggable thing, and it
> > took years to sort out all the issues. I think we should either model the
> > bridge hotunplug locking after that, or just outright reuse the connector
> > locking and lifetime rules. I much prefer the latter personally.
> > 
> > Anyway the big issues:
> > 
> > - We need to refcount the hotpluggable bridges, because software (like
> >   atomic state updates) might hang onto pointers for longer than the
> >   bridge physically exists. Assuming that you can all tear it down
> >   synchronously will not work.
> > 
> >   If we reuse connector locking/lifetime then we could put the
> >   hotpluggable part of the bridge chain into the drm_connector, since that
> >   already has refcounting as needed. It would mean that finding the next
> >   bridge in the chain becomes a lot more tricky though. With that model
> >   we'd create a new connector e

Re: [PATCH v2 1/1] drm: Add ioctl for querying a DRM device's list of open client PIDs

2024-05-21 Thread Daniel Vetter
On Thu, May 16, 2024 at 11:12:19PM +0100, Adrián Larumbe wrote:
> Hi Daniel,
> 
> On 02.05.2024 10:09, Daniel Vetter wrote:
> > On Wed, May 01, 2024 at 07:50:43PM +0100, Adrián Larumbe wrote:
> > > Up to this day, all fdinfo-based GPU profilers must traverse the entire
> > > /proc directory structure to find open DRM clients with fdinfo file
> > > descriptors. This is inefficient and time-consuming.
> > > 
> > > This patch adds a new DRM ioctl that allows users to obtain a list of PIDs
> > > for clients who have opened the DRM device. Output from the ioctl isn't
> > > human-readable, and it's meant to be retrieved only by GPU profilers like
> > > gputop and nvtop.
> > > 
> > > Cc: Rob Clark 
> > > Cc: Tvrtko Ursulin 
> > > Signed-off-by: Adrián Larumbe 
> > > ---
> > >  drivers/gpu/drm/drm_internal.h |  1 +
> > >  drivers/gpu/drm/drm_ioctl.c| 89 ++
> > >  include/uapi/drm/drm.h |  7 +++
> > >  3 files changed, 97 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_internal.h 
> > > b/drivers/gpu/drm/drm_internal.h
> > > index 690505a1f7a5..6f78954cae16 100644
> > > --- a/drivers/gpu/drm/drm_internal.h
> > > +++ b/drivers/gpu/drm/drm_internal.h
> > > @@ -243,6 +243,7 @@ static inline void drm_debugfs_encoder_remove(struct 
> > > drm_encoder *encoder)
> > >  drm_ioctl_t drm_version;
> > >  drm_ioctl_t drm_getunique;
> > >  drm_ioctl_t drm_getclient;
> > > +drm_ioctl_t drm_getclients;
> > >  
> > >  /* drm_syncobj.c */
> > >  void drm_syncobj_open(struct drm_file *file_private);
> > > diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
> > > index e368fc084c77..da7057376581 100644
> > > --- a/drivers/gpu/drm/drm_ioctl.c
> > > +++ b/drivers/gpu/drm/drm_ioctl.c
> > > @@ -207,6 +207,93 @@ int drm_getclient(struct drm_device *dev, void *data,
> > >   }
> > >  }
> > >  
> > > +/*
> > > + * Get list of client PIDs who have opened a DRM file
> > > + *
> > > + * \param dev DRM device we are querying
> > > + * \param data IOCTL command input.
> > > + * \param file_priv DRM file private.
> > > + *
> > > + * \return zero on success or a negative number on failure.
> > > + *
> > > + * Traverses list of open clients for the given DRM device, and
> > > + * copies them into userpace as an array of PIDs
> > > + */
> > > +int drm_getclients(struct drm_device *dev, void *data,
> > > +struct drm_file *file_priv)
> > > +
> > > +{
> > > + struct drm_get_clients *get_clients = data;
> > > + ssize_t size = get_clients->len;
> > > + char __user *pid_buf;
> > > + ssize_t offset = 0;
> > > + int ret = 0;
> > > +
> > > + /*
> > > +  * We do not want to show clients of display only devices so
> > > +  * as to avoid confusing UM GPU profilers
> > > +  */
> > > + if (!dev->render) {
> > > + get_clients->len = 0;
> > > + return 0;
> > > + }
> > > +
> > > + /*
> > > +  * An input size of zero means UM wants to know the size of the PID 
> > > buffer
> > > +  * We round it up to the nearest multiple of the page size so that we 
> > > can have
> > > +  * some spare headroom in case more clients came in between successive 
> > > calls
> > > +  * of this ioctl, and also to simplify parsing of the PIDs buffer, 
> > > because
> > > +  * sizeof(pid_t) will hopefully always divide PAGE_SIZE
> > > +  */
> > > + if (size == 0) {
> > > + get_clients->len =
> > > + roundup(atomic_read(>open_count) * sizeof(pid_t), 
> > > PAGE_SIZE);
> > > + return 0;
> > > + }
> > > +
> > > + pid_buf = (char *)(void *)get_clients->user_data;
> > > +
> > > + if (!pid_buf)
> > > + return -EINVAL;
> > > +
> > > + mutex_lock(>filelist_mutex);
> > > + list_for_each_entry_reverse(file_priv, >filelist, lhead) {
> > > + pid_t pid_num;
> > > +
> > > + if ((size - offset) < sizeof(pid_t))
> > > + break;
> > > +
> > > + rcu_read_lock();
> > > + pid_num = pid_vnr(rcu_dereference(file_priv->pid));
> > > + rcu_read_unloc

Re: [PATCH] drm/etnaviv: switch devcoredump allocations to GFP_NOWAIT

2024-05-21 Thread Daniel Vetter
On Fri, May 17, 2024 at 10:18:50AM +0200, Philipp Zabel wrote:
> On Do, 2024-05-16 at 19:20 +0200, Lucas Stach wrote:
> > Am Freitag, dem 26.01.2024 um 17:46 +0100 schrieb Lucas Stach:
> > > The etnaviv devcoredump is created in the GPU reset path, which
> > > must make forward progress to avoid stalling memory reclaim on
> > > unsignalled dma fences. The currently used __GFP_NORETRY does not
> > > prohibit sleeping on direct reclaim, breaking the forward progress
> > > guarantee. Switch to GFP_NOWAIT, which allows background reclaim
> > > to be triggered, but avoids any stalls waiting for direct reclaim.
> > > 
> > Any takers for reviewing this one?
> > 
> > Regards,
> > Lucas
> > 
> > > Signed-off-by: Lucas Stach 
> > > ---
> > >  drivers/gpu/drm/etnaviv/etnaviv_dump.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_dump.c 
> > > b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > > index 898f84a0fc30c..42c5028872d54 100644
> > > --- a/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_dump.c
> > > @@ -159,8 +159,7 @@ void etnaviv_core_dump(struct etnaviv_gem_submit 
> > > *submit)
> > >   file_size += sizeof(*iter.hdr) * n_obj;
> > >  
> > >   /* Allocate the file in vmalloc memory, it's likely to be big */
> > > - iter.start = __vmalloc(file_size, GFP_KERNEL | __GFP_NOWARN |
> > > - __GFP_NORETRY);
> > > + iter.start = __vmalloc(file_size, GFP_NOWAIT | __GFP_NOWARN);
> > >   if (!iter.start) {
> > >   mutex_unlock(>mmu_context->lock);
> > >   dev_warn(gpu->dev, "failed to allocate devcoredump file\n");
> > > @@ -230,5 +229,6 @@ void etnaviv_core_dump(struct etnaviv_gem_submit 
> > > *submit)
> > >  
> > >   etnaviv_core_dump_header(, ETDUMP_BUF_END, iter.data);
> > >  
> > > - dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start, GFP_KERNEL);
> > > + dev_coredumpv(gpu->dev, iter.start, iter.data - iter.start,
> > > +   GFP_NOWAIT | __GFP_NOWARN);
> 
> Should this be __GFP_NOWARN? There is no fallback on failure, and if
> this fails and the __vmalloc() above didn't, there is no error message
> at all.

GFP_NOWAIT already has __GFP_NOWARN, so redundant. And there's really
nothing you can do as a fallback (aside from dmesg output, which this code
already does - ok mabye dev_coredump could also do warnings, but that's a
separate patch).

With the __GFP_NOWARN dropped:

Reviewed-by: Daniel Vetter 

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 0/5] Add support for GE SUNH hot-pluggable connector (was: "drm: add support for hot-pluggable bridges")

2024-05-16 Thread Daniel Vetter
t a lot clearer that we need to figure out who/when
  ->atomic_reset should be called for hotplugged bridges, maybe as part of
  connector registration when the entire bridge and it's new connector is
  assembled?

- Finally this very much means we need to rethink who/how the connector
  for a bridge is created. The new design is that the main driver creates
  this connector, once the entire bridge exists. But with hotplugging this
  gets a lot more complicated, so we might want to extract a pile of that
  encoder related code from drivers (same way dp mst helpers take care of
  connector creation too, it's just too much of a mess otherwise).

  The current bridge chaining infrastructure requires a lot of hand-rolled
  code in each bridge driver and the encoder, so that might be a good
  thing anyway.

- Finally I think the entire bridge hotplug infrastructure should be
  irrespective of the underlying bus. Which means for the mipi dsi case we
  might also want to look into what's missing to make mipi dsi
  hotunpluggable, at least for the case where it's a proper driver. I
  think we should ignore the old bridge model where driver's stitched it
  all toghether using the component framework, in my opinion that approach
  should be deprecated.

- Finally I think we should have a lot of safety checks, like only bridges
  which declare themselve to be hotunplug safe should be allowed as a part
  of the hotpluggable bridge chain part. All others must still be attached
  before the entire driver is registered with drm_dev_register.

  Or that we only allow bridges with the NO_CONNECTOR flag for
  drm_bridge_attach.

There's probably a pile more fundamental issues I've missed, but this
should get a good discussion started.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH 2/3] drm/tidss: Add support for display sharing

2024-05-16 Thread Daniel Vetter
;RTOS
> >>> controlling the display mode" as shared here [1]) and hence this is not
> >>> validated but the idea was to keep dt-bindings generic enough to support 
> >>> them
> >>> in future and that's why I referred to it here.
> >>>
> 
> 
> If I understand you correctly, for now the only real use case is when the
> the RTOS owns / manages the complete display pipeline and Linux can only
> own video planes.
> 
> The opposite is supported by the DSS hardware (thanks to its feature that
> allows partitioning the register space and having multiple per-host IRQs) 
> but it's not a real use case yet. The reason why this case is added to the
> DT binding is as you said for flexiblity and make the design future-proof.
> 
> >>> separate irq
> >>> Coming back to your questions, with the current scheme the Linux (tidss) 
> >>> would
> >>> be expected to make sure the CRTC being shared with RTOS is never 
> >>> shutdown and
> >>> the RTOS plane should never gets masked.
> >> 
> >> I'm probably missing something then here, but if the Linux side of
> >> things is expected to keep the current configuration and keep it active
> >> for it to work, what use-case would it be useful for?
> >> 
> >
> > It's just one of the partitioning possibilities that I mentioned here, that
> > Linux is in control of DSS as a whole and the user want the other host (be 
> > it
> > RTOS or any other core) to control a single plane. For e.g it could be Linux
> > (with GPU rendering) displaying the graphics and RTOS overlaying a real time
> > clock or any other signs which need to be displayed in real-time.
> > But more than the use-case this is inspired by the fact that we want to be
> > flexible and support in the linux driver whatever partitioning scheme
> > possibilities are there which are supported in hardware and we let user 
> > decide
> > on the partitioning scheme.
> >
> 
> A possible use case here could be if Linux is safer than the other host
> owning a single plane, right? Then in that case the RTOS could fail but
> the display pipeline won't be teared down.
> 
> That is, if your safety tell-tales would be driven by Linux and having
> other OS dislay the GPU-rendered QT based application on another plane.
> 
> But as said, for now that's a theorethical use case since the one you
> mentioned is the opposite.
> 
> []
> 
> >>>
> >>>> It's not just about interrupts, it's also about how your arbitrate
> >>>> between what Linux wants and what the RTOS wants. Like if the RTOS still
> >>>> wants to output something but Linux wants to disable it, how do you
> >>>> reconcile the two?
> >>>>
> >>>
> >>> The scheme involves static partitioning of display resource which are 
> >>> assigned
> >>> compile-time to RTOS and Linux. Here the RTOS firmware is compiled with
> >>> specific ownership/display resources as desired by user and this 
> >>> assignment
> >>> stays intact.
> >>>
> >>> If there is a more complex use-case which requires dynamic
> >>> assignment/arbitration of resources then I agree those require some sort 
> >>> of
> >>> IPC scheme but this is not what we target with these series. This series 
> >>> is
> >>> simply to support static partitioning feature (separate register space,
> >>> separate irq, firewalling support etc) of TI DSS hardware across the 
> >>> multiple
> >>> hosts and there are use-cases too for which this scheme suffices.
> >> 
> >> I think you're right and we have a misunderstanding. My initial
> >> assumption was that it was to prevent the Linux side of sides from
> >> screwing up the output if it was to crash.
> >> 
> >> But it looks like it's not the main point of this series, so could you
> >> share some use-cases you're trying to address?
> >> 
> >
> > The end use-case we have demonstrated right now with this series is a
> > proof-of-concept display cluster use-case where RTOS boots early on MCU core
> > (launched at bootloader stage) and initializes the display (using the global
> > common0 register space and irq) and starts displaying safety tell-tales on 
> > one
> > plane, and once Linux boots up on application processor,
> > Linux (using common1 register space and irq) controls the other plane with 
> > GPU
> > rendering using a QT based application. And yes, we also support the 
> > scenario
> > where Linux crashes but RTOS being the DSS master and in control of DSS 
> > power,
> > clock domain and global register space is not impacted by the crash.
> 
> You mention 2 scenarios but are actually the same? Or did I misunderstand?
> 
> In both cases the RTOS own the display pipeline and Linux can just display
> using a single plane.
> 
> That's why I think that agree with Maxime, that a fwkms could be a simpler
> solution to your use case instead of adding all this complexity to the DSS
> driver. Yes, I understand the HW supports all this flexibility but there's
> no real use case yet (you mentioned that don't even have firmware for this
> single plane owned by the RTOS in the R5F case).
> 
> The DT binding for a fwkms driver would be trivial, in fact maybe we might
> even leverage simpledrm for this case and not require a new driver at all.

I guess you can still do things like pageflipping and maybe use some of
the color/blending hardware? Maybe even have more than one plane
available? fwkms/simpledrm conceptually cannot really support pageflipping
even, so that's a much, much reduced feature set.

That all aside I do think we should limit the support to just the first
case, where linux gets a few pieces assigned to it and is not the DSS
master. From what I'm understanding you could assign entire crtc with
planes and everything to linux, so this shouldn't really constraint
real-world usage?

At least until there's support in firmware for this it's all way too
theoretical, and I agree with Maxime and Javier that there's some serious
design questions about how this kind of static leasing should work with
drm sitting on top.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 0/8] dma-buf: heaps: Support carved-out heaps and ECC related-flags

2024-05-16 Thread Daniel Vetter
On Wed, May 15, 2024 at 11:42:58AM -0700, John Stultz wrote:
> On Wed, May 15, 2024 at 6:57 AM Maxime Ripard  wrote:
> > This series is the follow-up of the discussion that John and I had a few
> > months ago here:
> >
> > https://lore.kernel.org/all/candhncqujn6bh3kxkf65bwitylvqsd9892-xtfdhhqqyrro...@mail.gmail.com/
> >
> > The initial problem we were discussing was that I'm currently working on
> > a platform which has a memory layout with ECC enabled. However, enabling
> > the ECC has a number of drawbacks on that platform: lower performance,
> > increased memory usage, etc. So for things like framebuffers, the
> > trade-off isn't great and thus there's a memory region with ECC disabled
> > to allocate from for such use cases.
> >
> > After a suggestion from John, I chose to start using heap allocations
> > flags to allow for userspace to ask for a particular ECC setup. This is
> > then backed by a new heap type that runs from reserved memory chunks
> > flagged as such, and the existing DT properties to specify the ECC
> > properties.
> >
> > We could also easily extend this mechanism to support more flags, or
> > through a new ioctl to discover which flags a given heap supports.
> 
> Hey! Thanks for sending this along! I'm eager to see more heap related
> work being done upstream.
> 
> The only thing that makes me a bit hesitant, is the introduction of
> allocation flags (as opposed to a uniquely specified/named "ecc"
> heap).
> 
> We did talk about this earlier, and my earlier press that only if the
> ECC flag was general enough to apply to the majority of heaps then it
> makes sense as a flag, and your patch here does apply it to all the
> heaps. So I don't have an objection.
> 
> But it makes me a little nervous to add a new generic allocation flag
> for a feature most hardware doesn't support (yet, at least). So it's
> hard to weigh how common the actual usage will be across all the
> heaps.
> 
> I apologize as my worry is mostly born out of seeing vendors really
> push opaque feature flags in their old ion heaps, so in providing a
> flags argument, it was mostly intended as an escape hatch for
> obviously common attributes. So having the first be something that
> seems reasonable, but isn't actually that common makes me fret some.
> 
> So again, not an objection, just something for folks to stew on to
> make sure this is really the right approach.

Another good reason to go with full heap names instead of opaque flags on
existing heaps is that with the former we can use symlinks in sysfs to
specify heaps, with the latter we need a new idea. We haven't yet gotten
around to implement this anywhere, but it's been in the dma-buf/heap todo
since forever, and I like it as a design approach. So would be a good idea
to not toss it. With that display would have symlinks to cma-ecc and cma,
and rendering maybe cma-ecc, shmem, cma heaps (in priority order) for a
SoC where the display needs contig memory for scanout.

> Another thing to discuss, that I didn't see in your mail: Do we have
> an open-source user of this new flag?

I think one option might be to just start using these internally, but not
sure the dma-api would understand a fallback cadence of allocators (afaik
you can specify specific cma regions already, but that doesn't really
covere the case where you can fall back to pages and iommu to remap to
contig dma space) ... And I don't think abandonding the dma-api for
allocating cma buffers is going to be a popular proposal.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: /sys/kernel/debug/vgaswitcheroo directory missing

2024-05-16 Thread Daniel Vetter
On Wed, May 08, 2024 at 08:38:00PM +0100, Chris Clayton wrote:
> 
> 
> On 08/05/2024 16:54, Daniel Vetter wrote:
> > On Wed, May 08, 2024 at 09:02:02AM +0100, Chris Clayton wrote:
> >> Hi,
> >>
> >> I'm running the latest development kernel - 6.9.0-rc7+ (HEAD is 
> >> dccb07f2914cdab2ac3a5b6c98406f765acab803.)
> >>
> >> As I say in $SUBJECT, the directory /sys/kernel/debug/vgaswitcheroo is 
> >> missing in this release. Perhaps more importantly
> >> unless it is configured to simply blank the screen, when xscreensaver 
> >> kicks in an error message flashes rapidly on and
> >> off complaining that no GL graphics are available. Moreover, if I start 
> >> scribus from qterminal, I see the message
> >> "Inconsistent value (1) for DRI_PRIME. Should be < 1 (GPU devices count). 
> >> Using: 0".
> >>
> >> This same userspace works fine with kernels 6.6.30 and 6.8.9
> >>
> >> lsmod shows that the nouveau module is loaded and lsof shows that 
> >> libdrm_nouveau is loaded for Xorg and a few desktop
> >> applications. However, inspecting the nouveau-related output from dmesg 
> >> reveals:
> >>
> >> [Wed May  8 08:20:07 2024] nouveau: detected PR support, will not use DSM
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: enabling device (0006 -> 
> >> 0007)
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: NVIDIA TU117 (167000a1)
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: bios: version 
> >> 90.17.42.00.36
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: pmu: firmware unavailable
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: fb: 4096 MiB GDDR6
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: sec2(acr): mbox 0007 
> >> 
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: sec2(acr):AHESASC: boot 
> >> failed: -5
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: acr: init failed, -5
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: init failed with -5
> >> [Wed May  8 08:20:07 2024] nouveau: DRM-master::0080: init 
> >> failed with -5
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: DRM-master: Device 
> >> allocation failed: -5
> >> [Wed May  8 08:20:07 2024] nouveau :01:00.0: probe with driver nouveau 
> >> failed with error -5
> >>
> >> With kernel 6.8.9 the equivalent output is :
> >>
> >> Wed May  8 08:51:07 2024] nouveau: detected PR support, will not use DSM
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: enabling device (0006 -> 
> >> 0007)
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: NVIDIA TU117 (167000a1)
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: bios: version 
> >> 90.17.42.00.36
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: pmu: firmware unavailable
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: fb: 4096 MiB GDDR6
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: VRAM: 4096 MiB
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: GART: 536870912 MiB
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: BIT table 'A' not 
> >> found
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: BIT table 'L' not 
> >> found
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: TMDS table version 
> >> 2.0
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: MM: using COPY for 
> >> buffer copies
> >> [Wed May  8 08:51:07 2024] [drm] Initialized nouveau 1.4.0 20120801 for 
> >> :01:00.0 on minor 1
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: [drm] No compatible 
> >> format found
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: [drm] Cannot find any 
> >> crtc or sizes
> >> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: Disabling PCI power 
> >> management to avoid bug
> >>
> >> I've attached the complete dmesg output from 6.9.8-rc7+.
> > 
> > I'm assuming that the working kernel's dmesg shows that the proprietary
> > nvidia driver is loaded, which provides all the services and gl. And now
> > that somehow the nouveau driver loads (but doesn't work correctly for some
> > reason, maybe because the userspace is missing) stuff is on fire.
> > 
> > If this assumption is correct you need to reinstall your nvidia driver
> > stack and bother nvidia with any issues, not upstream.
> > -Sima
> > 
> 
>

Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-16 Thread Daniel Vetter
On Thu, May 09, 2024 at 10:23:16AM +0100, Daniel Stone wrote:
> Hi,
> 
> On Wed, 8 May 2024 at 16:49, Daniel Vetter  wrote:
> > On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
> > > Right now, if your platform requires CMA for display, then the app
> > > needs access to the GPU render node and the display node too, in order
> > > to allocate buffers which the compositor can scan out directly. If it
> > > only has access to the render nodes and not the display node, it won't
> > > be able to allocate correctly, so its content will need a composition
> > > pass, i.e. performance penalty for sandboxing. But if it can allocate
> > > correctly, then hey, it can exhaust CMA just like heaps can.
> > >
> > > Personally I think we'd be better off just allowing access and
> > > figuring out cgroups later. It's not like the OOM story is great
> > > generally, and hey, you can get there with just render nodes ...
> >
> > Imo the right fix is to ask the compositor to allocate the buffers in this
> > case, and then maybe have some kind of revoke/purge behaviour on these
> > buffers. Compositor has an actual idea of who's a candidate for direct
> > scanout after all, not the app. Or well at least force migrate the memory
> > from cma to shmem.
> >
> > If you only whack cgroups on this issue you're still stuck in the world
> > where either all apps together can ddos the display or no one can
> > realistically direct scanout.
> 
> Mmm, back to DRI2. I can't say I'm wildly enthused about that, not
> least because a client using GPU/codec/etc for those buffers would
> have to communicate its requirements (alignment etc) forward to the
> compositor in order for the compositor to allocate for it. Obviously
> passing the constraints etc around isn't a solved problem yet, but it
> is at least contained down in clients rather than making it back and
> forth between client and compositor.

I don't think you need the compositor to allocate the buffer from the
requirements, you only need a protocol that a) allocates a buffer of a
given size from a given heap and b) has some kinda of revoke provisions so
that the compositor can claw back the memory again when it needs it.

> I'm extremely not-wild about the compositor migrating memory from CMA
> to shmem behind the client's back, and tbh I'm not sure how that would
> even work if the client has it pinned through whatever API it's
> imported into.

Other option is revoke on cma buffers that are allocated by clients, for
the case the compositor needs it.

> Anyway, like Laurent says, if we're deciding that heaps can't be used
> by generic apps (unlike DRM/V4L2/etc), then we need gralloc.

gralloc doesn't really fix this, it's just abstraction around how/where
you allocate?

Anyway the current plan is that we all pretend this issue of CMA allocated
buffers don't exist and we let clients allocate without limits. Given that
we don't even have cgroups to sort out the mess for anything else I
wouldn't worry too much ...
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-16 Thread Daniel Vetter
On Mon, May 13, 2024 at 01:51:23PM +, Simon Ser wrote:
> On Wednesday, May 8th, 2024 at 17:49, Daniel Vetter  wrote:
> 
> > On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
> > 
> > > On Wed, 8 May 2024 at 09:33, Daniel Vetter dan...@ffwll.ch wrote:
> > > 
> > > > On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
> > > > 
> > > > > That would have the unfortunate side effect of making sandboxed apps
> > > > > less efficient on some platforms, since they wouldn't be able to do
> > > > > direct scanout anymore ...
> > > > 
> > > > I was assuming that everyone goes through pipewire, and ideally that is
> > > > the only one that can even get at these special chardev.
> > > > 
> > > > If pipewire is only for sandboxed apps then yeah this aint great :-/
> > > 
> > > No, PipeWire is fine, I mean graphical apps.
> > > 
> > > Right now, if your platform requires CMA for display, then the app
> > > needs access to the GPU render node and the display node too, in order
> > > to allocate buffers which the compositor can scan out directly. If it
> > > only has access to the render nodes and not the display node, it won't
> > > be able to allocate correctly, so its content will need a composition
> > > pass, i.e. performance penalty for sandboxing. But if it can allocate
> > > correctly, then hey, it can exhaust CMA just like heaps can.
> > > 
> > > Personally I think we'd be better off just allowing access and
> > > figuring out cgroups later. It's not like the OOM story is great
> > > generally, and hey, you can get there with just render nodes ...
> > 
> > Imo the right fix is to ask the compositor to allocate the buffers in this
> > case, and then maybe have some kind of revoke/purge behaviour on these
> > buffers. Compositor has an actual idea of who's a candidate for direct
> > scanout after all, not the app. Or well at least force migrate the memory
> > from cma to shmem.
> > 
> > If you only whack cgroups on this issue you're still stuck in the world
> > where either all apps together can ddos the display or no one can
> > realistically direct scanout.
> > 
> > So yeah on the display side the problem isn't solved either, but we knew
> > that already.
> 
> What makes scanout memory so special?
> 
> The way I see it, any kind of memory will always be a limited resource:
> regular programs can exhaust system memory, as well as GPU VRAM, as well
> as scanout memory. I think we need to have ways to limit/control/arbiter
> the allocations regardless, and I don't think scanout memory should be a
> special case here.

(Long w/en and I caught a cold)

It's not scanout that's special, it's cma memory that's special. Because
once you've allocated it, it's gone since it cannot be swapped out, and
there's not a lot of it to go around. Which means even if we'd have
cgroups for all the various gpu allocation heaps, you can't use cgroups to
manage cma in a meaningful way:

- You set the cgroup limits so low for apps that it's guaranteed that the
  compositor will always be able to allocate enough scanout memory for
  it's need. That will be low enough that apps can never allocate scanout
  buffers themselves.

- Or you set the limit high enough so that apps can allocate enough, which
  means (as soon as you have more than just one app and not a totally
  bonkers amount of cma) that the compositor might not be able to allocate
  anymore.

It's kinda shit situation, which is also why you need the compositor to be
able to revoke cma allocations it has handed to clients (like with drm
leases).

Or we just keep the current yolo situation.

For any other memory type than CMA most of the popular drivers at least
implement swapping, which gives you a ton more flexibility in setting up
limits in a way that actually work. But even there we'd need cgroups first
to make sure things don't go wrong too badly in the face of evil apps ...
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 2/2] drm/fourcc.h: Add libcamera to Open Source Waiver

2024-05-09 Thread Daniel Vetter
On Wed, Feb 28, 2024 at 11:22:44AM +0100, Jacopo Mondi wrote:
> The libcamera (www.libcamera.org) project uses the drm/fourcc.h header
> to define its own image formats. Albeit libcamera aims for fully open
> source driver and userspace software stacks, it is licensed with the
> 'GNU L-GPL' license which allows closed source application to link
> against the library.
> 
> Add libcamera to the list projects to which the 'Open Source User
> Waiver' notice applies.
> 
> Signed-off-by: Jacopo Mondi 
> ---
>  include/uapi/drm/drm_fourcc.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
> index 4e6df826946a..beef743ac818 100644
> --- a/include/uapi/drm/drm_fourcc.h
> +++ b/include/uapi/drm/drm_fourcc.h
> @@ -97,6 +97,7 @@ extern "C" {
>   *
>   * - GL
>   * - Vulkan extensions
> + * - libcamera

I think we can bikeshed whether we want to be more specific (with like
listing the gl/vk extensions), but imo it's a good start and imo also
totally makes sense to officially list libcamera. On both patches.

Acked-by: Daniel Vetter 

I think collect a handful more acks from drm and libcamera folks and then
land this.
-Sima

>   *
>   * and other standards, and hence used both by open source and closed source
>   * driver stacks, the usual requirement for an upstream in-kernel or open 
> source
> -- 
> 2.43.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: /sys/kernel/debug/vgaswitcheroo directory missing

2024-05-08 Thread Daniel Vetter
On Wed, May 08, 2024 at 09:02:02AM +0100, Chris Clayton wrote:
> Hi,
> 
> I'm running the latest development kernel - 6.9.0-rc7+ (HEAD is 
> dccb07f2914cdab2ac3a5b6c98406f765acab803.)
> 
> As I say in $SUBJECT, the directory /sys/kernel/debug/vgaswitcheroo is 
> missing in this release. Perhaps more importantly
> unless it is configured to simply blank the screen, when xscreensaver kicks 
> in an error message flashes rapidly on and
> off complaining that no GL graphics are available. Moreover, if I start 
> scribus from qterminal, I see the message
> "Inconsistent value (1) for DRI_PRIME. Should be < 1 (GPU devices count). 
> Using: 0".
> 
> This same userspace works fine with kernels 6.6.30 and 6.8.9
> 
> lsmod shows that the nouveau module is loaded and lsof shows that 
> libdrm_nouveau is loaded for Xorg and a few desktop
> applications. However, inspecting the nouveau-related output from dmesg 
> reveals:
> 
> [Wed May  8 08:20:07 2024] nouveau: detected PR support, will not use DSM
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: enabling device (0006 -> 
> 0007)
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: NVIDIA TU117 (167000a1)
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: bios: version 90.17.42.00.36
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: pmu: firmware unavailable
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: fb: 4096 MiB GDDR6
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: sec2(acr): mbox 0007 
> 
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: sec2(acr):AHESASC: boot 
> failed: -5
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: acr: init failed, -5
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: init failed with -5
> [Wed May  8 08:20:07 2024] nouveau: DRM-master::0080: init failed 
> with -5
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: DRM-master: Device 
> allocation failed: -5
> [Wed May  8 08:20:07 2024] nouveau :01:00.0: probe with driver nouveau 
> failed with error -5
> 
> With kernel 6.8.9 the equivalent output is :
> 
> Wed May  8 08:51:07 2024] nouveau: detected PR support, will not use DSM
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: enabling device (0006 -> 
> 0007)
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: NVIDIA TU117 (167000a1)
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: bios: version 90.17.42.00.36
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: pmu: firmware unavailable
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: fb: 4096 MiB GDDR6
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: VRAM: 4096 MiB
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: GART: 536870912 MiB
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: BIT table 'A' not found
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: BIT table 'L' not found
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: TMDS table version 2.0
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: MM: using COPY for 
> buffer copies
> [Wed May  8 08:51:07 2024] [drm] Initialized nouveau 1.4.0 20120801 for 
> :01:00.0 on minor 1
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: [drm] No compatible format 
> found
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: [drm] Cannot find any crtc 
> or sizes
> [Wed May  8 08:51:07 2024] nouveau :01:00.0: DRM: Disabling PCI power 
> management to avoid bug
> 
> I've attached the complete dmesg output from 6.9.8-rc7+.

I'm assuming that the working kernel's dmesg shows that the proprietary
nvidia driver is loaded, which provides all the services and gl. And now
that somehow the nouveau driver loads (but doesn't work correctly for some
reason, maybe because the userspace is missing) stuff is on fire.

If this assumption is correct you need to reinstall your nvidia driver
stack and bother nvidia with any issues, not upstream.
-Sima

> 
> Please cc me on any reply as I'm not subscribed.
> 
> Chris
> 

> [Wed May  8 08:20:04 2024] Linux version 6.9.0-rc7+ (chris@laptop) (gcc14 
> (GCC) 14.0.1 20240503 (prerelease), GNU ld (GNU Binutils) 2.42) #283 SMP 
> PREEMPT_DYNAMIC Tue May  7 06:58:55 BST 2024
> [Wed May  8 08:20:04 2024] Command line: BOOT_IMAGE=/boot/vmlinuz-6.9.0-rc7+ 
> ro root=PARTUUID=f927883a-e95c-4cdd-b64e-a0a778216b9f 
> resume=PARTUUID=70ccedc5-d788-42bc-9f13-81e2beb61338 rootfstype=ext4 
> net.ifnames=0 video=1920x1080@60
> [Wed May  8 08:20:04 2024] BIOS-provided physical RAM map:
> [Wed May  8 08:20:04 2024] BIOS-e820: [mem 
> 0x-0x0009efff] usable
> [Wed May  8 08:20:04 2024] BIOS-e820: [mem 
> 0x0009f000-0x000f] reserved
> [Wed May  8 08:20:04 2024] BIOS-e820: [mem 
> 0x0010-0x7e1d8fff] usable
> [Wed May  8 08:20:04 2024] BIOS-e820: [mem 
> 0x7e1d9000-0x7ead8fff] reserved
> [Wed May  8 08:20:04 2024] BIOS-e820: [mem 
> 0x7ead9000-0x8cceefff] usable
> [Wed May  8 08:20:04 2024] BIOS-e820: [mem 
> 0x8ccef000-0x8eedefff] reserved

Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-08 Thread Daniel Vetter
On Wed, May 08, 2024 at 09:38:33AM +0100, Daniel Stone wrote:
> On Wed, 8 May 2024 at 09:33, Daniel Vetter  wrote:
> > On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
> > > That would have the unfortunate side effect of making sandboxed apps
> > > less efficient on some platforms, since they wouldn't be able to do
> > > direct scanout anymore ...
> >
> > I was assuming that everyone goes through pipewire, and ideally that is
> > the only one that can even get at these special chardev.
> >
> > If pipewire is only for sandboxed apps then yeah this aint great :-/
> 
> No, PipeWire is fine, I mean graphical apps.
> 
> Right now, if your platform requires CMA for display, then the app
> needs access to the GPU render node and the display node too, in order
> to allocate buffers which the compositor can scan out directly. If it
> only has access to the render nodes and not the display node, it won't
> be able to allocate correctly, so its content will need a composition
> pass, i.e. performance penalty for sandboxing. But if it can allocate
> correctly, then hey, it can exhaust CMA just like heaps can.
> 
> Personally I think we'd be better off just allowing access and
> figuring out cgroups later. It's not like the OOM story is great
> generally, and hey, you can get there with just render nodes ...

Imo the right fix is to ask the compositor to allocate the buffers in this
case, and then maybe have some kind of revoke/purge behaviour on these
buffers. Compositor has an actual idea of who's a candidate for direct
scanout after all, not the app. Or well at least force migrate the memory
from cma to shmem.

If you only whack cgroups on this issue you're still stuck in the world
where either all apps together can ddos the display or no one can
realistically direct scanout.

So yeah on the display side the problem isn't solved either, but we knew
that already.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Linaro-mm-sig] Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

2024-05-08 Thread Daniel Vetter
eak.
> 
> So say we record the fdtable to get ownership of that file descriptor so
> P2 doesn't close anything in (2) that really belongs to P1 to fix that
> problem.
> 
> But afaict, that would break another possible use-case. Namely, where P1
> creates an epoll instance and registeres fds and then fork()s to create
> P2. Now P1 can exit and P2 takes over the epoll loop of P1. This
> wouldn't work anymore because P1 would deregister all fds it owns in
> that epoll instance during exit. I didn't see an immediate nice way of
> fixing that issue.
> 
> But note that taking over an epoll loop from the parent doesn't work
> reliably for some file descriptors. Consider man signalfd(2):
> 
>epoll(7) semantics
>If a process adds (via epoll_ctl(2)) a signalfd file descriptor to an 
> epoll(7) instance,
>then epoll_wait(2) returns events only for signals sent to that 
> process.  In particular,
>if  the process then uses fork(2) to create a child process, then the 
> child will be able
>to read(2) signals that  are  sent  to  it  using  the  signalfd  file 
>  descriptor,  but
>epoll_wait(2)  will  not  indicate  that the signalfd file descriptor 
> is ready.  In this
>scenario, a possible workaround is that after the fork(2), the child 
> process  can  close
>the  signalfd  file descriptor that it inherited from the parent 
> process and then create
>another signalfd file descriptor and add it to the epoll instance.   
> Alternatively,  the
>parent and the child could delay creating their (separate) signalfd 
> file descriptors and
>adding them to the epoll instance until after the call to fork(2).
> 
> So effectively P1 opens a signalfd and registers it in an epoll
> instance. Then it fork()s and creates P2. Now both P1 and P2 call
> epoll_wait(). Since signalfds are always relative to the caller and P1
> did call signalfd_poll() to register the callback only P1 can get
> events. So P2 can't take over signalfds in that epoll loop.
> 
> Honestly, the inheritance semantics of epoll across fork() seem pretty
> wonky and it would've been better if an epoll fd inherited across
> would've returned ESTALE or EINVAL or something. And if that inheritance
> of epoll instances would really be a big use-case there'd be some
> explicit way to enable this.

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH net-next v8 02/14] net: page_pool: create hooks for custom page providers

2024-05-08 Thread Daniel Vetter
On Wed, May 08, 2024 at 12:35:52PM +0100, Pavel Begunkov wrote:
> On 5/8/24 08:16, Daniel Vetter wrote:
> > On Tue, May 07, 2024 at 08:32:47PM -0300, Jason Gunthorpe wrote:
> > > On Tue, May 07, 2024 at 08:35:37PM +0100, Pavel Begunkov wrote:
> > > > On 5/7/24 18:56, Jason Gunthorpe wrote:
> > > > > On Tue, May 07, 2024 at 06:25:52PM +0100, Pavel Begunkov wrote:
> > > > > > On 5/7/24 17:48, Jason Gunthorpe wrote:
> > > > > > > On Tue, May 07, 2024 at 09:42:05AM -0700, Mina Almasry wrote:
> > > > > > > 
> > > > > > > > 1. Align with devmem TCP to use udmabuf for your io_uring 
> > > > > > > > memory. I
> > > > > > > > think in the past you said it's a uapi you don't link but in 
> > > > > > > > the face
> > > > > > > > of this pushback you may want to reconsider.
> > > > > > > 
> > > > > > > dmabuf does not force a uapi, you can acquire your pages however 
> > > > > > > you
> > > > > > > want and wrap them up in a dmabuf. No uapi at all.
> > > > > > > 
> > > > > > > The point is that dmabuf already provides ops that do basically 
> > > > > > > what
> > > > > > > is needed here. We don't need ops calling ops just because 
> > > > > > > dmabuf's
> > > > > > > ops are not understsood or not perfect. Fixup dmabuf.
> > > > > > 
> > > > > > Those ops, for example, are used to efficiently return used buffers
> > > > > > back to the kernel, which is uapi, I don't see how dmabuf can be
> > > > > > fixed up to cover it.
> > > > > 
> > > > > Sure, but that doesn't mean you can't use dma buf for the other parts
> > > > > of the flow. The per-page lifetime is a different topic than the
> > > > > refcounting and access of the entire bulk of memory.
> > > > 
> > > > Ok, so if we're leaving uapi (and ops) and keep per page/sub-buffer as
> > > > is, the rest is resolving uptr -> pages, and passing it to page pool in
> > > > a convenient to page pool format (net_iov).
> > > 
> > > I'm not going to pretend to know about page pool details, but dmabuf
> > > is the way to get the bulk of pages into a pool within the net stack's
> > > allocator and keep that bulk properly refcounted while.
> > > 
> > > An object like dmabuf is needed for the general case because there are
> > > not going to be per-page references or otherwise available.
> > > 
> > > What you seem to want is to alter how the actual allocation flow works
> > > from that bulk of memory and delay the free. It seems like a different
> > > topic to me, and honestly hacking into the allocator free function
> > > seems a bit weird..
> > 
> > Also I don't see how it's an argument against dma-buf as the interface for
> 
> It's not, neither I said it is, but it is an argument against removing
> the network's page pool ops.
> 
> > all these, because e.g. ttm internally does have a page pool because
> > depending upon allocator, that's indeed beneficial. Other drm drivers have
> > more buffer-based concepts for opportunistically memory around, usually
> > by marking buffers that are just kept as cache as purgeable (which is a
> > concept that goes all the way to opengl/vulkan).
> 
> Because in this case it solves nothing and helps with nothing, quite
> the opposite. Just as well we can ask why NVMe doesn't wrap user pages
> into a dmabuf while doing IO.

Because the rules around memory reclaim, gfp nesting and guaranteed
forward progress don't match up for block i/o. I looked quite a bit into
gluing direct i/o into dma-buf because there's vulkan extensions for that,
and it's an absolute mess.
-Sima

> 
> > But these are all internals of the dma-buf exporter, the dma-buf api users
> > don't ever need to care.
> > -Sima
> 
> -- 
> Pavel Begunkov

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/buddy: Fix the range bias clear memory allocation issue

2024-05-08 Thread Daniel Vetter
On Wed, May 08, 2024 at 12:27:20PM +0530, Arunpravin Paneer Selvam wrote:
> Problem statement: During the system boot time, an application request
> for the bulk volume of cleared range bias memory when the clear_avail
> is zero, we dont fallback into normal allocation method as we had an
> unnecessary clear_avail check which prevents the fallback method leads
> to fb allocation failure following system goes into unresponsive state.
> 
> Solution: Remove the unnecessary clear_avail check in the range bias
> allocation function.
> 
> Signed-off-by: Arunpravin Paneer Selvam 
> Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
> Reviewed-by: Matthew Auld 
> ---
>  drivers/gpu/drm/drm_buddy.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Can you please also add a kunit test case to exercise this corner case and
make sure it stays fixed?

Thanks, Sima
> 
> diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
> index 284ebae71cc4..831929ac95eb 100644
> --- a/drivers/gpu/drm/drm_buddy.c
> +++ b/drivers/gpu/drm/drm_buddy.c
> @@ -574,7 +574,7 @@ __drm_buddy_alloc_range_bias(struct drm_buddy *mm,
>  
>   block = __alloc_range_bias(mm, start, end, order,
>  flags, fallback);
> - if (IS_ERR(block) && mm->clear_avail)
> + if (IS_ERR(block))
>   return __alloc_range_bias(mm, start, end, order,
> flags, !fallback);
>  
> -- 
> 2.25.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-08 Thread Daniel Vetter
On Tue, May 07, 2024 at 10:59:42PM +0300, Dmitry Baryshkov wrote:
> On Tue, 7 May 2024 at 21:40, Laurent Pinchart
>  wrote:
> >
> > On Tue, May 07, 2024 at 06:19:18PM +0300, Dmitry Baryshkov wrote:
> > > On Tue, 7 May 2024 at 18:15, Bryan O'Donoghue wrote:
> > > > On 07/05/2024 16:09, Dmitry Baryshkov wrote:
> > > > > Ah, I see. Then why do you require the DMA-ble buffer at all? If you 
> > > > > are
> > > > > providing data to VPU or DRM, then you should be able to get the 
> > > > > buffer
> > > > > from the data-consuming device.
> > > >
> > > > Because we don't necessarily know what the consuming device is, if any.
> > > >
> > > > Could be VPU, could be Zoom/Hangouts via pipewire, could for argument
> > > > sake be GPU or DSP.
> > > >
> > > > Also if we introduce a dependency on another device to allocate the
> > > > output buffers - say always taking the output buffer from the GPU, then
> > > > we've added another dependency which is more difficult to guarantee
> > > > across different arches.
> > >
> > > Yes. And it should be expected. It's a consumer who knows the
> > > restrictions on the buffer. As I wrote, Zoom/Hangouts should not
> > > require a DMA buffer at all.
> >
> > Why not ? If you want to capture to a buffer that you then compose on
> > the screen without copying data, dma-buf is the way to go. That's the
> > Linux solution for buffer sharing.
> 
> Yes. But it should be allocated by the DRM driver. As Sima wrote,
> there is no guarantee that the buffer allocated from dma-heaps is
> accessible to the GPU.
> 
> >
> > > Applications should be able to allocate
> > > the buffer out of the generic memory.
> >
> > If applications really want to copy data and degrade performance, they
> > are free to shoot themselves in the foot of course. Applications (or
> > compositors) need to support copying as a fallback in the worst case,
> > but all components should at least aim for the zero-copy case.
> 
> I'd say that they should aim for the optimal case. It might include
> both zero-copying access from another DMA master or simple software
> processing of some kind.
> 
> > > GPUs might also have different
> > > requirements. Consider GPUs with VRAM. It might be beneficial to
> > > allocate a buffer out of VRAM rather than generic DMA mem.
> >
> > Absolutely. For that we need a centralized device memory allocator in
> > userspace. An effort was started by James Jones in 2016, see [1]. It has
> > unfortunately stalled. If I didn't have a camera framework to develop, I
> > would try to tackle that issue :-)
> 
> I'll review the talk. However the fact that the effort has stalled
> most likely means that 'one fits them all' approach didn't really fly
> well. We have too many usecases.

I think there's two reasons:

- It's a really hard problem with many aspects. Where you need to allocate
  the buffer is just one of the myriad of issues a common allocator needs
  to solve.

- Every linux-based os has their own solution for these, and the one that
  suffers most has an entirely different one from everyone else: Android
  uses binder services to allow apps to make these allocations, keep track
  of them and make sure there's no abuse. And if there is, it can just
  nuke the app.

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-08 Thread Daniel Vetter
On Tue, May 07, 2024 at 04:07:39PM -0400, Nicolas Dufresne wrote:
> Hi,
> 
> Le mardi 07 mai 2024 à 21:36 +0300, Laurent Pinchart a écrit :
> > Shorter term, we have a problem to solve, and the best option we have
> > found so far is to rely on dma-buf heaps as a backend for the frame
> > buffer allocatro helper in libcamera for the use case described above.
> > This won't work in 100% of the cases, clearly. It's a stop-gap measure
> > until we can do better.
> 
> Considering the security concerned raised on this thread with dmabuf heap
> allocation not be restricted by quotas, you'd get what you want quickly with
> memfd + udmabuf instead (which is accounted already).
> 
> It was raised that distro don't enable udmabuf, but as stated there by Hans, 
> in
> any cases distro needs to take action to make the softISP works. This
> alternative is easy and does not interfere in anyway with your future plan or
> the libcamera API. You could even have both dmabuf heap (for Raspbian) and the
> safer memfd+udmabuf for the distro with security concerns.
> 
> And for the long term plan, we can certainly get closer by fixing that issue
> with accounting. This issue also applied to v4l2 io-ops, so it would be nice 
> to
> find common set of helpers to fix these exporters.

Yeah if this is just for softisp, then memfd + udmabuf is also what I was
about to suggest. Not just as a stopgap, but as the real official thing.

udmabuf does kinda allow you to pin memory, but we can easily fix that by
adding the right accounting and then either let mlock rlimits or cgroups
kernel memory limits enforce good behavior.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-08 Thread Daniel Vetter
On Wed, May 08, 2024 at 06:46:53AM +0100, Daniel Stone wrote:
> Hi,
> 
> On Tue, 7 May 2024 at 12:15, Daniel Vetter  wrote:
> > On Mon, May 06, 2024 at 04:01:42PM +0200, Hans de Goede wrote:
> > > On 5/6/24 3:38 PM, Daniel Vetter wrote:
> > > I agree that bad applications are an issue, but not for the flathub / 
> > > snaps
> > > case. Flatpacks / snaps run sandboxed and don't have access to a full /dev
> > > so those should not be able to open /dev/dma_heap/* independent of
> > > the ACLs on /dev/dma_heap/*. The plan is for cameras using the
> > > libcamera software ISP to always be accessed through pipewire and
> > > the camera portal, so in this case pipewere is taking the place of
> > > the compositor in your kms vs render node example.
> >
> > Yeah essentially if you clarify to "set the permissions such that pipewire
> > can do allocations", then I think that makes sense. And is at the same
> > level as e.g. drm kms giving compsitors (but _only_ compositors) special
> > access rights.
> 
> That would have the unfortunate side effect of making sandboxed apps
> less efficient on some platforms, since they wouldn't be able to do
> direct scanout anymore ...

I was assuming that everyone goes through pipewire, and ideally that is
the only one that can even get at these special chardev.

If pipewire is only for sandboxed apps then yeah this aint great :-/
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Linaro-mm-sig] Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

2024-05-08 Thread Daniel Vetter
On Wed, May 08, 2024 at 07:55:08AM +0200, Christian König wrote:
> Am 07.05.24 um 21:07 schrieb Linus Torvalds:
> > On Tue, 7 May 2024 at 11:04, Daniel Vetter  wrote:
> > > On Tue, May 07, 2024 at 09:46:31AM -0700, Linus Torvalds wrote:
> > > 
> > > > I'd be perfectly ok with adding a generic "FISAME" VFS level ioctl
> > > > too, if this is possibly a more common thing. and not just DRM wants
> > > > it.
> > > > 
> > > > Would something like that work for you?
> > > Yes.
> > > 
> > > Adding Simon and Pekka as two of the usual suspects for this kind of
> > > stuff. Also example code (the int return value is just so that callers 
> > > know
> > > when kcmp isn't available, they all only care about equality):
> > > 
> > > https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/util/os_file.c#L239
> > That example thing shows that we shouldn't make it a FISAME ioctl - we
> > should make it a fcntl() instead, and it would just be a companion to
> > F_DUPFD.
> > 
> > Doesn't that strike everybody as a *much* cleaner interface? I think
> > F_ISDUP would work very naturally indeed with F_DUPFD.
> > 
> > Yes? No?
> 
> Sounds absolutely sane to me.

Yeah fcntl(fd1, F_ISDUP, fd2); sounds extremely reasonable to me too.

Aside, after some irc discussions I paged a few more of the relevant info
back in, and at least for dma-buf we kinda sorted this out by going away
from the singleton inode in this patch: ed63bb1d1f84 ("dma-buf: give each
buffer a full-fledged inode")

It's uapi now so we can't ever undo that, but with hindsight just the
F_ISDUP is really what we wanted. Because we have no need for that inode
aside from the unique inode number that's only used to compare dma-buf fd
for sameness, e.g.

https://gitlab.freedesktop.org/wlroots/wlroots/-/blob/master/render/vulkan/texture.c#L490

The one question I have is whether this could lead to some exploit tools,
because at least the android conformance test suite verifies that kcmp
isn't available to apps (which is where we need it, because even with all
the binder-based isolation gpu userspace still all run in the application
process due to performance reasons, any ipc at all is just too much).

Otoh if we just add this to drm fd as an ioctl somewhere, then it will
also be available to every android app because they all do need the gpu
for rendering. So going with the full generic fcntl is probably best.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v1 2/2] vfio/pci: Allow MMIO regions to be exported through dma-buf

2024-05-08 Thread Daniel Vetter
On Tue, May 07, 2024 at 09:31:53PM -0300, Jason Gunthorpe wrote:
> On Thu, May 02, 2024 at 07:50:36AM +, Kasireddy, Vivek wrote:
> > Hi Jason,
> > 
> > > 
> > > On Tue, Apr 30, 2024 at 04:24:50PM -0600, Alex Williamson wrote:
> > > > > +static vm_fault_t vfio_pci_dma_buf_fault(struct vm_fault *vmf)
> > > > > +{
> > > > > + struct vm_area_struct *vma = vmf->vma;
> > > > > + struct vfio_pci_dma_buf *priv = vma->vm_private_data;
> > > > > + pgoff_t pgoff = vmf->pgoff;
> > > > > +
> > > > > + if (pgoff >= priv->nr_pages)
> > > > > + return VM_FAULT_SIGBUS;
> > > > > +
> > > > > + return vmf_insert_pfn(vma, vmf->address,
> > > > > +   page_to_pfn(priv->pages[pgoff]));
> > > > > +}
> > > >
> > > > How does this prevent the MMIO space from being mmap'd when disabled
> > > at
> > > > the device?  How is the mmap revoked when the MMIO becomes disabled?
> > > > Is it part of the move protocol?
> > In this case, I think the importers that mmap'd the dmabuf need to be 
> > tracked
> > separately and their VMA PTEs need to be zapped when MMIO access is revoked.
> 
> Which, as we know, is quite hard.
> 
> > > Yes, we should not have a mmap handler for dmabuf. vfio memory must be
> > > mmapped in the normal way.
> > Although optional, I think most dmabuf exporters (drm ones) provide a mmap
> > handler. Otherwise, there is no easy way to provide CPU access (backup slow 
> > path)
> > to the dmabuf for the importer.
> 
> Here we should not, there is no reason since VFIO already provides a
> mmap mechanism itself. Anything using this API should just call the
> native VFIO function instead of trying to mmap the DMABUF. Yes, it
> will be inconvient for the scatterlist case you have, but the kernel
> side implementation is much easier ..

Just wanted to confirm that it's entirely legit to not implement dma-buf
mmap. Same for the in-kernel vmap functions. Especially for really funny
buffers like these it's just not a good idea, and the dma-buf interfaces
are intentionally "everything is optional".

Similarly you can (and should) reject and dma_buf_attach to devices where
p2p connectevity isn't there, or well really for any other reason that
makes stuff complicated and is out of scope for your use-case. It's better
to reject strictly and than accidentally support something really horrible
(we've been there).

The only real rule with all the interfaces is that when attach() worked,
then map must too (except when you're in OOM). Because at least for some
drivers/subsystems, that's how userspace figures out whether a buffer can
be shared.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH net-next v8 02/14] net: page_pool: create hooks for custom page providers

2024-05-08 Thread Daniel Vetter
On Tue, May 07, 2024 at 08:32:47PM -0300, Jason Gunthorpe wrote:
> On Tue, May 07, 2024 at 08:35:37PM +0100, Pavel Begunkov wrote:
> > On 5/7/24 18:56, Jason Gunthorpe wrote:
> > > On Tue, May 07, 2024 at 06:25:52PM +0100, Pavel Begunkov wrote:
> > > > On 5/7/24 17:48, Jason Gunthorpe wrote:
> > > > > On Tue, May 07, 2024 at 09:42:05AM -0700, Mina Almasry wrote:
> > > > > 
> > > > > > 1. Align with devmem TCP to use udmabuf for your io_uring memory. I
> > > > > > think in the past you said it's a uapi you don't link but in the 
> > > > > > face
> > > > > > of this pushback you may want to reconsider.
> > > > > 
> > > > > dmabuf does not force a uapi, you can acquire your pages however you
> > > > > want and wrap them up in a dmabuf. No uapi at all.
> > > > > 
> > > > > The point is that dmabuf already provides ops that do basically what
> > > > > is needed here. We don't need ops calling ops just because dmabuf's
> > > > > ops are not understsood or not perfect. Fixup dmabuf.
> > > > 
> > > > Those ops, for example, are used to efficiently return used buffers
> > > > back to the kernel, which is uapi, I don't see how dmabuf can be
> > > > fixed up to cover it.
> > > 
> > > Sure, but that doesn't mean you can't use dma buf for the other parts
> > > of the flow. The per-page lifetime is a different topic than the
> > > refcounting and access of the entire bulk of memory.
> > 
> > Ok, so if we're leaving uapi (and ops) and keep per page/sub-buffer as
> > is, the rest is resolving uptr -> pages, and passing it to page pool in
> > a convenient to page pool format (net_iov).
> 
> I'm not going to pretend to know about page pool details, but dmabuf
> is the way to get the bulk of pages into a pool within the net stack's
> allocator and keep that bulk properly refcounted while.
> 
> An object like dmabuf is needed for the general case because there are
> not going to be per-page references or otherwise available.
> 
> What you seem to want is to alter how the actual allocation flow works
> from that bulk of memory and delay the free. It seems like a different
> topic to me, and honestly hacking into the allocator free function
> seems a bit weird..

Also I don't see how it's an argument against dma-buf as the interface for
all these, because e.g. ttm internally does have a page pool because
depending upon allocator, that's indeed beneficial. Other drm drivers have
more buffer-based concepts for opportunistically memory around, usually
by marking buffers that are just kept as cache as purgeable (which is a
concept that goes all the way to opengl/vulkan).

But these are all internals of the dma-buf exporter, the dma-buf api users
don't ever need to care.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [Linaro-mm-sig] Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

2024-05-07 Thread Daniel Vetter
On Tue, May 07, 2024 at 09:46:31AM -0700, Linus Torvalds wrote:
> On Tue, 7 May 2024 at 04:03, Daniel Vetter  wrote:
> >
> > It's really annoying that on some distros/builds we don't have that, and
> > for gpu driver stack reasons we _really_ need to know whether a fd is the
> > same as another, due to some messy uniqueness requirements on buffer
> > objects various drivers have.
>
> It's sad that such a simple thing would require two other horrid
> models (EPOLL or KCMP).
>
> There'[s a reason that KCMP is a config option - *some* of that is
> horrible code - but the "compare file descriptors for equality" is not
> that reason.
>
> Note that KCMP really is a broken mess. It's also a potential security
> hole, even for the simple things, because of how it ends up comparing
> kernel pointers (ie it doesn't just say "same file descriptor", it
> gives an ordering of them, so you can use KCMP to sort things in
> kernel space).
>
> And yes, it orders them after obfuscating the pointer, but it's still
> not something I would consider sane as a baseline interface. It was
> designed for checkpoint-restore, it's the wrong thing to use for some
> "are these file descriptors the same".
>
> The same argument goes for using EPOLL for that. Disgusting hack.
>
> Just what are the requirements for the GPU stack? Is one of the file
> descriptors "trusted", IOW, you know what kind it is?
>
> Because dammit, it's *so* easy to do. You could just add a core DRM
> ioctl for it. Literally just
>
> struct fd f1 = fdget(fd1);
> struct fd f2 = fdget(fd2);
> int same;
>
> same = f1.file && f1.file == f2.file;
> fdput(fd1);
> fdput(fd2);
> return same;
>
> where the only question is if you also woudl want to deal with O_PATH
> fd's, in which case the "fdget()" would be "fdget_raw()".
>
> Honestly, adding some DRM ioctl for this sounds hacky, but it sounds
> less hacky than relying on EPOLL or KCMP.

Well, in slightly more code (because it's part of the "import this
dma-buf/dma-fence/whatever fd into a driver object" ioctl) this is what we
do.

The issue is that there's generic userspace (like compositors) that sees
these things fly by and would also like to know whether the other side
they receive them from is doing nasty stuff/buggy/evil. And they don't
have access to the device drm fd (since those are a handful of layers away
behind the opengl/vulkan userspace drivers even if the compositor could get
at them, and in some cases not even that).

So if we do this in drm we'd essentially have to create a special
drm_compare_files chardev, put the ioctl there and then tell everyone to
make that thing world-accessible.

Which is just too close to a real syscall that it's offensive, and hey
kcmp does what we want already (but unfortunately also way more). So we
rejected adding that to drm. But we did think about it.

> I'd be perfectly ok with adding a generic "FISAME" VFS level ioctl
> too, if this is possibly a more common thing. and not just DRM wants
> it.
>
> Would something like that work for you?

Yes.

Adding Simon and Pekka as two of the usual suspects for this kind of
stuff. Also example code (the int return value is just so that callers know
when kcmp isn't available, they all only care about equality):

https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/util/os_file.c#L239
-Sima

--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-07 Thread Daniel Vetter
On Tue, May 07, 2024 at 04:15:05PM +0100, Bryan O'Donoghue wrote:
> On 07/05/2024 16:09, Dmitry Baryshkov wrote:
> > Ah, I see. Then why do you require the DMA-ble buffer at all? If you are
> > providing data to VPU or DRM, then you should be able to get the buffer
> > from the data-consuming device.
> 
> Because we don't necessarily know what the consuming device is, if any.

Well ... that's an entirely different issue. And it's unsolved.

Currently the approach is to allocate where the constraints are usually
most severe (like display, if you need that, or the camera module for
input) and then just pray the stack works out without too much copying.
All userspace (whether the generic glue or the userspace driver depends a
bit upon the exact api) does need to have a copy fallback for these
sharing cases, ideally with the copying accelerated by hw.

If you try to solve this by just preemptive allocating everything as cma
buffers, then you'll make the situation substantially worse (because now
you're wasting tons of cma memory where you might not even need it).
And without really solving the problem, since for some gpus that memory
might be unusable (because you cannot scan that out on any discrete gpu,
and sometimes not even on an integrated one).
-Sima

> Could be VPU, could be Zoom/Hangouts via pipewire, could for argument sake
> be GPU or DSP.
> 
> Also if we introduce a dependency on another device to allocate the output
> buffers - say always taking the output buffer from the GPU, then we've added
> another dependency which is more difficult to guarantee across different
> arches.
> 
> ---
> bod

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH net-next v8 02/14] net: page_pool: create hooks for custom page providers

2024-05-07 Thread Daniel Vetter
On Tue, May 07, 2024 at 01:48:38PM -0300, Jason Gunthorpe wrote:
> On Tue, May 07, 2024 at 09:42:05AM -0700, Mina Almasry wrote:
> 
> > 1. Align with devmem TCP to use udmabuf for your io_uring memory. I
> > think in the past you said it's a uapi you don't link but in the face
> > of this pushback you may want to reconsider.
> 
> dmabuf does not force a uapi, you can acquire your pages however you
> want and wrap them up in a dmabuf. No uapi at all.
> 
> The point is that dmabuf already provides ops that do basically what
> is needed here. We don't need ops calling ops just because dmabuf's
> ops are not understsood or not perfect. Fixup dmabuf.
> 
> If io_uring wants to take its existing memory pre-registration it can
> wrap that in a dmbauf, and somehow pass it to the netstack. Userspace
> doesn't need to know a dmabuf is being used in the background.

So roughly the current dma-buf design considerations for the users of the
dma-api interfaces:

- It's a memory buffer of fixed length.

- Either that memory is permanently nailed into place with dma_buf_pin
  (and if we add more users than just drm display then we should probably
  figure out the mlock account question for these). For locking hierarchy
  dma_buf_pin uses dma_resv_lock which nests within mmap_sem/vma locks but
  outside of any reclaim/alloc contexts.

- Or the memory is more dynamic, in which case case you need to be able to
  dma_resv_lock when you need the memory and make a promise (as a
  dma_fence) that you'll release the memory within finite time and without
  any further allocations once you've unlocked the dma_buf (because
  dma_fence is in GFP_NORECLAIM). That promise can be waiting for memory
  access to finish, but it can also be a pte invalidate+tlb flush, or some
  kind of preemption, or whatever your hw can do really.

  Also, if you do this dynamic model and need to atomically reserve more
  than one dma_buf, you get to do the wait/wound mutex dance, but that's
  really just a bunch of funny looking error handling code and not really
  impacting the overall design or locking hierarchy.

Everything else we can adjust, but I think the above three are not really
changeable or dma-buf becomes unuseable for gpu drivers.

Note that exporters of dma-buf can pretty much do whatever they feel like,
including rejecting all the generic interfaces/ops, because we also use
dma-buf as userspace handles for some really special memory.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] dmabuf: fix dmabuf file poll uaf issue

2024-05-07 Thread Daniel Vetter
On Tue, May 07, 2024 at 12:10:07PM +0200, Christian König wrote:
> Am 06.05.24 um 21:04 schrieb T.J. Mercier:
> > On Mon, May 6, 2024 at 2:30 AM Charan Teja Kalla
> >  wrote:
> > > Hi TJ,
> > > 
> > > Seems I have got answers from [1], where it is agreed upon epoll() is
> > > the source of issue.
> > > 
> > > Thanks a lot for the discussion.
> > > 
> > > [1] https://lore.kernel.org/lkml/2d631f0615918...@google.com/
> > > 
> > > Thanks
> > > Charan
> > Oh man, quite a set of threads on this over the weekend. Thanks for the 
> > link.
> 
> Yeah and it also has some interesting side conclusion: We should probably
> tell people to stop using DMA-buf with epoll.
> 
> The background is that the mutex approach epoll uses to make files disappear
> from the interest list on close results in the fact that each file can only
> be part of a single epoll at a time.
> 
> Now since DMA-buf is build around the idea that we share the buffer
> representation as file between processes it means that only one process at a
> time can use epoll with each DMA-buf.
> 
> So for example if a window manager uses epoll everything is fine. If a
> client is using epoll everything is fine as well. But if *both* use epoll at
> the same time it won't work.
> 
> This can lead to rather funny and hard to debug combinations of failures and
> I think we need to document this limitation and explicitly point it out.

Ok, I tested this with a small C program, and you're mixing things up.
Here's what I got

- You cannot add a file twice to the same epoll file/fd. So that part is
  correct, and also my understanding from reading the kernel code.

- You can add the same file to two different epoll file instaces. Which
  means it's totally fine to use epoll on a dma_buf in different processes
  like both in the compositor and in clients.

- Substantially more entertaining, you can nest epoll instances, and e.g.
  add a 2nd epoll file as an event to the first one. That way you can add
  the same file to both epoll fds, and so end up with the same file
  essentially being added twice to the top-level epoll file. So even
  within one application there's no real issue when e.g. different
  userspace drivers all want to use epoll on the same fd, because you can
  just throw in another level of epoll and it's fine again and you won't
  get an EEXISTS on EPOLL_CTL_ADD.

  But I also don't think we have this issue right now anywhere, since it's
  kinda a general epoll issue that happens with any duplicated file.

So I don't think there's any reasons to recommend against using epoll on
dma-buf fd (or sync_file or drm_syncobj or any of the sharing primitives
we have really).

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-07 Thread Daniel Vetter
On Mon, May 06, 2024 at 04:01:42PM +0200, Hans de Goede wrote:
> Hi Sima,
> 
> On 5/6/24 3:38 PM, Daniel Vetter wrote:
> > On Mon, May 06, 2024 at 02:05:12PM +0200, Maxime Ripard wrote:
> >> Hi,
> >>
> >> On Mon, May 06, 2024 at 01:49:17PM GMT, Hans de Goede wrote:
> >>> Hi dma-buf maintainers, et.al.,
> >>>
> >>> Various people have been working on making complex/MIPI cameras work OOTB
> >>> with mainline Linux kernels and an opensource userspace stack.
> >>>
> >>> The generic solution adds a software ISP (for Debayering and 3A) to
> >>> libcamera. Libcamera's API guarantees that buffers handed to applications
> >>> using it are dma-bufs so that these can be passed to e.g. a video encoder.
> >>>
> >>> In order to meet this API guarantee the libcamera software ISP allocates
> >>> dma-bufs from userspace through one of the /dev/dma_heap/* heaps. For
> >>> the Fedora COPR repo for the PoC of this:
> >>> https://hansdegoede.dreamwidth.org/28153.html
> >>
> >> For the record, we're also considering using them for ARM KMS devices,
> >> so it would be better if the solution wasn't only considering v4l2
> >> devices.
> >>
> >>> I have added a simple udev rule to give physically present users access
> >>> to the dma_heap-s:
> >>>
> >>> KERNEL=="system", SUBSYSTEM=="dma_heap", TAG+="uaccess"
> >>>
> >>> (and on Rasperry Pi devices any users in the video group get access)
> >>>
> >>> This was just a quick fix for the PoC. Now that we are ready to move out
> >>> of the PoC phase and start actually integrating this into distributions
> >>> the question becomes if this is an acceptable solution; or if we need some
> >>> other way to deal with this ?
> >>>
> >>> Specifically the question is if this will have any negative security
> >>> implications? I can certainly see this being used to do some sort of
> >>> denial of service attack on the system (1). This is especially true for
> >>> the cma heap which generally speaking is a limited resource.
> >>
> >> There's plenty of other ways to exhaust CMA, like allocating too much
> >> KMS or v4l2 buffers. I'm not sure we should consider dma-heaps
> >> differently than those if it's part of our threat model.
> > 
> > So generally for an arm soc where your display needs cma, your render node
> > doesn't. And user applications only have access to the later, while only
> > the compositor gets a kms fd through logind. At least in drm aside from
> > vc4 there's really no render driver that just gives you access to cma and
> > allows you to exhaust that, you need to be a compositor with drm master
> > access to the display.
> > 
> > Which means we're mostly protected against bad applications, and that's
> > not a threat the "user physically sits in front of the machine accounts
> > for", and which giving cma access to everyone would open up. And with
> > flathub/snaps/... this is very much an issue.
> 
> I agree that bad applications are an issue, but not for the flathub / snaps
> case. Flatpacks / snaps run sandboxed and don't have access to a full /dev
> so those should not be able to open /dev/dma_heap/* independent of
> the ACLs on /dev/dma_heap/*. The plan is for cameras using the
> libcamera software ISP to always be accessed through pipewire and
> the camera portal, so in this case pipewere is taking the place of
> the compositor in your kms vs render node example.

Yeah essentially if you clarify to "set the permissions such that pipewire
can do allocations", then I think that makes sense. And is at the same
level as e.g. drm kms giving compsitors (but _only_ compositors) special
access rights.

> So this reduces the problem to bad apps packaged by regular distributions
> and if any of those misbehave the distros should fix that.
> 
> So I think that for the denial of service side allowing physical
> present users (but not sandboxed apps running as those users) to
> access /dev/dma_heap/* should be ok.
> 
> My bigger worry is if dma_heap (u)dma-bufs can be abused in other
> ways then causing a denial of service.
> 
> I guess that the answer there is that causing other security issues
> should not be possible ?

Well pinned memory exhaustion is a very useful tool to make all kinds of
other kernel issues exploitable. Like if you have that you can weaponize
all kinds of kmalloc error paths (and since it's untracked memory the oom
killer will not get you of these issuees).

I think for the pipewire based desktop it'd be best if you only allow
pipewire to get at an fd for allocating from dma-heaps, kinda like logind
furnishes the kms master fd ... Still has the issue that you can't nuke
these buffers, but that's for another day. But at least from a "limit
attack surface" design pov I think this would be better than just handing
out access to the current user outright. But that's also not the worst
option I guess, as long as snaps/flatpacks only go through the pipewire
service.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] fbdev: Have CONFIG_FB_NOTIFY be tristate

2024-05-07 Thread Daniel Vetter
On Mon, May 06, 2024 at 04:53:47PM +0200, Arnd Bergmann wrote:
> On Mon, May 6, 2024, at 15:14, Daniel Vetter wrote:
> > On Fri, May 03, 2024 at 01:22:10PM -0700, Florian Fainelli wrote:
> >> On 5/3/24 12:45, Arnd Bergmann wrote:
> >> > On Fri, May 3, 2024, at 21:28, Florian Fainelli wrote:
> >> > > Android devices in recovery mode make use of a framebuffer device to
> >> > > provide an user interface. In a GKI configuration that has CONFIG_FB=m,
> >> > > but CONFIG_FB_NOTIFY=y, loading the fb.ko module will fail with:
> >> > > 
> >> > > fb: Unknown symbol fb_notifier_call_chain (err -2)
> >> > > 
> >> > > Have CONFIG_FB_NOTIFY be tristate, just like CONFIG_FB such that both
> >> > > can be loaded as module with fb_notify.ko first, and fb.ko second.
> >> > > 
> >> > > Signed-off-by: Florian Fainelli 
> >> > 
> >> > I see two problems here, so I don't think this is the right
> >> > approach:
> >> > 
> >> >   1. I don't understand your description: Since fb_notifier_call_chain()
> >> >  is an exported symbol, it should work for both FB_NOTIFY=y and
> >> >  FB_NOTIFY=m, unless something in Android drops the exported
> >> >  symbols for some reason.
> >> 
> >> The symbol is still exported in the Android tree. The issue is that the GKI
> >> defconfig does not enable any CONFIG_FB options at all. This is left for 
> >> SoC
> >> vendors to do in their own "fragment" which would add CONFIG_FB=m. That
> >> implies CONFIG_FB_NOTIFY=y which was not part of the original kernel image
> >> we build/run against, and so we cannot resolve the symbol.
> 
> I see.
> 
> >> This could be resolved by having the GKI kernel have CONFIG_FB_NOTIFY=y but
> >> this is a bit wasteful (not by much since the code is very slim anyway) and
> >> it does require making changes specifically to the Android tree which could
> >> be beneficial upstream, hence my attempt at going upstream first.
> >
> > Making fbdev (the driver subsystem, not the uapi, we'll still happily
> > merge patches for that) more useful is really not the upstream direction :-)
> 
> I'm more worried about the idea of enabling an entire subsystem
> as a loadable module and expecting that to work with an existing
> kernel, specifically when the drm.ko and fb.ko interact with
> one another and are built with different .config files.
> 
> This is the current Android GKI config:
> https://android.googlesource.com/kernel/common.git/+/refs/heads/android-mainline/arch/arm64/configs/gki_defconfig
> where I can see that CONFIG_DRM is built-in, but DRM_FBDEV_EMULATION
> CONFIG_VT, CONFIG_FRAMEBUFFER_CONSOLE, CONFIG_FB_DEVICE and
> CONFIG_FB_CORE are all disabled.
> 
> So the console won't work at all,I think this means that there
> is no way to get the console working, but building a fb.ko module
> allows using /dev/fb with simplefb.ko (or any other one) happens
> to almost work, but only by dumb luck rather than by design.

So using /dev/fb chardev without fbcon is very much a real idea. This way
you should be able to run old userspace that uses fbdev directly for
drawing, but your console needs are served by a userspace console running
on top of drm.

vt switching gets a bit more entertaining, but I thought logind has all
the glue already to make that happen. Worst case you need a tiny launcher
tool to get your userspace console out of the way while you launch a fbdev
using application, but I think correctly implement the vt ioctls to switch
to graphics mode /should/ work automatically.

I do agree that this is only really a good idea with drm drivers, since
those do not rely on any of the fbdev infrastructure like the notifier
under discussion.

> >> > $ git grep -w fb_register_client
> >> > arch/arm/mach-pxa/am200epd.c:   fb_register_client(_fb_notif);
> >> > drivers/leds/trigger/ledtrig-backlight.c:   ret = 
> >> > fb_register_client(>notifier);
> >> > drivers/video/backlight/backlight.c:return 
> >> > fb_register_client(>fb_notif);
> >> > drivers/video/backlight/lcd.c:  return fb_register_client(>fb_notif);
> >> > 
> >> > Somewhat related but not directly addressing your patch, I wonder
> >> > if Android itself could migrate to using FB_CORE=m FB=n FB_NOTIFY=n
> >> > instead and use simpledrm for the console in place of the legacy
> >> > fbdev layer.
> >> 
> >> That is beyond my reach :)
> >
> > This one is. And it do

Re: [Linaro-mm-sig] Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

2024-05-07 Thread Daniel Vetter
On Mon, May 06, 2024 at 04:29:44PM +0200, Christian König wrote:
> Am 04.05.24 um 20:20 schrieb Linus Torvalds:
> > On Sat, 4 May 2024 at 08:32, Linus Torvalds
> >  wrote:
> > > Lookie here, the fundamental issue is that epoll can call '->poll()'
> > > on a file descriptor that is being closed concurrently.
> > Thinking some more about this, and replying to myself...
> > 
> > Actually, I wonder if we could *really* fix this by simply moving the
> > eventpoll_release() to where it really belongs.
> > 
> > If we did it in file_close_fd_locked(),  it would actually make a
> > *lot* more sense. Particularly since eventpoll actually uses this:
> > 
> >  struct epoll_filefd {
> >  struct file *file;
> >  int fd;
> >  } __packed;
> > 
> > ie it doesn't just use the 'struct file *', it uses the 'fd' itself
> > (for ep_find()).
> > 
> > (Strictly speaking, it should also have a pointer to the 'struct
> > files_struct' to make the 'int fd' be meaningful).
> 
> While I completely agree on this I unfortunately have to ruin the idea.
> 
> Before we had KCMP some people relied on the strange behavior of eventpoll
> to compare struct files when the fd is the same.
> 
> I just recently suggested that solution to somebody at AMD as a workaround
> when KCMP is disabled because of security hardening and I'm pretty sure I've
> seen it somewhere else as well.
> 
> So when we change that it would break (undocumented?) UAPI behavior.

Uh extremely aside, but doesn't this mean we should just enable kcmp on
files unconditionally, since there's an alternative? Or a least everywhere
CONFIG_EPOLL is enabled?

It's really annoying that on some distros/builds we don't have that, and
for gpu driver stack reasons we _really_ need to know whether a fd is the
same as another, due to some messy uniqueness requirements on buffer
objects various drivers have.
-Sima

> 
> Regards,
> Christian.
> 
> > 
> > IOW, eventpoll already considers the file _descriptor_ relevant, not
> > just the file pointer, and that's destroyed at *close* time, not at
> > 'fput()' time.
> > 
> > Yeah, yeah, the locking situation in file_close_fd_locked() is a bit
> > inconvenient, but if we can solve that, it would solve the problem in
> > a fundamentally different way: remove the ep iterm before the
> > file->f_count has actually been decremented, so the whole "race with
> > fput()" would just go away entirely.
> > 
> > I dunno. I think that would be the right thing to do, but I wouldn't
> > be surprised if some disgusting eventpoll user then might depend on
> > the current situation where the eventpoll thing stays around even
> > after the close() if you have another copy of the file open.
> > 
> >   Linus
> > ___
> > Linaro-mm-sig mailing list -- linaro-mm-...@lists.linaro.org
> > To unsubscribe send an email to linaro-mm-sig-le...@lists.linaro.org
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

2024-05-07 Thread Daniel Vetter
On Mon, May 06, 2024 at 04:46:54PM +0200, Christian Brauner wrote:
> On Mon, May 06, 2024 at 02:47:23PM +0200, Daniel Vetter wrote:
> > On Sun, May 05, 2024 at 01:53:48PM -0700, Linus Torvalds wrote:
> > > On Sun, 5 May 2024 at 13:30, Al Viro  wrote:
> > > >
> > > > 0.  special-cased ->f_count rule for ->poll() is a wart and it's
> > > > better to get rid of it.
> > > >
> > > > 1.  fs/eventpoll.c is a steaming pile of shit and I'd be glad to see
> > > > git rm taken to it.  Short of that, by all means, let's grab reference
> > > > in there around the call of vfs_poll() (see (0)).
> > > 
> > > Agreed on 0/1.
> > > 
> > > > 2.  having ->poll() instances grab extra references to file passed
> > > > to them is not something that should be encouraged; there's a plenty
> > > > of potential problems, and "caller has it pinned, so we are fine with
> > > > grabbing extra refs" is nowhere near enough to eliminate those.
> > > 
> > > So it's not clear why you hate it so much, since those extra
> > > references are totally normal in all the other VFS paths.
> > > 
> > > I mean, they are perhaps not the *common* case, but we have a lot of
> > > random get_file() calls sprinkled around in various places when you
> > > end up passing a file descriptor off to some asynchronous operation
> > > thing.
> > > 
> > > Yeah, I think most of them tend to be special operations (eg the tty
> > > TIOCCONS ioctl to redirect the console), but it's not like vfs_ioctl()
> > > is *that* different from vfs_poll. Different operation, not somehow
> > > "one is more special than the other".
> > > 
> > > cachefiles and backing-file does it for regular IO, and drop it at IO
> > > completion - not that different from what dma-buf does. It's in
> > > ->read_iter() rather than ->poll(), but again: different operations,
> > > but not "one of them is somehow fundamentally different".
> > > 
> > > > 3.  dma-buf uses of get_file() are probably safe (epoll shite 
> > > > aside),
> > > > but they do look fishy.  That has nothing to do with epoll.
> > > 
> > > Now, what dma-buf basically seems to do is to avoid ref-counting its
> > > own fundamental data structure, and replaces that by refcounting the
> > > 'struct file' that *points* to it instead.
> > > 
> > > And it is a bit odd, but it actually makes some amount of sense,
> > > because then what it passes around is that file pointer (and it allows
> > > passing it around from user space *as* that file).
> > > 
> > > And honestly, if you look at why it then needs to add its refcount to
> > > it all, it actually makes sense.  dma-bufs have this notion of
> > > "fences" that are basically completion points for the asynchronous
> > > DMA. Doing a "poll()" operation will add a note to the fence to get
> > > that wakeup when it's done.
> > > 
> > > And yes, logically it takes a ref to the "struct dma_buf", but because
> > > of how the lifetime of the dma_buf is associated with the lifetime of
> > > the 'struct file', that then turns into taking a ref on the file.
> > > 
> > > Unusual? Yes. But not illogical. Not obviously broken. Tying the
> > > lifetime of the dma_buf to the lifetime of a file that is passed along
> > > makes _sense_ for that use.
> > > 
> > > I'm sure dma-bufs could add another level of refcounting on the
> > > 'struct dma_buf' itself, and not make it be 1:1 with the file, but
> > > it's not clear to me what the advantage would really be, or why it
> > > would be wrong to re-use a refcount that is already there.
> > 
> > So there is generally another refcount, because dma_buf is just the
> > cross-driver interface to some kind of real underlying buffer object from
> > the various graphics related subsystems we have.
> > 
> > And since it's a pure file based api thing that ceases to serve any
> > function once the fd/file is gone we tied all the dma_buf refcounting to
> > the refcount struct file already maintains. But the underlying buffer
> > object can easily outlive the dma_buf, and over the lifetime of an
> > underlying buffer object you might actually end up creating different
> > dma_buf api wrappers for it (but at least in drm we guarantee there's at
> > most one, hence why vmwgfx does the atomic_i

Re: [lvc-project] [PATCH] [RFC] dma-buf: fix race condition between poll and close

2024-05-07 Thread Daniel Vetter
On Tue, May 07, 2024 at 11:58:33AM +0200, Christian König wrote:
> Am 06.05.24 um 08:52 schrieb Fedor Pchelkin:
> > On Fri, 03. May 14:08, Dmitry Antipov wrote:
> > > On 5/3/24 11:18 AM, Christian König wrote:
> > > 
> > > > Attached is a compile only tested patch, please verify if it fixes your 
> > > > problem.
> > > LGTM, and this is similar to get_file() in __pollwait() and fput() in
> > > free_poll_entry() used in implementation of poll(). Please resubmit to
> > > linux-fsdevel@ including the following:
> > > 
> > > Reported-by: syzbot+5d4cb6b4409edfd18...@syzkaller.appspotmail.com
> > > Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
> > > Tested-by: Dmitry Antipov 
> > I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better
> > about file lifetimes") which was pushed upstream just before v6.9-rc7.
> > 
> > Link: https://lore.kernel.org/lkml/2d631f0615918...@google.com/
> 
> Yeah, Linus took care of that after convincing Al that this is really a bug.
> 
> They key missing information was that we have a mutex which makes sure that
> fput() blocks for epoll to stop the polling.
> 
> It also means that you should probably re-consider using epoll together with
> shared DMA-bufs. Background is that when both client and display server try
> to use epoll the kernel will return an error because there can only be one
> user of epoll.

I think for dma-buf implicit sync the best is to use the new fence export
ioctl, which has the added benefit that you get a snapshot and so no funny
livelock issues if someone keeps submitting rendering to a shared buffer.

That aside, why can you not use the same file with multiple epoll files in
different processes? Afaik from a quick look, all the tracking structures
are per epoll file, so both client and compositor using it should work?

I haven't tried, so I just might be extremely blind ...
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: Safety of opening up /dev/dma_heap/* to physically present users (udev uaccess tag) ?

2024-05-06 Thread Daniel Vetter
On Mon, May 06, 2024 at 02:05:12PM +0200, Maxime Ripard wrote:
> Hi,
> 
> On Mon, May 06, 2024 at 01:49:17PM GMT, Hans de Goede wrote:
> > Hi dma-buf maintainers, et.al.,
> > 
> > Various people have been working on making complex/MIPI cameras work OOTB
> > with mainline Linux kernels and an opensource userspace stack.
> > 
> > The generic solution adds a software ISP (for Debayering and 3A) to
> > libcamera. Libcamera's API guarantees that buffers handed to applications
> > using it are dma-bufs so that these can be passed to e.g. a video encoder.
> > 
> > In order to meet this API guarantee the libcamera software ISP allocates
> > dma-bufs from userspace through one of the /dev/dma_heap/* heaps. For
> > the Fedora COPR repo for the PoC of this:
> > https://hansdegoede.dreamwidth.org/28153.html
> 
> For the record, we're also considering using them for ARM KMS devices,
> so it would be better if the solution wasn't only considering v4l2
> devices.
> 
> > I have added a simple udev rule to give physically present users access
> > to the dma_heap-s:
> > 
> > KERNEL=="system", SUBSYSTEM=="dma_heap", TAG+="uaccess"
> > 
> > (and on Rasperry Pi devices any users in the video group get access)
> > 
> > This was just a quick fix for the PoC. Now that we are ready to move out
> > of the PoC phase and start actually integrating this into distributions
> > the question becomes if this is an acceptable solution; or if we need some
> > other way to deal with this ?
> > 
> > Specifically the question is if this will have any negative security
> > implications? I can certainly see this being used to do some sort of
> > denial of service attack on the system (1). This is especially true for
> > the cma heap which generally speaking is a limited resource.
> 
> There's plenty of other ways to exhaust CMA, like allocating too much
> KMS or v4l2 buffers. I'm not sure we should consider dma-heaps
> differently than those if it's part of our threat model.

So generally for an arm soc where your display needs cma, your render node
doesn't. And user applications only have access to the later, while only
the compositor gets a kms fd through logind. At least in drm aside from
vc4 there's really no render driver that just gives you access to cma and
allows you to exhaust that, you need to be a compositor with drm master
access to the display.

Which means we're mostly protected against bad applications, and that's
not a threat the "user physically sits in front of the machine accounts
for", and which giving cma access to everyone would open up. And with
flathub/snaps/... this is very much an issue.

So you need more, either:

- cgroups limits on dma-buf and dma-buf heaps. This has been bikeshedded
  for years and is just not really moving.

- An allocator service which checks whether you're allowed to allocate
  these special buffers. Android does that through binder.

Probably also some way to nuke applications that refuse to release buffers
when they're no longer the right application. cgroups is a lot more
convenient for that.
-Sima

> > But devices tagged for uaccess are only opened up to users who are 
> > physcially present behind the machine and those can just hit
> > the powerbutton, so I don't believe that any *on purpose* DOS is part of
> > the thread model. 
> 
> How would that work for headless devices?
> 
> Maxime



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] fbdev: Have CONFIG_FB_NOTIFY be tristate

2024-05-06 Thread Daniel Vetter
On Fri, May 03, 2024 at 01:22:10PM -0700, Florian Fainelli wrote:
> On 5/3/24 12:45, Arnd Bergmann wrote:
> > On Fri, May 3, 2024, at 21:28, Florian Fainelli wrote:
> > > Android devices in recovery mode make use of a framebuffer device to
> > > provide an user interface. In a GKI configuration that has CONFIG_FB=m,
> > > but CONFIG_FB_NOTIFY=y, loading the fb.ko module will fail with:
> > > 
> > > fb: Unknown symbol fb_notifier_call_chain (err -2)
> > > 
> > > Have CONFIG_FB_NOTIFY be tristate, just like CONFIG_FB such that both
> > > can be loaded as module with fb_notify.ko first, and fb.ko second.
> > > 
> > > Signed-off-by: Florian Fainelli 
> > 
> > I see two problems here, so I don't think this is the right
> > approach:
> > 
> >   1. I don't understand your description: Since fb_notifier_call_chain()
> >  is an exported symbol, it should work for both FB_NOTIFY=y and
> >  FB_NOTIFY=m, unless something in Android drops the exported
> >  symbols for some reason.
> 
> The symbol is still exported in the Android tree. The issue is that the GKI
> defconfig does not enable any CONFIG_FB options at all. This is left for SoC
> vendors to do in their own "fragment" which would add CONFIG_FB=m. That
> implies CONFIG_FB_NOTIFY=y which was not part of the original kernel image
> we build/run against, and so we cannot resolve the symbol.
> 
> This could be resolved by having the GKI kernel have CONFIG_FB_NOTIFY=y but
> this is a bit wasteful (not by much since the code is very slim anyway) and
> it does require making changes specifically to the Android tree which could
> be beneficial upstream, hence my attempt at going upstream first.

Making fbdev (the driver subsystem, not the uapi, we'll still happily
merge patches for that) more useful is really not the upstream direction :-)

> IMHO it makes sense for all subsystem supporting code to be completely
> modular or completely built-in, or at least allowed to be.
> 
> > 
> >   2. This breaks if any of the four callers of fb_register_client()
> >  are built-in while CONFIG_FB_NOTIFY is set to =m:
> 
> Ah good point, I missed that part, thanks, adding "select FB_NOTIFY" ought
> to be enough for those.
> 
> > 
> > $ git grep -w fb_register_client
> > arch/arm/mach-pxa/am200epd.c:   fb_register_client(_fb_notif);
> > drivers/leds/trigger/ledtrig-backlight.c:   ret = 
> > fb_register_client(>notifier);
> > drivers/video/backlight/backlight.c:return 
> > fb_register_client(>fb_notif);
> > drivers/video/backlight/lcd.c:  return fb_register_client(>fb_notif);
> > 
> > Somewhat related but not directly addressing your patch, I wonder
> > if Android itself could migrate to using FB_CORE=m FB=n FB_NOTIFY=n
> > instead and use simpledrm for the console in place of the legacy
> > fbdev layer.
> 
> That is beyond my reach :)

This one is. And it doesn't need to be simpledrm, just a drm kms driver
with fbdev emulation. Heck even if you have an fbdev driver you should
control the corresponding backlight explicitly, and not rely on the fb
notifier to magical enable/disable some random backlights somewhere.

So please do not encourage using this in any modern code.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 06/23] drm/xe/svm: Introduce a helper to build sg table from hmm range

2024-05-06 Thread Daniel Vetter
On Sat, May 04, 2024 at 11:03:03AM +1000, Dave Airlie wrote:
> > Let me know if this understanding is correct.
> >
> > Or what would you like to do in such situation?
> >
> > >
> > > Not sure how it is really a good idea.
> > >
> > > Adaptive locality of memory is still an unsolved problem in Linux,
> > > sadly.
> > >
> > > > > However, the migration stuff should really not be in the driver
> > > > > either. That should be core DRM logic to manage that. It is so
> > > > > convoluted and full of policy that all the drivers should be working
> > > > > in the same way.
> > > >
> > > > Completely agreed. Moving migration infrastructures to DRM is part
> > > > of our plan. We want to first prove of concept with xekmd driver,
> > > > then move helpers, infrastructures to DRM. Driver should be as easy
> > > > as implementation a few callback functions for device specific page
> > > > table programming and device migration, and calling some DRM common
> > > > functions during gpu page fault.
> > >
> > > You'd be better to start out this way so people can look at and
> > > understand the core code on its own merits.
> >
> > The two steps way were agreed with DRM maintainers, see here:  
> > https://lore.kernel.org/dri-devel/sa1pr11mb6991045cc69ec8e1c576a71592...@sa1pr11mb6991.namprd11.prod.outlook.com/,
> >  bullet 4)
> 
> After this discussion and the other cross-device HMM stuff I think we
> should probably push more for common up-front, I think doing this in a
> driver without considering the bigger picture might not end up
> extractable, and then I fear the developers will just move onto other
> things due to management pressure to land features over correctness.
> 
> I think we have enough people on the list that can review this stuff,
> and even if the common code ends up being a little xe specific,
> iterating it will be easier outside the driver, as we can clearly
> demark what is inside and outside.

tldr; Yeah concurring.

I think like with the gpu vma stuff we should at least aim for the core
data structures, and more importantly, the locking design and how it
interacts with core mm services to be common code.

I read through amdkfd and I think that one is warning enough that this
area is one of these cases where going with common code aggressively is
much better. Because it will be buggy in terribly "how do we get out of
this design corner again ever?" ways no matter what. But with common code
there will at least be all of dri-devel and hopefully some mm folks
involved in sorting things out.

Most other areas it's indeed better to explore the design space with a few
drivers before going with common code, at the cost of having some really
terrible driver code in upstream. But here the cost of some really bad
design in drivers is just too expensive imo.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/vmwgfx: Re-introduce drm_crtc_helper_funcs::prepare

2024-05-06 Thread Daniel Vetter
On Fri, May 03, 2024 at 10:46:40PM -0400, Zack Rusin wrote:
> On Fri, May 3, 2024 at 6:29 PM Ian Forbes  wrote:
> >
> > This function was removed in the referenced fixes commit and caused a
> > regression. This is because the presence of this function, even though it
> > is a noop, changes the behaviour of disable_outputs in
> > drm_atomic_helper.c:1211.
> >
> > Fixes: 7b0062036c3b ("drm/vmwgfx: Implement virtual crc generation")
> > Signed-off-by: Ian Forbes 
> > ---
> >  drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c 
> > b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
> > index 2041c4d48daa..37223f95cbec 100644
> > --- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
> > +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
> > @@ -409,6 +409,10 @@ static void vmw_stdu_crtc_mode_set_nofb(struct 
> > drm_crtc *crtc)
> >   crtc->x, crtc->y);
> >  }
> >
> > +static void vmw_stdu_crtc_helper_prepare(struct drm_crtc *crtc)
> > +{
> > +}
> > +
> >  static void vmw_stdu_crtc_atomic_disable(struct drm_crtc *crtc,
> >  struct drm_atomic_state *state)
> >  {
> > @@ -1463,6 +1467,7 @@ drm_plane_helper_funcs 
> > vmw_stdu_primary_plane_helper_funcs = {
> >  };
> >
> >  static const struct drm_crtc_helper_funcs vmw_stdu_crtc_helper_funcs = {
> > +   .prepare = vmw_stdu_crtc_helper_prepare,
> > .mode_set_nofb = vmw_stdu_crtc_mode_set_nofb,
> > .atomic_check = vmw_du_crtc_atomic_check,
> > .atomic_begin = vmw_du_crtc_atomic_begin,
> > --
> > 2.34.1
> >
> 
> Thanks, but that doesn't look correct. We do want to make sure the
> drm_crtc_vblank_off is actually called when outputs are disabled.
> Since this is my regression it's perfectly fine if you want to hand it
> off to me and work on something else. In general you always want to
> understand what the patch that you're sending is doing before sending
> it. In this case it's pretty trivial, the commit you mention says that
> it fixes kms_pipe_crc_basic and if you run it with your patch (e.g.
> sudo ./kms_pipe_crc_basic --run-subtest disable-crc-after-crtc) you
> should notice:

Rather aside, but atomic helpers supporting the ->prepare callback in that
special way is kinda a remnant of the conversion helpers to move legacy
kms drivers to atomic.

Which means we might want to look into whether anyone still needs that, or
whether the ->atomic_disable hook is enough and then nuke that if-ladder
of compat code. Because as this bug shows, it's rather surprising special
case :-/
-Sima

> May 03 22:25:05 fedora.local kernel: [ cut here ]
> May 03 22:25:05 fedora.local kernel: driver forgot to call 
> drm_crtc_vblank_off()
> May 03 22:25:05 fedora.local kernel: WARNING: CPU: 2 PID: 2204 at
> drivers/gpu/drm/drm_atomic_helper.c:1232 disable_outputs+0x345/0x350
> May 03 22:25:05 fedora.local kernel: Modules linked in: snd_seq_dummy
> snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables
> snd_seq_midi snd_seq_midi_event qrtr vsock_loopback
> vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock
> sunrpc binfmt_misc snd_ens1371 snd_ac97_codec ac97_bus snd_seq
> intel_rapl_msr snd_pcm intel_rapl_common vmw_balloon
> intel_uncore_frequency_common isst_if_mbox_msr isst_if_common gameport
> snd_rawmidi snd_seq_device rapl snd_timer snd vmw_vmci pcspkr
> soundcore i2c_piix4 pktcdvd joydev loop nfnetlink zram vmwgfx
> crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni
> polyval_generic nvme ghash_clmulni_intel nvme_core sha512_ssse3
> sha256_ssse3 sha1_ssse3 drm_ttm_helper ttm nvme_auth vmxnet3 serio_raw
> ata_generic pata_acpi fuse i2c_dev
> May 03 22:25:05 fedora.local kernel: CPU: 2 PID: 2204 Comm:
> kms_pipe_crc_ba Not tainted 6.9.0-rc2-vmwgfx #5
> May 03 22:25:05 fedora.local kernel: Hardware name: VMware, Inc.
> VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00
> 11/12/2020
> May 03 22:25:05 fedora.local kernel: RIP: 0010:disable_outputs+0x345/0x350
> ... but in most cases it's not going to be so trivial. Whether you
> decide to work on this one yourself or hand it off to me - we don't
> want to trade bug for bug here, but fix both of those things.
> 
> z

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] Documentation/gpu: Document the situation with unqualified drm-memory-

2024-05-06 Thread Daniel Vetter
On Fri, May 03, 2024 at 06:06:03PM +0100, Tvrtko Ursulin wrote:
> 
> On 03/05/2024 16:58, Alex Deucher wrote:
> > On Fri, May 3, 2024 at 11:33 AM Daniel Vetter  wrote:
> > > 
> > > On Fri, May 03, 2024 at 01:58:38PM +0100, Tvrtko Ursulin wrote:
> > > > 
> > > > [And I forgot dri-devel.. doing well!]
> > > > 
> > > > On 03/05/2024 13:40, Tvrtko Ursulin wrote:
> > > > > 
> > > > > [Correcting Christian's email]
> > > > > 
> > > > > On 03/05/2024 13:36, Tvrtko Ursulin wrote:
> > > > > > From: Tvrtko Ursulin 
> > > > > > 
> > > > > > Currently it is not well defined what is drm-memory- compared to 
> > > > > > other
> > > > > > categories.
> > > > > > 
> > > > > > In practice the only driver which emits these keys is amdgpu and in 
> > > > > > them
> > > > > > exposes the total memory use (including shared).
> > > > > > 
> > > > > > Document that drm-memory- and drm-total-memory- are aliases to
> > > > > > prevent any
> > > > > > confusion in the future.
> > > > > > 
> > > > > > While at it also clarify that the reserved sub-string 'memory' 
> > > > > > refers to
> > > > > > the memory region component.
> > > > > > 
> > > > > > Signed-off-by: Tvrtko Ursulin 
> > > > > > Cc: Alex Deucher 
> > > > > > Cc: Christian König 
> > > > > 
> > > > > Mea culpa, I copied the mistake from
> > > > > 77d17c4cd0bf52eacfad88e63e8932eb45d643c5. :)
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > Tvrtko
> > > > > 
> > > > > > Cc: Rob Clark 
> > > > > > ---
> > > > > >Documentation/gpu/drm-usage-stats.rst | 10 +-
> > > > > >1 file changed, 9 insertions(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/Documentation/gpu/drm-usage-stats.rst
> > > > > > b/Documentation/gpu/drm-usage-stats.rst
> > > > > > index 6dc299343b48..ef5c0a0aa477 100644
> > > > > > --- a/Documentation/gpu/drm-usage-stats.rst
> > > > > > +++ b/Documentation/gpu/drm-usage-stats.rst
> > > > > > @@ -128,7 +128,9 @@ Memory
> > > > > >Each possible memory type which can be used to store buffer
> > > > > > objects by the
> > > > > >GPU in question shall be given a stable and unique name to be
> > > > > > returned as the
> > > > > > -string here.  The name "memory" is reserved to refer to normal
> > > > > > system memory.
> > > > > > +string here.
> > > > > > +
> > > > > > +The region name "memory" is reserved to refer to normal system 
> > > > > > memory.
> > > > > >Value shall reflect the amount of storage currently consumed by
> > > > > > the buffer
> > > > > >objects belong to this client, in the respective memory region.
> > > > > > @@ -136,6 +138,9 @@ objects belong to this client, in the respective
> > > > > > memory region.
> > > > > >Default unit shall be bytes with optional unit specifiers of 
> > > > > > 'KiB'
> > > > > > or 'MiB'
> > > > > >indicating kibi- or mebi-bytes.
> > > > > > +This is an alias for drm-total- and only one of the two
> > > > > > should be
> > > > > > +present.
> > > 
> > > This feels a bit awkward and seems to needlessly complicate fdinfo uapi.
> > > 
> > > - Could we just patch amdgpu to follow everyone else, and avoid the
> > >special case? If there's no tool that relies on the special amdgpu
> > >prefix then that would be a lot easier.
> > > 
> > > - If that's not on the table, could we make everyone (with a suitable
> > >helper or something) just print both variants, so that we again have
> > >consisent fdinfo output? Or breaks that a different set of existing
> > >tools.
> > > 
> > > - Finally maybe could we get away with fixing amd by adding the common
> > >format there, deprecating the old, fixing

Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

2024-05-06 Thread Daniel Vetter
On Sun, May 05, 2024 at 01:53:48PM -0700, Linus Torvalds wrote:
> On Sun, 5 May 2024 at 13:30, Al Viro  wrote:
> >
> > 0.  special-cased ->f_count rule for ->poll() is a wart and it's
> > better to get rid of it.
> >
> > 1.  fs/eventpoll.c is a steaming pile of shit and I'd be glad to see
> > git rm taken to it.  Short of that, by all means, let's grab reference
> > in there around the call of vfs_poll() (see (0)).
> 
> Agreed on 0/1.
> 
> > 2.  having ->poll() instances grab extra references to file passed
> > to them is not something that should be encouraged; there's a plenty
> > of potential problems, and "caller has it pinned, so we are fine with
> > grabbing extra refs" is nowhere near enough to eliminate those.
> 
> So it's not clear why you hate it so much, since those extra
> references are totally normal in all the other VFS paths.
> 
> I mean, they are perhaps not the *common* case, but we have a lot of
> random get_file() calls sprinkled around in various places when you
> end up passing a file descriptor off to some asynchronous operation
> thing.
> 
> Yeah, I think most of them tend to be special operations (eg the tty
> TIOCCONS ioctl to redirect the console), but it's not like vfs_ioctl()
> is *that* different from vfs_poll. Different operation, not somehow
> "one is more special than the other".
> 
> cachefiles and backing-file does it for regular IO, and drop it at IO
> completion - not that different from what dma-buf does. It's in
> ->read_iter() rather than ->poll(), but again: different operations,
> but not "one of them is somehow fundamentally different".
> 
> > 3.  dma-buf uses of get_file() are probably safe (epoll shite aside),
> > but they do look fishy.  That has nothing to do with epoll.
> 
> Now, what dma-buf basically seems to do is to avoid ref-counting its
> own fundamental data structure, and replaces that by refcounting the
> 'struct file' that *points* to it instead.
> 
> And it is a bit odd, but it actually makes some amount of sense,
> because then what it passes around is that file pointer (and it allows
> passing it around from user space *as* that file).
> 
> And honestly, if you look at why it then needs to add its refcount to
> it all, it actually makes sense.  dma-bufs have this notion of
> "fences" that are basically completion points for the asynchronous
> DMA. Doing a "poll()" operation will add a note to the fence to get
> that wakeup when it's done.
> 
> And yes, logically it takes a ref to the "struct dma_buf", but because
> of how the lifetime of the dma_buf is associated with the lifetime of
> the 'struct file', that then turns into taking a ref on the file.
> 
> Unusual? Yes. But not illogical. Not obviously broken. Tying the
> lifetime of the dma_buf to the lifetime of a file that is passed along
> makes _sense_ for that use.
> 
> I'm sure dma-bufs could add another level of refcounting on the
> 'struct dma_buf' itself, and not make it be 1:1 with the file, but
> it's not clear to me what the advantage would really be, or why it
> would be wrong to re-use a refcount that is already there.

So there is generally another refcount, because dma_buf is just the
cross-driver interface to some kind of real underlying buffer object from
the various graphics related subsystems we have.

And since it's a pure file based api thing that ceases to serve any
function once the fd/file is gone we tied all the dma_buf refcounting to
the refcount struct file already maintains. But the underlying buffer
object can easily outlive the dma_buf, and over the lifetime of an
underlying buffer object you might actually end up creating different
dma_buf api wrappers for it (but at least in drm we guarantee there's at
most one, hence why vmwgfx does the atomic_inc_unless_zero trick, which I
don't particularly like and isn't really needed).

But we could add another refcount, it just means we have 3 of those then
when only really 2 are needed.

Also maybe here two: dma_fence are bounded like other disk i/o (including
the option of timeouts if things go very wrong), so it's very much not
forever but at most a few seconds worst case (shit hw/driver excluded, as
usual).
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

2024-05-06 Thread Daniel Vetter
On Fri, May 03, 2024 at 04:41:19PM -0700, Linus Torvalds wrote:
> On Fri, 3 May 2024 at 16:23, Kees Cook  wrote:
> >
> > static bool __must_check get_dma_buf_unless_doomed(struct dma_buf *dmabuf)
> > {
> > return atomic_long_inc_not_zero(>file->f_count) != 0L;
> > }
> >
> > If we end up adding epi_fget(), we'll have 2 cases of using
> > "atomic_long_inc_not_zero" for f_count. Do we need some kind of blessed
> > helper to live in file.h or something, with appropriate comments?
> 
> I wonder if we could try to abstract this out a bit more.
> 
> These games with non-ref-counted file structures *feel* a bit like the
> games we play with non-ref-counted (aka "stashed") 'struct dentry'
> that got fairly recently cleaned up with path_from_stashed() when both
> nsfs and pidfs started doing the same thing.
> 
> I'm not loving the TTM use of this thing, but at least the locking and
> logic feels a lot more straightforward (ie the
> atomic_long_inc_not_zero() here is clealy under the 'prime->mutex'
> lock

The one the vmgfx isn't really needed (I think at least), because all
other drivers that use gem or ttm use the dma_buf export cache in
drm/drm_prime.c, which is protected by a bog standard mutex.

vmwgfx is unfortunately special in a lot of ways due to somewhat parallel
dev history. So there might be an uapi reason why the weak reference is
required. I suspect because vmwgfx is reinventing a lot of its own wheels
it can't play the same tricks as gem_prime.c, which hooks into a few core
drm cleanup/release functions.

tldr; drm really has no architectural need for a get_file_unless_doomed,
and I certainly don't want to spread it it further than the vmwgfx
historical special case that was added in 2013.
-Sima

> IOW, the tty use looks correct to me, and it has fairly simple locking
> and is just catching the the race between 'fput()' decrementing the
> refcount and and 'file->f_op->release()' doing the actual release.
> 
> You are right that it's similar to the epoll thing in that sense, it
> just looks a _lot_ more straightforward to me (and, unlike epoll,
> doesn't look actively buggy right now).
> 
> Could we abstract out this kind of "stashed file pointer" so that we'd
> have a *common* form for this? Not just the inc_not_zero part, but the
> locking rule too?
> 
>   Linus

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: get_file() unsafe under epoll (was Re: [syzbot] [fs?] [io-uring?] general protection fault in __ep_remove)

2024-05-06 Thread Daniel Vetter
On Fri, May 03, 2024 at 10:53:03PM +0100, Al Viro wrote:
> On Fri, May 03, 2024 at 02:42:22PM -0700, Linus Torvalds wrote:
> > On Fri, 3 May 2024 at 14:36, Al Viro  wrote:
> > >
> > > ... the last part is no-go - poll_wait() must be able to grab a reference
> > > (well, the callback in it must)
> > 
> > Yeah. I really think that *poll* itself is doing everything right. It
> > knows that it's called with a file pointer with a reference, and it
> > adds its own references as needed.
> 
> Not really.  Note that select's __pollwait() does *NOT* leave a reference
> at the mercy of driver - it's stuck into poll_table_entry->filp and
> the poll_freewait() knows how to take those out.
> 
> 
> dmabuf does something very different - it grabs the damn thing into
> its private data structures and for all we know it could keep it for
> a few hours, until some even materializes.

dma_fence must complete in reasonable amount of time, where "reasonable"
is roughly in line with other i/o (including the option that there's
timeouts if the hw's gone busted).

So definitely not hours (aside from driver bugs when things go really
wrong ofc), but more like a few seconds in a worst case scenario.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] Documentation/gpu: Document the situation with unqualified drm-memory-

2024-05-03 Thread Daniel Vetter
On Fri, May 03, 2024 at 01:58:38PM +0100, Tvrtko Ursulin wrote:
> 
> [And I forgot dri-devel.. doing well!]
> 
> On 03/05/2024 13:40, Tvrtko Ursulin wrote:
> > 
> > [Correcting Christian's email]
> > 
> > On 03/05/2024 13:36, Tvrtko Ursulin wrote:
> > > From: Tvrtko Ursulin 
> > > 
> > > Currently it is not well defined what is drm-memory- compared to other
> > > categories.
> > > 
> > > In practice the only driver which emits these keys is amdgpu and in them
> > > exposes the total memory use (including shared).
> > > 
> > > Document that drm-memory- and drm-total-memory- are aliases to
> > > prevent any
> > > confusion in the future.
> > > 
> > > While at it also clarify that the reserved sub-string 'memory' refers to
> > > the memory region component.
> > > 
> > > Signed-off-by: Tvrtko Ursulin 
> > > Cc: Alex Deucher 
> > > Cc: Christian König 
> > 
> > Mea culpa, I copied the mistake from
> > 77d17c4cd0bf52eacfad88e63e8932eb45d643c5. :)
> > 
> > Regards,
> > 
> > Tvrtko
> > 
> > > Cc: Rob Clark 
> > > ---
> > >   Documentation/gpu/drm-usage-stats.rst | 10 +-
> > >   1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/gpu/drm-usage-stats.rst
> > > b/Documentation/gpu/drm-usage-stats.rst
> > > index 6dc299343b48..ef5c0a0aa477 100644
> > > --- a/Documentation/gpu/drm-usage-stats.rst
> > > +++ b/Documentation/gpu/drm-usage-stats.rst
> > > @@ -128,7 +128,9 @@ Memory
> > >   Each possible memory type which can be used to store buffer
> > > objects by the
> > >   GPU in question shall be given a stable and unique name to be
> > > returned as the
> > > -string here.  The name "memory" is reserved to refer to normal
> > > system memory.
> > > +string here.
> > > +
> > > +The region name "memory" is reserved to refer to normal system memory.
> > >   Value shall reflect the amount of storage currently consumed by
> > > the buffer
> > >   objects belong to this client, in the respective memory region.
> > > @@ -136,6 +138,9 @@ objects belong to this client, in the respective
> > > memory region.
> > >   Default unit shall be bytes with optional unit specifiers of 'KiB'
> > > or 'MiB'
> > >   indicating kibi- or mebi-bytes.
> > > +This is an alias for drm-total- and only one of the two
> > > should be
> > > +present.

This feels a bit awkward and seems to needlessly complicate fdinfo uapi.

- Could we just patch amdgpu to follow everyone else, and avoid the
  special case? If there's no tool that relies on the special amdgpu
  prefix then that would be a lot easier.

- If that's not on the table, could we make everyone (with a suitable
  helper or something) just print both variants, so that we again have
  consisent fdinfo output? Or breaks that a different set of existing
  tools.

- Finally maybe could we get away with fixing amd by adding the common
  format there, deprecating the old, fixing the tools that would break and
  then maybe if we're lucky, remove the old one from amdgpu in a year or
  so?

Uapi that's "either do $foo or on this one driver, do $bar" is just
guaranteed to fragement the ecosystem, so imo that should be the absolute
last resort.
-Sima

> > > +
> > >   - drm-shared-:  [KiB|MiB]
> > >   The total size of buffers that are shared with another file (e.g.,
> > > have more
> > > @@ -145,6 +150,9 @@ than a single handle).
> > >   The total size of buffers that including shared and private memory.
> > > +This is an alias for drm-memory- and only one of the two
> > > should be
> > > +present.
> > > +
> > >   - drm-resident-:  [KiB|MiB]
> > >   The total size of buffers that are resident in the specified region.

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v2 1/1] drm: Add ioctl for querying a DRM device's list of open client PIDs

2024-05-02 Thread Daniel Vetter
de_list_lessees_ioctl, 
> DRM_MASTER),
>   DRM_IOCTL_DEF(DRM_IOCTL_MODE_GET_LEASE, drm_mode_get_lease_ioctl, 
> DRM_MASTER),
>   DRM_IOCTL_DEF(DRM_IOCTL_MODE_REVOKE_LEASE, drm_mode_revoke_lease_ioctl, 
> DRM_MASTER),
> +
> + DRM_IOCTL_DEF(DRM_IOCTL_GET_CLIENTS, drm_getclients, DRM_RENDER_ALLOW),

Uh RENDER_ALLOW sounds like a very bad idea for this, flat out leaks
information that really paranoid people don't want to have leaked.

Natural would be to limit to ptrace ability, but ptrace is for processes
and fd aren't tied to that. So I'm not sure ...

I think a separate file (whether in procfs or a special chardev) where you
can set access rights that suit feels a lot better for this. Plus putting
it into procfs would also give the natural property that you can only read
it if you have access to procfs anyway, so imo that feels like the best
place for this ...

And yes that means some lkml bikeshedding with procfs folks, but I do
think that's good here since we're likely not the only ones who need a bit
faster procfs trawling for some special use-cases. So consistency across
subsystems would be nice to at least try to aim for.
-Sima

>  };
>  
>  #define DRM_CORE_IOCTL_COUNT ARRAY_SIZE(drm_ioctls)
> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
> index 16122819edfe..c47aa9de51ab 100644
> --- a/include/uapi/drm/drm.h
> +++ b/include/uapi/drm/drm.h
> @@ -1024,6 +1024,11 @@ struct drm_crtc_queue_sequence {
>   __u64 user_data;/* user data passed to event */
>  };
>  
> +struct drm_get_clients {
> + __u64 user_data;
> + __kernel_size_t len;
> +};
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> @@ -1236,6 +1241,8 @@ extern "C" {
>  #define DRM_IOCTL_SYNCOBJ_TRANSFER   DRM_IOWR(0xCC, struct 
> drm_syncobj_transfer)
>  #define DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNALDRM_IOWR(0xCD, struct 
> drm_syncobj_timeline_array)
>  
> +#define DRM_IOCTL_GET_CLIENTSDRM_IOWR(0xD1, struct 
> drm_get_clients)
> +
>  /**
>   * DRM_IOCTL_MODE_GETFB2 - Get framebuffer metadata.
>   *
> -- 
> 2.44.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 06/23] drm/xe/svm: Introduce a helper to build sg table from hmm range

2024-05-02 Thread Daniel Vetter
On Tue, Apr 30, 2024 at 09:09:15PM -0300, Jason Gunthorpe wrote:
> On Tue, Apr 30, 2024 at 08:57:48PM +0200, Daniel Vetter wrote:
> > On Tue, Apr 30, 2024 at 02:30:02PM -0300, Jason Gunthorpe wrote:
> > > On Mon, Apr 29, 2024 at 10:25:48AM +0200, Thomas Hellström wrote:
> > > 
> > > > > Yes there is another common scheme where you bind a window of CPU to
> > > > > a
> > > > > window on the device and mirror a fixed range, but this is a quite
> > > > > different thing. It is not SVA, it has a fixed range, and it is
> > > > > probably bound to a single GPU VMA in a multi-VMA device page table.
> > > > 
> > > > And this above here is exactly what we're implementing, and the GPU
> > > > page-tables are populated using device faults. Regions (large) of the
> > > > mirrored CPU mm need to coexist in the same GPU vm as traditional GPU
> > > > buffer objects.
> > > 
> > > Well, not really, if that was the case you'd have a single VMA over
> > > the entire bound range, not dynamically create them.
> > > 
> > > A single VMA that uses hmm_range_fault() to populate the VM is
> > > completely logical.
> > > 
> > > Having a hidden range of mm binding and then creating/destroying 2M
> > > VMAs dynamicaly is the thing that doesn't make alot of sense.
> > 
> > I only noticed this thread now but fyi I did dig around in the
> > implementation and it's summarily an absolute no-go imo for multiple
> > reasons. It starts with this approach of trying to mirror cpu vma (which I
> > think originated from amdkfd) leading to all kinds of locking fun, and
> > then it gets substantially worse when you dig into the details.
> 
> :(
> 
> Why does the DRM side struggle so much with hmm_range fault? I would
> have thought it should have a fairly straightforward and logical
> connection to the GPU page table.

Short summary is that traditionally gpu memory was managed with buffer
objects, and each individual buffer object owns the page tables for it's
va range.

For hmm you don't have that buffer object, and you want the pagetables to
be fairly independent (maybe even with their own locking like linux cpu
pagetables do) from any mapping/backing storage. Getting to that world is
a lot of reshuffling, and so thus far all the code went with the quick
hack route of creating ad-hoc ranges that look like buffer objects to the
rest of the driver code. This includes the merged amdkfd hmm code, and if
you dig around in that it results in some really annoying locking
inversions because that middle layer of fake buffer object lookalikes only
gets in the way and results in a fairly fundamental impendendence mismatch
with core linux mm locking.

> FWIW, it does make sense to have both a window and a full MM option
> for hmm_range_fault. ODP does both and it is fine..
> 
> > I think until something more solid shows up you can just ignore this. I do
> > fully agree that for sva the main mirroring primitive needs to be page
> > centric, so dma_map_sg. 
>   ^^
> 
> dma_map_page

Oops yes.

> > There's a bit a question around how to make the
> > necessary batching efficient and the locking/mmu_interval_notifier scale
> > enough, but I had some long chats with Thomas and I think there's enough
> > option to spawn pretty much any possible upstream consensus. So I'm not
> > worried.
> 
> Sure, the new DMA API will bring some more considerations to this as
> well. ODP uses a 512M granual scheme and it seems to be OK. By far the
> worst part of all this is the faulting performance. I've yet hear any
> complains about mmu notifier performance..

Yeah I don't expect there to be any need for performance improvements on
the mmu notifier side of things. All the concerns I've heard felt rather
theoretical, or where just fallout of that fake buffer object layer in the
middle.

At worst I guess the gpu pagetables need per-pgtable locking like the cpu
pagetables have, and then maybe keep track of mmu notifier sequence
numbers on a per-pgtable basis, so that invalidates and faults on
different va ranges have no impact on each another. But even that is most
likely way, way down the road.

> > But first this needs to be page-centric in the fundamental mirroring
> > approach.
> 
> Yes

Ok clarifying consensus on this was the main reason I replied, it felt a
bit like the thread was derailing in details that don't yet matter.

Thanks, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/sysfs: Add drm class-wide attribute to get active device clients

2024-05-01 Thread Daniel Vetter
s))
> - return PTR_ERR(drm_class);
> + err = class_register(_class);
> + if (err)
> + return err;
>  
> - err = class_create_file(drm_class, _attr_version.attr);
> + err = class_create_file(_class, _attr_version.attr);
>   if (err) {
> - class_destroy(drm_class);
> - drm_class = NULL;
> + class_destroy(_class);
>   return err;
>   }
>  
> - drm_class->devnode = drm_devnode;
> + drm_class.devnode = drm_devnode;
> +
> + drm_class_initialised = true;
>  
>   drm_sysfs_acpi_register();
>   return 0;
> @@ -166,12 +221,12 @@ int drm_sysfs_init(void)
>   */
>  void drm_sysfs_destroy(void)
>  {
> - if (IS_ERR_OR_NULL(drm_class))
> + if (!drm_class_initialised)
>   return;
>   drm_sysfs_acpi_unregister();
> - class_remove_file(drm_class, _attr_version.attr);
> - class_destroy(drm_class);
> - drm_class = NULL;
> + class_remove_file(_class, _attr_version.attr);
> + class_destroy(_class);
> + drm_class_initialised = false;
>  }
>  
>  static void drm_sysfs_release(struct device *dev)
> @@ -372,7 +427,7 @@ int drm_sysfs_connector_add(struct drm_connector 
> *connector)
>   return -ENOMEM;
>  
>   device_initialize(kdev);
> - kdev->class = drm_class;
> + kdev->class = _class;
>   kdev->type = _sysfs_device_connector;
>   kdev->parent = dev->primary->kdev;
>   kdev->groups = connector_dev_groups;
> @@ -550,7 +605,7 @@ struct device *drm_sysfs_minor_alloc(struct drm_minor 
> *minor)
>   minor_str = "card%d";
>  
>   kdev->devt = MKDEV(DRM_MAJOR, minor->index);
> - kdev->class = drm_class;
> + kdev->class = _class;
>   kdev->type = _sysfs_device_minor;
>   }
>  
> @@ -579,10 +634,10 @@ struct device *drm_sysfs_minor_alloc(struct drm_minor 
> *minor)
>   */
>  int drm_class_device_register(struct device *dev)
>  {
> - if (!drm_class || IS_ERR(drm_class))
> + if (!drm_class_initialised)
>   return -ENOENT;
>  
> - dev->class = drm_class;
> + dev->class = _class;
>   return device_register(dev);
>  }
>  EXPORT_SYMBOL_GPL(drm_class_device_register);
> 
> base-commit: 45c734fdd43db1025910b4c59dd2b8be714a
> -- 
> 2.44.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 06/23] drm/xe/svm: Introduce a helper to build sg table from hmm range

2024-04-30 Thread Daniel Vetter
On Tue, Apr 30, 2024 at 02:30:02PM -0300, Jason Gunthorpe wrote:
> On Mon, Apr 29, 2024 at 10:25:48AM +0200, Thomas Hellström wrote:
> 
> > > Yes there is another common scheme where you bind a window of CPU to
> > > a
> > > window on the device and mirror a fixed range, but this is a quite
> > > different thing. It is not SVA, it has a fixed range, and it is
> > > probably bound to a single GPU VMA in a multi-VMA device page table.
> > 
> > And this above here is exactly what we're implementing, and the GPU
> > page-tables are populated using device faults. Regions (large) of the
> > mirrored CPU mm need to coexist in the same GPU vm as traditional GPU
> > buffer objects.
> 
> Well, not really, if that was the case you'd have a single VMA over
> the entire bound range, not dynamically create them.
> 
> A single VMA that uses hmm_range_fault() to populate the VM is
> completely logical.
> 
> Having a hidden range of mm binding and then creating/destroying 2M
> VMAs dynamicaly is the thing that doesn't make alot of sense.

I only noticed this thread now but fyi I did dig around in the
implementation and it's summarily an absolute no-go imo for multiple
reasons. It starts with this approach of trying to mirror cpu vma (which I
think originated from amdkfd) leading to all kinds of locking fun, and
then it gets substantially worse when you dig into the details.

I think until something more solid shows up you can just ignore this. I do
fully agree that for sva the main mirroring primitive needs to be page
centric, so dma_map_sg. There's a bit a question around how to make the
necessary batching efficient and the locking/mmu_interval_notifier scale
enough, but I had some long chats with Thomas and I think there's enough
option to spawn pretty much any possible upstream consensus. So I'm not
worried.

But first this needs to be page-centric in the fundamental mirroring
approach.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm: deprecate driver date

2024-04-30 Thread Daniel Vetter
On Tue, Apr 30, 2024 at 04:38:55PM +0300, Jani Nikula wrote:
> On Tue, 30 Apr 2024, Daniel Vetter  wrote:
> > Might also be a good idea to wait a bit in case there's any regression
> > reports for really old userspace. But I guess there's not a high chance
> > for that to happen here, so imo fine to just go ahead right away.
> 
> This small bit is definitely easier to revert if needed than the whole
> shebang.

So the reason I'm not super worried is that most likely it's going to be a
an old driver for a specific .ko that falls over. So as long as you split
it up per-driver it should still be a fairly minimal revert. But entirely
up to you.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


drmm vs devm (was Re: [PATCH 2/8] drm/xe: covert sysfs over to devm)

2024-04-30 Thread Daniel Vetter
Adding dri-devel because this is kinda more important.

On Mon, Apr 29, 2024 at 04:28:42PM -0500, Lucas De Marchi wrote:
> On Mon, Apr 29, 2024 at 02:45:26PM GMT, Rodrigo Vivi wrote:
> > On Mon, Apr 29, 2024 at 04:17:54PM +0100, Matthew Auld wrote:
> > > On 29/04/2024 14:52, Lucas De Marchi wrote:
> > > > On Mon, Apr 29, 2024 at 09:28:00AM GMT, Rodrigo Vivi wrote:
> > > > > On Mon, Apr 29, 2024 at 01:14:38PM +0100, Matthew Auld wrote:
> > > > > > Hotunplugging the device seems to result in stuff like:
> > > > > >
> > > > > > kobject_add_internal failed for tile0 with -EEXIST, don't try to
> > > > > > register things with the same name in the same directory.
> > > > > >
> > > > > > We only remove the sysfs as part of drmm, however that is tied to 
> > > > > > the
> > > > > > lifetime of the driver instance and not the device underneath. 
> > > > > > Attempt
> > > > > > to fix by using devm for all of the remaining sysfs stuff related 
> > > > > > to the
> > > > > > device.
> > > > >
> > > > > hmmm... so basically we should use the drmm only for the global module
> > > > > stuff and the devm for things that are per device?
> > > >
> > > > that doesn't make much sense. drmm is supposed to run when the driver
> > > > unbinds from the device... basically when all refcounts are gone with
> > > > drm_dev_put().  Are we keeping a ref we shouldn't?
> > > 
> > > It's run when all refcounts are dropped for that particular drm_device, 
> > > but
> > > that is separate from the physical device underneath (struct device). For
> > > example if something has an open driver fd the drmm release action is not
> > > going to be called until after that is also closed. But in the meantime we
> > > might have already removed the pci device and re-attached it to a newly
> > > allocated drm_device/xe_driver instance, like with hotunplug.
> > > 
> > > For example, currently we don't even call basic stuff like guc_fini() etc.
> > > when removing the pci device, but rather when the drm_device is released,
> > > which sounds quite broken.
> > > 
> > > So roughly drmm is for drm_device software level stuff and devm is for 
> > > stuff
> > > that needs to happen when removing the device. See also the doc for drmm:
> > > https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/gpu/drm/drm_managed.c#L23
> > > 
> > > Also: https://docs.kernel.org/gpu/drm-uapi.html#device-hot-unplug
> 
> yeah... I think you convinced me

So rule of thumb:

- devm is for hardware stuff, so like removeing pci mmaps, releasing
  interrupt handlers, cleaning up anything hw related. Because after devm
  respective driver unbind, all that stuff is gone, _even_ when you hold
  onto a struct device reference. Because all that struct device
  reference guarantees is that the software structure stays around as a
  valid memory reference.

- devm is also for remove uapi. Unfortunately we're not quite at the world
  where devm_drm_dev_register is possible, because on the unload side that
  must be done first, and there's still a few things drivers need to do
  after that which isn't fully devm/drmm-ified.

- drmm is for anything software related, so data structures and stuff like
  that. If you have a devm_kmalloc, you very, very likely have a bug. This
  is were you tear down all your software datastructures, which means if
  you have that interleaved with the hw teardown in e.g. guc_fini you have
  some serious work cut out for you. drmm stuff is tied to the drm_device
  lifetime as the core drm uapi interface thing which might stick around
  for much longer than the drm_dev_unregister.

- Finally, when going from the sw side to hw side you must wrap such
  access with drm_dev_enter/exit, or you have races. This is also where
  using drmm and devm for everything really helps, because it gives you a
  very strong hint when you're going from the sw world to the hw world.

  As an example, all the callbacks on the various kms objects are in the
  sw world (so need to be cleaned up with drmm), but the moment you access
  hw (e.g. any mmio) you need to protect that with a drm_dev_enter/exit

Using devm for everything means you have a use-after-free on the sw side,
otoh using devm means you have use-after-free on the hw side (like a
physical hotunplug might reallocate your mmio range to another thunderbolt
device that has been plugged in meanwhile).

It's definitely big time fun all around :-/

Oh also, please help improve the docs on this stuf

Re: [PATCH] drm: deprecate driver date

2024-04-30 Thread Daniel Vetter
On Mon, Apr 29, 2024 at 08:53:15PM +0300, Jani Nikula wrote:
> On Mon, 29 Apr 2024, Hamza Mahfooz  wrote:
> > On 4/29/24 12:43, Jani Nikula wrote:
> >> The driver date serves no useful purpose, because it's hardly ever
> >> updated. The information is misleading at best.
> >> 
> >> As described in Documentation/gpu/drm-internals.rst:
> >> 
> >>The driver date, formatted as MMDD, is meant to identify the date
> >>of the latest modification to the driver. However, as most drivers
> >>fail to update it, its value is mostly useless. The DRM core prints it
> >>to the kernel log at initialization time and passes it to userspace
> >>through the DRM_IOCTL_VERSION ioctl.
> >> 
> >> Stop printing the driver date at init, and start returning the empty
> >> string "" as driver date through the DRM_IOCTL_VERSION ioctl.
> >> 
> >> The driver date initialization in drivers and the struct drm_driver date
> >> member can be removed in follow-up.
> >> 
> >> Signed-off-by: Jani Nikula 
> >
> > I would prefer if it was dropped entirely in this patch, but if you feel
> > that would require too much back and forth, I'm okay with what is
> > currently proposed.
> 
> I can if that's what people prefer, but decided to start with this for
> the inevitable discussion before putting in the effort. ;)

Might also be a good idea to wait a bit in case there's any regression
reports for really old userspace. But I guess there's not a high chance
for that to happen here, so imo fine to just go ahead right away.
-Sima

> 
> > Reviewed-by: Hamza Mahfooz 
> 
> Thanks,
> Jani.
> 
> 
> -- 
> Jani Nikula, Intel

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] lib/fonts: Allow to select fonts for drm_panic

2024-04-30 Thread Daniel Vetter
On Fri, Apr 19, 2024 at 03:20:19PM +0200, Jocelyn Falempe wrote:
> drm_panic has been introduced recently, and uses the same fonts as
> FRAMEBUFFER_CONSOLE.
> 
> Signed-off-by: Jocelyn Falempe 

Acked-by: Daniel Vetter 

lib/fonts/ doesn't seem to have a designated maintainer, so please push
this through drm-misc.

Thanks, Sima
> ---
>  lib/fonts/Kconfig | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/fonts/Kconfig b/lib/fonts/Kconfig
> index 7e945fdcbf11..befcb463f738 100644
> --- a/lib/fonts/Kconfig
> +++ b/lib/fonts/Kconfig
> @@ -10,7 +10,7 @@ if FONT_SUPPORT
>  
>  config FONTS
>   bool "Select compiled-in fonts"
> - depends on FRAMEBUFFER_CONSOLE || STI_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || STI_CONSOLE || DRM_PANIC
>   help
> Say Y here if you would like to use fonts other than the default
> your frame buffer console usually use.
> @@ -23,7 +23,7 @@ config FONTS
>  
>  config FONT_8x8
>   bool "VGA 8x8 font" if FONTS
> - depends on FRAMEBUFFER_CONSOLE || STI_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || STI_CONSOLE || DRM_PANIC
>   default y if !SPARC && !FONTS
>   help
> This is the "high resolution" font for the VGA frame buffer (the one
> @@ -46,7 +46,7 @@ config FONT_8x16
>  
>  config FONT_6x11
>   bool "Mac console 6x11 font (not supported by all drivers)" if FONTS
> - depends on FRAMEBUFFER_CONSOLE || STI_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || STI_CONSOLE || DRM_PANIC
>   default y if !SPARC && !FONTS && MAC
>   help
> Small console font with Macintosh-style high-half glyphs.  Some Mac
> @@ -54,7 +54,7 @@ config FONT_6x11
>  
>  config FONT_7x14
>   bool "console 7x14 font (not supported by all drivers)" if FONTS
> - depends on FRAMEBUFFER_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || DRM_PANIC
>   help
> Console font with characters just a bit smaller than the default.
> If the standard 8x16 font is a little too big for you, say Y.
> @@ -62,7 +62,7 @@ config FONT_7x14
>  
>  config FONT_PEARL_8x8
>   bool "Pearl (old m68k) console 8x8 font" if FONTS
> - depends on FRAMEBUFFER_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || DRM_PANIC
>   default y if !SPARC && !FONTS && AMIGA
>   help
> Small console font with PC-style control-character and high-half
> @@ -70,7 +70,7 @@ config FONT_PEARL_8x8
>  
>  config FONT_ACORN_8x8
>   bool "Acorn console 8x8 font" if FONTS
> - depends on FRAMEBUFFER_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || DRM_PANIC
>   default y if !SPARC && !FONTS && ARM && ARCH_ACORN
>   help
> Small console font with PC-style control characters and high-half
> @@ -90,7 +90,7 @@ config FONT_6x10
>  
>  config FONT_10x18
>   bool "console 10x18 font (not supported by all drivers)" if FONTS
> - depends on FRAMEBUFFER_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || DRM_PANIC
>   help
> This is a high resolution console font for machines with very
> big letters. It fits between the sun 12x22 and the normal 8x16 font.
> @@ -105,7 +105,7 @@ config FONT_SUN8x16
>  
>  config FONT_SUN12x22
>   bool "Sparc console 12x22 font (not supported by all drivers)"
> - depends on FRAMEBUFFER_CONSOLE && (!SPARC && FONTS || SPARC)
> + depends on (FRAMEBUFFER_CONSOLE && (!SPARC && FONTS || SPARC)) || 
> DRM_PANIC
>   help
> This is the high resolution console font for Sun machines with very
> big letters (like the letters used in the SPARC PROM). If the
> @@ -113,7 +113,7 @@ config FONT_SUN12x22
>  
>  config FONT_TER16x32
>   bool "Terminus 16x32 font (not supported by all drivers)"
> - depends on FRAMEBUFFER_CONSOLE && (!SPARC && FONTS || SPARC)
> + depends on (FRAMEBUFFER_CONSOLE && (!SPARC && FONTS || SPARC)) || 
> DRM_PANIC
>   help
> Terminus Font is a clean, fixed width bitmap font, designed
> for long (8 and more hours per day) work with computers.
> @@ -122,7 +122,7 @@ config FONT_TER16x32
>  
>  config FONT_6x8
>   bool "OLED 6x8 font" if FONTS
> - depends on FRAMEBUFFER_CONSOLE
> + depends on FRAMEBUFFER_CONSOLE || DRM_PANIC
>   help
> This font is useful for small displays (OLED).
>  
> 
> base-commit: a35e92ef04c07bd473404b9b73d489aea19a60a8
> -- 
> 2.44.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 0/7] Adds support for ConfigFS to VKMS!

2024-04-30 Thread Daniel Vetter
On Tue, Aug 29, 2023 at 05:30:52AM +, Brandon Pollack wrote:
> Since Jim is busy with other work and I'm working on some things that
> rely on this, I've taken up the task of doing the iterations.  I've
> addressed the comments as best I can (those replies are to each
> individual change) and here is the patch set to go with those.
> 
> I added my own signoff to each commit, but I've left jshargo@ as the
> author of all the commits he wrote.  I'm sure there is still more to
> address and the ICT tests that were writtein parallel to this may also
> need some additions, but I'm hoping we're in a good enough state to get
> this in and iterate from there soon.
> 
> Since V6:
> 
> rmdirs for documentation examples
> fix crtc mask for writebacks
> 
> Since V5:
> 
> Fixed some bad merge conflicts and locking behaviours as well as
> clarified some documentation, should be good to go now :)
> 
> Since V4:
> 
> Fixed up some documentation as suggested by Marius
> Fixed up some bad locking as suggested by Marius
> Small fixes here and there (most have email responses to previous chain
> emails)
> 
> Since V3:
> 
> I've added hotplug support in the latest patch.  This has been reviewed some
> and the notes from that review are addressed here as well.
> 
> Relevant/Utilizing work:
> ===
> I've built a while test framework based on this as proof it functions (though
> I'm sure there may be lingering bugs!).  You can check that out on
> crrev.com if you are interested and need to get started yourself (but be
> aware of any licensing that may differ from the kernel itself!  Make
> sure you understand the license:
> 
> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/LICENSE
> 
> That said, you can see the changes in review on the crrev gerrit:
> 
> https://chromium-review.googlesource.com/c/chromiumos/platform/tast-tests/+/469
> 
> Outro:
> =
> I really appreciate everyone's input and tolerance in getting these
> changes in.  Jim's first patch series was this, and other than some
> small cleanups and documentation, taking over it is also mine.

Sorry for not having looked at this earlier. I think overall it's looking
good, mostly just a bunch of comments on lifetime/locking questions.

I'm also wondering a bit how much we want to go overboard with igt tests,
since the lifetime fun is quite big here. I think at least some basic
tests that trying to do nasty things like unbind the driver in sysfs and
then try to use configfs, or keeping the vkms_device alive with an open fd
and removing the configfs directory would be really good.

One thing that's a bit tricky is that configfs is considered uapi, so must
be stable forever. And I think that's actually the right thing for us,
since we want compositors and other projects to use this for their
testing. So unlike igt tests using special debugfs interfaces, which are
ok to be very tightly coupled to kernel releases

Cheers, Sima
> 
> Thank you everyone :)
> 
> Brandon Pollack (1):
>   drm/vkms Add hotplug support via configfs to VKMS.
> 
> Jim Shargo (6):
>   drm/vkms: Back VKMS with DRM memory management instead of static
> objects
>   drm/vkms: Support multiple DRM objects (crtcs, etc.) per VKMS device
>   drm/vkms: Provide platform data when creating VKMS devices
>   drm/vkms: Add ConfigFS scaffolding to VKMS
>   drm/vkms: Support enabling ConfigFS devices
>   drm/vkms: Add a module param to enable/disable the default device
> 
>  Documentation/gpu/vkms.rst|  20 +-
>  drivers/gpu/drm/Kconfig   |   1 +
>  drivers/gpu/drm/vkms/Makefile |   1 +
>  drivers/gpu/drm/vkms/vkms_composer.c  |  30 +-
>  drivers/gpu/drm/vkms/vkms_configfs.c  | 723 ++
>  drivers/gpu/drm/vkms/vkms_crtc.c  | 102 ++--
>  drivers/gpu/drm/vkms/vkms_drv.c   | 206 +---
>  drivers/gpu/drm/vkms/vkms_drv.h   | 182 +--
>  drivers/gpu/drm/vkms/vkms_output.c| 404 --
>  drivers/gpu/drm/vkms/vkms_plane.c |  44 +-
>  drivers/gpu/drm/vkms/vkms_writeback.c |  42 +-
>  11 files changed, 1514 insertions(+), 241 deletions(-)
>  create mode 100644 drivers/gpu/drm/vkms/vkms_configfs.c
> 
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 5/7] drm/vkms: Support enabling ConfigFS devices

2024-04-30 Thread Daniel Vetter
   "Too many primary planes found 
> for crtc %s.",
> + item->ci_name);
> + return -EINVAL;
> + }
> + primary = plane;
> + } else if (plane->type == DRM_PLANE_TYPE_CURSOR) {
> + if (cursor) {
> + DRM_WARN(
> + "Too many cursor planes found 
> for crtc %s.",
> + item->ci_name);
> + return -EINVAL;
> + }
> + cursor = plane;
> + }
> + }
> +
> + if (!primary) {
> + DRM_WARN("No primary plane configured for crtc %s",
> +  item->ci_name);
> + return -EINVAL;
> + }
> +
> + vkms_crtc =
> + vkms_crtc_init(vkmsdev, primary, cursor, item->ci_name);
> + if (IS_ERR(vkms_crtc)) {
> + DRM_WARN("Unable to init crtc from config: %s",
> +  item->ci_name);
> + return PTR_ERR(vkms_crtc);
> + }
> +
> + for (int j = 0; j < output->num_planes; j++) {
> + struct plane_map *plane_entry = _map[j];
> +
> + if (!plane_entry->plane)
> + break;
> +
> + if (is_object_linked(
> + _entry->config_plane->possible_crtcs,
> + config_crtc->crtc_config_idx)) {
> + plane_entry->plane->base.possible_crtcs |=
> + drm_crtc_mask(_crtc->base);
> + }
> + }
> +
> + for (int j = 0; j < output->num_encoders; j++) {
> + struct encoder_map *encoder_entry = _map[j];
> +
> + if (is_object_linked(_entry->config_encoder
> +   ->possible_crtcs,
> +  config_crtc->crtc_config_idx)) {
> + encoder_entry->encoder->possible_crtcs |=
> +     drm_crtc_mask(_crtc->base);
> + }
> + }
> +
> + if (vkmsdev->config.writeback) {
> + int ret = vkms_enable_writeback_connector(vkmsdev,
> +   vkms_crtc);
> + if (ret)
> + DRM_WARN(
> + "Failed to init writeback connector for 
> config crtc: %s. Error code %d",
> + item->ci_name, ret);
> + }
> + }
> +
> + drm_mode_config_reset(dev);
> +
> + return 0;
>  }
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c 
> b/drivers/gpu/drm/vkms/vkms_plane.c
> index 950e6c930273..3198bf0dca73 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0+
>  
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -215,20 +216,25 @@ static const struct drm_plane_helper_funcs 
> vkms_plane_helper_funcs = {
>  };
>  
>  struct vkms_plane *vkms_plane_init(struct vkms_device *vkmsdev,
> -enum drm_plane_type type)
> +enum drm_plane_type type, char *name, ...)
>  {
>   struct drm_device *dev = >drm;
>   struct vkms_output *output = >output;
>   struct vkms_plane *plane;
> + va_list va;
>   int ret;
>  
>   if (output->num_planes >= VKMS_MAX_PLANES)
>   return ERR_PTR(-ENOMEM);
>  
>   plane = >planes[output->num_planes++];
> +
> + va_start(va, name);
>   ret = drm_universal_plane_init(dev, >base, 0, _plane_funcs,
>  vkms_formats, ARRAY_SIZE(vkms_formats),
> -NULL, type, NULL);
> +NULL, type, name, va);
> + va_end(va);
> +
>   if (ret)
>   return ERR_PTR(ret);
>  
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 7/7] drm/vkms Add hotplug support via configfs to VKMS.

2024-04-30 Thread Daniel Vetter
ms_connector_funcs = {
> + .detect = vkms_connector_detect,
>   .fill_modes = drm_helper_probe_single_connector_modes,
>   .destroy = drm_connector_cleanup,
>   .reset = drm_atomic_helper_connector_reset,
> @@ -19,6 +22,48 @@ static const struct drm_connector_funcs 
> vkms_connector_funcs = {
>   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
>  };
>  
> +static const struct vkms_config_connector *
> +find_config_for_connector(struct drm_connector *connector)
> +{
> + struct vkms_device *vkms = drm_device_to_vkms_device(connector->dev);
> + struct vkms_configfs *configfs = vkms->configfs;
> + struct config_item *item;
> +
> + if (!configfs) {
> + pr_info("Default connector has no configfs entry");
> + return NULL;
> + }
> +
> + list_for_each_entry(item, >connectors_group.cg_children,
> + ci_entry) {
> + struct vkms_config_connector *config_connector =
> + item_to_config_connector(item);
> + if (config_connector->connector == connector)
> + return config_connector;
> + }
> +
> + pr_warn("Could not find config to match connector %s, but configfs was 
> initialized",
> + connector->name);
> +
> + return NULL;
> +}
> +
> +enum drm_connector_status vkms_connector_detect(struct drm_connector 
> *connector,
> + bool force)
> +{
> + enum drm_connector_status status = connector_status_connected;
> + const struct vkms_config_connector *config_connector =
> + find_config_for_connector(connector);
> +
> + if (!config_connector)
> + return connector_status_connected;
> +
> + if (!config_connector->connected)
> + status = connector_status_disconnected;
> +
> + return status;
> +}
> +
>  static const struct drm_encoder_funcs vkms_encoder_funcs = {
>   .destroy = drm_encoder_cleanup,
>  };
> @@ -280,12 +325,12 @@ int vkms_output_init(struct vkms_device *vkmsdev)
>   struct vkms_config_connector *config_connector =
>   item_to_config_connector(item);
>   struct drm_connector *connector = vkms_connector_init(vkmsdev);
> -
>   if (IS_ERR(connector)) {
>   DRM_ERROR("Failed to init connector from config: %s",
> item->ci_name);
>   return PTR_ERR(connector);
>   }
> + config_connector->connector = connector;
>  
>   for (int j = 0; j < output->num_encoders; j++) {
>   struct encoder_map *encoder = _map[j];
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 5/7] drm/vkms: Support enabling ConfigFS devices

2024-04-30 Thread Daniel Vetter
DRM_WARN(
> + "Too many primary planes found 
> for crtc %s.",
> + item->ci_name);
> + return -EINVAL;
> + }
> + primary = plane;
> + } else if (plane->type == DRM_PLANE_TYPE_CURSOR) {
> + if (cursor) {
> + DRM_WARN(
> + "Too many cursor planes found 
> for crtc %s.",
> + item->ci_name);
> + return -EINVAL;
> + }
> + cursor = plane;
> + }
> + }
> +
> + if (!primary) {
> + DRM_WARN("No primary plane configured for crtc %s",
> +  item->ci_name);
> + return -EINVAL;
> + }
> +
> + vkms_crtc =
> + vkms_crtc_init(vkmsdev, primary, cursor, item->ci_name);
> + if (IS_ERR(vkms_crtc)) {
> + DRM_WARN("Unable to init crtc from config: %s",
> +  item->ci_name);
> + return PTR_ERR(vkms_crtc);
> + }
> +
> + for (int j = 0; j < output->num_planes; j++) {
> + struct plane_map *plane_entry = _map[j];
> +
> + if (!plane_entry->plane)
> + break;
> +
> + if (is_object_linked(
> + _entry->config_plane->possible_crtcs,
> + config_crtc->crtc_config_idx)) {
> + plane_entry->plane->base.possible_crtcs |=
> + drm_crtc_mask(_crtc->base);
> + }
> + }
> +
> + for (int j = 0; j < output->num_encoders; j++) {
> + struct encoder_map *encoder_entry = _map[j];
> +
> + if (is_object_linked(_entry->config_encoder
> +   ->possible_crtcs,
> +  config_crtc->crtc_config_idx)) {
> + encoder_entry->encoder->possible_crtcs |=
> + drm_crtc_mask(_crtc->base);
> + }
> + }
> +
> + if (vkmsdev->config.writeback) {

This mixes the default setup code (that uses vkms_config) with the
configfs paths, which I think is really not a good idea. This is why I
think we should have a very strict split here. Alternatively if we go with
using struct vkms_config for configfs too, then I think we must not use
the module options to fill in the missing parameters that configfs does
not yet support.

The reason is that configfs is uapi (yay!), so if we make a mess here just
because it's easier to get things going, we'll bake that mess in forever.
Instead I think it'd be best if we just disable writeback support for the
initial configfs work.

For properly enabling writeback I think we need a new configfs group for
just writeback connectors, since those are fairly special (like you cannot
ever hotplug them, they're always there).

> + int ret = vkms_enable_writeback_connector(vkmsdev,
> +   vkms_crtc);
> + if (ret)
> + DRM_WARN(
> + "Failed to init writeback connector for 
> config crtc: %s. Error code %d",
> + item->ci_name, ret);
> + }
> + }
> +
> + drm_mode_config_reset(dev);
> +
> + return 0;
>  }
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c 
> b/drivers/gpu/drm/vkms/vkms_plane.c
> index 950e6c930273..3198bf0dca73 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: GPL-2.0+
>  
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -215,20 +216,25 @@ static const struct drm_plane_helper_funcs 
> vkms_plane_helper_funcs = {
>  };
>  
>  struct vkms_plane *vkms_plane_init(struct vkms_device *vkmsdev,
> -enum drm_plane_type type)
> +enum drm_plane_type type, char *name, ...)
>  {
>   struct drm_device *dev = >drm;
>   struct vkms_output *output = >output;
>   struct vkms_plane *plane;
> + va_list va;
>   int ret;
>  
>   if (output->num_planes >= VKMS_MAX_PLANES)
>   return ERR_PTR(-ENOMEM);
>  
>   plane = >planes[output->num_planes++];
> +
> + va_start(va, name);
>   ret = drm_universal_plane_init(dev, >base, 0, _plane_funcs,
>  vkms_formats, ARRAY_SIZE(vkms_formats),
> -NULL, type, NULL);
> +NULL, type, name, va);
> + va_end(va);
> +
>   if (ret)
>   return ERR_PTR(ret);
>  
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 4/7] drm/vkms: Add ConfigFS scaffolding to VKMS

2024-04-30 Thread Daniel Vetter
  struct vkms_configfs *configfs;
> + struct config_group config_group;
> + struct vkms_config_links possible_crtcs;
> + enum drm_plane_type type;
> +};
> +
> +struct vkms_configfs {
> + /* Directory group containing connector configs, e.g. 
> /config/vkms/device/ */
> + struct config_group device_group;
> + /* Directory group containing connector configs, e.g. 
> /config/vkms/device/connectors/ */
> + struct config_group connectors_group;
> + /* Directory group containing CRTC configs, e.g. 
> /config/vkms/device/crtcs/ */
> + struct config_group crtcs_group;
> + /* Directory group containing encoder configs, e.g. 
> /config/vkms/device/encoders/ */
> + struct config_group encoders_group;
> + /* Directory group containing plane configs, e.g. 
> /config/vkms/device/planes/ */
> + struct config_group planes_group;
> +
> + unsigned long allocated_crtcs;
> + unsigned long allocated_encoders;
> +
> + struct mutex lock;

So this doesn't work, because it doesn't protect against against concurent
add/removal of items to groups and other changes. Instead we need to rely
on configfs_subsystem.su_mutex. To make this work cleanly I think we
should do the following:

- Add a configfs_assert_subsystem_locked wrapped to configfs.h, so that we
  have a nicely abstracted lockdep check using lockdep_assert_held.

  Than use that locking assert everywhere you currently have a
  mutex_lock(vkms_configfs->lock). Note that you have quiet a few missing
  places (since really everything we do needs that lock), I'd focus on
  adding it to important helper functions like the xarray wrappers for
  allocating ids (if we do those).

- Then for walking the various lists (both here and in the next patch) we
  should also add proper wrapper macros to configfs.h, which both do the
  right upcasting and also have the lockdep assert.

> +
> + /* The platform device if this is registered, otherwise NULL */
> + struct vkms_device *vkms_device;
> +};
> +
>  struct vkms_device_setup {
> - bool is_default;
> + // Is NULL in the case of the default card.
> + struct vkms_configfs *configfs;
>  };
>  
>  struct vkms_device {
>   struct drm_device drm;
>   struct platform_device *platform;
> - bool is_default;
> + // Is NULL in the case of the default card.
> + struct vkms_configfs *configfs;
>   struct vkms_output output;
>   struct vkms_config config;
>  };
> @@ -164,11 +217,42 @@ struct vkms_device {
>  #define to_vkms_plane_state(target)\
>   container_of(target, struct vkms_plane_state, base.base)
>  
> +#define item_to_configfs(item) \
> + container_of(to_config_group(item), struct vkms_configfs, device_group)
> +
> +#define item_to_config_connector(item)\
> + container_of(to_config_group(item), struct vkms_config_connector, \
> +  config_group)
> +
> +#define item_to_config_crtc(item)\
> + container_of(to_config_group(item), struct vkms_config_crtc, \
> +  config_group)
> +
> +#define item_to_config_encoder(item)\
> + container_of(to_config_group(item), struct vkms_config_encoder, \
> +  config_group)
> +
> +#define item_to_config_plane(item)\
> + container_of(to_config_group(item), struct vkms_config_plane, \
> +  config_group)
> +
> +#define item_to_config_links(item) \
> + container_of(to_config_group(item), struct vkms_config_links, group)
> +
> +#define plane_item_to_configfs(item) 
> \
> + container_of(to_config_group(item->ci_parent), struct vkms_configfs, \
> +  planes_group)
> +
> +/* Devices */
> +struct vkms_device *vkms_add_device(struct vkms_configfs *configfs);
> +void vkms_remove_device(struct vkms_device *vkms_device);
> +
>  /* CRTC */
>  struct vkms_crtc *vkms_crtc_init(struct vkms_device *vkmsdev,
>struct drm_plane *primary,
>struct drm_plane *cursor);
>  
> +int vkms_output_init(struct vkms_device *vkmsdev);
>  int vkms_output_init_default(struct vkms_device *vkmsdev);
>  
>  struct vkms_plane *vkms_plane_init(struct vkms_device *vkmsdev,
> @@ -191,4 +275,8 @@ void vkms_writeback_row(struct vkms_writeback_job *wb, 
> const struct line_buffer
>  int vkms_enable_writeback_connector(struct vkms_device *vkmsdev,
>   struct vkms_crtc *vkms_crtc);
>  
> +/* ConfigFS Support */
> +int vkms_init_configfs(void);
> +void vkms_unregister_configfs(void);
> +
>  #endif /* _VKMS_DRV_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_output.c 
> b/drivers/gpu/drm/vkms/vkms_output.c
> index bfc2e2362c6d..dc69959c5e1d 100644
> --- a/drivers/gpu/drm/vkms/vkms_output.c
> +++ b/drivers/gpu/drm/vkms/vkms_output.c
> @@ -176,3 +176,8 @@ int vkms_output_init_default(struct vkms_device *vkmsdev)
>  
>   return ret;
>  }
> +
> +int vkms_output_init(struct vkms_device *vkmsdev)
> +{
> + return -EOPNOTSUPP;
> +}
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 3/7] drm/vkms: Provide platform data when creating VKMS devices

2024-04-30 Thread Daniel Vetter
On Tue, Aug 29, 2023 at 05:30:55AM +, Brandon Pollack wrote:
> From: Jim Shargo 
> 
> This is a small refactor to make ConfigFS support easier. This should be
> a no-op refactor.
> 
> Signed-off-by: Jim Shargo 
> Signed-off-by: Brandon Pollack 

This should be part of the series to switch over to a real platform
driver, since we only need that with that design and not with the current
setup/init code.
-Sima

> ---
>  drivers/gpu/drm/vkms/vkms_drv.c| 14 --
>  drivers/gpu/drm/vkms/vkms_drv.h|  9 ++---
>  drivers/gpu/drm/vkms/vkms_output.c |  2 +-
>  3 files changed, 19 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.c b/drivers/gpu/drm/vkms/vkms_drv.c
> index 65b1e2c52106..6c94c2b5d529 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.c
> +++ b/drivers/gpu/drm/vkms/vkms_drv.c
> @@ -9,6 +9,7 @@
>   * the GPU in DRM API tests.
>   */
>  
> +#include "asm-generic/errno-base.h"
>  #include 
>  #include 
>  #include 
> @@ -171,12 +172,14 @@ static int vkms_modeset_init(struct vkms_device 
> *vkmsdev)
>   dev->mode_config.preferred_depth = 0;
>   dev->mode_config.helper_private = _mode_config_helpers;
>  
> - return vkms_output_init(vkmsdev, 0);
> + return vkmsdev->is_default ? vkms_output_init_default(vkmsdev) :
> +  -EINVAL;
>  }
>  
>  static int vkms_platform_probe(struct platform_device *pdev)
>  {
>   int ret;
> + struct vkms_device_setup *vkms_device_setup = pdev->dev.platform_data;
>   struct vkms_device *vkms_device;
>   void *grp;
>  
> @@ -195,6 +198,7 @@ static int vkms_platform_probe(struct platform_device 
> *pdev)
>   vkms_device->config.cursor = enable_cursor;
>   vkms_device->config.writeback = enable_writeback;
>   vkms_device->config.overlay = enable_overlay;
> + vkms_device->is_default = vkms_device_setup->is_default;
>  
>   ret = dma_coerce_mask_and_coherent(vkms_device->drm.dev,
>  DMA_BIT_MASK(64));
> @@ -258,6 +262,9 @@ static int __init vkms_init(void)
>  {
>   int ret;
>   struct platform_device *pdev;
> + struct vkms_device_setup vkms_device_setup = {
> + .is_default = true,
> + };
>  
>   ret = platform_driver_register(_platform_driver);
>   if (ret) {
> @@ -265,8 +272,11 @@ static int __init vkms_init(void)
>   return ret;
>   }
>  
> - pdev = platform_device_register_simple(DRIVER_NAME, -1, NULL, 0);
> + pdev = platform_device_register_data(NULL, DRIVER_NAME, 0,
> +  _device_setup,
> +  sizeof(vkms_device_setup));
>   if (IS_ERR(pdev)) {
> + DRM_ERROR("Unable to register default vkms device\n");
>   platform_driver_unregister(_platform_driver);
>   return PTR_ERR(pdev);
>   }
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 761cd809617e..4262dcffd7e1 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -132,17 +132,20 @@ struct vkms_output {
>   struct vkms_plane planes[VKMS_MAX_PLANES];
>  };
>  
> -struct vkms_device;
> -
>  struct vkms_config {
>   bool writeback;
>   bool cursor;
>   bool overlay;
>  };
>  
> +struct vkms_device_setup {
> + bool is_default;
> +};
> +
>  struct vkms_device {
>   struct drm_device drm;
>   struct platform_device *platform;
> + bool is_default;
>   struct vkms_output output;
>   struct vkms_config config;
>  };
> @@ -166,7 +169,7 @@ struct vkms_crtc *vkms_crtc_init(struct vkms_device 
> *vkmsdev,
>struct drm_plane *primary,
>struct drm_plane *cursor);
>  
> -int vkms_output_init(struct vkms_device *vkmsdev, int index);
> +int vkms_output_init_default(struct vkms_device *vkmsdev);
>  
>  struct vkms_plane *vkms_plane_init(struct vkms_device *vkmsdev,
>  enum drm_plane_type type);
> diff --git a/drivers/gpu/drm/vkms/vkms_output.c 
> b/drivers/gpu/drm/vkms/vkms_output.c
> index 86faf94f7408..bfc2e2362c6d 100644
> --- a/drivers/gpu/drm/vkms/vkms_output.c
> +++ b/drivers/gpu/drm/vkms/vkms_output.c
> @@ -80,7 +80,7 @@ static struct drm_encoder *vkms_encoder_init(struct 
> vkms_device *vkms_device)
>   return encoder;
>  }
>  
> -int vkms_output_init(struct vkms_device *vkmsdev, int index)
> +int vkms_output_init_default(struct vkms_device *vkmsdev)
>  {
>   struct vkms_output *output = >output;
>   struct drm_device *dev = >drm;
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 2/7] drm/vkms: Support multiple DRM objects (crtcs, etc.) per VKMS device

2024-04-30 Thread Daniel Vetter
On Tue, Aug 29, 2023 at 05:30:54AM +, Brandon Pollack wrote:
> From: Jim Shargo 
> 
> This change supports multiple CRTCs, encoders, connectors instead of one
> of each per device.
> 
> Since ConfigFS-based devices will support multiple crtcs, it's useful to
> move all of the writeback/composition data from being per-"output" to
> being per-CRTC.
> 
> Since there's still only ever one CRTC, this should be a no-op refactor.
> 
> Signed-off-by: Jim Shargo 
> Signed-off-by: Brandon Pollack 

> +struct vkms_crtc {
> + struct drm_crtc base;
> +
> + struct drm_writeback_connector wb_connector;
> + struct hrtimer vblank_hrtimer;
> + ktime_t period_ns;
> + struct drm_pending_vblank_event *event;
> + /* ordered wq for composer_work */
> + struct workqueue_struct *composer_workq;
> + /* protects concurrent access to composer */
> + spinlock_t lock;
> + /* guarantees that if the composer is enabled, a job will be queued */
> + struct mutex enabled_lock;
> +
> + /* protected by @enabled_lock */
> + bool composer_enabled;
> + struct vkms_crtc_state *composer_state;
> +
> + spinlock_t composer_lock;
> +};
> +
>  struct vkms_color_lut {
>   struct drm_color_lut *base;
>   size_t lut_length;
> @@ -97,25 +122,14 @@ struct vkms_crtc_state {
>  };
>  
>  struct vkms_output {

I think this structure doesn't make sense anymore. If I didn't misread
then it's really only needed as a temporary structure during the default
vkms_output_init code anymore, and for that case I think we should just
completely delete it. Since vkms is now using drmm_ there's really not
need to track all our kms objects again ourselves.

With that this patch essentially becomes "creat vkms_crtc" (which moves
all the composer releated data from vkms_output to this new structure) and
then maybe a 2nd patch which deletes the leftovers of vkms_output.

> - struct drm_crtc crtc;
> - struct drm_encoder encoder;
> - struct drm_connector connector;
> - struct drm_writeback_connector wb_connector;
> - struct hrtimer vblank_hrtimer;
> - ktime_t period_ns;
> - struct drm_pending_vblank_event *event;
> - /* ordered wq for composer_work */
> - struct workqueue_struct *composer_workq;
> - /* protects concurrent access to composer */
> - spinlock_t lock;
> - /* guarantees that if the composer is enabled, a job will be queued */
> - struct mutex enabled_lock;
> -
> - /* protected by @enabled_lock */
> - bool composer_enabled;
> - struct vkms_crtc_state *composer_state;
> -
> - spinlock_t composer_lock;
> + int num_crtcs;
> + struct vkms_crtc crtcs[VKMS_MAX_OUTPUT_OBJECTS];

Uh can we please directly use the DRM limits here for these? I guess this
is because you have static arrays, but vkms really shouldn't need it's own
arrays to keep track of what drm already keeps track of.

Using DRM limits also means we can rely on the drm validation code instead
of having to duplicate that code in the vkms configfs validation
functions.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v6 1/7] drm/vkms: Back VKMS with DRM memory management instead of static objects

2024-04-30 Thread Daniel Vetter
iteback = enable_writeback;
> - config->overlay = enable_overlay;
> + vkms_device = platform_get_drvdata(pdev);
> + if (!vkms_device)
> + return 0;
>  
> - ret = vkms_create(config);
> - if (ret)
> - kfree(config);
> -
> - return ret;
> + drm_dev_unregister(_device->drm);
> + drm_atomic_helper_shutdown(_device->drm);
> + return 0;
>  }
>  
> -static void vkms_destroy(struct vkms_config *config)
> +static struct platform_driver vkms_platform_driver = {
> + .probe = vkms_platform_probe,
> + .remove = vkms_platform_remove,
> + .driver.name = DRIVER_NAME,
> +};
> +
> +static int __init vkms_init(void)
>  {
> + int ret;
>   struct platform_device *pdev;
>  
> - if (!config->dev) {
> - DRM_INFO("vkms_device is NULL.\n");
> - return;
> + ret = platform_driver_register(_platform_driver);
> + if (ret) {
> + DRM_ERROR("Unable to register platform driver\n");
> + return ret;
>   }
>  
> - pdev = config->dev->platform;
> -
> - drm_dev_unregister(>dev->drm);
> - drm_atomic_helper_shutdown(>dev->drm);
> - devres_release_group(>dev, NULL);
> - platform_device_unregister(pdev);
> + pdev = platform_device_register_simple(DRIVER_NAME, -1, NULL, 0);
> + if (IS_ERR(pdev)) {
> + platform_driver_unregister(_platform_driver);
> + return PTR_ERR(pdev);
> + }
>  
> - config->dev = NULL;
> + return 0;
>  }
>  
>  static void __exit vkms_exit(void)
>  {
> - if (default_config->dev)
> - vkms_destroy(default_config);
> + struct device *dev;
> +
> + while ((dev = platform_find_device_by_driver(
> + NULL, _platform_driver.driver))) {
> + // platform_find_device_by_driver increments the refcount. Drop
> + // it so we don't leak memory.
> + put_device(dev);
> + platform_device_unregister(to_platform_device(dev));
> + }
>  
> - kfree(default_config);
> + platform_driver_unregister(_platform_driver);
>  }
>  
>  module_init(vkms_init);
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index c7ae6c2ba1df..4c35d6305f2a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -124,15 +124,13 @@ struct vkms_config {
>   bool writeback;
>   bool cursor;
>   bool overlay;
> - /* only set when instantiated */
> - struct vkms_device *dev;
>  };
>  
>  struct vkms_device {
>   struct drm_device drm;
>   struct platform_device *platform;
>   struct vkms_output output;
> - const struct vkms_config *config;
> + struct vkms_config config;
>  };
>  
>  #define drm_crtc_to_vkms_output(target) \
> diff --git a/drivers/gpu/drm/vkms/vkms_output.c 
> b/drivers/gpu/drm/vkms/vkms_output.c
> index 5ce70dd946aa..963a64cf068b 100644
> --- a/drivers/gpu/drm/vkms/vkms_output.c
> +++ b/drivers/gpu/drm/vkms/vkms_output.c
> @@ -62,7 +62,7 @@ int vkms_output_init(struct vkms_device *vkmsdev, int index)
>   if (IS_ERR(primary))
>   return PTR_ERR(primary);
>  
> - if (vkmsdev->config->overlay) {
> + if (vkmsdev->config.overlay) {
>   for (n = 0; n < NUM_OVERLAY_PLANES; n++) {
>   ret = vkms_add_overlay_plane(vkmsdev, index, crtc);
>   if (ret)
> @@ -70,7 +70,7 @@ int vkms_output_init(struct vkms_device *vkmsdev, int index)
>   }
>   }
>  
> - if (vkmsdev->config->cursor) {
> + if (vkmsdev->config.cursor) {
>   cursor = vkms_plane_init(vkmsdev, DRM_PLANE_TYPE_CURSOR, index);
>   if (IS_ERR(cursor))
>   return PTR_ERR(cursor);
> @@ -103,7 +103,7 @@ int vkms_output_init(struct vkms_device *vkmsdev, int 
> index)
>   goto err_attach;
>   }
>  
> - if (vkmsdev->config->writeback) {
> + if (vkmsdev->config.writeback) {
>   writeback = vkms_enable_writeback_connector(vkmsdev);
>   if (writeback)
>   DRM_ERROR("Failed to init writeback connector\n");
> -- 
> 2.42.0.rc2.253.gd59a3bf2b4-goog
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v12 0/9] drm/panic: Add a drm panic handler

2024-04-10 Thread Daniel Vetter
On Tue, Apr 09, 2024 at 06:30:39PM +0200, Jocelyn Falempe wrote:
> drm/panic: Add a drm panic handler
> 
> This introduces a new drm panic handler, which displays a message when a 
> panic occurs.
> So when fbcon is disabled, you can still see a kernel panic.
> 
> This is one of the missing feature, when disabling VT/fbcon in the kernel:
> https://www.reddit.com/r/linux/comments/10eccv9/config_vtn_in_2023/
> Fbcon can be replaced by a userspace kms console, but the panic screen must 
> be done in the kernel.
> 
> It works with simpledrm, mgag200, ast, and imx.
> 
> To test it, make sure you're using one of the supported driver, and trigger a 
> panic:
> echo c > /proc/sysrq-trigger
> 
> or you can enable CONFIG_DRM_PANIC_DEBUG and echo 1 > 
> /sys/kernel/debug/dri/0/drm_panic_plane_0
> 
> Even if this is not yet useful, it will allows to work on more driver 
> support, and better debug information to be added.
> 
> v2:
>  * Use get_scanout_buffer() instead of the drm client API. (Thomas Zimmermann)
>  * Add the panic reason to the panic message (Nerdopolis)
>  * Add an exclamation mark (Nerdopolis)
>  
> v3:
>  * Rework the drawing functions, to write the pixels line by line and
>  to use the drm conversion helper to support other formats.
>  (Thomas Zimmermann)
>  
> v4:
>  * Fully support all simpledrm formats using drm conversion helpers
>  * Rename dpanic_* to drm_panic_*, and have more coherent function name.
>(Thomas Zimmermann)
>  * Use drm_fb_r1_to_32bit for fonts (Thomas Zimmermann)
>  * Remove the default y to DRM_PANIC config option (Thomas Zimmermann)
>  * Add foreground/background color config option
>  * Fix the bottom lines not painted if the framebuffer height
>is not a multiple of the font height.
>  * Automatically register the driver to drm_panic, if the function
>get_scanout_buffer() exists. (Thomas Zimmermann)
>  * Add mgag200 support.
>  
> v5:
>  * Change the drawing API, use drm_fb_blit_from_r1() to draw the font.
>(Thomas Zimmermann)
>  * Also add drm_fb_fill() to fill area with background color.
>  * Add draw_pixel_xy() API for drivers that can't provide a linear buffer.
>  * Add a flush() callback for drivers that needs to synchronize the buffer.
>  * Add a void *private field, so drivers can pass private data to
>draw_pixel_xy() and flush(). 
>  * Add ast support.
>  * Add experimental imx/ipuv3 support, to test on an ARM hw. (Maxime Ripard)
> 
> v6:
>  * Fix sparse and __le32 warnings
>  * Drop the IMX/IPUV3 experiment, it was just to show that it works also on
>ARM devices.
> 
> v7:
>  * Add a check to see if the 4cc format is supported by drm_panic.
>  * Add a drm/plane helper to loop over all visible primary buffer,
>simplifying the get_scanout_buffer implementations
>  * Add a generic implementation for drivers that uses drm_fb_dma. (Maxime 
> Ripard)
>  * Add back the IMX/IPUV3 support, and use the generic implementation. 
> (Maxime Ripard)
> 
> v8:
>  * Directly register each plane to the panic notifier (Sima)
>  * Replace get_scanout_buffer() with set_scanout_buffer() to simplify
>the locking. (Thomas Zimmermann)
>  * Add a debugfs entry, to trigger the drm_panic without a real panic (Sima)
>  * Fix the drm_panic Documentation, and include it in drm-kms.rst
> 
> v9:
>  * Revert to using get_scanout_buffer() (Sima)
>  * Move get_scanout_buffer() and panic_flush() to the plane helper
>functions (Thomas Zimmermann)
>  * Register all planes with get_scanout_buffer() to the panic notifier
>  * Use drm_panic_lock() to protect against race (Sima)
>  * Create a debugfs file for each plane in the device's debugfs
>directory. This allows to test for each plane of each GPU
>independently.
> v10:
>  * Move blit and fill functions back in drm_panic (Thomas Zimmermann).
>  * Simplify the text drawing functions.
>  * Use kmsg_dumper instead of panic_notifier (Sima).
>  * Use spinlock_irqsave/restore (John Ogness)
> 
> v11:
>  * Use macro instead of inline functions for drm_panic_lock/unlock (John 
> Ogness)
> 
> v12:
>  * Use array for map and pitch in struct drm_scanout_buffer
>to support multi-planar format later. (Thomas Zimmermann)
>  * Rename drm_panic_gem_get_scanout_buffer to drm_fb_dma_get_scanout_buffer
>and build it unconditionally (Thomas Zimmermann)
>  * Better indent struct drm_scanout_buffer declaration. (Thomas Zimmermann)

On the series: Acked-by: Daniel Vetter 

And apologies for the detours this patch set took and my part in the (too
many) revisions. I think we should land this and do anything more once
it's merged and we extend the panic support to more drivers.

Thanks a lot to you for seeing this 

Re: [PATCH v12 2/9] drm/panic: Add a drm panic handler

2024-04-10 Thread Daniel Vetter
On Tue, Apr 09, 2024 at 06:30:41PM +0200, Jocelyn Falempe wrote:
> This module displays a user friendly message when a kernel panic
> occurs. It currently doesn't contain any debug information,
> but that can be added later.
> 
> v2
>  * Use get_scanout_buffer() instead of the drm client API.
>   (Thomas Zimmermann)
>  * Add the panic reason to the panic message (Nerdopolis)
>  * Add an exclamation mark (Nerdopolis)
> 
> v3
>  * Rework the drawing functions, to write the pixels line by line and
>  to use the drm conversion helper to support other formats.
>  (Thomas Zimmermann)
> 
> v4
>  * Use drm_fb_r1_to_32bit for fonts (Thomas Zimmermann)
>  * Remove the default y to DRM_PANIC config option (Thomas Zimmermann)
>  * Add foreground/background color config option
>  * Fix the bottom lines not painted if the framebuffer height
>is not a multiple of the font height.
>  * Automatically register the device to drm_panic, if the function
>get_scanout_buffer exists. (Thomas Zimmermann)
> 
> v5
>  * Change the drawing API, use drm_fb_blit_from_r1() to draw the font.
>  * Also add drm_fb_fill() to fill area with background color.
>  * Add draw_pixel_xy() API for drivers that can't provide a linear buffer.
>  * Add a flush() callback for drivers that needs to synchronize the buffer.
>  * Add a void *private field, so drivers can pass private data to
>draw_pixel_xy() and flush().
> 
> v6
>  * Fix sparse warning for panic_msg and logo.
> 
> v7
>  * Add select DRM_KMS_HELPER for the color conversion functions.
> 
> v8
>  * Register directly each plane to the panic notifier (Sima)
>  * Add raw_spinlock to properly handle concurrency (Sima)
>  * Register plane instead of device, to avoid looping through plane
>list, and simplify code.
>  * Replace get_scanout_buffer() logic with drm_panic_set_buffer()
>   (Thomas Zimmermann)
>  * Removed the draw_pixel_xy() API, will see later if it can be added back.
> 
> v9
>  * Revert to using get_scanout_buffer() (Sima)
>  * Move get_scanout_buffer() and panic_flush() to the plane helper
>functions (Thomas Zimmermann)
>  * Register all planes with get_scanout_buffer() to the panic notifier
>  * Use drm_panic_lock() to protect against race (Sima)
> 
> v10
>  * Move blit and fill functions back in drm_panic (Thomas Zimmermann).
>  * Simplify the text drawing functions.
>  * Use kmsg_dumper instead of panic_notifier (Sima).
> 
> v12
>  * Use array for map and pitch in struct drm_scanout_buffer
>to support multi-planar format later. (Thomas Zimmermann)
>  * Better indent struct drm_scanout_buffer declaration. (Thomas Zimmermann)
> 
> Signed-off-by: Jocelyn Falempe 

Some detail suggestions for the kerneldoc but those aside Acked-by: Daniel
Vetter 
> ---
>  Documentation/gpu/drm-kms.rst|  12 +
>  drivers/gpu/drm/Kconfig  |  23 ++
>  drivers/gpu/drm/Makefile |   1 +
>  drivers/gpu/drm/drm_drv.c|   4 +
>  drivers/gpu/drm/drm_panic.c  | 289 +++
>  include/drm/drm_modeset_helper_vtables.h |  37 +++
>  include/drm/drm_panic.h  |  57 +
>  include/drm/drm_plane.h  |   6 +
>  8 files changed, 429 insertions(+)
>  create mode 100644 drivers/gpu/drm/drm_panic.c
> 
> diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
> index 13d3627d8bc0..b64334661aeb 100644
> --- a/Documentation/gpu/drm-kms.rst
> +++ b/Documentation/gpu/drm-kms.rst
> @@ -398,6 +398,18 @@ Plane Damage Tracking Functions Reference
>  .. kernel-doc:: include/drm/drm_damage_helper.h
> :internal:
>  
> +Plane Panic Feature
> +---
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_panic.c
> +   :doc: overview
> +
> +Plane Panic Functions Reference
> +---
> +
> +.. kernel-doc:: drivers/gpu/drm/drm_panic.c
> +   :export:
> +
>  Display Modes Function Reference
>  
>  
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 3914aaf443a8..f8a26423369e 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -104,6 +104,29 @@ config DRM_KMS_HELPER
>   help
> CRTC helpers for KMS drivers.
>  
> +config DRM_PANIC
> + bool "Display a user-friendly message when a kernel panic occurs"
> + depends on DRM && !FRAMEBUFFER_CONSOLE
> + select DRM_KMS_HELPER
> + select FONT_SUPPORT
> + help
> +   Enable a drm panic handler, which will display a user-friendly message
> +   when a kernel panic occurs. It's useful when using a user-space
> +   console instead 

Re: [PATCH v12 4/9] drm/panic: Add debugfs entry to test without triggering panic.

2024-04-10 Thread Daniel Vetter
On Tue, Apr 09, 2024 at 06:30:43PM +0200, Jocelyn Falempe wrote:
> Add a debugfs file, so you can test drm_panic without freezing
> your machine. This is unsafe, and should be enabled only for
> developer or tester.
> 
> To display the drm_panic screen on the device 0:
> echo 1 > /sys/kernel/debug/dri/0/drm_panic_plane_0
> 
> v9:
>  * Create a debugfs file for each plane in the device's debugfs
>directory. This allows to test for each plane of each GPU
>independently.
> 
> Signed-off-by: Jocelyn Falempe 

I was pondering whether this guarantees that the debugfs file disappears
before drm_dev_unregister finishes (otherwise we have a bit a problem),
and looks like we're good.

Maybe add a todo that it would be nice to simulate nmi context, not sure
lockdept can help here ...

Anyway Reviewed-by: Daniel Vetter 
> ---
>  drivers/gpu/drm/Kconfig |  9 
>  drivers/gpu/drm/drm_panic.c | 43 -
>  2 files changed, 51 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index f8a26423369e..959b19a04101 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -127,6 +127,15 @@ config DRM_PANIC_BACKGROUND_COLOR
>   depends on DRM_PANIC
>   default 0x00
>  
> +config DRM_PANIC_DEBUG
> + bool "Add a debug fs entry to trigger drm_panic"
> + depends on DRM_PANIC && DEBUG_FS
> + help
> +   Add dri/[device]/drm_panic_plane_x in the kernel debugfs, to force the
> +   panic handler to write the panic message to this plane scanout buffer.
> +   This is unsafe and should not be enabled on a production build.
> +   If in doubt, say "N".
> +
>  config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>  bool "Enable refcount backtrace history in the DP MST helpers"
>   depends on STACKTRACE_SUPPORT
> diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c
> index e1ec30b6c04a..78fd6d5d7adc 100644
> --- a/drivers/gpu/drm/drm_panic.c
> +++ b/drivers/gpu/drm/drm_panic.c
> @@ -495,6 +495,45 @@ static void drm_panic(struct kmsg_dumper *dumper, enum 
> kmsg_dump_reason reason)
>   draw_panic_plane(plane);
>  }
>  
> +
> +/*
> + * DEBUG FS, This is currently unsafe.
> + * Create one file per plane, so it's possible to debug one plane at a time.
> + */
> +#ifdef CONFIG_DRM_PANIC_DEBUG
> +#include 
> +
> +static ssize_t debugfs_trigger_write(struct file *file, const char __user 
> *user_buf,
> +  size_t count, loff_t *ppos)
> +{
> + bool run;
> +
> + if (kstrtobool_from_user(user_buf, count, ) == 0 && run) {
> + struct drm_plane *plane = file->private_data;
> +
> + draw_panic_plane(plane);
> + }
> + return count;
> +}
> +
> +static const struct file_operations dbg_drm_panic_ops = {
> + .owner = THIS_MODULE,
> + .write = debugfs_trigger_write,
> + .open = simple_open,
> +};
> +
> +static void debugfs_register_plane(struct drm_plane *plane, int index)
> +{
> + char fname[32];
> +
> + snprintf(fname, 32, "drm_panic_plane_%d", index);
> + debugfs_create_file(fname, 0200, plane->dev->debugfs_root,
> + plane, _drm_panic_ops);
> +}
> +#else
> +static void debugfs_register_plane(struct drm_plane *plane, int index) {}
> +#endif /* CONFIG_DRM_PANIC_DEBUG */
> +
>  /**
>   * drm_panic_register() - Initialize DRM panic for a device
>   * @dev: the drm device on which the panic screen will be displayed.
> @@ -514,8 +553,10 @@ void drm_panic_register(struct drm_device *dev)
>   plane->kmsg_panic.max_reason = KMSG_DUMP_PANIC;
>   if (kmsg_dump_register(>kmsg_panic))
>   drm_warn(dev, "Failed to register panic handler\n");
> - else
> + else {
> + debugfs_register_plane(plane, registered_plane);
>   registered_plane++;
> + }
>   }
>   if (registered_plane)
>   drm_info(dev, "Registered %d planes with drm panic\n", 
> registered_plane);
> -- 
> 2.44.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v12 1/9] drm/panic: Add drm panic locking

2024-04-10 Thread Daniel Vetter
On Tue, Apr 09, 2024 at 06:30:40PM +0200, Jocelyn Falempe wrote:
> From: Daniel Vetter 
> 
> Rough sketch for the locking of drm panic printing code. The upshot of
> this approach is that we can pretty much entirely rely on the atomic
> commit flow, with the pair of raw_spin_lock/unlock providing any
> barriers we need, without having to create really big critical
> sections in code.
> 
> This also avoids the need that drivers must explicitly update the
> panic handler state, which they might forget to do, or not do
> consistently, and then we blow up in the worst possible times.
> 
> It is somewhat racy against a concurrent atomic update, and we might
> write into a buffer which the hardware will never display. But there's
> fundamentally no way to avoid that - if we do the panic state update
> explicitly after writing to the hardware, we might instead write to an
> old buffer that the user will barely ever see.
> 
> Note that an rcu protected deference of plane->state would give us the
> the same guarantees, but it has the downside that we then need to
> protect the plane state freeing functions with call_rcu too. Which
> would very widely impact a lot of code and therefore doesn't seem
> worth the complexity compared to a raw spinlock with very tiny
> critical sections. Plus rcu cannot be used to protect access to
> peek/poke registers anyway, so we'd still need it for those cases.
> 
> Peek/poke registers for vram access (or a gart pte reserved just for
> panic code) are also the reason I've gone with a per-device and not
> per-plane spinlock, since usually these things are global for the
> entire display. Going with per-plane locks would mean drivers for such
> hardware would need additional locks, which we don't want, since it
> deviates from the per-console takeoverlocks design.
> 
> Longer term it might be useful if the panic notifiers grow a bit more
> structure than just the absolute bare
> EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not
> EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console
> drivers with proper register/unregister interfaces we could perhaps
> reuse the very fancy console lock with all it's check and takeover
> semantics that John Ogness is developing to fix the console_lock mess.
> But for the initial cut of a drm panic printing support I don't think
> we need that, because the critical sections are extremely small and
> only happen once per display refresh. So generally just 60 tiny locked
> sections per second, which is nothing compared to a serial console
> running a 115kbaud doing really slow mmio writes for each byte. So for
> now the raw spintrylock in drm panic notifier callback should be good
> enough.
> 
> Another benefit of making panic notifiers more like full blown
> consoles (that are used in panics only) would be that we get the two
> stage design, where first all the safe outputs are used. And then the
> dangerous takeover tricks are deployed (where for display drivers we
> also might try to intercept any in-flight display buffer flips, which
> if we race and misprogram fifos and watermarks can hang the memory
> controller on some hw).
> 
> For context the actual implementation on the drm side is by Jocelyn
> and this patch is meant to be combined with the overall approach in
> v7 (v8 is a bit less flexible, which I think is the wrong direction):
> 
> https://lore.kernel.org/dri-devel/20240104160301.185915-1-jfale...@redhat.com/
> 
> Note that the locking is very much not correct there, hence this
> separate rfc.
> 
> v2:
> - fix authorship, this was all my typing
> - some typo oopsies
> - link to the drm panic work by Jocelyn for context

Please annotate that v10 and later is your work, credit where credit is
due and all that :-)

> v10:
> - Use spinlock_irqsave/restore (John Ogness)
> 
> v11:
> - Use macro instead of inline functions for drm_panic_lock/unlock (John 
> Ogness)
> 
> Signed-off-by: Daniel Vetter 
> Cc: Jocelyn Falempe 
> Cc: Andrew Morton 
> Cc: "Peter Zijlstra (Intel)" 
> Cc: Lukas Wunner 
> Cc: Petr Mladek 
> Cc: Steven Rostedt 
> Cc: John Ogness 
> Cc: Sergey Senozhatsky 
> Cc: Maarten Lankhorst 
> Cc: Maxime Ripard 
> Cc: Thomas Zimmermann 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Signed-off-by: Jocelyn Falempe 
> ---
>  drivers/gpu/drm/drm_atomic_helper.c |   4 ++
>  drivers/gpu/drm/drm_drv.c   |   1 +
>  include/drm/drm_mode_config.h   |  10 +++
>  include/drm/drm_panic.h | 100 
>  4 files changed, 115 insertions(+)
>  create mode 100644 include/drm/drm_panic.h
> 
> diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
> b/drivers/gpu/drm/d

Re: [PATCH v2] drm: Document requirements for driver-specific KMS props in new drivers

2024-03-19 Thread Daniel Vetter
On Thu, Mar 14, 2024 at 11:20:09AM +0100, Maxime Ripard wrote:
> On Mon, Mar 11, 2024 at 04:58:58PM +0100, Sebastian Wick wrote:
> > When extending support for a driver-specific KMS property to additional
> > drivers, we should apply all the requirements for new properties and
> > make sure the semantics are the same and documented.
> > 
> > v2: devs of the driver which introduced property shall help and ack
> > 
> > Signed-off-by: Sebastian Wick 
> 
> Acked-by: Maxime Ripard 
> 
> We probably want to have Dave or Sima ack on that one too

Yeah that's a good idea and defacto how we handled this - additional users
of anything (whether library or uapi or whatever) get to clean up an
existing mess if it's too bad. But for uapi it's good to be really
explicit and document that.

Acked-by: Daniel Vetter 

Cheers, Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v9 1/9] drm/panic: Add drm panic locking

2024-03-14 Thread Daniel Vetter
On Fri, Mar 08, 2024 at 02:45:52PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 07.03.24 um 10:14 schrieb Jocelyn Falempe:
> > From: Daniel Vetter 
> > 
> > Rough sketch for the locking of drm panic printing code. The upshot of
> > this approach is that we can pretty much entirely rely on the atomic
> > commit flow, with the pair of raw_spin_lock/unlock providing any
> > barriers we need, without having to create really big critical
> > sections in code.
> 
> The ast driver has a lock to protect modesetting and concurrent EDID reads
> from each other. [1] That new panic_lock seems to serve the same purpose.
> 
> If we go that route, can we make this a bit more generic and call it
> commit_lock? I could then remove the dedicated lock from ast.

No, because the drm_panic_lock/unlock sections must be as small as
possible for two reasons:

- Anything we do while holding this lock that isn'st strictly needed for
  the panic code (like edid reading or more than the scanout address
  registers) will reduce the chances that the drm panic handler can run,
  since all we can do is try_lock.

- it's a raw spinlock, if you do more than a handful of instructions
  you'll really annoy the -rt people, because raw spinlocks are not
  converted to sleeping locks with the realtime config enabled. Reading an
  EDID (which takes upwards of tens of ms) is definitely about 4-5 orders
  of magnitudes to much at least.

The mutex is really the right lock here for protecting modesets against
edid reads here, and the panic lock is at most on top of that very, very
small and specific things.
-Sima

> 
> Best regards
> Thomas
> 
> [1] 
> https://elixir.bootlin.com/linux/v6.7/source/drivers/gpu/drm/ast/ast_drv.h#L195
> 
> > 
> > This also avoids the need that drivers must explicitly update the
> > panic handler state, which they might forget to do, or not do
> > consistently, and then we blow up in the worst possible times.
> > 
> > It is somewhat racy against a concurrent atomic update, and we might
> > write into a buffer which the hardware will never display. But there's
> > fundamentally no way to avoid that - if we do the panic state update
> > explicitly after writing to the hardware, we might instead write to an
> > old buffer that the user will barely ever see.
> > 
> > Note that an rcu protected deference of plane->state would give us the
> > the same guarantees, but it has the downside that we then need to
> > protect the plane state freeing functions with call_rcu too. Which
> > would very widely impact a lot of code and therefore doesn't seem
> > worth the complexity compared to a raw spinlock with very tiny
> > critical sections. Plus rcu cannot be used to protect access to
> > peek/poke registers anyway, so we'd still need it for those cases.
> > 
> > Peek/poke registers for vram access (or a gart pte reserved just for
> > panic code) are also the reason I've gone with a per-device and not
> > per-plane spinlock, since usually these things are global for the
> > entire display. Going with per-plane locks would mean drivers for such
> > hardware would need additional locks, which we don't want, since it
> > deviates from the per-console takeoverlocks design.
> > 
> > Longer term it might be useful if the panic notifiers grow a bit more
> > structure than just the absolute bare
> > EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not
> > EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console
> > drivers with proper register/unregister interfaces we could perhaps
> > reuse the very fancy console lock with all it's check and takeover
> > semantics that John Ogness is developing to fix the console_lock mess.
> > But for the initial cut of a drm panic printing support I don't think
> > we need that, because the critical sections are extremely small and
> > only happen once per display refresh. So generally just 60 tiny locked
> > sections per second, which is nothing compared to a serial console
> > running a 115kbaud doing really slow mmio writes for each byte. So for
> > now the raw spintrylock in drm panic notifier callback should be good
> > enough.
> > 
> > Another benefit of making panic notifiers more like full blown
> > consoles (that are used in panics only) would be that we get the two
> > stage design, where first all the safe outputs are used. And then the
> > dangerous takeover tricks are deployed (where for display drivers we
> > also might try to intercept any in-flight display buffer flips, which
> > if we race and misprogram fifos and watermarks can hang the memory
> > controller on some hw)

Re: [RFC] drm/panic: Add drm panic locking

2024-03-14 Thread Daniel Vetter
On Fri, Mar 01, 2024 at 02:03:12PM +0100, Jocelyn Falempe wrote:
> Thanks for the patch.
> 
> I think it misses to initialize the lock, so we need to add a
> raw_spin_lock_init() in the drm device initialization.
> 
> Also I'm wondering if it make sense to put that under the CONFIG_DRM_PANIC
> flag, so that if you don't enable it, panic_lock() and panic_unlock() would
> be no-op.
> But that may not work if the driver uses this lock to protect some register
> access.

If we get drivers to use this for some of their own locking we have to
keep it enabled unconditionally. Also I think locking that's only
conditional on Kconfig is just a bit too suprising to be a good idea
irrespective of this specific case.
-Sima

> 
> Best regards,
> 
> -- 
> 
> Jocelyn
> 
> On 01/03/2024 11:39, Daniel Vetter wrote:
> > Rough sketch for the locking of drm panic printing code. The upshot of
> > this approach is that we can pretty much entirely rely on the atomic
> > commit flow, with the pair of raw_spin_lock/unlock providing any
> > barriers we need, without having to create really big critical
> > sections in code.
> > 
> > This also avoids the need that drivers must explicitly update the
> > panic handler state, which they might forget to do, or not do
> > consistently, and then we blow up in the worst possible times.
> > 
> > It is somewhat racy against a concurrent atomic update, and we might
> > write into a buffer which the hardware will never display. But there's
> > fundamentally no way to avoid that - if we do the panic state update
> > explicitly after writing to the hardware, we might instead write to an
> > old buffer that the user will barely ever see.
> > 
> > Note that an rcu protected deference of plane->state would give us the
> > the same guarantees, but it has the downside that we then need to
> > protect the plane state freeing functions with call_rcu too. Which
> > would very widely impact a lot of code and therefore doesn't seem
> > worth the complexity compared to a raw spinlock with very tiny
> > critical sections. Plus rcu cannot be used to protect access to
> > peek/poke registers anyway, so we'd still need it for those cases.
> > 
> > Peek/poke registers for vram access (or a gart pte reserved just for
> > panic code) are also the reason I've gone with a per-device and not
> > per-plane spinlock, since usually these things are global for the
> > entire display. Going with per-plane locks would mean drivers for such
> > hardware would need additional locks, which we don't want, since it
> > deviates from the per-console takeoverlocks design.
> > 
> > Longer term it might be useful if the panic notifiers grow a bit more
> > structure than just the absolute bare
> > EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not
> > EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console
> > drivers with proper register/unregister interfaces we could perhaps
> > reuse the very fancy console lock with all it's check and takeover
> > semantics that John Ogness is developing to fix the console_lock mess.
> > But for the initial cut of a drm panic printing support I don't think
> > we need that, because the critical sections are extremely small and
> > only happen once per display refresh. So generally just 60 tiny locked
> > sections per second, which is nothing compared to a serial console
> > running a 115kbaud doing really slow mmio writes for each byte. So for
> > now the raw spintrylock in drm panic notifier callback should be good
> > enough.
> > 
> > Another benefit of making panic notifiers more like full blown
> > consoles (that are used in panics only) would be that we get the two
> > stage design, where first all the safe outputs are used. And then the
> > dangerous takeover tricks are deployed (where for display drivers we
> > also might try to intercept any in-flight display buffer flips, which
> > if we race and misprogram fifos and watermarks can hang the memory
> > controller on some hw).
> > 
> > For context the actual implementation on the drm side is by Jocelyn
> > and this patch is meant to be combined with the overall approach in
> > v7 (v8 is a bit less flexible, which I think is the wrong direction):
> > 
> > https://lore.kernel.org/dri-devel/20240104160301.185915-1-jfale...@redhat.com/
> > 
> > Note that the locking is very much not correct there, hence this
> > separate rfc.
> > 
> > v2:
> > - fix authorship, this was all my typing
> > - some typo oopsies
> > - link to the drm panic work by Jocelyn for 

Re: [RFC] drm/panic: Add drm panic locking

2024-03-14 Thread Daniel Vetter
On Tue, Mar 05, 2024 at 09:20:04AM +0106, John Ogness wrote:
> Hi Daniel,
> 
> Great to see this moving forward!
> 
> On 2024-03-01, Daniel Vetter  wrote:
> > But for the initial cut of a drm panic printing support I don't think
> > we need that, because the critical sections are extremely small and
> > only happen once per display refresh. So generally just 60 tiny locked
> > sections per second, which is nothing compared to a serial console
> > running a 115kbaud doing really slow mmio writes for each byte. So for
> > now the raw spintrylock in drm panic notifier callback should be good
> > enough.
> 
> Is there a reason you do not use the irqsave/irqrestore variants? By
> leaving interrupts enabled, there is the risk that a panic from any
> interrupt handler may block the drm panic handler.

tbh I simply did not consider that could be useful. but yeah if we're
unlucky and an interrupt happens in here and dies, the drm panic handler
cannot run. And this code is definitely not hot enough to matter, the
usual driver code for a plane flip does a few more irqsafe spinlocks on
top. One more doesn't add anything I think, and I guess if it does we'll
notice :-)

Also irqsave makes drm_panic_lock/unlock a bit more widely useful to
protect driver mmio access since then it also works from irq handlers.
Means we have to pass irqflags around, but that sounds acceptable. So very
much has my vote.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] MAINTAINERS: Update email address for Tvrtko Ursulin

2024-03-05 Thread Daniel Vetter
On Wed, Feb 28, 2024 at 02:22:40PM +, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin 
> 
> I will lose access to my @.*intel.com e-mail addresses soon so let me
> adjust the maintainers entry and update the mailmap too.
> 
> While at it consolidate a few other of my old emails to point to the
> main one.
> 
> Signed-off-by: Tvrtko Ursulin 
> Cc: Daniel Vetter 
> Cc: Dave Airlie 
> Cc: Jani Nikula 
> Cc: Joonas Lahtinen 
> Cc: Rodrigo Vivi 

Directly applied to drm-fixes as requested on irc.
-Sima

> ---
>  .mailmap| 5 +
>  MAINTAINERS | 2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/.mailmap b/.mailmap
> index b99a238ee3bd..d67e351bce8e 100644
> --- a/.mailmap
> +++ b/.mailmap
> @@ -608,6 +608,11 @@ TripleX Chung  
>  TripleX Chung  
>  Tsuneo Yoshioka 
>  Tudor Ambarus  
> +Tvrtko Ursulin  
> +Tvrtko Ursulin  
> +Tvrtko Ursulin  
> +Tvrtko Ursulin  
> +Tvrtko Ursulin  
>  Tycho Andersen  
>  Tzung-Bi Shih  
>  Uwe Kleine-König 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 19f6f8014f94..b940bfe2a692 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -10734,7 +10734,7 @@ INTEL DRM I915 DRIVER (Meteor Lake, DG2 and older 
> excluding Poulsbo, Moorestown
>  M:   Jani Nikula 
>  M:   Joonas Lahtinen 
>  M:   Rodrigo Vivi 
> -M:   Tvrtko Ursulin 
> +M:   Tvrtko Ursulin 
>  L:   intel-...@lists.freedesktop.org
>  S:   Supported
>  W:   https://drm.pages.freedesktop.org/intel-docs/
> -- 
> 2.40.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[RFC] drm/panic: Add drm panic locking

2024-03-01 Thread Daniel Vetter
Rough sketch for the locking of drm panic printing code. The upshot of
this approach is that we can pretty much entirely rely on the atomic
commit flow, with the pair of raw_spin_lock/unlock providing any
barriers we need, without having to create really big critical
sections in code.

This also avoids the need that drivers must explicitly update the
panic handler state, which they might forget to do, or not do
consistently, and then we blow up in the worst possible times.

It is somewhat racy against a concurrent atomic update, and we might
write into a buffer which the hardware will never display. But there's
fundamentally no way to avoid that - if we do the panic state update
explicitly after writing to the hardware, we might instead write to an
old buffer that the user will barely ever see.

Note that an rcu protected deference of plane->state would give us the
the same guarantees, but it has the downside that we then need to
protect the plane state freeing functions with call_rcu too. Which
would very widely impact a lot of code and therefore doesn't seem
worth the complexity compared to a raw spinlock with very tiny
critical sections. Plus rcu cannot be used to protect access to
peek/poke registers anyway, so we'd still need it for those cases.

Peek/poke registers for vram access (or a gart pte reserved just for
panic code) are also the reason I've gone with a per-device and not
per-plane spinlock, since usually these things are global for the
entire display. Going with per-plane locks would mean drivers for such
hardware would need additional locks, which we don't want, since it
deviates from the per-console takeoverlocks design.

Longer term it might be useful if the panic notifiers grow a bit more
structure than just the absolute bare
EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not
EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console
drivers with proper register/unregister interfaces we could perhaps
reuse the very fancy console lock with all it's check and takeover
semantics that John Ogness is developing to fix the console_lock mess.
But for the initial cut of a drm panic printing support I don't think
we need that, because the critical sections are extremely small and
only happen once per display refresh. So generally just 60 tiny locked
sections per second, which is nothing compared to a serial console
running a 115kbaud doing really slow mmio writes for each byte. So for
now the raw spintrylock in drm panic notifier callback should be good
enough.

Another benefit of making panic notifiers more like full blown
consoles (that are used in panics only) would be that we get the two
stage design, where first all the safe outputs are used. And then the
dangerous takeover tricks are deployed (where for display drivers we
also might try to intercept any in-flight display buffer flips, which
if we race and misprogram fifos and watermarks can hang the memory
controller on some hw).

For context the actual implementation on the drm side is by Jocelyn
and this patch is meant to be combined with the overall approach in
v7 (v8 is a bit less flexible, which I think is the wrong direction):

https://lore.kernel.org/dri-devel/20240104160301.185915-1-jfale...@redhat.com/

Note that the locking is very much not correct there, hence this
separate rfc.

v2:
- fix authorship, this was all my typing
- some typo oopsies
- link to the drm panic work by Jocelyn for context

Signed-off-by: Daniel Vetter 
Cc: Jocelyn Falempe 
Cc: Andrew Morton 
Cc: "Peter Zijlstra (Intel)" 
Cc: Lukas Wunner 
Cc: Petr Mladek 
Cc: Steven Rostedt 
Cc: John Ogness 
Cc: Sergey Senozhatsky 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_atomic_helper.c |  3 +
 include/drm/drm_mode_config.h   | 10 +++
 include/drm/drm_panic.h | 99 +
 3 files changed, 112 insertions(+)
 create mode 100644 include/drm/drm_panic.h

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index 40c2bd3e62e8..5a908c186037 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -3086,6 +3087,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state 
*state,
}
}
 
+   drm_panic_lock(state->dev);
for_each_oldnew_plane_in_state(state, plane, old_plane_state, 
new_plane_state, i) {
WARN_ON(plane->state != old_plane_state);
 
@@ -3095,6 +3097,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state 
*state,
state->planes[i].state = old_plane_state;
plane->state = new_plane_state;
}
+   drm_panic_unlock(state->dev);
 
for_each_oldnew_private_obj_in_state(state, obj, old_obj_state, 
new_obj_state, i) {
  

[RFC] drm/panic: Add drm panic locking

2024-03-01 Thread Daniel Vetter
From: Jocelyn Falempe 

Rough sketch for the locking of drm panic printing code. The upshot of
this approach is that we can pretty much entirely rely on the atomic
commit flow, with the pair of raw_spin_lock/unlock providing any
barriers we need, without having to create really big critical
sections in code.

This also avoids the need that drivers must explicitly update the
panic handler state, which they might forget to do, or not do
consistently, and then we blow up in the worst possible times.

It is somewhat racy against a concurrent atomic update, and we might
write into a buffer which the hardware will never display. But there's
fundamentally no way to avoid that - if we do the panic state update
explicitly after writing to the hardware, we might instead write to an
old buffer that the user will barely ever see.

Note that an rcu protected deference of plane->state would give us the
the same guarantees, but it has the downside that we then need to
protect the plane state freeing functions with call_rcu too. Which
would very widely impact a lot of code and therefore doesn't seem
worth the it compared to a raw spinlock with very tiny critical
sections. Plus rcu cannot be used to protect access to peek/poke registers
anyway, so we'd still need it for those cases.

Peek/poke registers for vram access (or a gart pte reserved just for
panic code) are also the reason I've gone with a per-device and not
per-plane spinlock, since usually these things are global for the
entire display. Going with per-plane locks would mean drivers for such
hardware would need additional locks, which we don't want, since it
deviates from the per-console takeoverlocks design.

Longer term it might be useful if the panic notifiers grow a bit more
structure than just the absolute bare
EXPORT_SYMBOL(panic_notifier_list) - somewhat aside, why is that not
EXPORT_SYMBOL_GPL ... If panic notifiers would be more like console
drivers with proper register/unregister interfaces we could perhaps
reuse the very fancy console lock with all it's check and takeover
semantics that John Ogness is developing to fix the console_lock mess.
But for the initial cut of a drm panic printing support I don't think
we need that, because the critical sections are extremely small and
only happen once per display refresh. So generally just 60 tiny locked
sections per second, which is nothing compared to a serial console
running a 115kbaud doing really slow mmio writes for each byte. So for
now the raw spintrylock in drm panic notifier callback should be good
enough.

Another benefit of making panic notifiers more like full blown
consoles (that are used in panics only) would be that we get the two
stage design, where first all the safe outputs are used. And then the
dangerous takeover tricks are deployed (where for display drivers we
also might try to intercept any in-flight display buffer flips, which
if we race and misprogram fifos and watermarks can hang the memory
controller on some hw).

Signed-off-by: Daniel Vetter 
Cc: Jocelyn Falempe 
Cc: Andrew Morton 
Cc: "Peter Zijlstra (Intel)" 
Cc: Lukas Wunner 
Cc: Petr Mladek 
Cc: Steven Rostedt 
Cc: John Ogness 
Cc: Sergey Senozhatsky 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_atomic_helper.c |  3 +
 include/drm/drm_mode_config.h   | 10 +++
 include/drm/drm_panic.h | 99 +
 3 files changed, 112 insertions(+)
 create mode 100644 include/drm/drm_panic.h

diff --git a/drivers/gpu/drm/drm_atomic_helper.c 
b/drivers/gpu/drm/drm_atomic_helper.c
index 40c2bd3e62e8..5a908c186037 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -3086,6 +3087,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state 
*state,
}
}
 
+   drm_panic_lock(state->dev);
for_each_oldnew_plane_in_state(state, plane, old_plane_state, 
new_plane_state, i) {
WARN_ON(plane->state != old_plane_state);
 
@@ -3095,6 +3097,7 @@ int drm_atomic_helper_swap_state(struct drm_atomic_state 
*state,
state->planes[i].state = old_plane_state;
plane->state = new_plane_state;
}
+   drm_panic_unlock(state->dev);
 
for_each_oldnew_private_obj_in_state(state, obj, old_obj_state, 
new_obj_state, i) {
WARN_ON(obj->state != old_obj_state);
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 973119a9176b..92a390379e85 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -505,6 +505,16 @@ struct drm_mode_config {
 */
struct list_head plane_list;
 
+   /**
+* @panic_lock:
+*
+* Raw spinlock used to protect critical sections of code that access
+* 

Re: [PATCH v8 3/8] drm/panic: Add debugfs entry to test without triggering panic.

2024-02-29 Thread Daniel Vetter
On Tue, Feb 27, 2024 at 11:04:14AM +0100, Jocelyn Falempe wrote:
> Add a debugfs file, so you can test drm_panic without freezing
> your machine. This is unsafe, and should be enabled only for
> developer or tester.
> 
> to display the drm_panic screen, just run:
> echo 1 > /sys/kernel/debug/drm_panic/trigger
> 
> Signed-off-by: Jocelyn Falempe 
> ---
>  drivers/gpu/drm/Kconfig |  9 +++
>  drivers/gpu/drm/drm_panic.c | 47 +
>  2 files changed, 56 insertions(+)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index c17d8a8f6877..8dcea29f595c 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -125,6 +125,15 @@ config DRM_PANIC_BACKGROUND_COLOR
>   depends on DRM_PANIC
>   default 0x00
>  
> +config DRM_PANIC_DEBUG
> + bool "Add a debug fs entry to trigger drm_panic"
> + depends on DRM_PANIC && DEBUG_FS
> + help
> +   Add drm_panic/trigger in the kernel debugfs, to force the panic
> +   handler to write the panic message to the scanout buffer. This is
> +   unsafe and should not be enabled on a production build.
> +   If in doubt, say "N".
> +
>  config DRM_DEBUG_DP_MST_TOPOLOGY_REFS
>  bool "Enable refcount backtrace history in the DP MST helpers"
>   depends on STACKTRACE_SUPPORT
> diff --git a/drivers/gpu/drm/drm_panic.c b/drivers/gpu/drm/drm_panic.c
> index c9f386476ef9..c5d3f725c5f5 100644
> --- a/drivers/gpu/drm/drm_panic.c
> +++ b/drivers/gpu/drm/drm_panic.c
> @@ -398,3 +398,50 @@ void drm_panic_unregister(struct drm_plane *plane)
>  }
>  EXPORT_SYMBOL(drm_panic_unregister);
>  
> +
> +/*
> + * DEBUG, This is currently unsafe.
> + * Also it will call all panic_notifier, since there is no way to filter and
> + * only call the drm_panic notifier.
> + */
> +#ifdef CONFIG_DRM_PANIC_DEBUG
> +#include 
> +
> +static struct dentry *debug_dir;
> +static struct dentry *debug_trigger;
> +
> +static ssize_t dbgfs_trigger_write(struct file *file, const char __user 
> *user_buf,
> +size_t count, loff_t *ppos)
> +{
> + bool run;
> +
> + if (kstrtobool_from_user(user_buf, count, ) == 0 && run)
> + atomic_notifier_call_chain(_notifier_list, 0, "Test drm 
> panic from debugfs");

Since this is just the general panic notifier it feels very misplaced in
the drm subsystem. I think moving that code into the core panic code makes
a lot more sense, then we'd also have all the right people on Cc: to
figure out how we can best recreate the correct calling context (like nmi
context or whatever) for best case simulation of panic code. John Ogness
definitely needs to see this and ack, wherever we put it.
-Sima

> + return count;
> +}
> +
> +static const struct file_operations dbg_drm_panic_ops = {
> + .owner = THIS_MODULE,
> + .write = dbgfs_trigger_write,
> +};
> +
> +static int __init debugfs_start(void)
> +{
> + debug_dir = debugfs_create_dir("drm_panic", NULL);
> +
> + if (IS_ERR(debug_dir))
> + return PTR_ERR(debug_dir);
> + debug_trigger = debugfs_create_file("trigger", 0200, debug_dir,
> + NULL, _drm_panic_ops);
> + return 0;
> +}
> +
> +static void __exit debugfs_end(void)
> +{
> + debugfs_remove_recursive(debug_dir);
> +}
> +
> +module_init(debugfs_start);
> +module_exit(debugfs_end);
> +
> +#endif
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH v8 5/8] drm/simpledrm: Add drm_panic support

2024-02-29 Thread Daniel Vetter
On Tue, Feb 27, 2024 at 11:04:16AM +0100, Jocelyn Falempe wrote:
> Add support for the drm_panic module, which displays a user-friendly
> message to the screen when a kernel panic occurs.
> 
> v8:
>  * Replace get_scanout_buffer() with drm_panic_set_buffer()
>(Thomas Zimmermann)
> 
> Signed-off-by: Jocelyn Falempe 
> ---
>  drivers/gpu/drm/tiny/simpledrm.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/gpu/drm/tiny/simpledrm.c 
> b/drivers/gpu/drm/tiny/simpledrm.c
> index 7ce1c4617675..a2190995354a 100644
> --- a/drivers/gpu/drm/tiny/simpledrm.c
> +++ b/drivers/gpu/drm/tiny/simpledrm.c
> @@ -25,6 +25,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #define DRIVER_NAME  "simpledrm"
> @@ -735,6 +736,20 @@ static const struct drm_connector_funcs 
> simpledrm_connector_funcs = {
>   .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
>  };
>  
> +static void simpledrm_init_panic_buffer(struct drm_plane *plane)
> +{
> + struct simpledrm_device *sdev = simpledrm_device_of_dev(plane->dev);
> + struct drm_framebuffer fb;
> +
> + /* Fake framebuffer struct for drm_panic_set_buffer */
> + fb.width = sdev->mode.hdisplay;
> + fb.height = sdev->mode.vdisplay;
> + fb.format = sdev->format;
> + fb.pitches[0] = sdev->pitch;
> +
> + drm_panic_set_buffer(plane->panic_scanout, , >screen_base);
> +}
> +
>  static const struct drm_mode_config_funcs simpledrm_mode_config_funcs = {
>   .fb_create = drm_gem_fb_create_with_dirty,
>   .atomic_check = drm_atomic_helper_check,
> @@ -945,6 +960,8 @@ static struct simpledrm_device 
> *simpledrm_device_create(struct drm_driver *drv,
>   return ERR_PTR(ret);
>   drm_plane_helper_add(primary_plane, 
> _primary_plane_helper_funcs);
>   drm_plane_enable_fb_damage_clips(primary_plane);
> + drm_panic_register(primary_plane);

Just a quick comment on this:

This does not work, the driver is not ready to handle panic calls at this
stage. Instead we need to automatically register all planes that support
panic handling in drm_dev_register(), and we need to remove them all again
in drm_dev_unregister(). Outside of these functions it is not safe to call
into driver code.

At that point it might be simpler to only register one panic notifier per
drm_device, and push the loop into the panic handler again.

Cheers, Sima

> + simpledrm_init_panic_buffer(primary_plane);
>  
>   /* CRTC */
>  
> -- 
> 2.43.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: UAPI Re: [PATCH 1/3] drm: Add DRM_MODE_TV_MODE_MONOCHROME

2024-02-29 Thread Daniel Vetter
On Wed, Feb 28, 2024 at 04:22:56PM +, Simon Ser wrote:
> On Wednesday, February 28th, 2024 at 17:14, Maxime Ripard 
>  wrote:
> 
> > > I don't know what the rules were 8 years ago, but the current uAPI rules
> > > are what they are, and a new enum entry is new uAPI.
> > 
> > TBF, and even if the wayland compositors support is missing, this
> > property is perfectly usable as it is with upstream, open-source code,
> > through either the command-line or X.org, and it's documented.
> > 
> > So it's fine by me from a UAPI requirement side.
> 
> That is not a valid way to pass the uAPI requirements IMHO. Yes, one
> can program any KMS property via modetest or xrandr. Does that mean that
> none of the new uAPI need a "real" implementation anymore? Does that mean
> that the massive patch adding a color pipeline uAPI doesn't need
> user-space anymore?

xrandr only supports properties on the connector, so it's right out for
the color pipeline.

Also "we use xrandr for color properties" very much doesn't pass the bs
filter of "is it a toy".

My take would be that this escape hatch is also not valid for all
connector property, stuff that is clearly meant to be configured
automatically by the compositors cannot use the "we use xrandr" excuse,
because users can't type fast enough and hit  precisely enough to
update a property in lockstep with the compositor's redraw loop :-)

> The only thing I'm saying is that this breaks the usual DRM requirements.
> If, as a maintainer, you're fine with breaking the rules and have a good
> motivation to do so, that's fine by me. Rules are meant to be broken from
> time to time depending on the situation. But please don't pretend that
> modetest/xrandr is valid user-space to pass the rules.

I think it bends it pretty badly, because people running native Xorg are
slowly going away, and the modetest hack does not clear the bar for "is it
a joke/test/demo hack" for me.

I think some weston (or whatever compositor you like) config file support
to set a bunch of "really only way to configure is by hand" output
properties would clear the bar here for me. Because that is a feature I
already mentioned that xrandr _does_ have, and which modetest hackery very
much does not.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH RFC 0/4] Support for Simulated Panels

2024-02-29 Thread Daniel Vetter
On Wed, Feb 28, 2024 at 01:49:34PM -0800, Jessica Zhang wrote:
> 
> 
> On 2/2/2024 2:15 AM, Maxime Ripard wrote:
> > On Tue, Jan 30, 2024 at 09:53:13AM +0100, Daniel Vetter wrote:
> > > > > > > > Wouldn't it be simpler if we had a vkms-like panel that we 
> > > > > > > > could either
> > > > > > > > configure from DT or from debugfs that would just be registered 
> > > > > > > > the
> > > > > > > > usual way and would be the only panel we register?
> > > > > > > 
> > > > > > 
> > > > > > No, we need to have validate actual hardware pipeline with the 
> > > > > > simulated
> > > > > > panel. With vkms, actual display pipeline will not be validated. 
> > > > > > With
> > > > > > incorrect display pipeline misconfigurations arising from different 
> > > > > > panel
> > > > > > combinations, this can easily be caught with any existing IGT CRC 
> > > > > > testing.
> > > > > > In addition, all performance related bugs can also be easily caught 
> > > > > > by
> > > > > > simulating high resolution displays.
> > > > > 
> > > > > That's not what I meant. What I meant was that something like a
> > > > > user-configurable, generic, panel driver would be a good idea. Just 
> > > > > like
> > > > > vkms (with the debugfs patches) is for a full blown KMS device.
> > > > > 
> > > > 
> > > > Let me respond for both this question and the one below from you/Jani.
> > > > 
> > > > Certainly having user-configurable information is a goal here. The 
> > > > end-goal
> > > > is to make everything there in the existing panels such as below like I
> > > > wrote:
> > > > 
> > > > 1) Display resolution with timings (drm_display_mode)
> > > > 2) Compression/non-compression
> > > > 3) Command mode/Video mode
> > > > 4) MIPI mode flags
> > > > 5) DCS commands for panel enable/disable and other panel sequences
> > > > 6) Power-up/Power-down sequence for the panel
> > > > 
> > > > But, we also have to see what all is feasible today from the DRM fwk
> > > > standpoint. There are some limitations about what is boot-time 
> > > > configurable
> > > > using bootparams and what is runtime configurable (across a modeset) 
> > > > using
> > > > debugfs.
> > > > 
> > > > 1) Today, everything part of struct mipi_dsi_device needs to be 
> > > > available at
> > > > boot time from what I can see as we need that while calling
> > > > mipi_dsi_attach(). So for that we went with boot-params.
> > > > 
> > > > 2) For the list of modes, we can move this to a debugfs like
> > > > "populate_modes" which the client using a sim panel can call before 
> > > > picking
> > > > a mode and triggering a commit.
> > > > 
> > > > But we need to have some default mode and configuration.
> > > 
> > > Uh, at the risk of sounding a bit like I'm just chasing the latest
> > > buzzwords, but this sounds like something that's screaming for ebpf.
> > 
> > I make a half-joke to Jani on IRC about it, but I was also being
> > half-serious. If the goal we want to have is to fully emulate any panel
> > variation, ebpf really looks like the best and most flexible way
> > forward.
> 
> Hi Maxime and Daniel,
> 
> For our current sim panel requirements, we can go with implementing the
> configfs first then add ebpf if requirements get more complex.

Agreed, this is definitely the pragmatic approach to get this going.
-Sima
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC PATCH v4 00/42] Color Pipeline API w/ VKMS

2024-02-29 Thread Daniel Vetter
rs/gpu/drm/drm_mode_config.c |   7 +
>  drivers/gpu/drm/drm_plane.c   |  52 ++
>  drivers/gpu/drm/tests/Makefile|   3 +-
>  drivers/gpu/drm/tests/drm_fixp_test.c |  69 ++
>  drivers/gpu/drm/vkms/Kconfig  |  20 +
>  drivers/gpu/drm/vkms/Makefile |   4 +-
>  drivers/gpu/drm/vkms/tests/.kunitconfig   |   4 +
>  drivers/gpu/drm/vkms/tests/vkms_color_tests.c | 449 ++
>  drivers/gpu/drm/vkms/vkms_colorop.c   | 100 +++
>  drivers/gpu/drm/vkms/vkms_composer.c  | 135 ++-
>  drivers/gpu/drm/vkms/vkms_drv.h   |   8 +
>  drivers/gpu/drm/vkms/vkms_luts.c  | 802 ++
>  drivers/gpu/drm/vkms/vkms_luts.h  |  12 +
>  drivers/gpu/drm/vkms/vkms_plane.c |   2 +
>  include/drm/drm_atomic.h  | 122 +++
>  include/drm/drm_atomic_uapi.h |   3 +
>  include/drm/drm_colorop.h | 301 +++
>  include/drm/drm_file.h|   7 +
>  include/drm/drm_fixed.h   |  35 +-
>  include/drm/drm_mode_config.h |  18 +
>  include/drm/drm_plane.h   |  13 +
>  include/uapi/drm/drm.h|  16 +
>  include/uapi/drm/drm_mode.h   |  14 +
>  38 files changed, 3882 insertions(+), 30 deletions(-)
>  create mode 100644 Documentation/gpu/rfc/color_pipeline.rst
>  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.c
>  create mode 100644 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_colorop.h
>  create mode 100644 drivers/gpu/drm/drm_colorop.c
>  create mode 100644 drivers/gpu/drm/tests/drm_fixp_test.c
>  create mode 100644 drivers/gpu/drm/vkms/Kconfig
>  create mode 100644 drivers/gpu/drm/vkms/tests/.kunitconfig
>  create mode 100644 drivers/gpu/drm/vkms/tests/vkms_color_tests.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_colorop.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_luts.c
>  create mode 100644 drivers/gpu/drm/vkms/vkms_luts.h
>  create mode 100644 include/drm/drm_colorop.h
> 
> --
> 2.44.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC] drm/fourcc: Add RPI modifiers

2024-02-29 Thread Daniel Vetter
On Wed, Feb 28, 2024 at 01:13:45PM +0200, Laurent Pinchart wrote:
> On Wed, Feb 28, 2024 at 11:41:57AM +0100, Jacopo Mondi wrote:
> > On Tue, Feb 27, 2024 at 03:08:27PM +0200, Laurent Pinchart wrote:
> > > On Mon, Feb 26, 2024 at 04:46:24PM +0100, Daniel Vetter wrote:
> > > > On Mon, 26 Feb 2024 at 16:39, Jacopo Mondi wrote:
> > > > >
> > > > > Add modifiers for the Raspberry Pi PiSP compressed formats.
> > > > >
> > > > > The compressed formats are documented at:
> > > > > Documentation/userspace-api/media/v4l/pixfmt-pisp-comp-rggb.rst
> > > > >
> > > > > and in the PiSP datasheet:
> > > > > https://datasheets.raspberrypi.com/camera/raspberry-pi-image-signal-processor-specification.pdf
> > > > >
> > > > > Signed-off-by: Jacopo Mondi 
> > > > > ---
> > > > >
> > > > > Background:
> > > > > ---
> > > > >
> > > > > The Raspberry Pi PiSP camera subsystem is on its way to upstream 
> > > > > through the
> > > > > Video4Linux2 subsystem:
> > > > > https://patchwork.linuxtv.org/project/linux-media/list/?series=12310
> > > > >
> > > > > The PiSP camera system is composed by a "Front End" and a "Back End".
> > > > > The FrontEnd part is a MIPI CSI-2 receiver that store frames to 
> > > > > memory and
> > > > > produce statistics, and the BackEnd is a memory-to-memory ISP that 
> > > > > converts
> > > > > images in a format usable by application.
> > > > >
> > > > > The "FrontEnd" is capable of encoding RAW Bayer images as received by 
> > > > > the
> > > > > image sensor in a 'compressed' format defined by Raspberry Pi and 
> > > > > fully
> > > > > documented in the PiSP manual:
> > > > > https://datasheets.raspberrypi.com/camera/raspberry-pi-image-signal-processor-specification.pdf
> > > > >
> > > > > The compression scheme is documented in the in-review patch series 
> > > > > for the BE
> > > > > support at:
> > > > > https://patchwork.linuxtv.org/project/linux-media/patch/20240223163012.300763-7-jacopo.mo...@ideasonboard.com/
> > > > >
> > > > > The "BackEnd" is capable of consuming images in the compressed format 
> > > > > and
> > > > > optionally user application might want to inspect those images for 
> > > > > debugging
> > > > > purposes.
> > > > >
> > > > > Why a DRM modifier
> > > > > --
> > > > >
> > > > > The PiSP support is entirely implemented in libcamera, with the 
> > > > > support of an
> > > > > hw-specific library called 'libpisp'.
> > > > >
> > > > > libcamera uses the fourcc codes defined by DRM to define its formats:
> > > > > https://git.libcamera.org/libcamera/libcamera.git/tree/src/libcamera/formats.yaml
> > > > >
> > > > > And to define a new libcamera format for the Raspberry Pi compressed 
> > > > > ones we
> > > > > need to associate the above proposed modifiers with a RAW Bayer format
> > > > > identifier.
> > > > >
> > > > > In example:
> > > > >
> > > > >   - RGGB16_PISP_COMP1:
> > > > >   fourcc: DRM_FORMAT_SRGGB16
> > >
> > > An "interesting" issue here is that these formats currently live in
> > > libcamera only, we haven't merged them in DRM "yet". This may be a
> > > prerequisite ?
> > >
> > 
> > Ah right! I didn't notice!
> > 
> > I think there are two issues at play here, one to be clarified by the
> > DRM maintainers, the other more technically involved with the
> > definition of the Bayer formats themselves.
> > 
> > - Does DRM want RAW Bayer formats to be listed here, as these are not
> >   typically 'graphic' formats. What's the DRM maintainers opinion here ?
> 
> To give some context, the "historical mistake" I keep referring to
> regarding V4L2 is the decision to combine the bit depth of raw formats
> with the colour filter array (a.k.a. Bayer) pattern into a fourcc. I
> think we should have defined raw pixel formats that only encode a bit
> depth, and conveyed the CFA pa

Re: [RFC] drm/fourcc: Add RPI modifiers

2024-02-29 Thread Daniel Vetter
On Tue, Feb 27, 2024 at 03:10:06PM +0200, Laurent Pinchart wrote:
> On Mon, Feb 26, 2024 at 05:24:41PM +0100, Jacopo Mondi wrote:
> > On Mon, Feb 26, 2024 at 04:46:24PM +0100, Daniel Vetter wrote:
> > > On Mon, 26 Feb 2024 at 16:39, Jacopo Mondi wrote:
> > > >
> > > > Add modifiers for the Raspberry Pi PiSP compressed formats.
> > > >
> > > > The compressed formats are documented at:
> > > > Documentation/userspace-api/media/v4l/pixfmt-pisp-comp-rggb.rst
> > > >
> > > > and in the PiSP datasheet:
> > > > https://datasheets.raspberrypi.com/camera/raspberry-pi-image-signal-processor-specification.pdf
> > > >
> > > > Signed-off-by: Jacopo Mondi 
> > > > ---
> > > >
> > > > Background:
> > > > ---
> > > >
> > > > The Raspberry Pi PiSP camera subsystem is on its way to upstream 
> > > > through the
> > > > Video4Linux2 subsystem:
> > > > https://patchwork.linuxtv.org/project/linux-media/list/?series=12310
> > > >
> > > > The PiSP camera system is composed by a "Front End" and a "Back End".
> > > > The FrontEnd part is a MIPI CSI-2 receiver that store frames to memory 
> > > > and
> > > > produce statistics, and the BackEnd is a memory-to-memory ISP that 
> > > > converts
> > > > images in a format usable by application.
> > > >
> > > > The "FrontEnd" is capable of encoding RAW Bayer images as received by 
> > > > the
> > > > image sensor in a 'compressed' format defined by Raspberry Pi and fully
> > > > documented in the PiSP manual:
> > > > https://datasheets.raspberrypi.com/camera/raspberry-pi-image-signal-processor-specification.pdf
> > > >
> > > > The compression scheme is documented in the in-review patch series for 
> > > > the BE
> > > > support at:
> > > > https://patchwork.linuxtv.org/project/linux-media/patch/20240223163012.300763-7-jacopo.mo...@ideasonboard.com/
> > > >
> > > > The "BackEnd" is capable of consuming images in the compressed format 
> > > > and
> > > > optionally user application might want to inspect those images for 
> > > > debugging
> > > > purposes.
> > > >
> > > > Why a DRM modifier
> > > > --
> > > >
> > > > The PiSP support is entirely implemented in libcamera, with the support 
> > > > of an
> > > > hw-specific library called 'libpisp'.
> > > >
> > > > libcamera uses the fourcc codes defined by DRM to define its formats:
> > > > https://git.libcamera.org/libcamera/libcamera.git/tree/src/libcamera/formats.yaml
> > > >
> > > > And to define a new libcamera format for the Raspberry Pi compressed 
> > > > ones we
> > > > need to associate the above proposed modifiers with a RAW Bayer format
> > > > identifier.
> > > >
> > > > In example:
> > > >
> > > >   - RGGB16_PISP_COMP1:
> > > >   fourcc: DRM_FORMAT_SRGGB16
> > > >   mod: PISP_FORMAT_MOD_COMPRESS_MODE1
> > > >   - GRBG16_PISP_COMP1:
> > > >   fourcc: DRM_FORMAT_SGRBG16
> > > >   mod: PISP_FORMAT_MOD_COMPRESS_MODE1
> > > >   - GBRG16_PISP_COMP1:
> > > >   fourcc: DRM_FORMAT_SGBRG16
> > > >   mod: PISP_FORMAT_MOD_COMPRESS_MODE1
> > > >   - BGGR16_PISP_COMP1:
> > > >   fourcc: DRM_FORMAT_SBGGR16
> > > >   mod: PISP_FORMAT_MOD_COMPRESS_MODE1
> > > >   - MONO_PISP_COMP1:
> > > >   fourcc: DRM_FORMAT_R16
> > > >   mod: PISP_FORMAT_MOD_COMPRESS_MODE1
> > > >
> > > > See
> > > > https://patchwork.libcamera.org/patch/19503/
> > > >
> > > > Would if be acceptable for DRM to include the above proposed modifiers 
> > > > for the
> > > > purpose of defining the above presented libcamera formats ? There will 
> > > > be no
> > > > graphic format associated with these modifiers as their purpose it not
> > > > displaying images but rather exchange them between the components of the
> > > > camera subsystem (and possibly be inspected by specialized test 
> > > > applications).
> > >
> > > Yeah I think libcamera using drm-fourcc formats and modifiers is
> > >

Re: drm-misc migration to Gitlab server

2024-02-28 Thread Daniel Vetter
Hi Stephen!

On Wed, Feb 21, 2024 at 09:46:43AM +1100, Stephen Rothwell wrote:
> On Tue, 20 Feb 2024 11:25:05 + Daniel Stone  wrote:
> >
> > On Tue, 20 Feb 2024 at 09:05, Maxime Ripard  wrote:
> > > On Tue, Feb 20, 2024 at 09:49:25AM +0100, Maxime Ripard wrote:  
> > > > This will be mostly transparent to current committers and users: we'll
> > > > still use dim, in the exact same way, the only change will be the URL of
> > > > the repo. This will also be transparent to linux-next, since the
> > > > linux-next branch lives in its own repo and is pushed by dim when
> > > > pushing a branch.  
> > >
> > > Actually, I double-checked and linux-next pulls our branches directly,
> > > so once the transition is over we'll have to notify them too.  
> > 
> > cc sfr - once we move the DRM repos to a different location, what's
> > the best way to update linux-next?
> > 
> > That being said, we could set up read-only pull mirrors in the old
> > location ... something I want to do in March (because what else are
> > you going to do on holiday?) is to kill the write repos on kemper
> > (git.fd.o), move them to being on molly (cgit/anongit.fd.o) only, and
> > just have a cronjob that regularly pulls from all the gl.fd.o repos,
> > rather than pushing from GitLab.
> 
> These are (I think) all the drm trees/branches that I fetch every day:
> 
> git://anongit.freedesktop.org/drm-intel#for-linux-next
> git://anongit.freedesktop.org/drm-intel#for-linux-next-fixes
> git://anongit.freedesktop.org/drm/drm-misc#for-linux-next
> git://anongit.freedesktop.org/drm/drm-misc#for-linux-next-fixes
> git://git.freedesktop.org/git/drm/drm.git#drm-fixes
> git://git.freedesktop.org/git/drm/drm.git#drm-next

To test out the process we've moved drm.git first. It's now here:

https://gitlab.freedesktop.org/drm/kernel.git

Still the same two branches. Can you please update the url? We haven't
enabled the auto-mirror for this one, since we want to make sure the
upgrade path in the tooling works and people do switch over to the new
repo.

For the others the plan is keep the old places automatically mirrored, at
least until the dust has settled.

Thanks!


> git://git.freedesktop.org/git/drm/drm.git#topic/drm-ci
> git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos.git#for-linux-next
> https://gitlab.freedesktop.org/agd5f/linux#drm-next
> https://gitlab.freedesktop.org/drm/msm.git#msm-next
> https://gitlab.freedesktop.org/drm/tegra.git#for-next
> https://gitlab.freedesktop.org/lumag/msm.git#msm-next-lumag
> 
> If someone could just send me all the new equivalent URLs when the
> change happens, I will fix them up in my config.
> 
> -- 
> Cheers,
> Stephen Rothwell



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] drm/scheduler: Simplify the allocation of slab caches in drm_sched_fence_slab_init

2024-02-28 Thread Daniel Vetter
On Wed, Feb 21, 2024 at 04:55:58PM +0800, Kunwu Chan wrote:
> Use the new KMEM_CACHE() macro instead of direct kmem_cache_create
> to simplify the creation of SLAB caches.
> 
> Signed-off-by: Kunwu Chan 

Applied to drm-misc-next, thanks for your patch!
-Sima

> ---
>  drivers/gpu/drm/scheduler/sched_fence.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
> b/drivers/gpu/drm/scheduler/sched_fence.c
> index 06cedfe4b486..0f35f009b9d3 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -33,9 +33,7 @@ static struct kmem_cache *sched_fence_slab;
>  
>  static int __init drm_sched_fence_slab_init(void)
>  {
> - sched_fence_slab = kmem_cache_create(
> - "drm_sched_fence", sizeof(struct drm_sched_fence), 0,
> - SLAB_HWCACHE_ALIGN, NULL);
> + sched_fence_slab = KMEM_CACHE(drm_sched_fence, SLAB_HWCACHE_ALIGN);
>   if (!sched_fence_slab)
>   return -ENOMEM;
>  
> -- 
> 2.39.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: drm-misc migration to Gitlab server

2024-02-28 Thread Daniel Vetter
Hi Maxime

Just wanted to chime in with a big thank your for volunteering to push
this forward! Best vacations when you come back and are surprised that the
5 year old project is magically moving :-)
-Sima

On Tue, Feb 20, 2024 at 09:49:25AM +0100, Maxime Ripard wrote:
> Hi,
> 
> As you might have noticed in your mails, Daniel Stone and I have been
> working on adding all the drm-misc maintainers and committers to Gitlab.
> 
> The current repository was still in the cgit instance and was creating
> an unnecessary burden on the admins.
> 
> For example, any new user had to create an issue and go through Daniel
> to create an cgit account, even though that user already needed to have
> a gitlab account to create the issue in the first place. Adding an SSH
> key was a similar story. By moving to Gitlab, we'll remove most of that
> burden.
> 
> This will be mostly transparent to current committers and users: we'll
> still use dim, in the exact same way, the only change will be the URL of
> the repo. This will also be transparent to linux-next, since the
> linux-next branch lives in its own repo and is pushed by dim when
> pushing a branch.
> 
> In the next few days, you might notice conflicting notifications. As we
> figured out the drm-misc group and repo structure, we've added members
> at multiple levels and we will clean things up in the next few days. The
> final organization is that every drm-misc committers and maintainer will
> have permissions over the drm-misc group and its projects, so if it's
> not the case please let us know.
> 
> # What we do next
> 
> ## Adding the a remaining users
> 
> I was able to identify most of the users with an account on the old git
> server. However, there's a few I couldn't match with certainty to a
> gitlab account:
> 
> * andr2000
> * jsarha
> 
> Please let me know your Gitlab user so I can add them to the group.
> 
> ## Changing the default location repo
> 
> Dim gets its repos list in the drm-rerere nightly.conf file. We will
> need to change that file to match the gitlab repo, and drop the old cgit
> URLs to avoid people pushing to the wrong place once the transition is
> made.
> 
> I guess the next merge window is a good time to do so, it's usually a
> quiet time for us and a small disruption would be easier to handle. I'll
> be off-duty during that time too, so I'll have time to handle any
> complication.
> 
> ## Updating the documentation
> 
> The documentation currently mentions the old process to request a
> drm-misc access. It will all go through Gitlab now, so it will change a
> few things. We will also need to update and move the issue template to
> the new repo to maintain consistency.
> 
> I would expect the transition (if everything goes smoothly) to occur in
> the merge-window time frame (11/03 -> 24/03).
> 
> Let me know if you have any questions, or if there's anything we missed,
> Maxime



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 0/1] Always record job cycle and timestamp information

2024-02-28 Thread Daniel Vetter
On Wed, Feb 21, 2024 at 03:13:41PM +, Adrián Larumbe wrote:
> > On 21.02.2024 14:34, Tvrtko Ursulin wrote:
> > 
> > On 21/02/2024 09:40, Adrián Larumbe wrote:
> > > Hi,
> > > 
> > > I just wanted to make sure we're on the same page on this matter. So in
> > > Panfrost, and I guess in almost every other single driver out there, HW 
> > > perf
> > > counters and their uapi interface are orthogonal to fdinfo's reporting on 
> > > drm
> > > engine utilisation.
> > > 
> > > At the moment it seems like HW perfcounters and the way they're exposed 
> > > to UM
> > > are very idiosincratic and any attempt to unify their interface into a 
> > > common
> > > set of ioctl's sounds like a gargantuan task I wouldn't like to be faced 
> > > with.
> > 
> > I share the same feeling on this sub-topic.
> > 
> > > As for fdinfo, I guess there's more room for coming up with common 
> > > helpers that
> > > could handle the toggling of HW support for drm engine calculations, but 
> > > I'd at
> > > least have to see how things are being done in let's say, Freedreno or 
> > > Intel.
> > 
> > For Intel we don't need this ability, well at least for pre-GuC platforms.
> > Stat collection is super cheap and permanently enabled there.
> > 
> > But let me copy Umesh because something at the back of my mind is telling me
> > that perhaps there was something expensive about collecting these stats with
> > the GuC backend? If so maybe a toggle would be beneficial there.
> > 
> > > Right now there's a pressing need to get rid of the debugfs knob for 
> > > fdinfo's
> > > drm engine profiling sources in Panfrost, after which I could perhaps 
> > > draw up an
> > > RFC for how to generalise this onto other drivers.
> > 
> > There is a knob currently meaning fdinfo does not work by default? If that 
> > is
> > so, I would have at least expected someone had submitted a patch for gputop 
> > to
> > handle this toggle. It being kind of a common reference implementation I 
> > don't
> > think it is great if it does not work out of the box.
> 
> It does sound like I forgot to document this knob at the time I submited 
> fdinfo
> support for Panforst.  I'll make a point of mentioning it in a new patch 
> where I
> drop debugfs support and enable toggling from sysfs instead.
> 
> > The toggle as an idea sounds a bit annoying, but if there is no other
> > realistic way maybe it is not too bad. As long as it is documented in the
> > drm-usage-stats.rst, doesn't live in debugfs, and has some common plumbing
> > implemented both on the kernel side and for the aforementioned gputop /
> > igt_drm_fdinfo / igt_drm_clients. Where and how exactly TBD.
> 
> As soon as the new patch is merged, I'll go and reflect the driver uAPI 
> changes
> in all three of these.

Would be good (and kinda proper per process rules) to implement the code
in at least e.g. gputop for this. To make sure it actually works for that
use-case, and there's not an oversight that breaks it all.
-Sima

> 
> > Regards,
> > 
> > Tvrtko
> > 
> 
> Cheers,
> Adrian
> 
> > > On 16.02.2024 17:43, Tvrtko Ursulin wrote:
> > > > 
> > > > On 16/02/2024 16:57, Daniel Vetter wrote:
> > > > > On Wed, Feb 14, 2024 at 01:52:05PM +, Steven Price wrote:
> > > > > > Hi Adrián,
> > > > > > 
> > > > > > On 14/02/2024 12:14, Adrián Larumbe wrote:
> > > > > > > A driver user expressed interest in being able to access engine 
> > > > > > > usage stats
> > > > > > > through fdinfo when debugfs is not built into their kernel. In 
> > > > > > > the current
> > > > > > > implementation, this wasn't possible, because it was assumed even 
> > > > > > > for
> > > > > > > inflight jobs enabling the cycle counter and timestamp registers 
> > > > > > > would
> > > > > > > incur in additional power consumption, so both were kept disabled 
> > > > > > > until
> > > > > > > toggled through debugfs.
> > > > > > > 
> > > > > > > A second read of the TRM made me think otherwise, but this is 
> > > > > > > something
> > > > > > > that would be best clarified by someone from ARM's side.
> > > > > > 
> > > > >

Re: [PATCH v4 0/2] drm: Check polling initialized before

2024-02-28 Thread Daniel Vetter
On Mon, Feb 19, 2024 at 10:02:17PM -0800, Shradha Gupta wrote:
> Gentle reminder to consume this patchset.

Apologies, I've assumed you have commit rights or know someone, but seems
like no one from microsoft can push to drm-misc :-/

Applied to drm-misc-next now, thanks for your patches!
-Sima

> 
> On Tue, Feb 06, 2024 at 03:07:47PM +0100, Daniel Vetter wrote:
> > On Thu, Feb 01, 2024 at 10:42:56PM -0800, Shradha Gupta wrote:
> > > This patchset consists of sanity checks before enabling/disabling
> > > output polling to make sure we do not call polling enable and disable
> > > functions when polling for the device is not initialized or is now
> > > uninitialized(by drm_kms_helper_poll_fini() function)
> > > 
> > > The first patch consists of these checks in
> > > drm_kms_helper_poll_disable() and drm_kms_helper_poll_enable() calls.
> > > It further flags a warning if a caller violates this. It also adds
> > > these checks in drm_mode_config_helper_resume() and
> > > drm_mode_config_helper_suspend() calls to avoid this warning.
> > > 
> > > The second patch adds a similar missing check in
> > > drm_helper_probe_single_connector_modes() function that is exposed by
> > > the new warning introduced in the first patch.
> > > 
> > > Shradha Gupta (2):
> > >   drm: Check output polling initialized before disabling
> > >   drm: Check polling initialized before enabling in
> > > drm_helper_probe_single_connector_modes
> > 
> > On the series:
> > 
> > Reviewed-by: Daniel Vetter 
> > 
> > > 
> > >  drivers/gpu/drm/drm_modeset_helper.c | 19 ---
> > >  drivers/gpu/drm/drm_probe_helper.c   | 21 +
> > >  2 files changed, 33 insertions(+), 7 deletions(-)
> > > 
> > > -- 
> > > 2.34.1
> > > 
> > 
> > -- 
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] fbcon: always restore the old font data in fbcon_do_set_font()

2024-02-26 Thread Daniel Vetter
On Thu, 8 Feb 2024 at 12:44, Jiri Slaby (SUSE)  wrote:
>
> Commit a5a923038d70 (fbdev: fbcon: Properly revert changes when
> vc_resize() failed) started restoring old font data upon failure (of
> vc_resize()). But it performs so only for user fonts. It means that the
> "system"/internal fonts are not restored at all. So in result, the very
> first call to fbcon_do_set_font() performs no restore at all upon
> failing vc_resize().
>
> This can be reproduced by Syzkaller to crash the system on the next
> invocation of font_get(). It's rather hard to hit the allocation failure
> in vc_resize() on the first font_set(), but not impossible. Esp. if
> fault injection is used to aid the execution/failure. It was
> demonstrated by Sirius:
>   BUG: unable to handle page fault for address: fff8
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x) - not-present page
>   PGD cb7b067 P4D cb7b067 PUD cb7d067 PMD 0
>   Oops:  [#1] PREEMPT SMP KASAN
>   CPU: 1 PID: 8007 Comm: poc Not tainted 6.7.0-g9d1694dc91ce #20
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 
> 04/01/2014
>   RIP: 0010:fbcon_get_font+0x229/0x800 drivers/video/fbdev/core/fbcon.c:2286
>   Call Trace:
>
>con_font_get drivers/tty/vt/vt.c:4558 [inline]
>con_font_op+0x1fc/0xf20 drivers/tty/vt/vt.c:4673
>vt_k_ioctl drivers/tty/vt/vt_ioctl.c:474 [inline]
>vt_ioctl+0x632/0x2ec0 drivers/tty/vt/vt_ioctl.c:752
>tty_ioctl+0x6f8/0x1570 drivers/tty/tty_io.c:2803
>vfs_ioctl fs/ioctl.c:51 [inline]
>   ...
>
> So restore the font data in any case, not only for user fonts. Note the
> later 'if' is now protected by 'old_userfont' and not 'old_data' as the
> latter is always set now. (And it is supposed to be non-NULL. Otherwise
> we would see the bug above again.)
>
> Signed-off-by: Jiri Slaby (SUSE) 
> Fixes: a5a923038d70 ("fbdev: fbcon: Properly revert changes when vc_resize() 
> failed")
> Cc: Ubisectech Sirius 
> Cc: Daniel Vetter 
> Cc: Helge Deller 
> Cc: linux-fb...@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org

Reviewing patches to code where assignments in if conditions are still
cool is a pain :-/

Merged to drm-misc-fixes with reported/tested-by credit tag for sirius added.

Thanks a lot!
-Sima

> ---
>  drivers/video/fbdev/core/fbcon.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/video/fbdev/core/fbcon.c 
> b/drivers/video/fbdev/core/fbcon.c
> index 17a9fc80b4e4..98d0e2dbcd2f 100644
> --- a/drivers/video/fbdev/core/fbcon.c
> +++ b/drivers/video/fbdev/core/fbcon.c
> @@ -2395,11 +2395,9 @@ static int fbcon_do_set_font(struct vc_data *vc, int 
> w, int h, int charcount,
> struct fbcon_ops *ops = info->fbcon_par;
> struct fbcon_display *p = _display[vc->vc_num];
> int resize, ret, old_userfont, old_width, old_height, old_charcount;
> -   char *old_data = NULL;
> +   u8 *old_data = vc->vc_font.data;
>
> resize = (w != vc->vc_font.width) || (h != vc->vc_font.height);
> -   if (p->userfont)
> -   old_data = vc->vc_font.data;
> vc->vc_font.data = (void *)(p->fontdata = data);
> old_userfont = p->userfont;
> if ((p->userfont = userfont))
> @@ -2433,13 +2431,13 @@ static int fbcon_do_set_font(struct vc_data *vc, int 
> w, int h, int charcount,
> update_screen(vc);
> }
>
> -   if (old_data && (--REFCOUNT(old_data) == 0))
> +   if (old_userfont && (--REFCOUNT(old_data) == 0))
> kfree(old_data - FONT_EXTRA_WORDS * sizeof(int));
> return 0;
>
>  err_out:
> p->fontdata = old_data;
> -   vc->vc_font.data = (void *)old_data;
> +   vc->vc_font.data = old_data;
>
> if (userfont) {
> p->userfont = old_userfont;
> --
> 2.43.0
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [RFC] drm/fourcc: Add RPI modifiers

2024-02-26 Thread Daniel Vetter
rcc_canonicalize_nvidia_format_mod(__u64 
> modifier)
>  #define AMD_FMT_MOD_CLEAR(field) \
> (~((__u64)AMD_FMT_MOD_##field##_MASK << AMD_FMT_MOD_##field##_SHIFT))
>
> +/* RPI (Raspberry Pi) modifiers */
> +#define PISP_FORMAT_MOD_COMPRESS_MODE1 fourcc_mod_code(RPI, 1)
> +#define PISP_FORMAT_MOD_COMPRESS_MODE2 fourcc_mod_code(RPI, 2)
> +
>  #if defined(__cplusplus)
>  }
>  #endif
> --
> 2.43.0
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [rerere PATCH] nightly.conf: Switch drm.git to gitlab

2024-02-26 Thread Daniel Vetter
On Mon, 26 Feb 2024 at 16:16, Maxime Ripard  wrote:
>
> Start the big migration with drm.git.
>
> Existing remotes need to be adjusted with
>
> git remote set-url drm ssh://g...@gitlab.freedesktop.org:drm/kernel.git
>
> or
>
> git remote set-url drm https://gitlab.freedesktop.org/drm/kernel.git
>
> Signed-off-by: Maxime Ripard 

Acked.
-Sima

> ---
>  nightly.conf | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/nightly.conf b/nightly.conf
> index c189f2ccad17..68ac687a5c7f 100644
> --- a/nightly.conf
> +++ b/nightly.conf
> @@ -45,10 +45,8 @@ https://anongit.freedesktop.org/git/drm/drm-misc
>  https://anongit.freedesktop.org/git/drm/drm-misc.git
>  "
>  drm_tip_repos[drm]="
> -ssh://git.freedesktop.org/git/drm/drm
> -git://anongit.freedesktop.org/drm/drm
> -https://anongit.freedesktop.org/git/drm/drm
> -https://anongit.freedesktop.org/git/drm/drm.git
> +https://gitlab.freedesktop.org/drm/kernel.git
> +ssh://g...@gitlab.freedesktop.org:drm/kernel.git
>  "
>  drm_tip_repos[linux-upstream]="
>  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> --
> 2.43.2
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH] MAINTAINERS: Update drm.git URL

2024-02-26 Thread Daniel Vetter
On Mon, 26 Feb 2024 at 16:21, Maxime Ripard  wrote:
>
> Now that the main DRM tree has moved to Gitlab, adjust the MAINTAINERS
> git trees to reflect the location change.
>
> Signed-off-by: Maxime Ripard 

Acked-by: Daniel Vetter 

> ---
>  MAINTAINERS | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7e7e7c378913..00e8a8ff627e 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -614,7 +614,7 @@ AGPGART DRIVER
>  M: David Airlie 
>  L: dri-devel@lists.freedesktop.org
>  S: Maintained
> -T: git git://anongit.freedesktop.org/drm/drm
> +T: git https://gitlab.freedesktop.org/drm/kernel.git
>  F: drivers/char/agp/
>  F: include/linux/agp*
>  F: include/uapi/linux/agp*
> @@ -6996,7 +6996,7 @@ L:dri-devel@lists.freedesktop.org
>  S: Maintained
>  B: https://gitlab.freedesktop.org/drm
>  C: irc://irc.oftc.net/dri-devel
> -T: git git://anongit.freedesktop.org/drm/drm
> +T: git https://gitlab.freedesktop.org/drm/kernel.git
>  F: Documentation/devicetree/bindings/display/
>  F: Documentation/devicetree/bindings/gpu/
>  F: Documentation/gpu/
> --
> 2.43.2
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: drm-misc migration to Gitlab server

2024-02-26 Thread Daniel Vetter
On Wed, Feb 21, 2024 at 09:46:43AM +1100, Stephen Rothwell wrote:
> Hi Daniel,
> 
> On Tue, 20 Feb 2024 11:25:05 + Daniel Stone  wrote:
> >
> > On Tue, 20 Feb 2024 at 09:05, Maxime Ripard  wrote:
> > > On Tue, Feb 20, 2024 at 09:49:25AM +0100, Maxime Ripard wrote:  
> > > > This will be mostly transparent to current committers and users: we'll
> > > > still use dim, in the exact same way, the only change will be the URL of
> > > > the repo. This will also be transparent to linux-next, since the
> > > > linux-next branch lives in its own repo and is pushed by dim when
> > > > pushing a branch.  
> > >
> > > Actually, I double-checked and linux-next pulls our branches directly,
> > > so once the transition is over we'll have to notify them too.  
> > 
> > cc sfr - once we move the DRM repos to a different location, what's
> > the best way to update linux-next?
> > 
> > That being said, we could set up read-only pull mirrors in the old
> > location ... something I want to do in March (because what else are
> > you going to do on holiday?) is to kill the write repos on kemper
> > (git.fd.o), move them to being on molly (cgit/anongit.fd.o) only, and
> > just have a cronjob that regularly pulls from all the gl.fd.o repos,
> > rather than pushing from GitLab.
> 
> These are (I think) all the drm trees/branches that I fetch every day:
> 
> git://anongit.freedesktop.org/drm-intel#for-linux-next
> git://anongit.freedesktop.org/drm-intel#for-linux-next-fixes
> git://anongit.freedesktop.org/drm/drm-misc#for-linux-next
> git://anongit.freedesktop.org/drm/drm-misc#for-linux-next-fixes
> git://git.freedesktop.org/git/drm/drm.git#drm-fixes
> git://git.freedesktop.org/git/drm/drm.git#drm-next
> git://git.freedesktop.org/git/drm/drm.git#topic/drm-ci

This one you can drop right away, it's all merged, apologies for not
telling you earlier.
-Sima

> git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos.git#for-linux-next
> https://gitlab.freedesktop.org/agd5f/linux#drm-next
> https://gitlab.freedesktop.org/drm/msm.git#msm-next
> https://gitlab.freedesktop.org/drm/tegra.git#for-next
> https://gitlab.freedesktop.org/lumag/msm.git#msm-next-lumag
> 
> If someone could just send me all the new equivalent URLs when the
> change happens, I will fix them up in my config.
> 
> -- 
> Cheers,
> Stephen Rothwell



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [git pull] habanalabs for drm-next-6.9

2024-02-26 Thread Daniel Vetter
++-
>  drivers/accel/habanalabs/common/mmu/mmu_v1.c   | 354 
> +++--
>  drivers/accel/habanalabs/common/mmu/mmu_v2.c   | 338 
>  drivers/accel/habanalabs/common/mmu/mmu_v2_hr.c|  24 +-
>  drivers/accel/habanalabs/common/security.c |  33 +-
>  drivers/accel/habanalabs/common/security.h |   3 +-
>  drivers/accel/habanalabs/gaudi/gaudi.c |   9 +-
>  drivers/accel/habanalabs/gaudi2/gaudi2.c   | 308 --
>  drivers/accel/habanalabs/gaudi2/gaudi2P.h  |  15 +-
>  drivers/accel/habanalabs/goya/goya.c   |  12 +-
>  drivers/accel/habanalabs/goya/goya_coresight.c |   3 +-
>  .../habanalabs/include/hw_ip/mmu/mmu_general.h |   2 +
>  21 files changed, 1008 insertions(+), 510 deletions(-)
>  create mode 100644 drivers/accel/habanalabs/common/mmu/mmu_v2.c

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PULL] drm-xe-next

2024-02-26 Thread Daniel Vetter
>  drivers/gpu/drm/xe/xe_uc_fw.c  |  60 +-
>  drivers/gpu/drm/xe/xe_uc_fw_types.h|   9 +-
>  drivers/gpu/drm/xe/xe_vm.c | 287 ++-
>  drivers/gpu/drm/xe/xe_vm.h |   7 +-
>  drivers/gpu/drm/xe/xe_vm_types.h   |  18 +-
>  drivers/gpu/drm/xe/xe_vram_freq.c  | 128 +++
>  drivers/gpu/drm/xe/xe_vram_freq.h  |  13 +
>  drivers/gpu/drm/xe/xe_wa.c | 191 +
>  drivers/gpu/drm/xe/xe_wa_oob.rules |  12 +-
>  drivers/gpu/drm/xe/xe_wait_user_fence.c|   2 +-
>  drivers/gpu/drm/xe/xe_wopcm_types.h|   4 +-
>  include/uapi/drm/xe_drm.h  |  34 +-
>  141 files changed, 6518 insertions(+), 1187 deletions(-)
>  create mode 100644 drivers/gpu/drm/xe/abi/gsc_proxy_commands_abi.h
>  create mode 100644 drivers/gpu/drm/xe/abi/guc_actions_sriov_abi.h
>  create mode 100644 drivers/gpu/drm/xe/abi/guc_relay_actions_abi.h
>  create mode 100644 drivers/gpu/drm/xe/abi/guc_relay_communication_abi.h
>  rename drivers/gpu/drm/xe/{ => display}/xe_display.c (99%)
>  rename drivers/gpu/drm/xe/{ => display}/xe_display.h (100%)
>  create mode 100644 drivers/gpu/drm/xe/regs/xe_pcode_regs.h
>  create mode 100644 drivers/gpu/drm/xe/tests/xe_guc_db_mgr_test.c
>  create mode 100644 drivers/gpu/drm/xe/tests/xe_guc_relay_test.c
>  create mode 100644 drivers/gpu/drm/xe/tests/xe_kunit_helpers.c
>  create mode 100644 drivers/gpu/drm/xe/tests/xe_kunit_helpers.h
>  create mode 100644 drivers/gpu/drm/xe/tests/xe_test_mod.c
>  create mode 100644 drivers/gpu/drm/xe/xe_gsc_proxy.c
>  create mode 100644 drivers/gpu/drm/xe/xe_gsc_proxy.h
>  create mode 100644 drivers/gpu/drm/xe/xe_gt_sriov_printk.h
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_db_mgr.c
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_db_mgr.h
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_hxg_helpers.h
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_relay.c
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_relay.h
>  create mode 100644 drivers/gpu/drm/xe/xe_guc_relay_types.h
>  create mode 100644 drivers/gpu/drm/xe/xe_memirq.c
>  create mode 100644 drivers/gpu/drm/xe/xe_memirq.h
>  create mode 100644 drivers/gpu/drm/xe/xe_memirq_types.h
>  create mode 100644 drivers/gpu/drm/xe/xe_vram_freq.c
>  create mode 100644 drivers/gpu/drm/xe/xe_vram_freq.h

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PULL] drm-misc-next

2024-02-26 Thread Daniel Vetter
pu/drm/panel/Makefile |1 +
>  drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c |2 +
>  drivers/gpu/drm/panel/panel-edp.c  |   19 +-
>  drivers/gpu/drm/panel/panel-himax-hx83112a.c   |  372 
>  drivers/gpu/drm/panel/panel-leadtek-ltk500hd1829.c |  265 ++-
>  drivers/gpu/drm/panel/panel-simple.c   |   20 +
>  drivers/gpu/drm/renesas/Kconfig|1 +
>  drivers/gpu/drm/renesas/Makefile   |1 +
>  drivers/gpu/drm/renesas/rz-du/Kconfig  |   12 +
>  drivers/gpu/drm/renesas/rz-du/Makefile |8 +
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_crtc.c  |  422 +
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_crtc.h  |   89 +
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_drv.c   |  175 ++
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_drv.h   |   78 +
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_encoder.c   |   72 +
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_encoder.h   |   32 +
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c   |  371 
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.h   |   43 +
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_vsp.c   |  349 
>  drivers/gpu/drm/renesas/rz-du/rzg2l_du_vsp.h   |   82 +
>  drivers/gpu/drm/xe/xe_drm_client.c |2 +-
>  drivers/gpu/host1x/cdma.c  |3 +-
>  include/drm/drm_bridge.h   |2 +-
>  include/drm/drm_gem.h  |   13 +
>  70 files changed, 3748 insertions(+), 1279 deletions(-)
>  create mode 100644 
> Documentation/devicetree/bindings/display/panel/himax,hx83112a.yaml
>  create mode 100644 
> Documentation/devicetree/bindings/display/renesas,rzg2l-du.yaml
>  create mode 100644 
> drivers/gpu/drm/ci/xfails/msm-sc7180-trogdor-kingoftown-skips.txt
>  create mode 100644 
> drivers/gpu/drm/ci/xfails/msm-sc7180-trogdor-lazor-limozeen-skips.txt
>  create mode 100644 drivers/gpu/drm/panel/panel-himax-hx83112a.c
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/Kconfig
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/Makefile
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_crtc.c
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_crtc.h
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_drv.c
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_drv.h
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_encoder.c
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_encoder.h
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.c
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_kms.h
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_vsp.c
>  create mode 100644 drivers/gpu/drm/renesas/rz-du/rzg2l_du_vsp.h
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Frankenstrasse 146, 90461 Nuernberg, Germany
> GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
> HRB 36809 (AG Nuernberg)

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


  1   2   3   4   5   6   7   8   9   10   >