Re: [PATCH v3 6/9] dma-buf/fence-array: Add fence deadline support
Am 08.09.21 um 20:00 schrieb Daniel Vetter: On Fri, Sep 03, 2021 at 11:47:57AM -0700, Rob Clark wrote: From: Rob Clark Signed-off-by: Rob Clark --- drivers/dma-buf/dma-fence-array.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-buf/dma-fence-array.c index d3fbd950be94..8d194b09ee3d 100644 --- a/drivers/dma-buf/dma-fence-array.c +++ b/drivers/dma-buf/dma-fence-array.c @@ -119,12 +119,23 @@ static void dma_fence_array_release(struct dma_fence *fence) dma_fence_free(fence); } +static void dma_fence_array_set_deadline(struct dma_fence *fence, +ktime_t deadline) +{ + struct dma_fence_array *array = to_dma_fence_array(fence); + unsigned i; + + for (i = 0; i < array->num_fences; ++i) + dma_fence_set_deadline(array->fences[i], deadline); Hm I wonder whether this can go wrong, and whether we need Christian's massive fence iterator that I've seen flying around. If you nest these things too much it could all go wrong I think. I looked at other users which inspect dma_fence_array and none of them have a risk for unbounded recursion. That should work fine or at least doesn't add anything new which could go boom. The dma_fence_array() can't contain other dma_fence_array or dma_fence_chain objects or it could end up in a recursion and corrupt the kernel stack. That's a well known limitation for other code paths as well. So Reviewed-by: Christian König for this one. Regards, Christian. Maybe check with Christian. -Daniel +} + const struct dma_fence_ops dma_fence_array_ops = { .get_driver_name = dma_fence_array_get_driver_name, .get_timeline_name = dma_fence_array_get_timeline_name, .enable_signaling = dma_fence_array_enable_signaling, .signaled = dma_fence_array_signaled, .release = dma_fence_array_release, + .set_deadline = dma_fence_array_set_deadline, }; EXPORT_SYMBOL(dma_fence_array_ops); -- 2.31.1
Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support
Am 08.09.21 um 20:45 schrieb Daniel Vetter: On Wed, Sep 08, 2021 at 11:19:15AM -0700, Rob Clark wrote: On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter wrote: On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote: From: Rob Clark Signed-off-by: Rob Clark --- drivers/dma-buf/dma-fence-chain.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/dma-buf/dma-fence-chain.c b/drivers/dma-buf/dma-fence-chain.c index 1b4cb3e5cec9..736a9ad3ea6d 100644 --- a/drivers/dma-buf/dma-fence-chain.c +++ b/drivers/dma-buf/dma-fence-chain.c @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence *fence) dma_fence_free(fence); } + +static void dma_fence_chain_set_deadline(struct dma_fence *fence, + ktime_t deadline) +{ + dma_fence_chain_for_each(fence, fence) { + struct dma_fence_chain *chain = to_dma_fence_chain(fence); + struct dma_fence *f = chain ? chain->fence : fence; Doesn't this just end up calling set_deadline on a chain, potenetially resulting in recursion? Also I don't think this should ever happen, why did you add that? Tbh the fence-chain was the part I was a bit fuzzy about, and the main reason I added igt tests. The iteration is similar to how, for ex, dma_fence_chain_signaled() work, and according to the igt test it does what was intended Huh indeed. Maybe something we should fix, like why does the dma_fence_chain_for_each not give you the upcast chain pointer ... I guess this also needs more Christian and less me. Yeah I was also already thinking about having a dma_fence_chain_for_each_contained() macro which directly returns the containing fence, just didn't had time to implement/clean that up. And yes the patch is correct as it is and avoid the recursion, so Reviewed-by: Christian König for this one. Regards, Christian. -Daniel BR, -R -Daniel + + dma_fence_set_deadline(f, deadline); + } +} + const struct dma_fence_ops dma_fence_chain_ops = { .use_64bit_seqno = true, .get_driver_name = dma_fence_chain_get_driver_name, @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = { .enable_signaling = dma_fence_chain_enable_signaling, .signaled = dma_fence_chain_signaled, .release = dma_fence_chain_release, + .set_deadline = dma_fence_chain_set_deadline, }; EXPORT_SYMBOL(dma_fence_chain_ops); -- 2.31.1 -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support
Am 08.09.21 um 19:45 schrieb Daniel Vetter: On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote: From: Rob Clark As the finished fence is the one that is exposed to userspace, and therefore the one that other operations, like atomic update, would block on, we need to propagate the deadline from from the finished fence to the actual hw fence. v2: Split into drm_sched_fence_set_parent() (ckoenig) Signed-off-by: Rob Clark --- drivers/gpu/drm/scheduler/sched_fence.c | 34 + drivers/gpu/drm/scheduler/sched_main.c | 2 +- include/drm/gpu_scheduler.h | 8 ++ 3 files changed, 43 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/sched_fence.c b/drivers/gpu/drm/scheduler/sched_fence.c index bcea035cf4c6..4fc41a71d1c7 100644 --- a/drivers/gpu/drm/scheduler/sched_fence.c +++ b/drivers/gpu/drm/scheduler/sched_fence.c @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct dma_fence *f) dma_fence_put(&fence->scheduled); } +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f, + ktime_t deadline) +{ + struct drm_sched_fence *fence = to_drm_sched_fence(f); + unsigned long flags; + + spin_lock_irqsave(&fence->lock, flags); + + /* If we already have an earlier deadline, keep it: */ + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) && + ktime_before(fence->deadline, deadline)) { + spin_unlock_irqrestore(&fence->lock, flags); + return; + } + + fence->deadline = deadline; + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags); + + spin_unlock_irqrestore(&fence->lock, flags); + + if (fence->parent) + dma_fence_set_deadline(fence->parent, deadline); +} + static const struct dma_fence_ops drm_sched_fence_ops_scheduled = { .get_driver_name = drm_sched_fence_get_driver_name, .get_timeline_name = drm_sched_fence_get_timeline_name, @@ -138,6 +162,7 @@ static const struct dma_fence_ops drm_sched_fence_ops_finished = { .get_driver_name = drm_sched_fence_get_driver_name, .get_timeline_name = drm_sched_fence_get_timeline_name, .release = drm_sched_fence_release_finished, + .set_deadline = drm_sched_fence_set_deadline_finished, }; struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f) @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f) } EXPORT_SYMBOL(to_drm_sched_fence); +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence, + struct dma_fence *fence) +{ + s_fence->parent = dma_fence_get(fence); + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, +&s_fence->finished.flags)) Don't you need the spinlock here too to avoid races? test_bit is very unordered, so guarantees nothing. Spinlock would need to be both around ->parent = and the test_bit. Entirely aside, but there's discussions going on to preallocate the hw fence somehow. If we do that we could make the deadline forwarding lockless here. Having a spinlock just to set the parent is a bit annoying Well previously we didn't needed the spinlock to set the parent because the parent wasn't used outside of the main thread. This becomes an issue now because we can race with setting the deadline. So yeah we probably do need the spinlock here now. Christian. ... Alternative is that you do this locklessly with barriers and a _lot_ of comments. Would be good to benchmark whether the overhead matters though first. -Daniel + dma_fence_set_deadline(fence, s_fence->deadline); +} + struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity, void *owner) { diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 595e47ff7d06..27bf0ac0625f 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -978,7 +978,7 @@ static int drm_sched_main(void *param) drm_sched_fence_scheduled(s_fence); if (!IS_ERR_OR_NULL(fence)) { - s_fence->parent = dma_fence_get(fence); + drm_sched_fence_set_parent(s_fence, fence); r = dma_fence_add_callback(fence, &sched_job->cb, drm_sched_job_done_cb); if (r == -ENOENT) diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 7f77a455722c..158ddd662469 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -238,6 +238,12 @@ struct drm_sched_fence { */ struct dma_fencefinished; + /** +* @deadline: deadline set on &drm_sched_fence.finished which +* potentially needs to be pro
Re: [PATCH] kernel/locking: Add context to ww_mutex_trylock.
Op 08-09-2021 om 12:14 schreef Peter Zijlstra: > On Tue, Sep 07, 2021 at 03:20:44PM +0200, Maarten Lankhorst wrote: >> i915 will soon gain an eviction path that trylock a whole lot of locks >> for eviction, getting dmesg failures like below: >> >> BUG: MAX_LOCK_DEPTH too low! >> turning off the locking correctness validator. >> depth: 48 max: 48! >> 48 locks held by i915_selftest/5776: >> #0: 888101a79240 (&dev->mutex){}-{3:3}, at: >> __driver_attach+0x88/0x160 >> #1: c99778c0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: >> i915_vma_pin.constprop.63+0x39/0x1b0 [i915] >> #2: 88800cf74de8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: >> i915_vma_pin.constprop.63+0x5f/0x1b0 [i915] >> #3: 88810c7f9e38 (&vm->mutex/1){+.+.}-{3:3}, at: >> i915_vma_pin_ww+0x1c4/0x9d0 [i915] >> #4: 88810bad5768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: >> i915_gem_evict_something+0x110/0x860 [i915] >> #5: 88810bad60e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: >> i915_gem_evict_something+0x110/0x860 [i915] >> ... >> #46: 88811964d768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: >> i915_gem_evict_something+0x110/0x860 [i915] >> #47: 88811964e0e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: >> i915_gem_evict_something+0x110/0x860 [i915] >> INFO: lockdep is turned off. >> As an intermediate solution, add an acquire context to ww_mutex_trylock, >> which allows us to do proper nesting annotations on the trylocks, making >> the above lockdep splat disappear. > Fair enough I suppose. > >> +/** >> + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire >> context >> + * @lock: mutex to lock >> + * @ctx: optional w/w acquire context >> + * >> + * Trylocks a mutex with the optional acquire context; no deadlock >> detection is >> + * possible. Returns 1 if the mutex has been acquired successfully, 0 >> otherwise. >> + * >> + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a >> @ctx is >> + * specified, -EALREADY and -EDEADLK handling may happen in calls to >> ww_mutex_lock. >> + * >> + * A mutex acquired with this function must be released with >> ww_mutex_unlock. >> + */ >> +int __sched >> +ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ctx) >> +{ >> +bool locked; >> + >> +if (!ctx) >> +return mutex_trylock(&ww->base); >> + >> +#ifdef CONFIG_DEBUG_MUTEXES >> +DEBUG_LOCKS_WARN_ON(ww->base.magic != &ww->base); >> +#endif >> + >> +preempt_disable(); >> +locked = __mutex_trylock(&ww->base); >> + >> +if (locked) { >> +ww_mutex_set_context_fastpath(ww, ctx); >> +mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ctx->dep_map, >> _RET_IP_); >> +} >> +preempt_enable(); >> + >> +return locked; >> +} >> +EXPORT_SYMBOL(ww_mutex_trylock); > You'll need a similar hunk in ww_rt_mutex.c What tree has that file?
Re: [PATCH] drm/amd/amdkfd: fix possible memory leak in svm_range_restore_pages
Hi Xiyu Yang, This bug was already fixed by this commit: https://gitlab.freedesktop.org/agd5f/linux/-/commit/598a118db0d85a432f8cd541a6a5d31e31c56b6b Regards, Felix Am 2021-09-09 um 12:27 a.m. schrieb Xiyu Yang: > The memory leak issue may take place in an error handling path. When > p->xnack_enabled is NULL, the function simply returns with -EFAULT and > forgets to decrement the reference count of a kfd_process object bumped > by kfd_lookup_process_by_pasid, which may incur memory leaks. > > Fix it by jumping to label "out", in which kfd_unref_process() decreases > the refcount. > > Signed-off-by: Xiyu Yang > Signed-off-by: Xin Xiong > Signed-off-by: Xin Tan > --- > drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > index e883731c3f8f..0f7f1e5621ea 100644 > --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c > @@ -2426,7 +2426,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, > unsigned int pasid, > } > if (!p->xnack_enabled) { > pr_debug("XNACK not enabled for pasid 0x%x\n", pasid); > - return -EFAULT; > + r = -EFAULT; > + goto out; > } > svms = &p->svms; >
Re: [PATCH v1 03/14] mm: add iomem vma selection for memory migration
Am 2021-09-01 um 9:14 p.m. schrieb Dave Chinner: > On Wed, Sep 01, 2021 at 07:07:34PM -0400, Felix Kuehling wrote: >> On 2021-09-01 6:03 p.m., Dave Chinner wrote: >>> On Wed, Sep 01, 2021 at 11:40:43AM -0400, Felix Kuehling wrote: Am 2021-09-01 um 4:29 a.m. schrieb Christoph Hellwig: > On Mon, Aug 30, 2021 at 01:04:43PM -0400, Felix Kuehling wrote: driver code is not really involved in updating the CPU mappings. Maybe it's something we need to do in the migration helpers. >>> It looks like I'm totally misunderstanding what you are adding here >>> then. Why do we need any special treatment at all for memory that >>> has normal struct pages and is part of the direct kernel map? >> The pages are like normal memory for purposes of mapping them in CPU >> page tables and for coherent access from the CPU. > That's the user page tables. What about the kernel direct map? > If there is a normal kernel struct page backing there really should > be no need for the pgmap. I'm not sure. The physical address ranges are in the UEFI system address map as special-purpose memory. Does Linux create the struct pages and kernel direct map for that without a pgmap call? I didn't see that last time I went digging through that code. >> From an application >> perspective, we want file-backed and anonymous mappings to be able to >> use DEVICE_PUBLIC pages with coherent CPU access. The goal is to >> optimize performance for GPU heavy workloads while minimizing the need >> to migrate data back-and-forth between system memory and device memory. > I don't really understand that part. file backed pages are always > allocated by the file system using the pagecache helpers, that is > using the page allocator. Anonymouns memory also always comes from > the page allocator. I'm coming at this from my experience with DEVICE_PRIVATE. Both anonymous and file-backed pages should be migrateable to DEVICE_PRIVATE memory by the migrate_vma_* helpers for more efficient access by our GPU. (*) It's part of the basic premise of HMM as I understand it. I would expect the same thing to work for DEVICE_PUBLIC memory. (*) I believe migrating file-backed pages to DEVICE_PRIVATE doesn't currently work, but that's something I'm hoping to fix at some point. >>> FWIW, I'd love to see the architecture documents that define how >>> filesystems are supposed to interact with this device private >>> memory. This whole "hand filesystem controlled memory to other >>> devices" is a minefield that is trivial to get wrong iand very >>> difficult to fix - just look at the historical mess that RDMA >>> to/from file backed and/or DAX pages has been. >>> >>> So, really, from my perspective as a filesystem engineer, I want to >>> see an actual specification for how this new memory type is going to >>> interact with filesystem and the page cache so everyone has some >>> idea of how this is going to work and can point out how it doesn't >>> work before code that simply doesn't work is pushed out into >>> production systems and then merged >> OK. To be clear, that's not part of this patch series. And I have no >> authority to push anything in this part of the kernel, so you have nothing >> to fear. ;) > I know this isn't part of the series. but this patchset is laying > the foundation for future work that will impact subsystems that > currently have zero visibility and/or knowledge of these changes. I don't think this patchset is the foundation. Jerome Glisse's work on HMM is, which was merged 4 years ago and is being used by multiple drivers now, with the AMD GPU driver being a fairly recent addition. > There must be an overall architectural plan for this functionality, > regardless of the current state of implementation. It's that overall > architectural plan I'm asking about here, because I need to > understand that before I can sanely comment on the page > cache/filesystem aspect of the proposed functionality... The overall HMM and ZONE_DEVICE architecture is documented to some extent in Documentation/vm/hmm.rst, though it may not go into the level of detail you are looking for. > >> FWIW, we already have the ability to map file-backed system memory pages >> into device page tables with HMM and interval notifiers. But we cannot >> currently migrate them to ZONE_DEVICE pages. > Sure, but sharing page cache pages allocated and managed by the > filesystem is not what you are talking about. You're talking about > migrating page cache data to completely different memory allocated > by a different memory manager that the filesystems currently have no > knowledge of or have any way of interfacing with. This is not part of the current patch series. It is only my intention to look into ways to migrate file-backed pages to ZONE_DEVICE memory in the future. > > So I'm asking basic, fundamental questions abou
[PATCH v2 2/2] dt-bindings: display: Add binding for LG.Philips SW43101
Add a device tree binding for LG.Philips SW43101. Signed-off-by: Yassine Oudjana --- Changes since v1: - Add regulator support. - Add MAINTAINERS entry. - Dual-license DT binding. .../display/panel/lgphilips,sw43101.yaml | 75 +++ MAINTAINERS | 1 + 2 files changed, 76 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml diff --git a/Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml b/Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml new file mode 100644 index ..6f67130bab8b --- /dev/null +++ b/Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml @@ -0,0 +1,75 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/display/panel/lgphilips,sw43101.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: LG.Philips SW43101 1080x1920 OLED panel + +maintainers: + - Yassine Oudjana + +allOf: + - $ref: panel-common.yaml# + +properties: + compatible: +const: lgphilips,sw43101 + + reg: true + reset-gpios: true + + vdd-supply: +description: Digital power supply + + avdd-supply: +description: Analog power supply + + elvdd-supply: +description: Positive electroluminescence power supply + + elvss-supply: +description: Negative electroluminescence power supply + + port: true + +required: + - compatible + - reg + - reset-gpios + - vdd-supply + - avdd-supply + - elvdd-supply + - elvss-supply + - port + +additionalProperties: false + +examples: + - | +#include + +dsi { +#address-cells = <1>; +#size-cells = <0>; + +panel@0 { +compatible = "lgphilips,sw43101"; +reg = <0>; + +reset-gpios = <&msmgpio 8 GPIO_ACTIVE_LOW>; + +vdd-supply = <&vreg_l14a_1p8>; +avdd-supply = <&vlin1_7v3>; +elvdd-supply = <&elvdd>; +elvss-supply = <&elvss>; + +port { +panel_in: endpoint { +remote-endpoint = <&dsi_out>; +}; +}; +}; +}; + +... diff --git a/MAINTAINERS b/MAINTAINERS index 46431e8ad373..aab9f057e8d7 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5902,6 +5902,7 @@ F:include/uapi/drm/i810_drm.h DRM DRIVER FOR LG.PHILIPS SW43101 PANEL M: Yassine Oudjana S: Maintained +F: Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml F: drivers/gpu/drm/panel/panel-lgphilips-sw43101.c DRM DRIVER FOR LVDS PANELS -- 2.33.0
[PATCH v2 1/2] drm/panel: Add driver for LG.Philips SW43101 DSI video mode panel
Add a driver for the LG.Philips SW43101 FHD (1080x1920) OLED DSI video mode panel. This driver has been generated using linux-mdss-dsi-panel-driver-generator. Signed-off-by: Yassine Oudjana --- Changes since v1: - Add regulator support. - Add MAINTAINERS entry. MAINTAINERS | 5 + drivers/gpu/drm/panel/Kconfig | 10 + drivers/gpu/drm/panel/Makefile| 1 + .../gpu/drm/panel/panel-lgphilips-sw43101.c | 363 ++ 4 files changed, 379 insertions(+) create mode 100644 drivers/gpu/drm/panel/panel-lgphilips-sw43101.c diff --git a/MAINTAINERS b/MAINTAINERS index f58dad1a1922..46431e8ad373 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5899,6 +5899,11 @@ S: Orphan / Obsolete F: drivers/gpu/drm/i810/ F: include/uapi/drm/i810_drm.h +DRM DRIVER FOR LG.PHILIPS SW43101 PANEL +M: Yassine Oudjana +S: Maintained +F: drivers/gpu/drm/panel/panel-lgphilips-sw43101.c + DRM DRIVER FOR LVDS PANELS M: Laurent Pinchart L: dri-devel@lists.freedesktop.org diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig index beb581b96ecd..d8741c35bbfc 100644 --- a/drivers/gpu/drm/panel/Kconfig +++ b/drivers/gpu/drm/panel/Kconfig @@ -226,6 +226,16 @@ config DRM_PANEL_SAMSUNG_LD9040 depends on OF && SPI select VIDEOMODE_HELPERS +config DRM_PANEL_LGPHILIPS_SW43101 + tristate "LG.Philips SW43101 DSI video mode panel" + depends on OF + depends on DRM_MIPI_DSI + depends on BACKLIGHT_CLASS_DEVICE + help + Say Y here if you want to enable support for the LG.Philips SW43101 FHD + (1080x1920) OLED DSI video mode panel found on the Xiaomi Mi Note 2. + To compile this driver as a module, choose M here. + config DRM_PANEL_LG_LB035Q02 tristate "LG LB035Q024573 RGB panel" depends on GPIOLIB && OF && SPI diff --git a/drivers/gpu/drm/panel/Makefile b/drivers/gpu/drm/panel/Makefile index c8132050bcec..e79143ad14dd 100644 --- a/drivers/gpu/drm/panel/Makefile +++ b/drivers/gpu/drm/panel/Makefile @@ -20,6 +20,7 @@ obj-$(CONFIG_DRM_PANEL_KHADAS_TS050) += panel-khadas-ts050.o obj-$(CONFIG_DRM_PANEL_KINGDISPLAY_KD097D04) += panel-kingdisplay-kd097d04.o obj-$(CONFIG_DRM_PANEL_LEADTEK_LTK050H3146W) += panel-leadtek-ltk050h3146w.o obj-$(CONFIG_DRM_PANEL_LEADTEK_LTK500HD1829) += panel-leadtek-ltk500hd1829.o +obj-$(CONFIG_DRM_PANEL_LGPHILIPS_SW43101) += panel-lgphilips-sw43101.o obj-$(CONFIG_DRM_PANEL_LG_LB035Q02) += panel-lg-lb035q02.o obj-$(CONFIG_DRM_PANEL_LG_LG4573) += panel-lg-lg4573.o obj-$(CONFIG_DRM_PANEL_NEC_NL8048HL11) += panel-nec-nl8048hl11.o diff --git a/drivers/gpu/drm/panel/panel-lgphilips-sw43101.c b/drivers/gpu/drm/panel/panel-lgphilips-sw43101.c new file mode 100644 index ..b7adf1d4a8ab --- /dev/null +++ b/drivers/gpu/drm/panel/panel-lgphilips-sw43101.c @@ -0,0 +1,363 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * LG.Philips SW43101 OLED Panel driver + * Generated with linux-mdss-dsi-panel-driver-generator + * + * Copyright (c) 2020 Yassine Oudjana + */ + +#include +#include +#include +#include +#include +#include + +#include + +#include +#include +#include + +static const char * const regulator_names[] = { + "vdd", + "avdd", + "elvdd", + "elvss", +}; + +struct sw43101_device { + struct drm_panel panel; + struct mipi_dsi_device *dsi; + struct gpio_desc *reset_gpio; + struct regulator_bulk_data supplies[ARRAY_SIZE(regulator_names)]; + bool prepared; +}; + +static inline +struct sw43101_device *to_sw43101_device(struct drm_panel *panel) +{ + return container_of(panel, struct sw43101_device, panel); +} + +#define dsi_dcs_write_seq(dsi, seq...) do {\ + static const u8 d[] = { seq }; \ + int ret;\ + ret = mipi_dsi_dcs_write_buffer(dsi, d, ARRAY_SIZE(d)); \ + if (ret < 0)\ + return ret; \ + } while (0) + +static void sw43101_reset(struct sw43101_device *ctx) +{ + gpiod_set_value_cansleep(ctx->reset_gpio, 0); + usleep_range(1, 11000); + gpiod_set_value_cansleep(ctx->reset_gpio, 1); + usleep_range(1000, 2000); + gpiod_set_value_cansleep(ctx->reset_gpio, 0); + msleep(20); +} + +static int sw43101_on(struct sw43101_device *ctx) +{ + struct mipi_dsi_device *dsi = ctx->dsi; + struct device *dev = &dsi->dev; + int ret; + + dsi->mode_flags |= MIPI_DSI_MODE_LPM; + + dsi_dcs_write_seq(dsi, 0xb0, 0x5a); + usleep_range(1000, 2000); + dsi_dcs_write_seq(dsi, 0xb2, 0x13, 0x12, 0x40, 0xd0, 0xff, 0xff, 0x15); + dsi_dcs_write_seq(dsi, 0xe3, 0x01); + usleep_range(1000, 2000); + dsi_d
[PATCH v2 0/2] drm/panel: Add support for LG.Philips SW43101 DSI video mode panel
This adds a driver for the LG.Philips SW43101 FHD (1080x1920) 58Hz OLED DSI video mode panel, found on the Xiaomi Mi Note 2. Changes since v1: - Add regulator support. - Add MAINTAINERS entry. - Dual-license DT binding. Yassine Oudjana (2): drm/panel: Add driver for LG.Philips SW43101 DSI video mode panel dt-bindings: display: Add binding for LG.Philips SW43101 .../display/panel/lgphilips,sw43101.yaml | 75 MAINTAINERS | 6 + drivers/gpu/drm/panel/Kconfig | 10 + drivers/gpu/drm/panel/Makefile| 1 + .../gpu/drm/panel/panel-lgphilips-sw43101.c | 363 ++ 5 files changed, 455 insertions(+) create mode 100644 Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml create mode 100644 drivers/gpu/drm/panel/panel-lgphilips-sw43101.c -- 2.33.0
Re: [PATCH v1 03/14] mm: add iomem vma selection for memory migration
Am 2021-09-02 um 4:18 a.m. schrieb Christoph Hellwig: > On Wed, Sep 01, 2021 at 11:40:43AM -0400, Felix Kuehling wrote: > It looks like I'm totally misunderstanding what you are adding here > then. Why do we need any special treatment at all for memory that > has normal struct pages and is part of the direct kernel map? The pages are like normal memory for purposes of mapping them in CPU page tables and for coherent access from the CPU. >>> That's the user page tables. What about the kernel direct map? >>> If there is a normal kernel struct page backing there really should >>> be no need for the pgmap. >> I'm not sure. The physical address ranges are in the UEFI system address >> map as special-purpose memory. Does Linux create the struct pages and >> kernel direct map for that without a pgmap call? I didn't see that last >> time I went digging through that code. > So doing some googling finds a patch from Dan that claims to hand EFI > special purpose memory to the device dax driver. But when I try to > follow the version that got merged it looks it is treated simply as an > MMIO region to be claimed by drivers, which would not get a struct page. > > Dan, did I misunderstand how E820_TYPE_SOFT_RESERVED works? > From an application perspective, we want file-backed and anonymous mappings to be able to use DEVICE_PUBLIC pages with coherent CPU access. The goal is to optimize performance for GPU heavy workloads while minimizing the need to migrate data back-and-forth between system memory and device memory. >>> I don't really understand that part. file backed pages are always >>> allocated by the file system using the pagecache helpers, that is >>> using the page allocator. Anonymouns memory also always comes from >>> the page allocator. >> I'm coming at this from my experience with DEVICE_PRIVATE. Both >> anonymous and file-backed pages should be migrateable to DEVICE_PRIVATE >> memory by the migrate_vma_* helpers for more efficient access by our >> GPU. (*) It's part of the basic premise of HMM as I understand it. I >> would expect the same thing to work for DEVICE_PUBLIC memory. > Ok, so you want to migrate to and from them. Not use DEVICE_PUBLIC > for the actual page cache pages. That maks a lot more sense. > >> I see DEVICE_PUBLIC as an improved version of DEVICE_PRIVATE that allows >> the CPU to map the device memory coherently to minimize the need for >> migrations when CPU and GPU access the same memory concurrently or >> alternatingly. But we're not going as far as putting that memory >> entirely under the management of the Linux memory manager and VM >> subsystem. Our (and HPE's) system architects decided that this memory is >> not suitable to be used like regular NUMA system memory by the Linux >> memory manager. > So yes. It is a Memory Mapped I/O region, which unlike the PCIe BARs > that people typically deal with is fully cache coherent. I think this > does make more sense as a description. > > But to go back to what start this discussion: If these are memory > mapped I/O pfn_valid should generally not return true for them. As I understand it, pfn_valid should be true for any pfn that's part of the kernel's physical memory map, i.e. is returned by page_to_pfn or works with pfn_to_page. Both the hmm_range_fault and the migrate_vma_* APIs use pfns to refer to regular system memory and ZONE_DEVICE pages (even DEVICE_PRIVATE). Therefore I believe pfn_valid should be true for ZONE_DEVICE pages as well. Regards, Felix > > And as you already pointed out in reply to Alex we need to tighten the > selection criteria one way or another.
[Bug 213917] Screen starts flickering when laptop(amdgpu) wakes up after suspend.
https://bugzilla.kernel.org/show_bug.cgi?id=213917 Samuel Sieb (samuel-kb...@sieb.net) changed: What|Removed |Added CC||samuel-kb...@sieb.net --- Comment #1 from Samuel Sieb (samuel-kb...@sieb.net) --- I've just run into this as well. I upgraded the laptop to 5.13.12 and the display doesn't resume properly as described. The cpu is "AMD A10-9600P RADEON R5". The only relevant messages in the journal are 4 lines like: amdgpu :00:01.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x address=0x10aa25720 flags=0x0070] with a different value for address in each one. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.
Re: [PULL] drm-misc-fixes
On Thu, 9 Sept 2021 at 03:44, Thomas Zimmermann wrote: > > Hi Dave and Daniel, > > here's this week's PR for drm-misc-fixes. One patch is a potential deadlock > in TTM, the other enables an additional plane in kmb. I'm slightly unhappy > that the latter one ended up in -fixes as it's not a bugfix AFAICT. To avoid messy merge window, I'm not pulling this until after rc1 unless there is some major reason? the current drm-next doesn't have v5.14 in it, and the merge is rather ugly right now. (maybe I should always pull it in before sending to Linus to avoid this in future). Dave.
[PATCH] dma-buf: system_heap: Avoid warning on mid-order allocations
When trying to do mid-order allocations, set __GFP_NOWARN to avoid warning messages if the allocation fails, as we will still fall back to single page allocatitions in that case. This is the similar to what we already do for large order allocations. Cc: Daniel Vetter Cc: Christian Koenig Cc: Sumit Semwal Cc: Liam Mark Cc: Chris Goldsworthy Cc: Laura Abbott Cc: Brian Starkey Cc: Hridya Valsaraju Cc: Suren Baghdasaryan Cc: Sandeep Patil Cc: Daniel Mentz Cc: Ørjan Eide Cc: Robin Murphy Cc: Simon Ser Cc: James Jones Cc: Leo Yan Cc: linux-me...@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: John Stultz --- drivers/dma-buf/heaps/system_heap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index 23a7e74ef966..f57a39ddd063 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -40,11 +40,12 @@ struct dma_heap_attachment { bool mapped; }; +#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP) +#define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN) #define HIGH_ORDER_GFP (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \ | __GFP_NORETRY) & ~__GFP_RECLAIM) \ | __GFP_COMP) -#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP) -static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP}; +static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP}; /* * The selection of the orders used for allocation (1MB, 64K, 4K) is designed * to match with the sizes often found in IOMMUs. Using order 4 pages instead -- 2.25.1
[PATCH AUTOSEL 5.14 010/252] dma-buf: fix dma_resv_test_signaled test_all handling v2
From: Christian König [ Upstream commit 9d38814d1e346ea37a51cbf31f4424c9d059459e ] As the name implies if testing all fences is requested we should indeed test all fences and not skip the exclusive one because we see shared ones. v2: fix logic once more Signed-off-by: Christian König Reviewed-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210702111642.17259-3-christian.koe...@amd.com Signed-off-by: Sasha Levin --- drivers/dma-buf/dma-resv.c | 33 - 1 file changed, 12 insertions(+), 21 deletions(-) diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index f26c71747d43..e744fd87c63c 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -615,25 +615,21 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence) */ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) { - unsigned int seq, shared_count; + struct dma_fence *fence; + unsigned int seq; int ret; rcu_read_lock(); retry: ret = true; - shared_count = 0; seq = read_seqcount_begin(&obj->seq); if (test_all) { struct dma_resv_list *fobj = dma_resv_shared_list(obj); - unsigned int i; - - if (fobj) - shared_count = fobj->shared_count; + unsigned int i, shared_count; + shared_count = fobj ? fobj->shared_count : 0; for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence; - fence = rcu_dereference(fobj->shared[i]); ret = dma_resv_test_signaled_single(fence); if (ret < 0) @@ -641,24 +637,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) else if (!ret) break; } - - if (read_seqcount_retry(&obj->seq, seq)) - goto retry; } - if (!shared_count) { - struct dma_fence *fence_excl = dma_resv_excl_fence(obj); - - if (fence_excl) { - ret = dma_resv_test_signaled_single(fence_excl); - if (ret < 0) - goto retry; + fence = dma_resv_excl_fence(obj); + if (ret && fence) { + ret = dma_resv_test_signaled_single(fence); + if (ret < 0) + goto retry; - if (read_seqcount_retry(&obj->seq, seq)) - goto retry; - } } + if (read_seqcount_retry(&obj->seq, seq)) + goto retry; + rcu_read_unlock(); return ret; } -- 2.30.2
[PATCH AUTOSEL 5.14 008/252] drm/amdgpu: Fix koops when accessing RAS EEPROM
From: Luben Tuikov [ Upstream commit 1d9d2ca85b32605ac9c74c8fa42d0c1cfbe019d4 ] Debugfs RAS EEPROM files are available when the ASIC supports RAS, and when the debugfs is enabled, an also when "ras_enable" module parameter is set to 0. However in this case, we get a kernel oops when accessing some of the "ras_..." controls in debugfs. The reason for this is that struct amdgpu_ras::adev is unset. This commit sets it, thus enabling access to those facilities. Note that this facilitates EEPROM access and not necessarily RAS features or functionality. Cc: Alexander Deucher Cc: John Clements Cc: Hawking Zhang Signed-off-by: Luben Tuikov Acked-by: Alexander Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 16 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index fc66aca28594..95d5842385b3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1966,11 +1966,20 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev) bool exc_err_limit = false; int ret; - if (adev->ras_enabled && con) - data = &con->eh_data; - else + if (!con) + return 0; + + /* Allow access to RAS EEPROM via debugfs, when the ASIC +* supports RAS and debugfs is enabled, but when +* adev->ras_enabled is unset, i.e. when "ras_enable" +* module parameter is set to 0. +*/ + con->adev = adev; + + if (!adev->ras_enabled) return 0; + data = &con->eh_data; *data = kmalloc(sizeof(**data), GFP_KERNEL | __GFP_ZERO); if (!*data) { ret = -ENOMEM; @@ -1980,7 +1989,6 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev) mutex_init(&con->recovery_lock); INIT_WORK(&con->recovery_work, amdgpu_ras_do_recovery); atomic_set(&con->in_recovery, 0); - con->adev = adev; max_eeprom_records_len = amdgpu_ras_eeprom_get_record_max_length(); amdgpu_ras_validate_threshold(adev, max_eeprom_records_len); -- 2.30.2
[PATCH AUTOSEL 5.14 009/252] drm: vc4: Fix pixel-wrap issue with DVP teardown
From: Tim Gover [ Upstream commit 0b066a6809d0f8fd9868e383add36aa5a2fa409d ] Adjust the DVP enable/disable sequence to avoid a pixel getting stuck in an internal, non resettable FIFO within PixelValve when changing HDMI resolution. The blank pixels features of the DVP can prevent signals back to pixelvalve causing it to not clear the FIFO. Adjust the ordering and timing of operations to ensure the clear signal makes it through to pixelvalve. Signed-off-by: Tim Gover Signed-off-by: Maxime Ripard Link: https://patchwork.freedesktop.org/patch/msgid/20210628130533.144617-1-max...@cerno.tech Signed-off-by: Sasha Levin --- drivers/gpu/drm/vc4/vc4_hdmi.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index ad92dbb128b3..f91d37beb113 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -613,12 +613,12 @@ static void vc4_hdmi_encoder_post_crtc_disable(struct drm_encoder *encoder, HDMI_WRITE(HDMI_RAM_PACKET_CONFIG, 0); - HDMI_WRITE(HDMI_VID_CTL, HDMI_READ(HDMI_VID_CTL) | - VC4_HD_VID_CTL_CLRRGB | VC4_HD_VID_CTL_CLRSYNC); + HDMI_WRITE(HDMI_VID_CTL, HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_CLRRGB); - HDMI_WRITE(HDMI_VID_CTL, - HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_BLANKPIX); + mdelay(1); + HDMI_WRITE(HDMI_VID_CTL, + HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE); vc4_hdmi_disable_scrambling(encoder); } @@ -628,12 +628,12 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct drm_encoder *encoder, struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder); int ret; + HDMI_WRITE(HDMI_VID_CTL, + HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_BLANKPIX); + if (vc4_hdmi->variant->phy_disable) vc4_hdmi->variant->phy_disable(vc4_hdmi); - HDMI_WRITE(HDMI_VID_CTL, - HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE); - clk_disable_unprepare(vc4_hdmi->pixel_bvb_clock); clk_disable_unprepare(vc4_hdmi->pixel_clock); @@ -1015,6 +1015,7 @@ static void vc4_hdmi_encoder_post_crtc_enable(struct drm_encoder *encoder, HDMI_WRITE(HDMI_VID_CTL, VC4_HD_VID_CTL_ENABLE | + VC4_HD_VID_CTL_CLRRGB | VC4_HD_VID_CTL_UNDERFLOW_ENABLE | VC4_HD_VID_CTL_FRAME_COUNTER_RESET | (vsync_pos ? 0 : VC4_HD_VID_CTL_VSYNC_LOW) | -- 2.30.2
[PATCH AUTOSEL 5.14 007/252] drm/amdgpu: Fix amdgpu_ras_eeprom_init()
From: Luben Tuikov [ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ] No need to account for the 2 bytes of EEPROM address--this is now well abstracted away by the fixes the the lower layers. Cc: Andrey Grodzovsky Cc: Alexander Deucher Signed-off-by: Luben Tuikov Acked-by: Alexander Deucher Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c index 38222de921d1..8dd151c9e459 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c @@ -325,7 +325,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control *control, return ret; } - __decode_table_header_from_buff(hdr, &buff[2]); + __decode_table_header_from_buff(hdr, buff); if (hdr->header == EEPROM_TABLE_HDR_VAL) { control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) / -- 2.30.2
[PATCH AUTOSEL 5.14 006/252] drm/omap: Follow implicit fencing in prepare_fb
From: Daniel Vetter [ Upstream commit 942d8344d5f14b9ea2ae43756f319b9f44216ba4 ] I guess no one ever tried running omap together with lima or panfrost, not even sure that's possible. Anyway for consistency, fix this. Reviewed-by: Tomi Valkeinen Signed-off-by: Daniel Vetter Cc: Tomi Valkeinen Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-12-daniel.vet...@ffwll.ch Signed-off-by: Sasha Levin --- drivers/gpu/drm/omapdrm/omap_plane.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c b/drivers/gpu/drm/omapdrm/omap_plane.c index 801da917507d..512af976b7e9 100644 --- a/drivers/gpu/drm/omapdrm/omap_plane.c +++ b/drivers/gpu/drm/omapdrm/omap_plane.c @@ -6,6 +6,7 @@ #include #include +#include #include #include "omap_dmm_tiler.h" @@ -29,6 +30,8 @@ static int omap_plane_prepare_fb(struct drm_plane *plane, if (!new_state->fb) return 0; + drm_gem_plane_helper_prepare_fb(plane, new_state); + return omap_framebuffer_pin(new_state->fb); } -- 2.30.2
[PATCH AUTOSEL 5.14 005/252] drm/ttm: Fix multihop assert on eviction.
From: Andrey Grodzovsky [ Upstream commit 403797925768d9fa870f5b1ebcd20016b397083b ] Problem: Under memory pressure when GTT domain is almost full multihop assert will come up when trying to evict LRU BO from VRAM to SYSTEM. Fix: Don't assert on multihop error in evict code but rather do a retry as we do in ttm_bo_move_buffer Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-6-andrey.grodzov...@amd.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/ttm/ttm_bo.c | 63 +++- 1 file changed, 34 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8d7fd65ccced..32202385073a 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -488,6 +488,31 @@ void ttm_bo_unlock_delayed_workqueue(struct ttm_device *bdev, int resched) } EXPORT_SYMBOL(ttm_bo_unlock_delayed_workqueue); +static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo, +struct ttm_resource **mem, +struct ttm_operation_ctx *ctx, +struct ttm_place *hop) +{ + struct ttm_placement hop_placement; + struct ttm_resource *hop_mem; + int ret; + + hop_placement.num_placement = hop_placement.num_busy_placement = 1; + hop_placement.placement = hop_placement.busy_placement = hop; + + /* find space in the bounce domain */ + ret = ttm_bo_mem_space(bo, &hop_placement, &hop_mem, ctx); + if (ret) + return ret; + /* move to the bounce domain */ + ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL); + if (ret) { + ttm_resource_free(bo, &hop_mem); + return ret; + } + return 0; +} + static int ttm_bo_evict(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx) { @@ -527,12 +552,17 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo, goto out; } +bounce: ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop); - if (unlikely(ret)) { - WARN(ret == -EMULTIHOP, "Unexpected multihop in eviction - likely driver bug\n"); - if (ret != -ERESTARTSYS) + if (ret == -EMULTIHOP) { + ret = ttm_bo_bounce_temp_buffer(bo, &evict_mem, ctx, &hop); + if (ret) { pr_err("Buffer eviction failed\n"); - ttm_resource_free(bo, &evict_mem); + ttm_resource_free(bo, &evict_mem); + goto out; + } + /* try and move to final place now. */ + goto bounce; } out: return ret; @@ -847,31 +877,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, } EXPORT_SYMBOL(ttm_bo_mem_space); -static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo, -struct ttm_resource **mem, -struct ttm_operation_ctx *ctx, -struct ttm_place *hop) -{ - struct ttm_placement hop_placement; - struct ttm_resource *hop_mem; - int ret; - - hop_placement.num_placement = hop_placement.num_busy_placement = 1; - hop_placement.placement = hop_placement.busy_placement = hop; - - /* find space in the bounce domain */ - ret = ttm_bo_mem_space(bo, &hop_placement, &hop_mem, ctx); - if (ret) - return ret; - /* move to the bounce domain */ - ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL); - if (ret) { - ttm_resource_free(bo, &hop_mem); - return ret; - } - return 0; -} - static int ttm_bo_move_buffer(struct ttm_buffer_object *bo, struct ttm_placement *placement, struct ttm_operation_ctx *ctx) -- 2.30.2
[PATCH AUTOSEL 5.14 004/252] drm/vc4: hdmi: Set HD_CTL_WHOLSMP and HD_CTL_CHALIGN_SET
From: Dom Cobley [ Upstream commit 1698ecb218eb82587dbfc71a2e26ded66e5ecf59 ] Symptom is random switching of speakers when using multichannel. Repeatedly running speakertest -c8 occasionally starts with channels jumbled. This is fixed with HD_CTL_WHOLSMP. The other bit looks beneficial and apears harmless in testing so I'd suggest adding it too. Documentation says: HD_CTL_WHILSMP_SET Wait for whole sample. When this bit is set MAI transmit will start only when there is at least one whole sample available in the fifo. Documentation says: HD_CTL_CHALIGN_SET Channel Align When Overflow. This bit is used to realign the audio channels in case of an overflow. If this bit is set, after the detection of an overflow, equal amount of dummy words to the missing words will be written to fifo, filling up the broken sample and maintaining alignment. Signed-off-by: Dom Cobley Signed-off-by: Maxime Ripard Reviewed-by: Nicolas Saenz Julienne Link: https://patchwork.freedesktop.org/patch/msgid/20210525132354.297468-7-max...@cerno.tech Signed-off-by: Sasha Levin --- drivers/gpu/drm/vc4/vc4_hdmi.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index c2876731ee2d..ad92dbb128b3 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -1372,7 +1372,9 @@ static int vc4_hdmi_audio_trigger(struct snd_pcm_substream *substream, int cmd, HDMI_WRITE(HDMI_MAI_CTL, VC4_SET_FIELD(vc4_hdmi->audio.channels, VC4_HD_MAI_CTL_CHNUM) | - VC4_HD_MAI_CTL_ENABLE); +VC4_HD_MAI_CTL_WHOLSMP | +VC4_HD_MAI_CTL_CHALIGN | +VC4_HD_MAI_CTL_ENABLE); break; case SNDRV_PCM_TRIGGER_STOP: HDMI_WRITE(HDMI_MAI_CTL, -- 2.30.2
[PATCH AUTOSEL 5.14 003/252] drm/vmwgfx: Fix some static checker warnings
From: Zack Rusin [ Upstream commit 74231041d14030f1ae6582b9233bfe782ac23e33 ] Fix some minor issues that Coverity spotted in the code. None of that are serious but they're all valid concerns so fixing them makes sense. Signed-off-by: Zack Rusin Reviewed-by: Roland Scheidegger Reviewed-by: Martin Krastev Link: https://patchwork.freedesktop.org/patch/msgid/20210609172307.131929-5-za...@vmware.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/vmwgfx/ttm_memory.c| 2 ++ drivers/gpu/drm/vmwgfx/vmwgfx_binding.c| 20 drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c | 4 +++- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c| 2 ++ drivers/gpu/drm/vmwgfx/vmwgfx_mob.c| 4 +++- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c| 6 -- drivers/gpu/drm/vmwgfx/vmwgfx_resource.c | 8 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_so.c | 3 ++- drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 4 ++-- 10 files changed, 33 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/ttm_memory.c b/drivers/gpu/drm/vmwgfx/ttm_memory.c index aeb0a22a2c34..edd17c30d5a5 100644 --- a/drivers/gpu/drm/vmwgfx/ttm_memory.c +++ b/drivers/gpu/drm/vmwgfx/ttm_memory.c @@ -435,8 +435,10 @@ int ttm_mem_global_init(struct ttm_mem_global *glob, struct device *dev) si_meminfo(&si); + spin_lock(&glob->lock); /* set it as 0 by default to keep original behavior of OOM */ glob->lower_mem_limit = 0; + spin_unlock(&glob->lock); ret = ttm_mem_init_kernel_zone(glob, &si); if (unlikely(ret != 0)) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c b/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c index 05b324825900..ea6d8c86985f 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c @@ -715,7 +715,7 @@ static int vmw_binding_scrub_cb(struct vmw_ctx_bindinfo *bi, bool rebind) * without checking which bindings actually need to be emitted * * @cbs: Pointer to the context's struct vmw_ctx_binding_state - * @bi: Pointer to where the binding info array is stored in @cbs + * @biv: Pointer to where the binding info array is stored in @cbs * @max_num: Maximum number of entries in the @bi array. * * Scans the @bi array for bindings and builds a buffer of view id data. @@ -725,11 +725,9 @@ static int vmw_binding_scrub_cb(struct vmw_ctx_bindinfo *bi, bool rebind) * contains the command data. */ static void vmw_collect_view_ids(struct vmw_ctx_binding_state *cbs, -const struct vmw_ctx_bindinfo *bi, +const struct vmw_ctx_bindinfo_view *biv, u32 max_num) { - const struct vmw_ctx_bindinfo_view *biv = - container_of(bi, struct vmw_ctx_bindinfo_view, bi); unsigned long i; cbs->bind_cmd_count = 0; @@ -838,7 +836,7 @@ static int vmw_emit_set_sr(struct vmw_ctx_binding_state *cbs, */ static int vmw_emit_set_rt(struct vmw_ctx_binding_state *cbs) { - const struct vmw_ctx_bindinfo *loc = &cbs->render_targets[0].bi; + const struct vmw_ctx_bindinfo_view *loc = &cbs->render_targets[0]; struct { SVGA3dCmdHeader header; SVGA3dCmdDXSetRenderTargets body; @@ -874,7 +872,7 @@ static int vmw_emit_set_rt(struct vmw_ctx_binding_state *cbs) * without checking which bindings actually need to be emitted * * @cbs: Pointer to the context's struct vmw_ctx_binding_state - * @bi: Pointer to where the binding info array is stored in @cbs + * @biso: Pointer to where the binding info array is stored in @cbs * @max_num: Maximum number of entries in the @bi array. * * Scans the @bi array for bindings and builds a buffer of SVGA3dSoTarget data. @@ -884,11 +882,9 @@ static int vmw_emit_set_rt(struct vmw_ctx_binding_state *cbs) * contains the command data. */ static void vmw_collect_so_targets(struct vmw_ctx_binding_state *cbs, - const struct vmw_ctx_bindinfo *bi, + const struct vmw_ctx_bindinfo_so_target *biso, u32 max_num) { - const struct vmw_ctx_bindinfo_so_target *biso = - container_of(bi, struct vmw_ctx_bindinfo_so_target, bi); unsigned long i; SVGA3dSoTarget *so_buffer = (SVGA3dSoTarget *) cbs->bind_cmd_buffer; @@ -919,7 +915,7 @@ static void vmw_collect_so_targets(struct vmw_ctx_binding_state *cbs, */ static int vmw_emit_set_so_target(struct vmw_ctx_binding_state *cbs) { - const struct vmw_ctx_bindinfo *loc = &cbs->so_targets[0].bi; + const struct vmw_ctx_bindinfo_so_target *loc = &cbs->so_targets[0]; struct { SVGA3dCmdHeader header; SVGA3dCmdDXSetSOTargets body; @@ -1066,7 +1062,7 @@ static int vmw_emit_set_vb(struct vmw_ctx_binding_state *cbs) static int vmw_emit_set_uav(
[PATCH AUTOSEL 5.14 002/252] drm/vmwgfx: Fix subresource updates with new contexts
From: Zack Rusin [ Upstream commit a12be0277316ed923411c9c80b2899ee74d2b033 ] The has_dx variable was only set during the initialization which meant that UPDATE_SUBRESOURCE was never used. We were emulating it with UPDATE_GB_IMAGE but that's always been a stop-gap. Instead of has_dx which has been deprecated a long time ago we need to check for whether shader model 4.0 or newer is available to the device. Signed-off-by: Zack Rusin Reviewed-by: Roland Scheidegger Reviewed-by: Martin Krastev Link: https://patchwork.freedesktop.org/patch/msgid/20210609172307.131929-4-za...@vmware.com Signed-off-by: Sasha Levin --- drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c index 0835468bb2ee..47c03a276515 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c @@ -1872,7 +1872,6 @@ static void vmw_surface_dirty_range_add(struct vmw_resource *res, size_t start, static int vmw_surface_dirty_sync(struct vmw_resource *res) { struct vmw_private *dev_priv = res->dev_priv; - bool has_dx = 0; u32 i, num_dirty; struct vmw_surface_dirty *dirty = (struct vmw_surface_dirty *) res->dirty; @@ -1899,7 +1898,7 @@ static int vmw_surface_dirty_sync(struct vmw_resource *res) if (!num_dirty) goto out; - alloc_size = num_dirty * ((has_dx) ? sizeof(*cmd1) : sizeof(*cmd2)); + alloc_size = num_dirty * ((has_sm4_context(dev_priv)) ? sizeof(*cmd1) : sizeof(*cmd2)); cmd = VMW_CMD_RESERVE(dev_priv, alloc_size); if (!cmd) return -ENOMEM; @@ -1917,7 +1916,7 @@ static int vmw_surface_dirty_sync(struct vmw_resource *res) * DX_UPDATE_SUBRESOURCE is aware of array surfaces. * UPDATE_GB_IMAGE is not. */ - if (has_dx) { + if (has_sm4_context(dev_priv)) { cmd1->header.id = SVGA_3D_CMD_DX_UPDATE_SUBRESOURCE; cmd1->header.size = sizeof(cmd1->body); cmd1->body.sid = res->id; -- 2.30.2
[PATCH AUTOSEL 5.14 001/252] drm/bridge: ti-sn65dsi86: Don't read EDID blob over DDC
From: Douglas Anderson [ Upstream commit a70e558c151043ce46a5e5999f4310e0b3551f57 ] This is really just a revert of commit 58074b08c04a ("drm/bridge: ti-sn65dsi86: Read EDID blob over DDC"), resolving conflicts. The old code failed to read the EDID properly in a very important case: before the bridge's pre_enable() was called. The way things need to work: 1. Read the EDID. 2. Based on the EDID, decide on video settings and pixel clock. 3. Enable the bridge w/ the desired settings. The way things were working: 1. Try to read the EDID but fail; fall back to hardcoded values. 2. Based on hardcoded values, decide on video settings and pixel clock. 3. Enable the bridge w/ the desired settings. 4. Try again to read the EDID, it works now! 5. Realize that the hardcoded settings weren't quite right. 6. Disable / reenable the bridge w/ the right settings. The reasons for the failures were twofold: a) Since we never ran the bridge chip's pre-enable then we never set the bit to ignore HPD. This meant the bridge chip didn't even _try_ to go out on the bus and communicate with the panel. b) Even if we fixed things to ignore HPD, the EDID still wouldn't read if the panel wasn't on. Instead of reverting the code, we could fix it to set the HPD bit and also power on the panel. However, it also works nicely to just let the panel code read the EDID. Now that we've split the driver up we can expose the DDC AUX channel bus to the panel node. The panel can take charge of reading the EDID. NOTE: in order for things to work, anyone that needs to read the EDID will need to instantiate their panel using the new DP AUX bus (AKA by listing their panel under the "aux-bus" node of the bridge chip in the device tree). In the future if we want to use the bridge chip to provide a full external DP port (which won't have a panel) then we will have to conditinally add EDID reading back in. Suggested-by: Andrzej Hajda Signed-off-by: Douglas Anderson Reviewed-by: Bjorn Andersson Link: https://patchwork.freedesktop.org/patch/msgid/20210611101711.v10.9.I9330684c25f65bb318eff57f0616500f83eac3cc@changeid Signed-off-by: Sasha Levin --- drivers/gpu/drm/bridge/ti-sn65dsi86.c | 22 -- 1 file changed, 22 deletions(-) diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c b/drivers/gpu/drm/bridge/ti-sn65dsi86.c index 45a2969afb2b..aef850296756 100644 --- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c +++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c @@ -124,7 +124,6 @@ * @connector:Our connector. * @host_node:Remote DSI node. * @dsi: Our MIPI DSI source. - * @edid: Detected EDID of eDP panel. * @refclk: Our reference clock. * @panel:Our panel. * @enable_gpio: The GPIO we toggle to enable the bridge. @@ -154,7 +153,6 @@ struct ti_sn65dsi86 { struct drm_dp_aux aux; struct drm_bridge bridge; struct drm_connectorconnector; - struct edid *edid; struct device_node *host_node; struct mipi_dsi_device *dsi; struct clk *refclk; @@ -403,24 +401,6 @@ connector_to_ti_sn65dsi86(struct drm_connector *connector) static int ti_sn_bridge_connector_get_modes(struct drm_connector *connector) { struct ti_sn65dsi86 *pdata = connector_to_ti_sn65dsi86(connector); - struct edid *edid = pdata->edid; - int num, ret; - - if (!edid) { - pm_runtime_get_sync(pdata->dev); - edid = pdata->edid = drm_get_edid(connector, &pdata->aux.ddc); - pm_runtime_put_autosuspend(pdata->dev); - } - - if (edid && drm_edid_is_valid(edid)) { - ret = drm_connector_update_edid_property(connector, edid); - if (!ret) { - num = drm_add_edid_modes(connector, edid); - if (num) - return num; - } - } - return drm_panel_get_modes(pdata->panel, connector); } @@ -1358,8 +1338,6 @@ static void ti_sn_bridge_remove(struct auxiliary_device *adev) mipi_dsi_device_unregister(pdata->dsi); } - kfree(pdata->edid); - drm_bridge_remove(&pdata->bridge); of_node_put(pdata->host_node); -- 2.30.2
[PATCH v1 12/12] drm/virtio: implement context init: advertise feature to userspace
This advertises the context init feature to userspace, along with a mask of supported capabilities. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index fdaa7f3d9eeb..5618a1d5879c 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -286,6 +286,12 @@ static int virtio_gpu_getparam_ioctl(struct drm_device *dev, void *data, case VIRTGPU_PARAM_CROSS_DEVICE: value = vgdev->has_resource_assign_uuid ? 1 : 0; break; + case VIRTGPU_PARAM_CONTEXT_INIT: + value = vgdev->has_context_init ? 1 : 0; + break; + case VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs: + value = vgdev->capset_id_mask; + break; default: return -EINVAL; } -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 11/12] drm/virtio: implement context init: add virtio_gpu_fence_event
Similar to DRM_VMW_EVENT_FENCE_SIGNALED. Sends a pollable event to the DRM file descriptor when a fence on a specific ring is signaled. One difference is the event is not exposed via the UAPI -- this is because host responses are on a shared memory buffer of type BLOB_MEM_GUEST [this is the common way to receive responses with virtgpu]. As such, there is no context specific read(..) implementation either -- just a poll(..) implementation. Signed-off-by: Gurchetan Singh Acked-by: Nicholas Verne --- drivers/gpu/drm/virtio/virtgpu_drv.c | 43 +- drivers/gpu/drm/virtio/virtgpu_drv.h | 7 + drivers/gpu/drm/virtio/virtgpu_fence.c | 10 ++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 4 files changed, 93 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index 9d963f1fda8f..749db18dcfa2 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -29,6 +29,8 @@ #include #include #include +#include +#include #include #include @@ -155,6 +157,35 @@ static void virtio_gpu_config_changed(struct virtio_device *vdev) schedule_work(&vgdev->config_changed_work); } +static __poll_t virtio_gpu_poll(struct file *filp, + struct poll_table_struct *wait) +{ + struct drm_file *drm_file = filp->private_data; + struct virtio_gpu_fpriv *vfpriv = drm_file->driver_priv; + struct drm_device *dev = drm_file->minor->dev; + struct drm_pending_event *e = NULL; + __poll_t mask = 0; + + if (!vfpriv->ring_idx_mask) + return drm_poll(filp, wait); + + poll_wait(filp, &drm_file->event_wait, wait); + + if (!list_empty(&drm_file->event_list)) { + spin_lock_irq(&dev->event_lock); + e = list_first_entry(&drm_file->event_list, +struct drm_pending_event, link); + drm_file->event_space += e->event->length; + list_del(&e->link); + spin_unlock_irq(&dev->event_lock); + + kfree(e); + mask |= EPOLLIN | EPOLLRDNORM; + } + + return mask; +} + static struct virtio_device_id id_table[] = { { VIRTIO_ID_GPU, VIRTIO_DEV_ANY_ID }, { 0 }, @@ -194,7 +225,17 @@ MODULE_AUTHOR("Dave Airlie "); MODULE_AUTHOR("Gerd Hoffmann "); MODULE_AUTHOR("Alon Levy"); -DEFINE_DRM_GEM_FOPS(virtio_gpu_driver_fops); +static const struct file_operations virtio_gpu_driver_fops = { + .owner = THIS_MODULE, + .open = drm_open, + .release= drm_release, + .unlocked_ioctl = drm_ioctl, + .compat_ioctl = drm_compat_ioctl, + .poll = virtio_gpu_poll, + .read = drm_read, + .llseek = noop_llseek, + .mmap = drm_gem_mmap +}; static const struct drm_driver driver = { .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC, diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index cb60d52c2bd1..e0265fe74aa5 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -138,11 +138,18 @@ struct virtio_gpu_fence_driver { spinlock_t lock; }; +#define VIRTGPU_EVENT_FENCE_SIGNALED_INTERNAL 0x1000 +struct virtio_gpu_fence_event { + struct drm_pending_event base; + struct drm_event event; +}; + struct virtio_gpu_fence { struct dma_fence f; uint32_t ring_idx; uint64_t fence_id; bool emit_fence_info; + struct virtio_gpu_fence_event *e; struct virtio_gpu_fence_driver *drv; struct list_head node; }; diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index 98a00c1e654d..f28357dbde35 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -152,11 +152,21 @@ void virtio_gpu_fence_event_process(struct virtio_gpu_device *vgdev, continue; dma_fence_signal_locked(&curr->f); + if (curr->e) { + drm_send_event(vgdev->ddev, &curr->e->base); + curr->e = NULL; + } + list_del(&curr->node); dma_fence_put(&curr->f); } dma_fence_signal_locked(&signaled->f); + if (signaled->e) { + drm_send_event(vgdev->ddev, &signaled->e->base); + signaled->e = NULL; + } + list_del(&signaled->node); dma_fence_put(&signaled->f); break; diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index be7b22a03884..fdaa
[PATCH v1 10/12] drm/virtio: implement context init: handle VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK
For the Sommelier guest Wayland proxy, it's desirable for the DRM fd to be pollable in response to an host compositor event. This can also be used by the 3D driver to poll events on a CPU timeline. This enables the DRM fd associated with a particular 3D context to be polled independent of KMS events. The parameter VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK specifies the pollable rings. Signed-off-by: Gurchetan Singh Acked-by: Nicholas Verne --- drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + drivers/gpu/drm/virtio/virtgpu_ioctl.c | 22 +- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index cca9ab505deb..cb60d52c2bd1 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -266,6 +266,7 @@ struct virtio_gpu_fpriv { bool context_created; uint32_t num_rings; uint64_t base_fence_ctx; + uint64_t ring_idx_mask; struct mutex context_lock; }; diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index 262f79210283..be7b22a03884 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -694,6 +694,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, { int ret = 0; uint32_t num_params, i, param, value; + uint64_t valid_ring_mask; size_t len; struct drm_virtgpu_context_set_param *ctx_set_params = NULL; struct virtio_gpu_device *vgdev = dev->dev_private; @@ -707,7 +708,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, return -EINVAL; /* Number of unique parameters supported at this time. */ - if (num_params > 2) + if (num_params > 3) return -EINVAL; ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), @@ -761,12 +762,31 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, vfpriv->base_fence_ctx = dma_fence_context_alloc(value); vfpriv->num_rings = value; break; + case VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK: + if (vfpriv->ring_idx_mask) { + ret = -EINVAL; + goto out_unlock; + } + + vfpriv->ring_idx_mask = value; + break; default: ret = -EINVAL; goto out_unlock; } } + if (vfpriv->ring_idx_mask) { + valid_ring_mask = 0; + for (i = 0; i < vfpriv->num_rings; i++) + valid_ring_mask |= 1 << i; + + if (~valid_ring_mask & vfpriv->ring_idx_mask) { + ret = -EINVAL; + goto out_unlock; + } + } + virtio_gpu_create_context_locked(vgdev, vfpriv); virtio_gpu_notify(vgdev); -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 09/12] drm/virtio: implement context init: allocate an array of fence contexts
We don't want fences from different 3D contexts (virgl, gfxstream, venus) to be on the same timeline. With explicit context creation, we can specify the number of ring each context wants. Execbuffer can specify which ring to use. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 -- 2 files changed, 35 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index a5142d60c2fa..cca9ab505deb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -56,6 +56,7 @@ #define STATE_ERR 2 #define MAX_CAPSET_ID 63 +#define MAX_RINGS 64 struct virtio_gpu_object_params { unsigned long size; @@ -263,6 +264,8 @@ struct virtio_gpu_fpriv { uint32_t ctx_id; uint32_t context_init; bool context_created; + uint32_t num_rings; + uint64_t base_fence_ctx; struct mutex context_lock; }; diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index f51f3393a194..262f79210283 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, int in_fence_fd = exbuf->fence_fd; int out_fence_fd = -1; void *buf; + uint64_t fence_ctx; + uint32_t ring_idx; + + fence_ctx = vgdev->fence_drv.context; + ring_idx = 0; if (vgdev->has_virgl_3d == false) return -ENOSYS; @@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) return -EINVAL; + if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) { + if (exbuf->ring_idx >= vfpriv->num_rings) + return -EINVAL; + + if (!vfpriv->base_fence_ctx) + return -EINVAL; + + fence_ctx = vfpriv->base_fence_ctx; + ring_idx = exbuf->ring_idx; + } + exbuf->fence_fd = -1; virtio_gpu_create_context(dev, file); @@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_memdup; } - out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); if(!out_fence) { ret = -ENOMEM; goto out_unresv; @@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, return -EINVAL; /* Number of unique parameters supported at this time. */ - if (num_params > 1) + if (num_params > 2) return -EINVAL; ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), @@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device *dev, vfpriv->context_init |= value; break; + case VIRTGPU_CONTEXT_PARAM_NUM_RINGS: + if (vfpriv->base_fence_ctx) { + ret = -EINVAL; + goto out_unlock; + } + + if (value > MAX_RINGS) { + ret = -EINVAL; + goto out_unlock; + } + + vfpriv->base_fence_ctx = dma_fence_context_alloc(value); + vfpriv->num_rings = value; + break; default: ret = -EINVAL; goto out_unlock; -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 05/12] drm/virtio: implement context init: support init ioctl
From: Anthoine Bourgeois This implements the context initialization ioctl. A list of params is passed in by userspace, and kernel driver validates them. The only currently supported param is VIRTGPU_CONTEXT_PARAM_CAPSET_ID. If the context has already been initialized, -EEXIST is returned. This happens after Linux userspace does dumb_create + followed by opening the Mesa virgl driver with the same virtgpu instance. However, for most applications, 3D contexts will be explicitly initialized when the feature is available. Signed-off-by: Anthoine Bourgeois Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_drv.h | 6 +- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 96 -- drivers/gpu/drm/virtio/virtgpu_vq.c| 4 +- 3 files changed, 98 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 5e1958a522ff..9996abf60e3a 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -259,12 +259,13 @@ struct virtio_gpu_device { struct virtio_gpu_fpriv { uint32_t ctx_id; + uint32_t context_init; bool context_created; struct mutex context_lock; }; /* virtgpu_ioctl.c */ -#define DRM_VIRTIO_NUM_IOCTLS 11 +#define DRM_VIRTIO_NUM_IOCTLS 12 extern struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS]; void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file); @@ -342,7 +343,8 @@ int virtio_gpu_cmd_get_capset(struct virtio_gpu_device *vgdev, struct virtio_gpu_drv_cap_cache **cache_p); int virtio_gpu_cmd_get_edids(struct virtio_gpu_device *vgdev); void virtio_gpu_cmd_context_create(struct virtio_gpu_device *vgdev, uint32_t id, - uint32_t nlen, const char *name); + uint32_t context_init, uint32_t nlen, + const char *name); void virtio_gpu_cmd_context_destroy(struct virtio_gpu_device *vgdev, uint32_t id); void virtio_gpu_cmd_context_attach_resource(struct virtio_gpu_device *vgdev, diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index 5c1ad1596889..f5281d1e30e1 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -38,20 +38,30 @@ VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \ VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE) +/* Must be called with &virtio_gpu_fpriv.struct_mutex held. */ +static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev, +struct virtio_gpu_fpriv *vfpriv) +{ + char dbgname[TASK_COMM_LEN]; + + get_task_comm(dbgname, current); + virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id, + vfpriv->context_init, strlen(dbgname), + dbgname); + + vfpriv->context_created = true; +} + void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file) { struct virtio_gpu_device *vgdev = dev->dev_private; struct virtio_gpu_fpriv *vfpriv = file->driver_priv; - char dbgname[TASK_COMM_LEN]; mutex_lock(&vfpriv->context_lock); if (vfpriv->context_created) goto out_unlock; - get_task_comm(dbgname, current); - virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id, - strlen(dbgname), dbgname); - vfpriv->context_created = true; + virtio_gpu_create_context_locked(vgdev, vfpriv); out_unlock: mutex_unlock(&vfpriv->context_lock); @@ -662,6 +672,79 @@ static int virtio_gpu_resource_create_blob_ioctl(struct drm_device *dev, return 0; } +static int virtio_gpu_context_init_ioctl(struct drm_device *dev, +void *data, struct drm_file *file) +{ + int ret = 0; + uint32_t num_params, i, param, value; + size_t len; + struct drm_virtgpu_context_set_param *ctx_set_params = NULL; + struct virtio_gpu_device *vgdev = dev->dev_private; + struct virtio_gpu_fpriv *vfpriv = file->driver_priv; + struct drm_virtgpu_context_init *args = data; + + num_params = args->num_params; + len = num_params * sizeof(struct drm_virtgpu_context_set_param); + + if (!vgdev->has_context_init || !vgdev->has_virgl_3d) + return -EINVAL; + + /* Number of unique parameters supported at this time. */ + if (num_params > 1) + return -EINVAL; + + ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params), +len); + + if (IS_ERR(ctx_set_params)) + return PTR_ERR(ctx_set_params); + + mutex_lock(&vfpriv->context_lock); + if (vfpriv->context
[PATCH v1 08/12] drm/virtio: implement context init: stop using drv->context when creating fence
The plumbing is all here to do this. Since we always use the default fence context when allocating a fence, this makes no functional difference. We can't process just the largest fence id anymore, since it's it's associated with different timelines. It's fine for fence_id 260 to signal before 259. As such, process each fence_id individually. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_fence.c | 16 ++-- drivers/gpu/drm/virtio/virtgpu_vq.c| 15 +++ 2 files changed, 17 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index 24c728b65d21..98a00c1e654d 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -75,20 +75,25 @@ struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev, uint64_t base_fence_ctx, uint32_t ring_idx) { + uint64_t fence_context = base_fence_ctx + ring_idx; struct virtio_gpu_fence_driver *drv = &vgdev->fence_drv; struct virtio_gpu_fence *fence = kzalloc(sizeof(struct virtio_gpu_fence), GFP_KERNEL); + if (!fence) return fence; fence->drv = drv; + fence->ring_idx = ring_idx; + fence->emit_fence_info = !(base_fence_ctx == drv->context); /* This only partially initializes the fence because the seqno is * unknown yet. The fence must not be used outside of the driver * until virtio_gpu_fence_emit is called. */ - dma_fence_init(&fence->f, &virtio_gpu_fence_ops, &drv->lock, drv->context, - 0); + + dma_fence_init(&fence->f, &virtio_gpu_fence_ops, &drv->lock, + fence_context, 0); return fence; } @@ -110,6 +115,13 @@ void virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev, cmd_hdr->flags |= cpu_to_le32(VIRTIO_GPU_FLAG_FENCE); cmd_hdr->fence_id = cpu_to_le64(fence->fence_id); + + /* Only currently defined fence param. */ + if (fence->emit_fence_info) { + cmd_hdr->flags |= + cpu_to_le32(VIRTIO_GPU_FLAG_INFO_RING_IDX); + cmd_hdr->ring_idx = (u8)fence->ring_idx; + } } void virtio_gpu_fence_event_process(struct virtio_gpu_device *vgdev, diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c b/drivers/gpu/drm/virtio/virtgpu_vq.c index 496f8ce4cd41..938331554632 100644 --- a/drivers/gpu/drm/virtio/virtgpu_vq.c +++ b/drivers/gpu/drm/virtio/virtgpu_vq.c @@ -205,7 +205,7 @@ void virtio_gpu_dequeue_ctrl_func(struct work_struct *work) struct list_head reclaim_list; struct virtio_gpu_vbuffer *entry, *tmp; struct virtio_gpu_ctrl_hdr *resp; - u64 fence_id = 0; + u64 fence_id; INIT_LIST_HEAD(&reclaim_list); spin_lock(&vgdev->ctrlq.qlock); @@ -232,23 +232,14 @@ void virtio_gpu_dequeue_ctrl_func(struct work_struct *work) DRM_DEBUG("response 0x%x\n", le32_to_cpu(resp->type)); } if (resp->flags & cpu_to_le32(VIRTIO_GPU_FLAG_FENCE)) { - u64 f = le64_to_cpu(resp->fence_id); - - if (fence_id > f) { - DRM_ERROR("%s: Oops: fence %llx -> %llx\n", - __func__, fence_id, f); - } else { - fence_id = f; - } + fence_id = le64_to_cpu(resp->fence_id); + virtio_gpu_fence_event_process(vgdev, fence_id); } if (entry->resp_cb) entry->resp_cb(vgdev, entry); } wake_up(&vgdev->ctrlq.ack_queue); - if (fence_id) - virtio_gpu_fence_event_process(vgdev, fence_id); - list_for_each_entry_safe(entry, tmp, &reclaim_list, list) { if (entry->objs) virtio_gpu_array_put_free_delayed(vgdev, entry->objs); -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 07/12] drm/virtio: implement context init: plumb {base_fence_ctx, ring_idx} to virtio_gpu_fence_alloc
These were defined in the previous commit. We'll need these parameters when allocating a dma_fence. The use case for this is multiple synchronizations timelines. The maximum number of timelines per 3D instance will be 32. Usually, only 2 are needed -- one for CPU commands, and another for GPU commands. As such, we'll need to specify these parameters when allocating a dma_fence. vgdev->fence_drv.context is the "default" fence context for 2D mode and old userspace. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_drv.h | 5 +++-- drivers/gpu/drm/virtio/virtgpu_fence.c | 4 +++- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 + drivers/gpu/drm/virtio/virtgpu_plane.c | 3 ++- 4 files changed, 13 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 401aec1a5efb..a5142d60c2fa 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -426,8 +426,9 @@ struct drm_plane *virtio_gpu_plane_init(struct virtio_gpu_device *vgdev, int index); /* virtgpu_fence.c */ -struct virtio_gpu_fence *virtio_gpu_fence_alloc( - struct virtio_gpu_device *vgdev); +struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev, + uint64_t base_fence_ctx, + uint32_t ring_idx); void virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev, struct virtio_gpu_ctrl_hdr *cmd_hdr, struct virtio_gpu_fence *fence); diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c b/drivers/gpu/drm/virtio/virtgpu_fence.c index d28e25e8409b..24c728b65d21 100644 --- a/drivers/gpu/drm/virtio/virtgpu_fence.c +++ b/drivers/gpu/drm/virtio/virtgpu_fence.c @@ -71,7 +71,9 @@ static const struct dma_fence_ops virtio_gpu_fence_ops = { .timeline_value_str = virtio_gpu_timeline_value_str, }; -struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev) +struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device *vgdev, + uint64_t base_fence_ctx, + uint32_t ring_idx) { struct virtio_gpu_fence_driver *drv = &vgdev->fence_drv; struct virtio_gpu_fence *fence = kzalloc(sizeof(struct virtio_gpu_fence), diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index f5281d1e30e1..f51f3393a194 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -173,7 +173,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, goto out_memdup; } - out_fence = virtio_gpu_fence_alloc(vgdev); + out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); if(!out_fence) { ret = -ENOMEM; goto out_unresv; @@ -288,7 +288,7 @@ static int virtio_gpu_resource_create_ioctl(struct drm_device *dev, void *data, if (params.size == 0) params.size = PAGE_SIZE; - fence = virtio_gpu_fence_alloc(vgdev); + fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); if (!fence) return -ENOMEM; ret = virtio_gpu_object_create(vgdev, ¶ms, &qobj, fence); @@ -367,7 +367,7 @@ static int virtio_gpu_transfer_from_host_ioctl(struct drm_device *dev, if (ret != 0) goto err_put_free; - fence = virtio_gpu_fence_alloc(vgdev); + fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0); if (!fence) { ret = -ENOMEM; goto err_unlock; @@ -427,7 +427,8 @@ static int virtio_gpu_transfer_to_host_ioctl(struct drm_device *dev, void *data, goto err_put_free; ret = -ENOMEM; - fence = virtio_gpu_fence_alloc(vgdev); + fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, + 0); if (!fence) goto err_unlock; diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c index a49fd9480381..6d3cc9e238a4 100644 --- a/drivers/gpu/drm/virtio/virtgpu_plane.c +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c @@ -256,7 +256,8 @@ static int virtio_gpu_plane_prepare_fb(struct drm_plane *plane, return 0; if (bo->dumb && (plane->state->fb != new_state->fb)) { - vgfb->fence = virtio_gpu_fence_alloc(vgdev); + vgfb->fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, +0); if (!vgfb->fence) return -ENOMEM; } -- 2.33
[PATCH v1 06/12] drm/virtio: implement context init: track {ring_idx, emit_fence_info} in virtio_gpu_fence
Each fence should be associated with a [fence ID, fence_context, seqno]. The seqno number is just the fence id. To get the fence context, we add the ring_idx to the 3D context's base_fence_ctx. The ring_idx is between 0 and 31, inclusive. Each 3D context will have it's own base_fence_ctx. The ring_idx will be emitted to host userspace, when emit_fence_info is true. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_drv.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 9996abf60e3a..401aec1a5efb 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -139,7 +139,9 @@ struct virtio_gpu_fence_driver { struct virtio_gpu_fence { struct dma_fence f; + uint32_t ring_idx; uint64_t fence_id; + bool emit_fence_info; struct virtio_gpu_fence_driver *drv; struct list_head node; }; -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 04/12] drm/virtio: implement context init: probe for feature
From: Anthoine Bourgeois Let's probe for VIRTIO_GPU_F_CONTEXT_INIT. Create a new DRM_INFO(..) line since the current one is getting too long. Signed-off-by: Anthoine Bourgeois Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.h | 1 + drivers/gpu/drm/virtio/virtgpu_kms.c | 8 +++- 4 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c b/drivers/gpu/drm/virtio/virtgpu_debugfs.c index c2b20e0ee030..b6954e2f75e6 100644 --- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c +++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c @@ -52,6 +52,7 @@ static int virtio_gpu_features(struct seq_file *m, void *data) vgdev->has_resource_assign_uuid); virtio_gpu_add_bool(m, "blob resources", vgdev->has_resource_blob); + virtio_gpu_add_bool(m, "context init", vgdev->has_context_init); virtio_gpu_add_int(m, "cap sets", vgdev->num_capsets); virtio_gpu_add_int(m, "scanouts", vgdev->num_scanouts); if (vgdev->host_visible_region.len) { diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index ed85a7863256..9d963f1fda8f 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -172,6 +172,7 @@ static unsigned int features[] = { VIRTIO_GPU_F_EDID, VIRTIO_GPU_F_RESOURCE_UUID, VIRTIO_GPU_F_RESOURCE_BLOB, + VIRTIO_GPU_F_CONTEXT_INIT, }; static struct virtio_driver virtio_gpu_driver = { .feature_table = features, diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 3023e16be0d6..5e1958a522ff 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -236,6 +236,7 @@ struct virtio_gpu_device { bool has_resource_assign_uuid; bool has_resource_blob; bool has_host_visible; + bool has_context_init; struct virtio_shm_region host_visible_region; struct drm_mm host_visible_mm; diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c index 58a65121c200..21f410901694 100644 --- a/drivers/gpu/drm/virtio/virtgpu_kms.c +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c @@ -191,13 +191,19 @@ int virtio_gpu_init(struct drm_device *dev) (unsigned long)vgdev->host_visible_region.addr, (unsigned long)vgdev->host_visible_region.len); } + if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_CONTEXT_INIT)) { + vgdev->has_context_init = true; + } - DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible\n", + DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible", vgdev->has_virgl_3d? '+' : '-', vgdev->has_edid? '+' : '-', vgdev->has_resource_blob ? '+' : '-', vgdev->has_host_visible ? '+' : '-'); + DRM_INFO("features: %ccontext_init\n", +vgdev->has_context_init ? '+' : '-'); + ret = virtio_find_vqs(vgdev->vdev, 2, vqs, callbacks, names, NULL); if (ret) { DRM_ERROR("failed to find virt queues\n"); -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 03/12] drm/virtio: implement context init: track valid capabilities in a mask
The valid capability IDs are between 1 to 63, and defined in the virtio gpu spec. This is used for error checking the subsequent patches. We're currently only using 2 capability IDs, so this should be plenty for the immediate future. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang --- drivers/gpu/drm/virtio/virtgpu_drv.h | 3 +++ drivers/gpu/drm/virtio/virtgpu_kms.c | 18 +- 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 0c4810982530..3023e16be0d6 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -55,6 +55,8 @@ #define STATE_OK 1 #define STATE_ERR 2 +#define MAX_CAPSET_ID 63 + struct virtio_gpu_object_params { unsigned long size; bool dumb; @@ -245,6 +247,7 @@ struct virtio_gpu_device { struct virtio_gpu_drv_capset *capsets; uint32_t num_capsets; + uint64_t capset_id_mask; struct list_head cap_cache; /* protects uuid state when exporting */ diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c index f3379059f324..58a65121c200 100644 --- a/drivers/gpu/drm/virtio/virtgpu_kms.c +++ b/drivers/gpu/drm/virtio/virtgpu_kms.c @@ -65,6 +65,7 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device *vgdev, int num_capsets) { int i, ret; + bool invalid_capset_id = false; vgdev->capsets = kcalloc(num_capsets, sizeof(struct virtio_gpu_drv_capset), @@ -78,19 +79,34 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device *vgdev, virtio_gpu_notify(vgdev); ret = wait_event_timeout(vgdev->resp_wq, vgdev->capsets[i].id > 0, 5 * HZ); - if (ret == 0) { + /* +* Capability ids are defined in the virtio-gpu spec and are +* between 1 to 63, inclusive. +*/ + if (!vgdev->capsets[i].id || + vgdev->capsets[i].id > MAX_CAPSET_ID) + invalid_capset_id = true; + + if (ret == 0) DRM_ERROR("timed out waiting for cap set %d\n", i); + else if (invalid_capset_id) + DRM_ERROR("invalid capset id %u", vgdev->capsets[i].id); + + if (ret == 0 || invalid_capset_id) { spin_lock(&vgdev->display_info_lock); kfree(vgdev->capsets); vgdev->capsets = NULL; spin_unlock(&vgdev->display_info_lock); return; } + + vgdev->capset_id_mask |= 1 << vgdev->capsets[i].id; DRM_INFO("cap set %d: id %d, max-version %d, max-size %d\n", i, vgdev->capsets[i].id, vgdev->capsets[i].max_version, vgdev->capsets[i].max_size); } + vgdev->num_capsets = num_capsets; } -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 02/12] drm/virtgpu api: create context init feature
This change allows creating contexts of depending on set of context parameters. The meaning of each of the parameters is listed below: 1) VIRTGPU_CONTEXT_PARAM_CAPSET_ID This determines the type of a context based on the capability set ID. For example, the current capsets: VIRTIO_GPU_CAPSET_VIRGL VIRTIO_GPU_CAPSET_VIRGL2 define a Gallium, TGSI based "virgl" context. We only need 1 capset ID per context type, though virgl has two due a bug that has since been fixed. The use case is the "gfxstream" rendering library and "venus" renderer. gfxstream doesn't do Gallium/TGSI translation and mostly relies on auto-generated API streaming. Certain users prefer gfxstream over virgl for GLES on GLES emulation. {gfxstream vk}/{venus} are also required for Vulkan emulation. The maximum capset ID is 63. The goal is for guest userspace to choose the optimal context type depending on the situation/hardware. 2) VIRTGPU_CONTEXT_PARAM_NUM_RINGS This tells the number of independent command rings that the context will use. This value may be zero and is inferred to be zero if VIRTGPU_CONTEXT_PARAM_NUM_RINGS is not passed in. This is for backwards compatibility for virgl, which has one big giant command ring for all commands. The maxiumum number of rings is 64. In practice, multi-queue or multi-ring submission is used for powerful dGPUs and virtio-gpu may not be the best option in that case (see PCI passthrough or rendernode forwarding). 3) VIRTGPU_CONTEXT_PARAM_POLL_RING_IDX_MASK This is a mask of ring indices for which the DRM fd is pollable. For example, if VIRTGPU_CONTEXT_PARAM_NUM_RINGS is 2, then the mask may be: [ring idx] | [1 << ring_idx] | final mask --- 0 11 1 23 The "Sommelier" guest Wayland proxy uses this to poll for events from the host compositor. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang Acked-by: Nicholas Verne --- include/uapi/drm/virtgpu_drm.h | 27 +++ 1 file changed, 27 insertions(+) diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h index b9ec26e9c646..a13e20cc66b4 100644 --- a/include/uapi/drm/virtgpu_drm.h +++ b/include/uapi/drm/virtgpu_drm.h @@ -47,12 +47,15 @@ extern "C" { #define DRM_VIRTGPU_WAIT 0x08 #define DRM_VIRTGPU_GET_CAPS 0x09 #define DRM_VIRTGPU_RESOURCE_CREATE_BLOB 0x0a +#define DRM_VIRTGPU_CONTEXT_INIT 0x0b #define VIRTGPU_EXECBUF_FENCE_FD_IN0x01 #define VIRTGPU_EXECBUF_FENCE_FD_OUT 0x02 +#define VIRTGPU_EXECBUF_RING_IDX 0x04 #define VIRTGPU_EXECBUF_FLAGS (\ VIRTGPU_EXECBUF_FENCE_FD_IN |\ VIRTGPU_EXECBUF_FENCE_FD_OUT |\ + VIRTGPU_EXECBUF_RING_IDX |\ 0) struct drm_virtgpu_map { @@ -68,6 +71,8 @@ struct drm_virtgpu_execbuffer { __u64 bo_handles; __u32 num_bo_handles; __s32 fence_fd; /* in/out fence fd (see VIRTGPU_EXECBUF_FENCE_FD_IN/OUT) */ + __u32 ring_idx; /* command ring index (see VIRTGPU_EXECBUF_RING_IDX) */ + __u32 pad; }; #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */ @@ -75,6 +80,8 @@ struct drm_virtgpu_execbuffer { #define VIRTGPU_PARAM_RESOURCE_BLOB 3 /* DRM_VIRTGPU_RESOURCE_CREATE_BLOB */ #define VIRTGPU_PARAM_HOST_VISIBLE 4 /* Host blob resources are mappable */ #define VIRTGPU_PARAM_CROSS_DEVICE 5 /* Cross virtio-device resource sharing */ +#define VIRTGPU_PARAM_CONTEXT_INIT 6 /* DRM_VIRTGPU_CONTEXT_INIT */ +#define VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs 7 /* Bitmask of supported capability set ids */ struct drm_virtgpu_getparam { __u64 param; @@ -173,6 +180,22 @@ struct drm_virtgpu_resource_create_blob { __u64 blob_id; }; +#define VIRTGPU_CONTEXT_PARAM_CAPSET_ID 0x0001 +#define VIRTGPU_CONTEXT_PARAM_NUM_RINGS 0x0002 +#define VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK 0x0003 +struct drm_virtgpu_context_set_param { + __u64 param; + __u64 value; +}; + +struct drm_virtgpu_context_init { + __u32 num_params; + __u32 pad; + + /* pointer to drm_virtgpu_context_set_param array */ + __u64 ctx_set_params; +}; + #define DRM_IOCTL_VIRTGPU_MAP \ DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_MAP, struct drm_virtgpu_map) @@ -212,6 +235,10 @@ struct drm_virtgpu_resource_create_blob { DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_RESOURCE_CREATE_BLOB, \ struct drm_virtgpu_resource_create_blob) +#define DRM_IOCTL_VIRTGPU_CONTEXT_INIT \ + DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_CONTEXT_INIT, \ + struct drm_virtgpu_context_init) + #if defined(__cplusplus) } #endif -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 00/12] Context types
Version 1 of context types: https://lists.oasis-open.org/archives/virtio-dev/202108/msg00141.html Changes since RFC: * le32 info --> {u8 ring_idx + u8 padding[3]). * Max rings is now 64. Anthoine Bourgeois (2): drm/virtio: implement context init: probe for feature drm/virtio: implement context init: support init ioctl Gurchetan Singh (10): virtio-gpu api: multiple context types with explicit initialization drm/virtgpu api: create context init feature drm/virtio: implement context init: track valid capabilities in a mask drm/virtio: implement context init: track {ring_idx, emit_fence_info} in virtio_gpu_fence drm/virtio: implement context init: plumb {base_fence_ctx, ring_idx} to virtio_gpu_fence_alloc drm/virtio: implement context init: stop using drv->context when creating fence drm/virtio: implement context init: allocate an array of fence contexts drm/virtio: implement context init: handle VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK drm/virtio: implement context init: add virtio_gpu_fence_event drm/virtio: implement context init: advertise feature to userspace drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 + drivers/gpu/drm/virtio/virtgpu_drv.c | 44 - drivers/gpu/drm/virtio/virtgpu_drv.h | 28 +++- drivers/gpu/drm/virtio/virtgpu_fence.c | 30 +++- drivers/gpu/drm/virtio/virtgpu_ioctl.c | 195 +-- drivers/gpu/drm/virtio/virtgpu_kms.c | 26 ++- drivers/gpu/drm/virtio/virtgpu_plane.c | 3 +- drivers/gpu/drm/virtio/virtgpu_vq.c | 19 +-- include/uapi/drm/virtgpu_drm.h | 27 include/uapi/linux/virtio_gpu.h | 18 ++- 10 files changed, 355 insertions(+), 36 deletions(-) -- 2.33.0.153.gba50c8fa24-goog
[PATCH v1 01/12] virtio-gpu api: multiple context types with explicit initialization
This feature allows for each virtio-gpu 3D context to be created with a "context_init" variable. This variable can specify: - the type of protocol used by the context via the capset id. This is useful for differentiating virgl, gfxstream, and venus protocols by host userspace. - other things in the future, such as the version of the context. In addition, each different context needs one or more timelines, so for example a virgl context's waiting can be independent on a gfxstream context's waiting. VIRTIO_GPU_FLAG_INFO_RING_IDX is introduced to specific to tell the host which per-context command ring (or "hardware queue", distinct from the virtio-queue) the fence should be associated with. The new capability sets (gfxstream, venus etc.) are only defined in the virtio-gpu spec and not defined in the header. Signed-off-by: Gurchetan Singh Acked-by: Lingfeng Yang --- include/uapi/linux/virtio_gpu.h | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h index 97523a95781d..b0e3d91dfab7 100644 --- a/include/uapi/linux/virtio_gpu.h +++ b/include/uapi/linux/virtio_gpu.h @@ -59,6 +59,11 @@ * VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB */ #define VIRTIO_GPU_F_RESOURCE_BLOB 3 +/* + * VIRTIO_GPU_CMD_CREATE_CONTEXT with + * context_init and multiple timelines + */ +#define VIRTIO_GPU_F_CONTEXT_INIT4 enum virtio_gpu_ctrl_type { VIRTIO_GPU_UNDEFINED = 0, @@ -122,14 +127,20 @@ enum virtio_gpu_shm_id { VIRTIO_GPU_SHM_ID_HOST_VISIBLE = 1 }; -#define VIRTIO_GPU_FLAG_FENCE (1 << 0) +#define VIRTIO_GPU_FLAG_FENCE (1 << 0) +/* + * If the following flag is set, then ring_idx contains the index + * of the command ring that needs to used when creating the fence + */ +#define VIRTIO_GPU_FLAG_INFO_RING_IDX (1 << 1) struct virtio_gpu_ctrl_hdr { __le32 type; __le32 flags; __le64 fence_id; __le32 ctx_id; - __le32 padding; + u8 ring_idx; + u8 padding[3]; }; /* data passed in the cursor vq */ @@ -269,10 +280,11 @@ struct virtio_gpu_resource_create_3d { }; /* VIRTIO_GPU_CMD_CTX_CREATE */ +#define VIRTIO_GPU_CONTEXT_INIT_CAPSET_ID_MASK 0x00ff struct virtio_gpu_ctx_create { struct virtio_gpu_ctrl_hdr hdr; __le32 nlen; - __le32 padding; + __le32 context_init; char debug_name[64]; }; -- 2.33.0.153.gba50c8fa24-goog
[PATCH v2] drm/plane-helper: fix uninitialized variable reference
drivers/gpu/drm/drm_plane_helper.c: In function 'drm_primary_helper_update': drivers/gpu/drm/drm_plane_helper.c:113:32: error: 'visible' is used uninitialized [-Werror=uninitialized] 113 | struct drm_plane_state plane_state = { |^~~ drivers/gpu/drm/drm_plane_helper.c:178:14: note: 'visible' was declared here 178 | bool visible; | ^~~ cc1: all warnings being treated as errors visible is an output, not an input. in practice this use might turn out OK but it's still UB. Fixes: df86af9133 ("drm/plane-helper: Add drm_plane_helper_check_state()") Signed-off-by: Alex Xu (Hello71) --- drivers/gpu/drm/drm_plane_helper.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/drm_plane_helper.c b/drivers/gpu/drm/drm_plane_helper.c index 5b2d0ca03705..838b32b70bce 100644 --- a/drivers/gpu/drm/drm_plane_helper.c +++ b/drivers/gpu/drm/drm_plane_helper.c @@ -123,7 +123,6 @@ static int drm_plane_helper_check_update(struct drm_plane *plane, .crtc_w = drm_rect_width(dst), .crtc_h = drm_rect_height(dst), .rotation = rotation, - .visible = *visible, }; struct drm_crtc_state crtc_state = { .crtc = crtc, -- 2.33.0
[PATCH 1/4] drm/i915: rename debugfs_gt files
We shouldn't be using debugfs_ namespace for this functionality. Rename debugfs_gt.[ch] to intel_gt_debugfs.[ch] and then make functions, defines and structs follow suit. While at it and since we are renaming the header, sort the includes alphabetically. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/Makefile | 2 +- drivers/gpu/drm/i915/gt/debugfs_engines.c | 6 +++--- drivers/gpu/drm/i915/gt/debugfs_gt_pm.c| 14 +++--- drivers/gpu/drm/i915/gt/intel_gt.c | 6 +++--- .../gt/{debugfs_gt.c => intel_gt_debugfs.c}| 8 .../gt/{debugfs_gt.h => intel_gt_debugfs.h}| 14 +++--- drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c | 10 +- drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c | 18 +- .../gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c | 8 drivers/gpu/drm/i915/gt/uc/intel_huc_debugfs.c | 6 +++--- drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.c | 6 +++--- 11 files changed, 49 insertions(+), 49 deletions(-) rename drivers/gpu/drm/i915/gt/{debugfs_gt.c => intel_gt_debugfs.c} (87%) rename drivers/gpu/drm/i915/gt/{debugfs_gt.h => intel_gt_debugfs.h} (71%) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index c36c8a4f0716..3e171f0b5f6a 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -80,7 +80,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o # "Graphics Technology" (aka we talk to the gpu) gt-y += \ gt/debugfs_engines.o \ - gt/debugfs_gt.o \ gt/debugfs_gt_pm.o \ gt/gen2_engine_cs.o \ gt/gen6_engine_cs.o \ @@ -101,6 +100,7 @@ gt-y += \ gt/intel_gt.o \ gt/intel_gt_buffer_pool.o \ gt/intel_gt_clock_utils.o \ + gt/intel_gt_debugfs.o \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ gt/intel_gt_pm_irq.o \ diff --git a/drivers/gpu/drm/i915/gt/debugfs_engines.c b/drivers/gpu/drm/i915/gt/debugfs_engines.c index 5e3725e62241..2980dac5b171 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_engines.c +++ b/drivers/gpu/drm/i915/gt/debugfs_engines.c @@ -7,9 +7,9 @@ #include #include "debugfs_engines.h" -#include "debugfs_gt.h" #include "i915_drv.h" /* for_each_engine! */ #include "intel_engine.h" +#include "intel_gt_debugfs.h" static int engines_show(struct seq_file *m, void *data) { @@ -24,11 +24,11 @@ static int engines_show(struct seq_file *m, void *data) return 0; } -DEFINE_GT_DEBUGFS_ATTRIBUTE(engines); +DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(engines); void debugfs_engines_register(struct intel_gt *gt, struct dentry *root) { - static const struct debugfs_gt_file files[] = { + static const struct intel_gt_debugfs_file files[] = { { "engines", &engines_fops }, }; diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c index f6733f279890..9222cf68c56c 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c @@ -6,11 +6,11 @@ #include -#include "debugfs_gt.h" #include "debugfs_gt_pm.h" #include "i915_drv.h" #include "intel_gt.h" #include "intel_gt_clock_utils.h" +#include "intel_gt_debugfs.h" #include "intel_gt_pm.h" #include "intel_llc.h" #include "intel_rc6.h" @@ -36,7 +36,7 @@ static int fw_domains_show(struct seq_file *m, void *data) return 0; } -DEFINE_GT_DEBUGFS_ATTRIBUTE(fw_domains); +DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(fw_domains); static void print_rc6_res(struct seq_file *m, const char *title, @@ -238,7 +238,7 @@ static int drpc_show(struct seq_file *m, void *unused) return err; } -DEFINE_GT_DEBUGFS_ATTRIBUTE(drpc); +DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(drpc); static int frequency_show(struct seq_file *m, void *unused) { @@ -480,7 +480,7 @@ static int frequency_show(struct seq_file *m, void *unused) return 0; } -DEFINE_GT_DEBUGFS_ATTRIBUTE(frequency); +DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(frequency); static int llc_show(struct seq_file *m, void *data) { @@ -533,7 +533,7 @@ static bool llc_eval(void *data) return HAS_LLC(gt->i915); } -DEFINE_GT_DEBUGFS_ATTRIBUTE(llc); +DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(llc); static const char *rps_power_to_str(unsigned int power) { @@ -612,11 +612,11 @@ static bool rps_eval(void *data) return HAS_RPS(gt->i915); } -DEFINE_GT_DEBUGFS_ATTRIBUTE(rps_boost); +DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rps_boost); void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root) { - static const struct debugfs_gt_file files[] = { + static const struct intel_gt_debugfs_file files[] = { { "drpc", &drpc_fops, NULL }, { "frequency", &frequency_fops, NULL }, { "forcewake", &fw_domains_fops, NULL }, diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 2aeaae036a6f..9dda17553e12 100644 --- a/drivers/gpu/drm
[PATCH 3/4] drm/i915: rename debugfs_gt_pm files
We shouldn't be using debugfs_ namespace for this functionality. Rename debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make functions, defines and structs follow suit. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/Makefile | 2 +- drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 -- drivers/gpu/drm/i915/gt/intel_gt_debugfs.c | 4 ++-- .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} | 4 ++-- drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h | 14 ++ 5 files changed, 19 insertions(+), 19 deletions(-) delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} (99%) create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 232c9673a2e5..dd656f2d7721 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o # "Graphics Technology" (aka we talk to the gpu) gt-y += \ - gt/debugfs_gt_pm.o \ gt/gen2_engine_cs.o \ gt/gen6_engine_cs.o \ gt/gen6_ppgtt.o \ @@ -103,6 +102,7 @@ gt-y += \ gt/intel_gt_engines_debugfs.o \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ + gt/intel_gt_pm_debugfs.o \ gt/intel_gt_pm_irq.o \ gt/intel_gt_requests.o \ gt/intel_gtt.o \ diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h deleted file mode 100644 index 4cf5f5c9da7d.. --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h +++ /dev/null @@ -1,14 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2019 Intel Corporation - */ - -#ifndef DEBUGFS_GT_PM_H -#define DEBUGFS_GT_PM_H - -struct intel_gt; -struct dentry; - -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root); - -#endif /* DEBUGFS_GT_PM_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c index e5d173c235a3..4096ee893b69 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c @@ -5,10 +5,10 @@ #include -#include "debugfs_gt_pm.h" #include "i915_drv.h" #include "intel_gt_debugfs.h" #include "intel_gt_engines_debugfs.h" +#include "intel_gt_pm_debugfs.h" #include "intel_sseu_debugfs.h" #include "uc/intel_uc_debugfs.h" @@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt *gt) return; intel_gt_engines_register_debugfs(gt, root); - debugfs_gt_pm_register(gt, root); + intel_gt_pm_register_debugfs(gt, root); intel_sseu_debugfs_register(gt, root); intel_uc_debugfs_register(>->uc, root); diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c similarity index 99% rename from drivers/gpu/drm/i915/gt/debugfs_gt_pm.c rename to drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c index 9222cf68c56c..baca153c05dd 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c @@ -6,12 +6,12 @@ #include -#include "debugfs_gt_pm.h" #include "i915_drv.h" #include "intel_gt.h" #include "intel_gt_clock_utils.h" #include "intel_gt_debugfs.h" #include "intel_gt_pm.h" +#include "intel_gt_pm_debugfs.h" #include "intel_llc.h" #include "intel_rc6.h" #include "intel_rps.h" @@ -614,7 +614,7 @@ static bool rps_eval(void *data) DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rps_boost); -void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root) +void intel_gt_pm_register_debugfs(struct intel_gt *gt, struct dentry *root) { static const struct intel_gt_debugfs_file files[] = { { "drpc", &drpc_fops, NULL }, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h new file mode 100644 index ..f44894579604 --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef INTEL_GT_PM_DEBUGFS_H +#define INTEL_GT_PM_DEBUGFS_H + +struct intel_gt; +struct dentry; + +void intel_gt_pm_register_debugfs(struct intel_gt *gt, struct dentry *root); + +#endif /* INTEL_GT_PM_DEBUGFS_H */ -- 2.32.0
[PATCH 2/4] drm/i915: rename debugfs_engines files
We shouldn't be using debugfs_ namespace for this functionality. Rename debugfs_engines.[ch] to intel_gt_engines_debugfs.[ch] and then make functions, defines and structs follow suit. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/Makefile | 2 +- drivers/gpu/drm/i915/gt/debugfs_engines.h | 14 -- drivers/gpu/drm/i915/gt/intel_gt_debugfs.c | 4 ++-- ...ebugfs_engines.c => intel_gt_engines_debugfs.c} | 4 ++-- drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h | 14 ++ 5 files changed, 19 insertions(+), 19 deletions(-) delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_engines.h rename drivers/gpu/drm/i915/gt/{debugfs_engines.c => intel_gt_engines_debugfs.c} (85%) create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 3e171f0b5f6a..232c9673a2e5 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o # "Graphics Technology" (aka we talk to the gpu) gt-y += \ - gt/debugfs_engines.o \ gt/debugfs_gt_pm.o \ gt/gen2_engine_cs.o \ gt/gen6_engine_cs.o \ @@ -101,6 +100,7 @@ gt-y += \ gt/intel_gt_buffer_pool.o \ gt/intel_gt_clock_utils.o \ gt/intel_gt_debugfs.o \ + gt/intel_gt_engines_debugfs.o \ gt/intel_gt_irq.o \ gt/intel_gt_pm.o \ gt/intel_gt_pm_irq.o \ diff --git a/drivers/gpu/drm/i915/gt/debugfs_engines.h b/drivers/gpu/drm/i915/gt/debugfs_engines.h deleted file mode 100644 index f69257eaa1cc.. --- a/drivers/gpu/drm/i915/gt/debugfs_engines.h +++ /dev/null @@ -1,14 +0,0 @@ -/* SPDX-License-Identifier: MIT */ -/* - * Copyright © 2019 Intel Corporation - */ - -#ifndef DEBUGFS_ENGINES_H -#define DEBUGFS_ENGINES_H - -struct intel_gt; -struct dentry; - -void debugfs_engines_register(struct intel_gt *gt, struct dentry *root); - -#endif /* DEBUGFS_ENGINES_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c index a27ba11605d8..e5d173c235a3 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c @@ -5,10 +5,10 @@ #include -#include "debugfs_engines.h" #include "debugfs_gt_pm.h" #include "i915_drv.h" #include "intel_gt_debugfs.h" +#include "intel_gt_engines_debugfs.h" #include "intel_sseu_debugfs.h" #include "uc/intel_uc_debugfs.h" @@ -23,7 +23,7 @@ void intel_gt_register_debugfs(struct intel_gt *gt) if (IS_ERR(root)) return; - debugfs_engines_register(gt, root); + intel_gt_engines_register_debugfs(gt, root); debugfs_gt_pm_register(gt, root); intel_sseu_debugfs_register(gt, root); diff --git a/drivers/gpu/drm/i915/gt/debugfs_engines.c b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.c similarity index 85% rename from drivers/gpu/drm/i915/gt/debugfs_engines.c rename to drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.c index 2980dac5b171..44b22384fcb2 100644 --- a/drivers/gpu/drm/i915/gt/debugfs_engines.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.c @@ -6,10 +6,10 @@ #include -#include "debugfs_engines.h" #include "i915_drv.h" /* for_each_engine! */ #include "intel_engine.h" #include "intel_gt_debugfs.h" +#include "intel_gt_engines_debugfs.h" static int engines_show(struct seq_file *m, void *data) { @@ -26,7 +26,7 @@ static int engines_show(struct seq_file *m, void *data) } DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(engines); -void debugfs_engines_register(struct intel_gt *gt, struct dentry *root) +void intel_gt_engines_register_debugfs(struct intel_gt *gt, struct dentry *root) { static const struct intel_gt_debugfs_file files[] = { { "engines", &engines_fops }, diff --git a/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h new file mode 100644 index ..4163b496937b --- /dev/null +++ b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: MIT */ +/* + * Copyright © 2019 Intel Corporation + */ + +#ifndef INTEL_GT_ENGINES_DEBUGFS_H +#define INTEL_GT_ENGINES_DEBUGFS_H + +struct intel_gt; +struct dentry; + +void intel_gt_engines_register_debugfs(struct intel_gt *gt, struct dentry *root); + +#endif /* INTEL_GT_ENGINES_DEBUGFS_H */ -- 2.32.0
[PATCH 4/4] drm/i915: deduplicate frequency dump on debugfs
Although commit 9dd4b065446a ("drm/i915/gt: Move pm debug files into a gt aware debugfs") says it was moving debug files to gt/, the i915_frequency_info file was left behind and its implementation copied into drivers/gpu/drm/i915/gt/debugfs_gt_pm.c. Over time we had several patches having to change both places to keep them in sync (and some patches failing to do so). The initial idea was to remove i915_frequency_info, but there are user space tools using it. From a quick code search there are other scripts and test tools besides igt, so it's not simply updating igt to get rid of the older file. Here we export a function using drm_printer as parameter and make both show() implementations to call this same function. Aside from a few variable name differences, for i915_frequency_info this brings a few lines that were not previously printed: RP UP EI, RP UP THRESHOLD, RP DOWN THRESHOLD and RP DOWN EI. These came in as part of commit 9c878557b1eb ("drm/i915/gt: Use the RPM config register to determine clk frequencies"), which didn't change both places. Signed-off-by: Lucas De Marchi --- drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c | 127 +- drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h | 2 + drivers/gpu/drm/i915/i915_debugfs.c | 231 +- 3 files changed, 76 insertions(+), 284 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c index baca153c05dd..31d334d3b3b5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c +++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c @@ -240,9 +240,8 @@ static int drpc_show(struct seq_file *m, void *unused) } DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(drpc); -static int frequency_show(struct seq_file *m, void *unused) +void intel_gt_pm_frequency_dump(struct intel_gt *gt, struct drm_printer *p) { - struct intel_gt *gt = m->private; struct drm_i915_private *i915 = gt->i915; struct intel_uncore *uncore = gt->uncore; struct intel_rps *rps = >->rps; @@ -254,21 +253,21 @@ static int frequency_show(struct seq_file *m, void *unused) u16 rgvswctl = intel_uncore_read16(uncore, MEMSWCTL); u16 rgvstat = intel_uncore_read16(uncore, MEMSTAT_ILK); - seq_printf(m, "Requested P-state: %d\n", (rgvswctl >> 8) & 0xf); - seq_printf(m, "Requested VID: %d\n", rgvswctl & 0x3f); - seq_printf(m, "Current VID: %d\n", (rgvstat & MEMSTAT_VID_MASK) >> + drm_printf(p, "Requested P-state: %d\n", (rgvswctl >> 8) & 0xf); + drm_printf(p, "Requested VID: %d\n", rgvswctl & 0x3f); + drm_printf(p, "Current VID: %d\n", (rgvstat & MEMSTAT_VID_MASK) >> MEMSTAT_VID_SHIFT); - seq_printf(m, "Current P-state: %d\n", + drm_printf(p, "Current P-state: %d\n", (rgvstat & MEMSTAT_PSTATE_MASK) >> MEMSTAT_PSTATE_SHIFT); } else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) { u32 rpmodectl, freq_sts; rpmodectl = intel_uncore_read(uncore, GEN6_RP_CONTROL); - seq_printf(m, "Video Turbo Mode: %s\n", + drm_printf(p, "Video Turbo Mode: %s\n", yesno(rpmodectl & GEN6_RP_MEDIA_TURBO)); - seq_printf(m, "HW control enabled: %s\n", + drm_printf(p, "HW control enabled: %s\n", yesno(rpmodectl & GEN6_RP_ENABLE)); - seq_printf(m, "SW control enabled: %s\n", + drm_printf(p, "SW control enabled: %s\n", yesno((rpmodectl & GEN6_RP_MEDIA_MODE_MASK) == GEN6_RP_MEDIA_SW_MODE)); @@ -276,25 +275,25 @@ static int frequency_show(struct seq_file *m, void *unused) freq_sts = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS); vlv_punit_put(i915); - seq_printf(m, "PUNIT_REG_GPU_FREQ_STS: 0x%08x\n", freq_sts); - seq_printf(m, "DDR freq: %d MHz\n", i915->mem_freq); + drm_printf(p, "PUNIT_REG_GPU_FREQ_STS: 0x%08x\n", freq_sts); + drm_printf(p, "DDR freq: %d MHz\n", i915->mem_freq); - seq_printf(m, "actual GPU freq: %d MHz\n", + drm_printf(p, "actual GPU freq: %d MHz\n", intel_gpu_freq(rps, (freq_sts >> 8) & 0xff)); - seq_printf(m, "current GPU freq: %d MHz\n", + drm_printf(p, "current GPU freq: %d MHz\n", intel_gpu_freq(rps, rps->cur_freq)); - seq_printf(m, "max GPU freq: %d MHz\n", + drm_printf(p, "max GPU freq: %d MHz\n", intel_gpu_freq(rps, rps->max_freq)); - seq_printf(m, "min GPU freq: %d MHz\n", + drm_printf(p, "min GPU freq: %d MHz\n", intel_gpu_freq(rps, rps->min_freq));
Re: [PATCH v3 00/16] eDP: Support probing eDP panels dynamically instead of hardcoding
Hi, On Sun, Sep 5, 2021 at 11:55 AM Sam Ravnborg wrote: > > Hi Douglas, > > On Wed, Sep 01, 2021 at 01:19:18PM -0700, Douglas Anderson wrote: > > The goal of this patch series is to move away from hardcoding exact > > eDP panels in device tree files. As discussed in the various patches > > in this series (I'm not repeating everything here), most eDP panels > > are 99% probable and we can get that last 1% by allowing two "power > > up" delays to be specified in the device tree file and then using the > > panel ID (found in the EDID) to look up additional power sequencing > > delays for the panel. > > > > This patch series is the logical contiunation of a previous patch > > series where I proposed solving this problem by adding a > > board-specific compatible string [1]. In the discussion that followed > > it sounded like people were open to something like the solution > > proposed in this new series. > > > > In version 2 I got rid of the idea that we could have a "fallback" > > compatible string that we'd use if we didn't recognize the ID in the > > EDID. This simplifies the bindings a lot and the implementation > > somewhat. As a result of not having a "fallback", though, I'm not > > confident in transitioning any existing boards over to this since > > we'll have to fallback to very conservative timings if we don't > > recognize the ID from the EDID and I can't guarantee that I've seen > > every panel that might have shipped on an existing product. The plan > > is to use "edp-panel" only on new boards or new revisions of old > > boards where we can guarantee that every EDID that ships out of the > > factory has an ID in the table. > > > > Version 3 of this series now splits out all eDP panels to their own > > driver and adds the generic eDP panel support to this new driver. I > > believe this is what Sam was looking for [2]. > > > > [1] https://lore.kernel.org/r/yfkqaxomowyye...@google.com/ > > [2] https://lore.kernel.org/r/yrtsfntn%2ft8fl...@ravnborg.org/ > > > > Changes in v3: > > - Decode hex product ID w/ same endianness as everyone else. > > - ("Reorder logicpd_type_28...") patch new for v3. > > - Split eDP panels patch new for v3. > > - Move wayward panels patch new for v3. > > - ("Non-eDP panels don't need "HPD" handling") new for v3. > > - Split the delay structure out patch just on eDP now. > > - ("Better describe eDP panel delays") new for v3. > > - Fix "prepare_to_enable" patch new for v3. > > - ("Don't re-read the EDID every time") moved to eDP only patch. > > - Generic "edp-panel" handled by the eDP panel driver now. > > - Change init order to we power at the end. > > - Adjust endianness of product ID. > > - Fallback to conservative delays if panel not recognized. > > - Add Sharp LQ116M1JW10 to table. > > - Add AUO B116XAN06.1 to table. > > - Rename delays more generically so they can be reused. > > > > Changes in v2: > > - No longer allow fallback to panel-simple. > > - Add "-ms" suffix to delays. > > - Don't support a "fallback" panel. Probed panels must be probed. > > - Not based on patch to copy "desc"--just allocate for probed panels. > > - Add "-ms" suffix to delays. > > > > Douglas Anderson (16): > > dt-bindings: drm/panel-simple-edp: Introduce generic eDP panels > > drm/edid: Break out reading block 0 of the EDID > > drm/edid: Allow the querying/working with the panel ID from the EDID > > drm/panel-simple: Reorder logicpd_type_28 / mitsubishi_aa070mc01 > > drm/panel-simple-edp: Split eDP panels out of panel-simple > > ARM: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP > > arm64: defconfig: Everyone who had PANEL_SIMPLE now gets > > PANEL_SIMPLE_EDP > > MIPS: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP > > drm/panel-simple-edp: Move some wayward panels to the eDP driver > > drm/panel-simple: Non-eDP panels don't need "HPD" handling > > drm/panel-simple-edp: Split the delay structure out > > drm/panel-simple-edp: Better describe eDP panel delays > > drm/panel-simple-edp: hpd_reliable shouldn't be subtraced from > > hpd_absent > > drm/panel-simple-edp: Fix "prepare_to_enable" if panel doesn't handle > > HPD > > drm/panel-simple-edp: Don't re-read the EDID every time we power off > > the panel > > drm/panel-simple-edp: Implement generic "edp-panel"s probed by EDID > > Thanks for looking into this. I really like the outcome. > We have panel-simple that now (mostly) handle simple panels, > and thus all the eDP functionality is in a separate driver. > > I have provided a few nits. > My only take on this is the naming - as we do not want to confuse > panel-simple and panel-edp I strongly suggest renaming the driver to > panel-edp. Sure, I'll do that. I was trying to express the fact that the new "panel-edp" driver won't actually handle _all_ eDP panels, only the eDP panels that are (comparatively) simpler. For instance, I'm not planning to handle panel-samsung-atna33xc20.c in "panel-edp". I guess people
Re: [PATCH v3 03/16] drm/edid: Allow the querying/working with the panel ID from the EDID
Hi, On Mon, Sep 6, 2021 at 3:05 AM Jani Nikula wrote: > > > +{ > > + struct edid *edid; > > + u32 val; > > + > > + edid = drm_do_get_edid_blk0(drm_do_probe_ddc_edid, adapter, NULL, > > NULL); > > + > > + /* > > + * There are no manufacturer IDs of 0, so if there is a problem > > reading > > + * the EDID then we'll just return 0. > > + */ > > + if (IS_ERR_OR_NULL(edid)) > > + return 0; > > + > > + /* > > + * In theory we could try to de-obfuscate this like edid_get_quirks() > > + * does, but it's easier to just deal with a 32-bit number. > > Hmm, but is it, really? AFAICT this is just an internal representation > for a table, where it could just as well be stored in a struct that > could be just as compact now, but extensible later. You populate the > table via an encoding macro, then decode the id using a function - while > it could be in a format that's directly usable without the decode. If > suitably chosen, the struct could perhaps be reused between the quirks > code and your code. I'm not 100% sure, but I think you're suggesting having this function return a `struct edid_panel_id` or something like that. Is that right? Maybe that would look something like this? struct edid_panel_id { char vendor[4]; u16 product_id; } ...or perhaps this (untested, but I think it works): struct edid_panel_id { u16 vend_c1:5; u16 vend_c2:5; u16 vend_c3:5; u16 product_id; } ...and then change `struct edid_quirk` to something like this: static const struct edid_quirk { struct edid_panel_id panel_id; u32 quirks; } ... Is that correct? There are a few downsides that I can see: a) I think the biggest downside is the inability compare with "==". I don't believe it's legal to compare structs with "==" in C. Yeah, we can use memcmp() but that feels more awkward to me. b) Unless you use the bitfield approach, it takes up more space. I know it's not a huge deal, but the format in the EDID is pretty much _forced_ to fit in 32-bits. The bitfield approach seems like it'd be more awkward than my encoding macros. -Doug
Re: [PATCH v5 11/16] drm/mediatek: add display MDP RDMA support for MT8195
Hi, Nancy: Nancy.Lin 於 2021年9月6日 週一 下午3:15寫道: > > Add MDP_RDMA driver for MT8195. MDP_RDMA is the DMA engine of > the ovl_adaptor component. > > Signed-off-by: Nancy.Lin > --- > drivers/gpu/drm/mediatek/Makefile | 3 +- > drivers/gpu/drm/mediatek/mtk_disp_drv.h | 7 + > drivers/gpu/drm/mediatek/mtk_mdp_rdma.c | 301 > drivers/gpu/drm/mediatek/mtk_mdp_rdma.h | 37 +++ > 4 files changed, 347 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/mediatek/mtk_mdp_rdma.c > create mode 100644 drivers/gpu/drm/mediatek/mtk_mdp_rdma.h > > diff --git a/drivers/gpu/drm/mediatek/Makefile > b/drivers/gpu/drm/mediatek/Makefile > index a38e88e82d12..6e604a933ed0 100644 > --- a/drivers/gpu/drm/mediatek/Makefile > +++ b/drivers/gpu/drm/mediatek/Makefile > @@ -13,7 +13,8 @@ mediatek-drm-y := mtk_disp_aal.o \ > mtk_drm_gem.o \ > mtk_drm_plane.o \ > mtk_dsi.o \ > - mtk_dpi.o > + mtk_dpi.o \ > + mtk_mdp_rdma.o > > obj-$(CONFIG_DRM_MEDIATEK) += mediatek-drm.o > > diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h > b/drivers/gpu/drm/mediatek/mtk_disp_drv.h > index a33b13fe2b6e..b3a372cab0bd 100644 > --- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h > +++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h > @@ -8,6 +8,7 @@ > > #include > #include "mtk_drm_plane.h" > +#include "mtk_mdp_rdma.h" > > int mtk_aal_clk_enable(struct device *dev); > void mtk_aal_clk_disable(struct device *dev); > @@ -106,4 +107,10 @@ void mtk_rdma_enable_vblank(struct device *dev, > void *vblank_cb_data); > void mtk_rdma_disable_vblank(struct device *dev); > > +int mtk_mdp_rdma_clk_enable(struct device *dev); > +void mtk_mdp_rdma_clk_disable(struct device *dev); > +void mtk_mdp_rdma_start(struct device *dev, struct cmdq_pkt *cmdq_pkt); > +void mtk_mdp_rdma_stop(struct device *dev, struct cmdq_pkt *cmdq_pkt); > +void mtk_mdp_rdma_config(struct device *dev, struct mtk_mdp_rdma_cfg *cfg, > +struct cmdq_pkt *cmdq_pkt); > #endif > diff --git a/drivers/gpu/drm/mediatek/mtk_mdp_rdma.c > b/drivers/gpu/drm/mediatek/mtk_mdp_rdma.c > new file mode 100644 > index ..052434d960b9 > --- /dev/null > +++ b/drivers/gpu/drm/mediatek/mtk_mdp_rdma.c > @@ -0,0 +1,301 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* > + * Copyright (c) 2021 MediaTek Inc. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "mtk_drm_drv.h" > +#include "mtk_disp_drv.h" > +#include "mtk_mdp_rdma.h" > + > +#define MDP_RDMA_EN0x000 > + #define FLD_ROT_ENABLEBIT(0) Maybe my description is not good, I like the style of rdma driver [1]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/mediatek/mtk_disp_rdma.c?h=v5.14 > + > +#define MDP_RDMA_RESET 0x008 > + > +#define MDP_RDMA_CON 0x020 > + #define FLD_OUTPUT_10BBIT(5) > + #define FLD_SIMPLE_MODE BIT(4) > + > +#define MDP_RDMA_GMCIF_CON 0x028 > + #define FLD_COMMAND_DIV BIT(0) > + #define FLD_EXT_PREULTRA_EN BIT(3) > + #define FLD_RD_REQ_TYPE GENMASK(7, 4) > + #define VAL_RD_REQ_TYPE_BURST_8_ACCESS7 > + #define FLD_ULTRA_EN GENMASK(13, 12) > + #define VAL_ULTRA_EN_ENABLE 1 > + #define FLD_PRE_ULTRA_EN GENMASK(17, 16) > + #define VAL_PRE_ULTRA_EN_ENABLE 1 > + #define FLD_EXT_ULTRA_EN BIT(18) > + > +#define MDP_RDMA_SRC_CON 0x030 > + #define FLD_OUTPUT_ARGB BIT(25) > + #define FLD_BIT_NUMBERGENMASK(19, 18) > + #define FLD_UNIFORM_CONFIGBIT(17) > + #define FLD_SWAP BIT(14) > + #define FLD_SRC_FORMATGENMASK(3, 0) > + > +#define MDP_RDMA_COMP_CON 0x038 > + #define FLD_AFBC_EN BIT(22) > + #define FLD_AFBC_YUV_TRANSFORMBIT(21) > + #define FLD_UFBDC_EN BIT(12) > + > +#define MDP_RDMA_MF_BKGD_SIZE_IN_BYTE 0x060 > + #define FLD_MF_BKGD_WBGENMASK(22, 0) > + > +#define MDP_RDMA_MF_SRC_SIZE 0x070 > + #define FLD_MF_SRC_H GENMASK(30, 16) > + #define FLD_MF_SR
Re: [PATCH v5 25/25] drm/i915/guc: Add GuC kernel doc
On 9/3/2021 12:59, Daniele Ceraolo Spurio wrote: From: Matthew Brost Add GuC kernel doc for all structures added thus far for GuC submission and update the main GuC submission section with the new interface details. v2: - Drop guc_active.lock DOC v3: - Fixup a few kernel doc comments (Daniele) v4 (Daniele): - Implement doc suggestions from John - Add kerneldoc for all members of the GuC structure and pull the file in i915.rst v5 (Daniele): - Implement new doc suggestions from John Signed-off-by: Matthew Brost Signed-off-by: Daniele Ceraolo Spurio Cc: John Harrison Reviewed-by: John Harrison --- Documentation/gpu/i915.rst| 2 + drivers/gpu/drm/i915/gt/intel_context_types.h | 43 +--- drivers/gpu/drm/i915/gt/uc/intel_guc.h| 75 ++--- .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 100 ++ drivers/gpu/drm/i915/i915_request.h | 21 ++-- 5 files changed, 181 insertions(+), 60 deletions(-) diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index 101dde3eb1ea..311e10400708 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -495,6 +495,8 @@ GuC .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc.c :doc: GuC +.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc.h + GuC Firmware Layout ~~~ diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index 5285d660eacf..930569a1a01f 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -156,40 +156,51 @@ struct intel_context { u8 wa_bb_page; /* if set, page num reserved for context workarounds */ struct { - /** lock: protects everything in guc_state */ + /** @lock: protects everything in guc_state */ spinlock_t lock; /** -* sched_state: scheduling state of this context using GuC +* @sched_state: scheduling state of this context using GuC * submission */ u32 sched_state; /* -* fences: maintains of list of requests that have a submit -* fence related to GuC submission +* @fences: maintains a list of requests that are currently +* being fenced until a GuC operation completes */ struct list_head fences; - /* GuC context blocked fence */ + /** +* @blocked: fence used to signal when the blocking of a +* context's submissions is complete. +*/ struct i915_sw_fence blocked; - /* GuC committed requests */ + /** @number_committed_requests: number of committed requests */ int number_committed_requests; - /** requests: active requests on this context */ + /** @requests: list of active requests on this context */ struct list_head requests; - /* -* GuC priority management -*/ + /** @prio: the context's current guc priority */ u8 prio; + /** +* @prio_count: a counter of the number requests in flight in +* each priority bucket +*/ u32 prio_count[GUC_CLIENT_PRIORITY_NUM]; } guc_state; struct { - /* GuC LRC descriptor ID */ + /** +* @id: handle which is used to uniquely identify this context +* with the GuC, protected by guc->contexts_lock +*/ u16 id; - - /* GuC LRC descriptor reference count */ + /** +* @ref: the number of references to the guc_id, when +* transitioning in and out of zero protected by +* guc->contexts_lock +*/ atomic_t ref; - - /* -* GuC ID link - in list when unpinned but guc_id still valid in GuC + /** +* @link: in guc->guc_id_list when the guc_id has no refs but is +* still valid, protected by guc->contexts_lock */ struct list_head link; } guc_id; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h b/drivers/gpu/drm/i915/gt/uc/intel_guc.h index 2e27fe59786b..5dd174babf7a 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h @@ -22,74 +22,121 @@ struct __guc_ads_blob; -/* - * Top level structure of GuC. It handles firmware loading and manages client - * pool. intel_guc owns a intel_guc_client to replace the legacy ExecList - * submission. +/** + * struct intel_guc - Top level structure of GuC. + * + * It handl
Re: [PATCH v3 06/16] ARM: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP
On Wed, Sep 8, 2021 at 3:36 PM Doug Anderson wrote: > > Hi, > > On Fri, Sep 3, 2021 at 1:38 PM Stephen Boyd wrote: > > > > Quoting Doug Anderson (2021-09-01 16:10:15) > > > Hi, > > > > > > On Wed, Sep 1, 2021 at 2:12 PM Olof Johansson wrote: > > > > > > > > On Wed, Sep 1, 2021 at 1:20 PM Douglas Anderson > > > > wrote: > > > > > > > > > > In the patch ("drm/panel-simple-edp: Split eDP panels out of > > > > > panel-simple") we split the PANEL_SIMPLE driver in 2. By default let's > > > > > give everyone who had the old driver enabled the new driver too. If > > > > > folks want to opt-out of one or the other they always can later. > > > > > > > > > > Signed-off-by: Douglas Anderson > > > > > > > > Isn't this a case where the new option should just have had the old > > > > option as the default value to avoid this kind of churn and possibly > > > > broken platforms? > > > > > > I'm happy to go either way. I guess I didn't do that originally > > > because logically there's not any reason to link the two drivers going > > > forward. Said another way, someone enabling the "simple panel" driver > > > for non-eDP panels wouldn't expect that the "simple panel" driver for > > > DP panels would also get enabled by default. They really have nothing > > > to do with one another. Enabling by default for something like this > > > also seems like it would lead to bloat. I could have sworn that > > > periodically people get yelled at for marking drivers on by default > > > when it doesn't make sense. > > > > > > ...that being said, I'm happy to change the default as you suggest. > > > Just let me know. > > > > Having the default will help olddefconfig users seamlessly migrate to > > the new Kconfig. Sadly they don't notice that they should probably > > disable the previous Kconfig symbol, but oh well. At least with the > > default they don't go on a hunt/bisect to figure out that some Kconfig > > needed to be enabled now that they're using a new kernel version. > > > > Maybe the default should have a TODO comment next to it indicating we > > should remove the default in a year or two. > > OK, so I'm trying to figure out how to do this without just "kicking > the can" down the road. I guess your idea is that for the next year > this will be the default and that anyone who really wants > "CONFIG_DRM_PANEL_EDP" will "opt-in" to keep it by adding > "CONFIG_DRM_PANEL_EDP=y" to their config? ...and then after a year > passes we remove the default? ...but that won't work, will it? Since > "CONFIG_DRM_PANEL_EDP" will be the default for the next year then you > really can't add it to the "defconfig", at least if you ever > "normalize" it. The "defconfig" by definition has everything stripped > from it that's already the "default", so for the next year anyone who > tries to opt-in will get their preference stripped. > > Hrm, so let me explain options as I see them. Maybe someone can point > out something that I missed. I'll assume that we'll change the config > option from CONFIG_DRM_PANEL_SIMPLE_EDP to CONFIG_DRM_PANEL_EDP > (remove the "SIMPLE" part). > > == > > Where we were before my series: > > * One config "CONFIG_DRM_PANEL_SIMPLE" and it enables simple non-eDP > and eDP drivers. > > == > > Option 1: update everyone's configs (this patch) > > * Keep old config "CONFIG_DRM_PANEL_SIMPLE" but it now only means > enable the panel-simple (non-eDP) driver. > * Anyone who wants eDP panels must opt-in to "CONFIG_DRM_PANEL_EDP" > * Update all configs in mainline; any out-of mainline configs must > figure this out themselves. > > Pros: > * no long term baggage > > Cons: > * patch upstream is a bit of "churn" > * anyone with downstream config will have to figure out what happened. > > == > > Option 2: kick the can down the road + accept cruft > > * Keep old config "CONFIG_DRM_PANEL_SIMPLE" and it means enable the > panel-simple (non-eDP) driver. > * Anyone with "CONFIG_DRM_PANEL_SIMPLE" is opted in by default to > "CONFIG_DRM_PANEL_EDP" > > AKA: > config DRM_PANEL_EDP > default DRM_PANEL_SIMPLE > > Pros: > * no config patches needed upstream at all and everything just works! > > Cons: > * people are opted in to extra cruft by default and need to know to turn it > off. > * unclear if we can change the default without the same problems. > > == > > Option 3: try to be clever > > * Add _two_ new configs. CONFIG_DRM_PANEL_SIMPLE_V2 and CONFIG_DRM_PANEL_EDP. > * Old config "CONFIG_DRM_PANEL_SIMPLE" gets marked as "deprecated". > * Both new configs have "default CONFIG_DRM_PANEL_SIMPLE" > > Now anyone old will magically get both the new config options by > default. Anyone looking at this in the future _won't_ set the > deprecated CONFIG_DRM_PANEL_SIMPLE but will instead choose if they > want either the eDP or "simple" driver. > > Pros: > * No long term baggage. > * Everyone is transitioned automatically by default with no cruft patches. > > Cons: > * I can't think of a better name than "CONFIG_DRM_PANEL_SIMPLE_V2" and > that name is ugl
[PATCH v3 8/8] treewide: Replace the use of mem_encrypt_active() with cc_platform_has()
Replace uses of mem_encrypt_active() with calls to cc_platform_has() with the CC_ATTR_MEM_ENCRYPT attribute. Remove the implementation of mem_encrypt_active() across all arches. For s390, since the default implementation of the cc_platform_has() matches the s390 implementation of mem_encrypt_active(), cc_platform_has() does not need to be implemented in s390 (the config option ARCH_HAS_CC_PLATFORM is not set). Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: David Airlie Cc: Daniel Vetter Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: VMware Graphics Cc: Joerg Roedel Cc: Will Deacon Cc: Dave Young Cc: Baoquan He Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Signed-off-by: Tom Lendacky --- arch/powerpc/include/asm/mem_encrypt.h | 5 - arch/powerpc/platforms/pseries/svm.c| 5 +++-- arch/s390/include/asm/mem_encrypt.h | 2 -- arch/x86/include/asm/mem_encrypt.h | 5 - arch/x86/kernel/head64.c| 4 ++-- arch/x86/mm/ioremap.c | 4 ++-- arch/x86/mm/mem_encrypt.c | 2 +- arch/x86/mm/pat/set_memory.c| 3 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++- drivers/gpu/drm/drm_cache.c | 4 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 ++-- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +++--- drivers/iommu/amd/iommu.c | 3 ++- drivers/iommu/amd/iommu_v2.c| 3 ++- drivers/iommu/iommu.c | 3 ++- fs/proc/vmcore.c| 6 +++--- include/linux/mem_encrypt.h | 4 kernel/dma/swiotlb.c| 4 ++-- 18 files changed, 31 insertions(+), 40 deletions(-) diff --git a/arch/powerpc/include/asm/mem_encrypt.h b/arch/powerpc/include/asm/mem_encrypt.h index ba9dab07c1be..2f26b8fc8d29 100644 --- a/arch/powerpc/include/asm/mem_encrypt.h +++ b/arch/powerpc/include/asm/mem_encrypt.h @@ -10,11 +10,6 @@ #include -static inline bool mem_encrypt_active(void) -{ - return is_secure_guest(); -} - static inline bool force_dma_unencrypted(struct device *dev) { return is_secure_guest(); diff --git a/arch/powerpc/platforms/pseries/svm.c b/arch/powerpc/platforms/pseries/svm.c index 87f001b4c4e4..c083ecbbae4d 100644 --- a/arch/powerpc/platforms/pseries/svm.c +++ b/arch/powerpc/platforms/pseries/svm.c @@ -8,6 +8,7 @@ #include #include +#include #include #include #include @@ -63,7 +64,7 @@ void __init svm_swiotlb_init(void) int set_memory_encrypted(unsigned long addr, int numpages) { - if (!mem_encrypt_active()) + if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT)) return 0; if (!PAGE_ALIGNED(addr)) @@ -76,7 +77,7 @@ int set_memory_encrypted(unsigned long addr, int numpages) int set_memory_decrypted(unsigned long addr, int numpages) { - if (!mem_encrypt_active()) + if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT)) return 0; if (!PAGE_ALIGNED(addr)) diff --git a/arch/s390/include/asm/mem_encrypt.h b/arch/s390/include/asm/mem_encrypt.h index 2542cbf7e2d1..08a8b96606d7 100644 --- a/arch/s390/include/asm/mem_encrypt.h +++ b/arch/s390/include/asm/mem_encrypt.h @@ -4,8 +4,6 @@ #ifndef __ASSEMBLY__ -static inline bool mem_encrypt_active(void) { return false; } - int set_memory_encrypted(unsigned long addr, int numpages); int set_memory_decrypted(unsigned long addr, int numpages); diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 499440781b39..ed954aa5c448 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -98,11 +98,6 @@ static inline void mem_encrypt_free_decrypted_mem(void) { } extern char __start_bss_decrypted[], __end_bss_decrypted[], __start_bss_decrypted_unused[]; -static inline bool mem_encrypt_active(void) -{ - return sme_me_mask; -} - static inline u64 sme_get_me_mask(void) { return sme_me_mask; diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index de01903c3735..f98c76a1d16c 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -19,7 +19,7 @@ #include #include #include -#include +#include #include #include @@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long physaddr, * there is no need to zero it after changing the memory encryption * attribute. */ - if (mem_encrypt_active()) { + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) { vaddr = (unsigned long)__start_bss_decrypted; vaddr_end = (unsigned long)__end_bss_decrypted; for (; vaddr < vaddr_end; vaddr += PMD_SIZE) { diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index b59a5cbc6bc5..026031b3b782 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/
[PATCH v3 7/8] x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
Replace uses of sev_es_active() with the more generic cc_platform_has() using CC_ATTR_GUEST_STATE_ENCRYPT. If future support is added for other memory encyrption techonologies, the use of CC_ATTR_GUEST_STATE_ENCRYPT can be updated, as required. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Signed-off-by: Tom Lendacky --- arch/x86/include/asm/mem_encrypt.h | 2 -- arch/x86/kernel/sev.c | 6 +++--- arch/x86/mm/mem_encrypt.c | 14 -- arch/x86/realmode/init.c | 3 +-- 4 files changed, 8 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index f440eebeeb2c..499440781b39 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void); void __init mem_encrypt_init(void); void __init sev_es_init_vc_handling(void); -bool sev_es_active(void); bool amd_cc_platform_has(enum cc_attr attr); #define __bss_decrypted __section(".bss..decrypted") @@ -75,7 +74,6 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } static inline void __init sme_enable(struct boot_params *bp) { } static inline void sev_es_init_vc_handling(void) { } -static inline bool sev_es_active(void) { return false; } static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; } static inline int __init diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index a6895e440bc3..53a6837d354b 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -11,7 +11,7 @@ #include /* For show_regs() */ #include -#include +#include #include #include #include @@ -615,7 +615,7 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd) int cpu; u64 pfn; - if (!sev_es_active()) + if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) return 0; pflags = _PAGE_NX | _PAGE_RW; @@ -774,7 +774,7 @@ void __init sev_es_init_vc_handling(void) BUILD_BUG_ON(offsetof(struct sev_es_runtime_data, ghcb_page) % PAGE_SIZE); - if (!sev_es_active()) + if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) return; if (!sev_es_check_cpu_features()) diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index 22d4e152a6de..47d571a2cd28 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -373,13 +373,6 @@ int __init early_set_memory_encrypted(unsigned long vaddr, unsigned long size) * up under SME the trampoline area cannot be encrypted, whereas under SEV * the trampoline area must be encrypted. */ - -/* Needs to be called from non-instrumentable code */ -bool noinstr sev_es_active(void) -{ - return sev_status & MSR_AMD64_SEV_ES_ENABLED; -} - bool amd_cc_platform_has(enum cc_attr attr) { switch (attr) { @@ -393,7 +386,7 @@ bool amd_cc_platform_has(enum cc_attr attr) return sev_status & MSR_AMD64_SEV_ENABLED; case CC_ATTR_GUEST_STATE_ENCRYPT: - return sev_es_active(); + return sev_status & MSR_AMD64_SEV_ES_ENABLED; default: return false; @@ -469,7 +462,7 @@ static void print_mem_encrypt_feature_info(void) pr_cont(" SEV"); /* Encrypted Register State */ - if (sev_es_active()) + if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) pr_cont(" SEV-ES"); pr_cont("\n"); @@ -488,7 +481,8 @@ void __init mem_encrypt_init(void) * With SEV, we need to unroll the rep string I/O instructions, * but SEV-ES supports them through the #VC handler. */ - if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && !sev_es_active()) + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && + !cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) static_branch_enable(&sev_enable_key); print_mem_encrypt_feature_info(); diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c index c878c5ee5a4c..4a3da7592b99 100644 --- a/arch/x86/realmode/init.c +++ b/arch/x86/realmode/init.c @@ -2,7 +2,6 @@ #include #include #include -#include #include #include @@ -48,7 +47,7 @@ static void sme_sev_setup_real_mode(struct trampoline_header *th) if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) th->flags |= TH_FLAGS_SME_ACTIVE; - if (sev_es_active()) { + if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) { /* * Skip the call to verify_cpu() in secondary_startup_64 as it * will cause #VC exceptions when the AP can't handle them yet. -- 2.33.0
[PATCH v3 6/8] x86/sev: Replace occurrences of sev_active() with cc_platform_has()
Replace uses of sev_active() with the more generic cc_platform_has() using CC_ATTR_GUEST_MEM_ENCRYPT. If future support is added for other memory encryption technologies, the use of CC_ATTR_GUEST_MEM_ENCRYPT can be updated, as required. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Ard Biesheuvel Signed-off-by: Tom Lendacky --- arch/x86/include/asm/mem_encrypt.h | 2 -- arch/x86/kernel/crash_dump_64.c| 4 +++- arch/x86/kernel/kvm.c | 3 ++- arch/x86/kernel/kvmclock.c | 4 ++-- arch/x86/kernel/machine_kexec_64.c | 4 ++-- arch/x86/kvm/svm/svm.c | 3 ++- arch/x86/mm/ioremap.c | 6 +++--- arch/x86/mm/mem_encrypt.c | 25 ++--- arch/x86/platform/efi/efi_64.c | 9 + 9 files changed, 29 insertions(+), 31 deletions(-) diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 8c4f0dfe63f9..f440eebeeb2c 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void); void __init mem_encrypt_init(void); void __init sev_es_init_vc_handling(void); -bool sev_active(void); bool sev_es_active(void); bool amd_cc_platform_has(enum cc_attr attr); @@ -76,7 +75,6 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } static inline void __init sme_enable(struct boot_params *bp) { } static inline void sev_es_init_vc_handling(void) { } -static inline bool sev_active(void) { return false; } static inline bool sev_es_active(void) { return false; } static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; } diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c index 045e82e8945b..a7f617a3981d 100644 --- a/arch/x86/kernel/crash_dump_64.c +++ b/arch/x86/kernel/crash_dump_64.c @@ -10,6 +10,7 @@ #include #include #include +#include static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize, unsigned long offset, int userbuf, @@ -73,5 +74,6 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char *buf, size_t csize, ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos) { - return read_from_oldmem(buf, count, ppos, 0, sev_active()); + return read_from_oldmem(buf, count, ppos, 0, + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)); } diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index a26643dc6bd6..509a578f56a0 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -418,7 +419,7 @@ static void __init sev_map_percpu_data(void) { int cpu; - if (!sev_active()) + if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) return; for_each_possible_cpu(cpu) { diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index ad273e5861c1..fc3930c5db1b 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -16,9 +16,9 @@ #include #include #include +#include #include -#include #include #include @@ -232,7 +232,7 @@ static void __init kvmclock_init_mem(void) * hvclock is shared between the guest and the hypervisor, must * be mapped decrypted. */ - if (sev_active()) { + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) { r = set_memory_decrypted((unsigned long) hvclock_mem, 1UL << order); if (r) { diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 7040c0fa921c..f5da4a18070a 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, pgd_t *pgd) } pte = pte_offset_kernel(pmd, vaddr); - if (sev_active()) + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) prot = PAGE_KERNEL_EXEC; set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot)); @@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned long start_pgtable) level4p = (pgd_t *)__va(start_pgtable); clear_page(level4p); - if (sev_active()) { + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) { info.page_flag |= _PAGE_ENC; info.kernpg_flag |= _PAGE_ENC; } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 69639f9624f5..eb3669154b48 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include @@ -457,7 +458,7 @@ static int has_svm(void) return 0; } - if (sev_active()) { + if (cc_platf
[PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()
Replace uses of sme_active() with the more generic cc_platform_has() using CC_ATTR_HOST_MEM_ENCRYPT. If future support is added for other memory encryption technologies, the use of CC_ATTR_HOST_MEM_ENCRYPT can be updated, as required. This also replaces two usages of sev_active() that are really geared towards detecting if SME is active. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Joerg Roedel Cc: Will Deacon Signed-off-by: Tom Lendacky --- arch/x86/include/asm/kexec.h | 2 +- arch/x86/include/asm/mem_encrypt.h | 2 -- arch/x86/kernel/machine_kexec_64.c | 15 --- arch/x86/kernel/pci-swiotlb.c| 9 - arch/x86/kernel/relocate_kernel_64.S | 2 +- arch/x86/mm/ioremap.c| 6 +++--- arch/x86/mm/mem_encrypt.c| 15 +-- arch/x86/mm/mem_encrypt_identity.c | 3 ++- arch/x86/realmode/init.c | 5 +++-- drivers/iommu/amd/init.c | 7 --- 10 files changed, 31 insertions(+), 35 deletions(-) diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h index 0a6e34b07017..11b7c06e2828 100644 --- a/arch/x86/include/asm/kexec.h +++ b/arch/x86/include/asm/kexec.h @@ -129,7 +129,7 @@ relocate_kernel(unsigned long indirection_page, unsigned long page_list, unsigned long start_address, unsigned int preserve_context, - unsigned int sme_active); + unsigned int host_mem_enc_active); #endif #define ARCH_HAS_KIMAGE_ARCH diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 3d8a5e8b2e3f..8c4f0dfe63f9 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void); void __init mem_encrypt_init(void); void __init sev_es_init_vc_handling(void); -bool sme_active(void); bool sev_active(void); bool sev_es_active(void); bool amd_cc_platform_has(enum cc_attr attr); @@ -77,7 +76,6 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { } static inline void __init sme_enable(struct boot_params *bp) { } static inline void sev_es_init_vc_handling(void) { } -static inline bool sme_active(void) { return false; } static inline bool sev_active(void) { return false; } static inline bool sev_es_active(void) { return false; } static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; } diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 131f30fdcfbd..7040c0fa921c 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -358,7 +359,7 @@ void machine_kexec(struct kimage *image) (unsigned long)page_list, image->start, image->preserve_context, - sme_active()); + cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)); #ifdef CONFIG_KEXEC_JUMP if (image->preserve_context) @@ -569,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void) */ int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp) { - if (sev_active()) + if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) return 0; /* -* If SME is active we need to be sure that kexec pages are -* not encrypted because when we boot to the new kernel the +* If host memory encryption is active we need to be sure that kexec +* pages are not encrypted because when we boot to the new kernel the * pages won't be accessed encrypted (initially). */ return set_memory_decrypted((unsigned long)vaddr, pages); @@ -582,12 +583,12 @@ int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp) void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) { - if (sev_active()) + if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)) return; /* -* If SME is active we need to reset the pages back to being -* an encrypted mapping before freeing them. +* If host memory encryption is active we need to reset the pages back +* to being an encrypted mapping before freeing them. */ set_memory_encrypted((unsigned long)vaddr, pages); } diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c index c2cfa5e7c152..814ab46a0dad 100644 --- a/arch/x86/kernel/pci-swiotlb.c +++ b/arch/x86/kernel/pci-swiotlb.c @@ -6,7 +6,7 @@ #include #include #include -#include +#include #include #include @@ -45,11 +45,10 @@ int __init pci_swiotlb_detect_4gb(void) swiotlb = 1; /* -*
[PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
Introduce a powerpc version of the cc_platform_has() function. This will be used to replace the powerpc mem_encrypt_active() implementation, so the implementation will initially only support the CC_ATTR_MEM_ENCRYPT attribute. Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Signed-off-by: Tom Lendacky --- arch/powerpc/platforms/pseries/Kconfig | 1 + arch/powerpc/platforms/pseries/Makefile | 2 ++ arch/powerpc/platforms/pseries/cc_platform.c | 26 3 files changed, 29 insertions(+) create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 5e037df2a3a1..2e57391e0778 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -159,6 +159,7 @@ config PPC_SVM select SWIOTLB select ARCH_HAS_MEM_ENCRYPT select ARCH_HAS_FORCE_DMA_UNENCRYPTED + select ARCH_HAS_CC_PLATFORM help There are certain POWER platforms which support secure guests using the Protected Execution Facility, with the help of an Ultravisor diff --git a/arch/powerpc/platforms/pseries/Makefile b/arch/powerpc/platforms/pseries/Makefile index 4cda0ef87be0..41d8aee98da4 100644 --- a/arch/powerpc/platforms/pseries/Makefile +++ b/arch/powerpc/platforms/pseries/Makefile @@ -31,3 +31,5 @@ obj-$(CONFIG_FA_DUMP) += rtas-fadump.o obj-$(CONFIG_SUSPEND) += suspend.o obj-$(CONFIG_PPC_VAS) += vas.o + +obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o diff --git a/arch/powerpc/platforms/pseries/cc_platform.c b/arch/powerpc/platforms/pseries/cc_platform.c new file mode 100644 index ..e8021af83a19 --- /dev/null +++ b/arch/powerpc/platforms/pseries/cc_platform.c @@ -0,0 +1,26 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Confidential Computing Platform Capability checks + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + */ + +#include +#include + +#include +#include + +bool cc_platform_has(enum cc_attr attr) +{ + switch (attr) { + case CC_ATTR_MEM_ENCRYPT: + return is_secure_guest(); + + default: + return false; + } +} +EXPORT_SYMBOL_GPL(cc_platform_has); -- 2.33.0
[PATCH v3 3/8] x86/sev: Add an x86 version of cc_platform_has()
Introduce an x86 version of the cc_platform_has() function. This will be used to replace vendor specific calls like sme_active(), sev_active(), etc. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Co-developed-by: Andi Kleen Signed-off-by: Andi Kleen Co-developed-by: Kuppuswamy Sathyanarayanan Signed-off-by: Kuppuswamy Sathyanarayanan Signed-off-by: Tom Lendacky --- arch/x86/Kconfig | 1 + arch/x86/include/asm/mem_encrypt.h | 3 +++ arch/x86/kernel/Makefile | 3 +++ arch/x86/kernel/cc_platform.c | 21 + arch/x86/mm/mem_encrypt.c | 21 + 5 files changed, 49 insertions(+) create mode 100644 arch/x86/kernel/cc_platform.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 4e001425..2b2a9639d8ae 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1513,6 +1513,7 @@ config AMD_MEM_ENCRYPT select ARCH_HAS_FORCE_DMA_UNENCRYPTED select INSTRUCTION_DECODER select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS + select ARCH_HAS_CC_PLATFORM help Say yes to enable support for the encryption of system memory. This requires an AMD processor that supports Secure Memory diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 9c80c68d75b5..3d8a5e8b2e3f 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -13,6 +13,7 @@ #ifndef __ASSEMBLY__ #include +#include #include @@ -53,6 +54,7 @@ void __init sev_es_init_vc_handling(void); bool sme_active(void); bool sev_active(void); bool sev_es_active(void); +bool amd_cc_platform_has(enum cc_attr attr); #define __bss_decrypted __section(".bss..decrypted") @@ -78,6 +80,7 @@ static inline void sev_es_init_vc_handling(void) { } static inline bool sme_active(void) { return false; } static inline bool sev_active(void) { return false; } static inline bool sev_es_active(void) { return false; } +static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; } static inline int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; } diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 8f4e8fa6ed75..f91403a78594 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -147,6 +147,9 @@ obj-$(CONFIG_UNWINDER_FRAME_POINTER)+= unwind_frame.o obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o + +obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c new file mode 100644 index ..3c9bacd3c3f3 --- /dev/null +++ b/arch/x86/kernel/cc_platform.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Confidential Computing Platform Capability checks + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + */ + +#include +#include +#include + +bool cc_platform_has(enum cc_attr attr) +{ + if (sme_me_mask) + return amd_cc_platform_has(attr); + + return false; +} +EXPORT_SYMBOL_GPL(cc_platform_has); diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c index ff08dc463634..18fe19916bc3 100644 --- a/arch/x86/mm/mem_encrypt.c +++ b/arch/x86/mm/mem_encrypt.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include @@ -389,6 +390,26 @@ bool noinstr sev_es_active(void) return sev_status & MSR_AMD64_SEV_ES_ENABLED; } +bool amd_cc_platform_has(enum cc_attr attr) +{ + switch (attr) { + case CC_ATTR_MEM_ENCRYPT: + return sme_me_mask != 0; + + case CC_ATTR_HOST_MEM_ENCRYPT: + return sme_active(); + + case CC_ATTR_GUEST_MEM_ENCRYPT: + return sev_active(); + + case CC_ATTR_GUEST_STATE_ENCRYPT: + return sev_es_active(); + + default: + return false; + } +} + /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */ bool force_dma_unencrypted(struct device *dev) { -- 2.33.0
[PATCH v3 2/8] mm: Introduce a function to check for confidential computing features
In prep for other confidential computing technologies, introduce a generic helper function, cc_platform_has(), that can be used to check for specific active confidential computing attributes, like memory encryption. This is intended to eliminate having to add multiple technology-specific checks to the code (e.g. if (sev_active() || tdx_active())). Co-developed-by: Andi Kleen Signed-off-by: Andi Kleen Co-developed-by: Kuppuswamy Sathyanarayanan Signed-off-by: Kuppuswamy Sathyanarayanan Signed-off-by: Tom Lendacky --- arch/Kconfig| 3 ++ include/linux/cc_platform.h | 88 + 2 files changed, 91 insertions(+) create mode 100644 include/linux/cc_platform.h diff --git a/arch/Kconfig b/arch/Kconfig index 3743174da870..ca7c359e5da8 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1234,6 +1234,9 @@ config RELR config ARCH_HAS_MEM_ENCRYPT bool +config ARCH_HAS_CC_PLATFORM + bool + config HAVE_SPARSE_SYSCALL_NR bool help diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h new file mode 100644 index ..253f3ea66cd8 --- /dev/null +++ b/include/linux/cc_platform.h @@ -0,0 +1,88 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Confidential Computing Platform Capability checks + * + * Copyright (C) 2021 Advanced Micro Devices, Inc. + * + * Author: Tom Lendacky + */ + +#ifndef _CC_PLATFORM_H +#define _CC_PLATFORM_H + +#include +#include + +/** + * enum cc_attr - Confidential computing attributes + * + * These attributes represent confidential computing features that are + * currently active. + */ +enum cc_attr { + /** +* @CC_ATTR_MEM_ENCRYPT: Memory encryption is active +* +* The platform/OS is running with active memory encryption. This +* includes running either as a bare-metal system or a hypervisor +* and actively using memory encryption or as a guest/virtual machine +* and actively using memory encryption. +* +* Examples include SME, SEV and SEV-ES. +*/ + CC_ATTR_MEM_ENCRYPT, + + /** +* @CC_ATTR_HOST_MEM_ENCRYPT: Host memory encryption is active +* +* The platform/OS is running as a bare-metal system or a hypervisor +* and actively using memory encryption. +* +* Examples include SME. +*/ + CC_ATTR_HOST_MEM_ENCRYPT, + + /** +* @CC_ATTR_GUEST_MEM_ENCRYPT: Guest memory encryption is active +* +* The platform/OS is running as a guest/virtual machine and actively +* using memory encryption. +* +* Examples include SEV and SEV-ES. +*/ + CC_ATTR_GUEST_MEM_ENCRYPT, + + /** +* @CC_ATTR_GUEST_STATE_ENCRYPT: Guest state encryption is active +* +* The platform/OS is running as a guest/virtual machine and actively +* using memory encryption and register state encryption. +* +* Examples include SEV-ES. +*/ + CC_ATTR_GUEST_STATE_ENCRYPT, +}; + +#ifdef CONFIG_ARCH_HAS_CC_PLATFORM + +/** + * cc_platform_has() - Checks if the specified cc_attr attribute is active + * @attr: Confidential computing attribute to check + * + * The cc_platform_has() function will return an indicator as to whether the + * specified Confidential Computing attribute is currently active. + * + * Context: Any context + * Return: + * * TRUE - Specified Confidential Computing attribute is active + * * FALSE - Specified Confidential Computing attribute is not active + */ +bool cc_platform_has(enum cc_attr attr); + +#else /* !CONFIG_ARCH_HAS_CC_PLATFORM */ + +static inline bool cc_platform_has(enum cc_attr attr) { return false; } + +#endif /* CONFIG_ARCH_HAS_CC_PLATFORM */ + +#endif /* _CC_PLATFORM_H */ -- 2.33.0
[PATCH v3 1/8] x86/ioremap: Selectively build arch override encryption functions
In prep for other uses of the cc_platform_has() function besides AMD's memory encryption support, selectively build the AMD memory encryption architecture override functions only when CONFIG_AMD_MEM_ENCRYPT=y. These functions are: - early_memremap_pgprot_adjust() - arch_memremap_can_ram_remap() Additionally, routines that are only invoked by these architecture override functions can also be conditionally built. These functions are: - memremap_should_map_decrypted() - memremap_is_efi_data() - memremap_is_setup_data() - early_memremap_is_setup_data() And finally, phys_mem_access_encrypted() is conditionally built as well, but requires a static inline version of it when CONFIG_AMD_MEM_ENCRYPT is not set. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Signed-off-by: Tom Lendacky --- arch/x86/include/asm/io.h | 8 arch/x86/mm/ioremap.c | 2 +- 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 841a5d104afa..5c6a4af0b911 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -391,6 +391,7 @@ extern void arch_io_free_memtype_wc(resource_size_t start, resource_size_t size) #define arch_io_reserve_memtype_wc arch_io_reserve_memtype_wc #endif +#ifdef CONFIG_AMD_MEM_ENCRYPT extern bool arch_memremap_can_ram_remap(resource_size_t offset, unsigned long size, unsigned long flags); @@ -398,6 +399,13 @@ extern bool arch_memremap_can_ram_remap(resource_size_t offset, extern bool phys_mem_access_encrypted(unsigned long phys_addr, unsigned long size); +#else +static inline bool phys_mem_access_encrypted(unsigned long phys_addr, +unsigned long size) +{ + return true; +} +#endif /** * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 60ade7dd71bd..ccff76cedd8f 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -508,6 +508,7 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr) memunmap((void *)((unsigned long)addr & PAGE_MASK)); } +#ifdef CONFIG_AMD_MEM_ENCRYPT /* * Examine the physical address to determine if it is an area of memory * that should be mapped decrypted. If the memory is not part of the @@ -746,7 +747,6 @@ bool phys_mem_access_encrypted(unsigned long phys_addr, unsigned long size) return arch_memremap_can_ram_remap(phys_addr, size, 0); } -#ifdef CONFIG_AMD_MEM_ENCRYPT /* Remap memory with encryption */ void __init *early_memremap_encrypted(resource_size_t phys_addr, unsigned long size) -- 2.33.0
[PATCH v3 0/8] Implement generic cc_platform_has() helper function
This patch series provides a generic helper function, cc_platform_has(), to replace the sme_active(), sev_active(), sev_es_active() and mem_encrypt_active() functions. It is expected that as new confidential computing technologies are added to the kernel, they can all be covered by a single function call instead of a collection of specific function calls all called from the same locations. The powerpc and s390 patches have been compile tested only. Can the folks copied on this series verify that nothing breaks for them. Also, a new file, arch/powerpc/platforms/pseries/cc_platform.c, has been created for powerpc to hold the out of line function. Cc: Andi Kleen Cc: Andy Lutomirski Cc: Ard Biesheuvel Cc: Baoquan He Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Christian Borntraeger Cc: Daniel Vetter Cc: Dave Hansen Cc: Dave Young Cc: David Airlie Cc: Heiko Carstens Cc: Ingo Molnar Cc: Joerg Roedel Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Michael Ellerman Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Thomas Zimmermann Cc: Vasily Gorbik Cc: VMware Graphics Cc: Will Deacon Cc: Christoph Hellwig --- Patches based on: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master 4b93c544e90e ("thunderbolt: test: split up test cases in tb_test_credit_alloc_all") Changes since v2: - Changed the name from prot_guest_has() to cc_platform_has() - Took the cc_platform_has() function out of line. Created two new files, cc_platform.c, in both x86 and ppc to implment the function. As a result, also changed the attribute defines into enums. - Removed any received Reviewed-by's and Acked-by's given changes in this version. - Added removal of new instances of mem_encrypt_active() usage in powerpc arch. - Based on latest Linux tree to pick up powerpc changes related to the mem_encrypt_active() function. Changes since v1: - Moved some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT in prep for use of prot_guest_has() by TDX. - Added type includes to the the protected_guest.h header file to prevent build errors outside of x86. - Made amd_prot_guest_has() EXPORT_SYMBOL_GPL - Used amd_prot_guest_has() in place of checking sme_me_mask in the arch/x86/mm/mem_encrypt.c file. Tom Lendacky (8): x86/ioremap: Selectively build arch override encryption functions mm: Introduce a function to check for confidential computing features x86/sev: Add an x86 version of cc_platform_has() powerpc/pseries/svm: Add a powerpc version of cc_platform_has() x86/sme: Replace occurrences of sme_active() with cc_platform_has() x86/sev: Replace occurrences of sev_active() with cc_platform_has() x86/sev: Replace occurrences of sev_es_active() with cc_platform_has() treewide: Replace the use of mem_encrypt_active() with cc_platform_has() arch/Kconfig | 3 + arch/powerpc/include/asm/mem_encrypt.h | 5 -- arch/powerpc/platforms/pseries/Kconfig | 1 + arch/powerpc/platforms/pseries/Makefile | 2 + arch/powerpc/platforms/pseries/cc_platform.c | 26 ++ arch/powerpc/platforms/pseries/svm.c | 5 +- arch/s390/include/asm/mem_encrypt.h | 2 - arch/x86/Kconfig | 1 + arch/x86/include/asm/io.h| 8 ++ arch/x86/include/asm/kexec.h | 2 +- arch/x86/include/asm/mem_encrypt.h | 14 +--- arch/x86/kernel/Makefile | 3 + arch/x86/kernel/cc_platform.c| 21 + arch/x86/kernel/crash_dump_64.c | 4 +- arch/x86/kernel/head64.c | 4 +- arch/x86/kernel/kvm.c| 3 +- arch/x86/kernel/kvmclock.c | 4 +- arch/x86/kernel/machine_kexec_64.c | 19 +++-- arch/x86/kernel/pci-swiotlb.c| 9 +- arch/x86/kernel/relocate_kernel_64.S | 2 +- arch/x86/kernel/sev.c| 6 +- arch/x86/kvm/svm/svm.c | 3 +- arch/x86/mm/ioremap.c| 18 ++-- arch/x86/mm/mem_encrypt.c| 57 +++-- arch/x86/mm/mem_encrypt_identity.c | 3 +- arch/x86/mm/pat/set_memory.c | 3 +- arch/x86/platform/efi/efi_64.c | 9 +- arch/x86/realmode/init.c | 8 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +- drivers/gpu/drm/drm_cache.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 +- drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +- drivers/iommu/amd/init.c | 7 +- drivers/iommu/amd/iommu.c| 3 +- drivers/iommu/amd/iommu_v2.c | 3 +- drivers/iommu/iommu.c| 3 +- fs/proc/vmcore.c | 6 +- include/linux/cc_platform.h | 88 include/linux/mem_encrypt.h | 4 -
Re: [PATCH v3 06/16] ARM: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP
Hi, On Fri, Sep 3, 2021 at 1:38 PM Stephen Boyd wrote: > > Quoting Doug Anderson (2021-09-01 16:10:15) > > Hi, > > > > On Wed, Sep 1, 2021 at 2:12 PM Olof Johansson wrote: > > > > > > On Wed, Sep 1, 2021 at 1:20 PM Douglas Anderson > > > wrote: > > > > > > > > In the patch ("drm/panel-simple-edp: Split eDP panels out of > > > > panel-simple") we split the PANEL_SIMPLE driver in 2. By default let's > > > > give everyone who had the old driver enabled the new driver too. If > > > > folks want to opt-out of one or the other they always can later. > > > > > > > > Signed-off-by: Douglas Anderson > > > > > > Isn't this a case where the new option should just have had the old > > > option as the default value to avoid this kind of churn and possibly > > > broken platforms? > > > > I'm happy to go either way. I guess I didn't do that originally > > because logically there's not any reason to link the two drivers going > > forward. Said another way, someone enabling the "simple panel" driver > > for non-eDP panels wouldn't expect that the "simple panel" driver for > > DP panels would also get enabled by default. They really have nothing > > to do with one another. Enabling by default for something like this > > also seems like it would lead to bloat. I could have sworn that > > periodically people get yelled at for marking drivers on by default > > when it doesn't make sense. > > > > ...that being said, I'm happy to change the default as you suggest. > > Just let me know. > > Having the default will help olddefconfig users seamlessly migrate to > the new Kconfig. Sadly they don't notice that they should probably > disable the previous Kconfig symbol, but oh well. At least with the > default they don't go on a hunt/bisect to figure out that some Kconfig > needed to be enabled now that they're using a new kernel version. > > Maybe the default should have a TODO comment next to it indicating we > should remove the default in a year or two. OK, so I'm trying to figure out how to do this without just "kicking the can" down the road. I guess your idea is that for the next year this will be the default and that anyone who really wants "CONFIG_DRM_PANEL_EDP" will "opt-in" to keep it by adding "CONFIG_DRM_PANEL_EDP=y" to their config? ...and then after a year passes we remove the default? ...but that won't work, will it? Since "CONFIG_DRM_PANEL_EDP" will be the default for the next year then you really can't add it to the "defconfig", at least if you ever "normalize" it. The "defconfig" by definition has everything stripped from it that's already the "default", so for the next year anyone who tries to opt-in will get their preference stripped. Hrm, so let me explain options as I see them. Maybe someone can point out something that I missed. I'll assume that we'll change the config option from CONFIG_DRM_PANEL_SIMPLE_EDP to CONFIG_DRM_PANEL_EDP (remove the "SIMPLE" part). == Where we were before my series: * One config "CONFIG_DRM_PANEL_SIMPLE" and it enables simple non-eDP and eDP drivers. == Option 1: update everyone's configs (this patch) * Keep old config "CONFIG_DRM_PANEL_SIMPLE" but it now only means enable the panel-simple (non-eDP) driver. * Anyone who wants eDP panels must opt-in to "CONFIG_DRM_PANEL_EDP" * Update all configs in mainline; any out-of mainline configs must figure this out themselves. Pros: * no long term baggage Cons: * patch upstream is a bit of "churn" * anyone with downstream config will have to figure out what happened. == Option 2: kick the can down the road + accept cruft * Keep old config "CONFIG_DRM_PANEL_SIMPLE" and it means enable the panel-simple (non-eDP) driver. * Anyone with "CONFIG_DRM_PANEL_SIMPLE" is opted in by default to "CONFIG_DRM_PANEL_EDP" AKA: config DRM_PANEL_EDP default DRM_PANEL_SIMPLE Pros: * no config patches needed upstream at all and everything just works! Cons: * people are opted in to extra cruft by default and need to know to turn it off. * unclear if we can change the default without the same problems. == Option 3: try to be clever * Add _two_ new configs. CONFIG_DRM_PANEL_SIMPLE_V2 and CONFIG_DRM_PANEL_EDP. * Old config "CONFIG_DRM_PANEL_SIMPLE" gets marked as "deprecated". * Both new configs have "default CONFIG_DRM_PANEL_SIMPLE" Now anyone old will magically get both the new config options by default. Anyone looking at this in the future _won't_ set the deprecated CONFIG_DRM_PANEL_SIMPLE but will instead choose if they want either the eDP or "simple" driver. Pros: * No long term baggage. * Everyone is transitioned automatically by default with no cruft patches. Cons: * I can't think of a better name than "CONFIG_DRM_PANEL_SIMPLE_V2" and that name is ugly. == Option 4: shave a yak When thinking about this I came up with a clever idea of stashing the kernel version in a defconfig when it's generated. Then you could do something like: config DRM_PANEL_EDP default DRM_PANEL_SIMPLE if DEFCONFIG_GENERATED_AT <= 0x00050f00 That feels
Re: [PATCH 1/2] drm/nouveau/ga102-: support ttm buffer moves via copy engine
On Thu, 9 Sept 2021 at 04:19, Daniel Vetter wrote: > > On Mon, Sep 06, 2021 at 10:56:27AM +1000, Ben Skeggs wrote: > > From: Ben Skeggs > > > > We don't currently have any kind of real acceleration on Ampere GPUs, > > but the TTM memcpy() fallback paths aren't really designed to handle > > copies between different devices, such as on Optimus systems, and > > result in a kernel OOPS. > > Is this just for moving a buffer from vram to system memory when you pin > it for dma-buf? I'm kinda lost what you even use ttm bo moves for if > there's no one using the gpu. It occurs when we attempt to move the buffer into vram for scanout, through the modeset paths. > > Also I guess memcpy goes boom if you can't mmap it because it's outside > the gart? Or just that it's very slow. We're trying to use ttm memcyp as > fallback, so want to know how this can all go wrong :-) Neither ttm_kmap_iter_linear_io_init() nor ttm_kmap_iter_tt_init() are able to work with the imported dma-buf object, which can obviously be fixed. But. I then attempted to hack that up with a custom memcpy() for that situation to test it, using dma_buf_vmap(), and get stuck forever inside i915 waiting for the gem object lock. Ben. > -Daniel > > > > > A few options were investigated to try and fix this, but didn't work > > out, and likely would have resulted in a very unpleasant experience > > for users anyway. > > > > This commit adds just enough support for setting up a single channel > > connected to a copy engine, which the kernel can use to accelerate > > the buffer copies between devices. Userspace has no access to this > > incomplete channel support, but it's suitable for TTM's needs. > > > > A more complete implementation of host(fifo) for Ampere GPUs is in > > the works, but the required changes are far too invasive that they > > would be unsuitable to backport to fix this issue on current kernels. > > > > Signed-off-by: Ben Skeggs > > Cc: Lyude Paul > > Cc: Karol Herbst > > Cc: # v5.12+ > > --- > > drivers/gpu/drm/nouveau/include/nvif/class.h | 2 + > > .../drm/nouveau/include/nvkm/engine/fifo.h| 1 + > > drivers/gpu/drm/nouveau/nouveau_bo.c | 1 + > > drivers/gpu/drm/nouveau/nouveau_chan.c| 6 +- > > drivers/gpu/drm/nouveau/nouveau_drm.c | 4 + > > drivers/gpu/drm/nouveau/nv84_fence.c | 2 +- > > .../gpu/drm/nouveau/nvkm/engine/device/base.c | 3 + > > .../gpu/drm/nouveau/nvkm/engine/fifo/Kbuild | 1 + > > .../gpu/drm/nouveau/nvkm/engine/fifo/ga102.c | 308 ++ > > .../gpu/drm/nouveau/nvkm/subdev/top/ga100.c | 7 +- > > 10 files changed, 329 insertions(+), 6 deletions(-) > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fifo/ga102.c > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvif/class.h > > b/drivers/gpu/drm/nouveau/include/nvif/class.h > > index c68cc957248e..a582c0cb0cb0 100644 > > --- a/drivers/gpu/drm/nouveau/include/nvif/class.h > > +++ b/drivers/gpu/drm/nouveau/include/nvif/class.h > > @@ -71,6 +71,7 @@ > > #define PASCAL_CHANNEL_GPFIFO_A /* cla06f.h */ > > 0xc06f > > #define VOLTA_CHANNEL_GPFIFO_A/* clc36f.h */ > > 0xc36f > > #define TURING_CHANNEL_GPFIFO_A /* clc36f.h */ > > 0xc46f > > +#define AMPERE_CHANNEL_GPFIFO_B /* clc36f.h */ > > 0xc76f > > > > #define NV50_DISP /* cl5070.h */ > > 0x5070 > > #define G82_DISP /* cl5070.h */ > > 0x8270 > > @@ -200,6 +201,7 @@ > > #define PASCAL_DMA_COPY_B > > 0xc1b5 > > #define VOLTA_DMA_COPY_A > > 0xc3b5 > > #define TURING_DMA_COPY_A > > 0xc5b5 > > +#define AMPERE_DMA_COPY_B > > 0xc7b5 > > > > #define FERMI_DECOMPRESS > > 0x90b8 > > > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > > b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > > index 54fab7cc36c1..64ee82c7c1be 100644 > > --- a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > > +++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > > @@ -77,4 +77,5 @@ int gp100_fifo_new(struct nvkm_device *, enum > > nvkm_subdev_type, int inst, struct > > int gp10b_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > > struct nvkm_fifo **); > > int gv100_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > > struct nvkm_fifo **); > > int tu102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > > struct nvkm_fifo **); > > +int ga102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > > struct nvkm_fifo **); > > #endif > > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c > > b/drivers/gpu/drm/nouveau/nouveau_bo.c >
Re: [PATCH 2/2] drm/bridge: parade-ps8640: Add support for AUX channel
Quoting Philip Chen (2021-09-08 11:18:06) > diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c > b/drivers/gpu/drm/bridge/parade-ps8640.c > index a16725dbf912..3f0241a60357 100644 > --- a/drivers/gpu/drm/bridge/parade-ps8640.c > +++ b/drivers/gpu/drm/bridge/parade-ps8640.c > @@ -93,6 +115,102 @@ static inline struct ps8640 *bridge_to_ps8640(struct > drm_bridge *e) > return container_of(e, struct ps8640, bridge); > } > > +static inline struct ps8640 *aux_to_ps8640(struct drm_dp_aux *aux) > +{ > + return container_of(aux, struct ps8640, aux); > +} > + > +static ssize_t ps8640_aux_transfer(struct drm_dp_aux *aux, > + struct drm_dp_aux_msg *msg) > +{ > + struct ps8640 *ps_bridge = aux_to_ps8640(aux); > + struct i2c_client *client = ps_bridge->page[PAGE0_DP_CNTL]; > + struct regmap *map = ps_bridge->regmap[PAGE0_DP_CNTL]; > + unsigned int len = msg->size; > + unsigned int data; > + int ret; > + u8 request = msg->request & > +~(DP_AUX_I2C_MOT | DP_AUX_I2C_WRITE_STATUS_UPDATE); > + u8 *buf = msg->buffer; > + bool is_native_aux = false; > + > + if (len > DP_AUX_MAX_PAYLOAD_BYTES) > + return -EINVAL; > + > + pm_runtime_get_sync(&client->dev); Is this driver using runtime PM? Probably can't add this until it is actually runtime PM enabled. > + > + switch (request) { > + case DP_AUX_NATIVE_WRITE: > + case DP_AUX_NATIVE_READ: > + is_native_aux = true; > + case DP_AUX_I2C_WRITE: > + case DP_AUX_I2C_READ: > + regmap_write(map, PAGE0_AUXCH_CFG3, AUXCH_CFG3_RESET); > + break; > + default: > + ret = -EINVAL; > + goto exit; > + } > + > + /* Assume it's good */ > + msg->reply = 0; > + > + data = ((request << 4) & AUX_CMD_MASK) | > + ((msg->address >> 16) & AUX_ADDR_19_16_MASK); > + regmap_write(map, PAGE0_AUX_ADDR_23_16, data); > + data = (msg->address >> 8) & 0xff; > + regmap_write(map, PAGE0_AUX_ADDR_15_8, data); > + data = msg->address & 0xff; > + regmap_write(map, PAGE0_AUX_ADDR_7_0, msg->address & 0xff); Can we pack this into a three byte buffer and write it in one regmap_bulk_write()? That would be nice because it looks like the addresses are all next to each other in the i2c address space. > + > + data = (len - 1) & AUX_LENGTH_MASK; > + regmap_write(map, PAGE0_AUX_LENGTH, data); > + > + if (request == DP_AUX_NATIVE_WRITE || request == DP_AUX_I2C_WRITE) { > + ret = regmap_noinc_write(map, PAGE0_AUX_WDATA, buf, len); > + if (ret < 0) { > + DRM_ERROR("failed to write PAGE0_AUX_WDATA"); Needs a newline. > + goto exit; > + } > + } > + > + regmap_write(map, PAGE0_AUX_CTRL, AUX_START); > + > + regmap_read(map, PAGE0_AUX_STATUS, &data); > + switch (data & AUX_STATUS_MASK) { > + case AUX_STATUS_DEFER: > + if (is_native_aux) > + msg->reply |= DP_AUX_NATIVE_REPLY_DEFER; > + else > + msg->reply |= DP_AUX_I2C_REPLY_DEFER; > + goto exit; > + case AUX_STATUS_NACK: > + if (is_native_aux) > + msg->reply |= DP_AUX_NATIVE_REPLY_NACK; > + else > + msg->reply |= DP_AUX_I2C_REPLY_NACK; > + goto exit; > + case AUX_STATUS_TIMEOUT: > + ret = -ETIMEDOUT; > + goto exit; > + } > + > + if (request == DP_AUX_NATIVE_READ || request == DP_AUX_I2C_READ) { > + ret = regmap_noinc_read(map, PAGE0_AUX_RDATA, buf, len); > + if (ret < 0) > + DRM_ERROR("failed to read PAGE0_AUX_RDATA"); Needs a newline. > + } > + > +exit: > + pm_runtime_mark_last_busy(&client->dev); > + pm_runtime_put_autosuspend(&client->dev); > + > + if (ret) > + return ret; > + > + return len; > +} > + > static int ps8640_bridge_vdo_control(struct ps8640 *ps_bridge, > const enum ps8640_vdo_control ctrl) > {
Re: [PATCH 1/2] drm/bridge: parade-ps8640: Use regmap APIs
Quoting Philip Chen (2021-09-08 11:18:05) > diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c > b/drivers/gpu/drm/bridge/parade-ps8640.c > index 685e9c38b2db..a16725dbf912 100644 > --- a/drivers/gpu/drm/bridge/parade-ps8640.c > +++ b/drivers/gpu/drm/bridge/parade-ps8640.c > @@ -64,12 +65,29 @@ struct ps8640 { > struct drm_bridge *panel_bridge; > struct mipi_dsi_device *dsi; > struct i2c_client *page[MAX_DEVS]; > + struct regmap *regmap[MAX_DEVS]; > struct regulator_bulk_data supplies[2]; > struct gpio_desc *gpio_reset; > struct gpio_desc *gpio_powerdown; > bool powered; > }; > > +static const struct regmap_range ps8640_volatile_ranges[] = { > + { .range_min = 0, .range_max = 0xff }, Is the plan to fill this out later or is 0xff the max register? If it's the latter then I think adding the max register to regmap_config is simpler. > +}; > + > +static const struct regmap_access_table ps8640_volatile_table = { > + .yes_ranges = ps8640_volatile_ranges, > + .n_yes_ranges = ARRAY_SIZE(ps8640_volatile_ranges), > +}; > + > +static const struct regmap_config ps8640_regmap_config = { > + .reg_bits = 8, > + .val_bits = 8, > + .volatile_table = &ps8640_volatile_table, > + .cache_type = REGCACHE_NONE, > +}; > + > static inline struct ps8640 *bridge_to_ps8640(struct drm_bridge *e) > { > return container_of(e, struct ps8640, bridge); > @@ -78,13 +96,13 @@ static inline struct ps8640 *bridge_to_ps8640(struct > drm_bridge *e) > static int ps8640_bridge_vdo_control(struct ps8640 *ps_bridge, > const enum ps8640_vdo_control ctrl) > { > - struct i2c_client *client = ps_bridge->page[PAGE3_DSI_CNTL1]; > - u8 vdo_ctrl_buf[] = { VDO_CTL_ADD, ctrl }; > + struct regmap *map = ps_bridge->regmap[PAGE3_DSI_CNTL1]; > + u8 vdo_ctrl_buf[] = {VDO_CTL_ADD, ctrl}; Nitpick: Add a space after { and before }. > int ret; > > - ret = i2c_smbus_write_i2c_block_data(client, PAGE3_SET_ADD, > -sizeof(vdo_ctrl_buf), > -vdo_ctrl_buf); > + ret = regmap_bulk_write(map, PAGE3_SET_ADD, > + vdo_ctrl_buf, sizeof(vdo_ctrl_buf)); > + > if (ret < 0) { > DRM_ERROR("failed to %sable VDO: %d\n", > ctrl == ENABLE ? "en" : "dis", ret);
Re: [PATCH] drm: mxsfb: Fix NULL pointer dereference crash on unload
On 9/8/21 8:24 PM, Daniel Vetter wrote: On Tue, Sep 07, 2021 at 04:49:00AM +0200, Marek Vasut wrote: The mxsfb->crtc.funcs may already be NULL when unloading the driver, in which case calling mxsfb_irq_disable() via drm_irq_uninstall() from mxsfb_unload() leads to NULL pointer dereference. Since all we care about is masking the IRQ and mxsfb->base is still valid, just use that to clear and mask the IRQ. Fixes: ae1ed00932819 ("drm: mxsfb: Stop using DRM simple display pipeline helper") Signed-off-by: Marek Vasut Cc: Daniel Abrecht Cc: Emil Velikov Cc: Laurent Pinchart Cc: Sam Ravnborg Cc: Stefan Agner You probably want a drm_atomic_helper_shutdown instead of trying to do all that manually. We've also added a bunch more devm and drmm_ functions to automate the cleanup a lot more here, e.g. your drm_mode_config_cleanup is in the wrong place. Also I'm confused because I'm not even seeing this function anywhere in upstream. It is still here: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/mxsfb/mxsfb_drv.c#n171 as of: 999569d59a0aa ("Add linux-next specific files for 20210908") Is there some other tree I should be looking at ?
Re: [PATCH] drm/bridge: ti-sn65dsi83: Check link status register after enabling the bridge
W dniu 08.09.2021 o 13:11, Dave Stevenson pisze: > Hi Marek and Andrzej > > On Tue, 7 Sept 2021 at 22:24, Marek Vasut wrote: >> On 9/7/21 7:29 PM, Andrzej Hajda wrote: >>> W dniu 07.09.2021 o 16:25, Marek Vasut pisze: On 9/7/21 9:31 AM, Andrzej Hajda wrote: > On 07.09.2021 04:39, Marek Vasut wrote: >> In rare cases, the bridge may not start up correctly, which usually >> leads to no display output. In case this happens, warn about it in >> the kernel log. >> >> Signed-off-by: Marek Vasut >> Cc: Jagan Teki >> Cc: Laurent Pinchart >> Cc: Linus Walleij >> Cc: Robert Foss >> Cc: Sam Ravnborg >> Cc: dri-devel@lists.freedesktop.org >> --- >> NOTE: See the following: >> https://e2e.ti.com/support/interface-group/interface/f/interface-forum/942005/sn65dsi83-dsi83-lvds-bridge---sporadic-behavior---no-video >> >> https://community.nxp.com/t5/i-MX-Processors/i-MX8M-MIPI-DSI-Interface-LVDS-Bridge-Initialization/td-p/1156533 >> >> --- >> drivers/gpu/drm/bridge/ti-sn65dsi83.c | 5 + >> 1 file changed, 5 insertions(+) >> >> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c >> b/drivers/gpu/drm/bridge/ti-sn65dsi83.c >> index a32f70bc68ea4..4ea71d7f0bfbc 100644 >> --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c >> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c >> @@ -520,6 +520,11 @@ static void sn65dsi83_atomic_enable(struct >> drm_bridge *bridge, >> /* Clear all errors that got asserted during initialization. */ >> regmap_read(ctx->regmap, REG_IRQ_STAT, &pval); >> regmap_write(ctx->regmap, REG_IRQ_STAT, pval); > > It does not look as correct error handling, maybe it would be good to > analyze and optionally report 'unexpected' errors here as well. The above is correct -- it clears the status register because the setup might've set random bits in that register. Then we wait a bit, let the link run, and read them again to get the real link status in this new piece of code below, hence the usleep_range there. And then if the link indicates a problem, we know it is a problem. >>> >>> Usually such registers are cleared on very beginning of the >>> initialization, and tested (via irq handler, or via reading), during >>> initalization, if initialization phase goes well. If it is not the case >>> forgive me. >> The init just flips the bit at random in the IRQ_STAT register, so no, >> that's not really viable here. That's why we clear them at the end, and >> then wait a bit, and then check whether something new appeared in them. >> >> If not, all is great. >> >> Sure, we could generate an IRQ, but then IRQ line is not always >> connected to this chip on all hardware I have available. So this gives >> the user at least some indication that something is wrong with their HW. >> >> + >> +usleep_range(1, 12000); >> +regmap_read(ctx->regmap, REG_IRQ_STAT, &pval); >> +if (pval) >> +dev_err(ctx->dev, "Unexpected link status 0x%02x\n", pval); > > I am not sure what is the case here but it looks like 'we do not know > what is going on, so let's add some diagnostic messages to gather info > and figure it out later'. That's pretty much the case, see the two links above in the NOTE section. If something goes wrong, we print the value for the user (usually developer) so they can fix their problems. We cannot do much better in the attach callback. The issue I ran into (and where this would be helpful information to me during debugging, since the issue happened real seldom, see also the NOTE links above) is that the DSI controller driver started streaming video on the data lanes before the DSI83 had a chance to initialize. This worked most of the time, except for a few exceptions here and there, where the video didn't start. This does set link status bits consistently. In the meantime, I fixed the controller driver (so far downstream, due to ongoing discussion). >>> >>> Maybe drm_connector_set_link_status_property(conn, >>> DRM_MODE_LINK_STATUS_BAD) would be usefule here. >> Hmm, this works on connector, the dsi83 is a bridge and it can be stuck >> between two other bridges. That doesn't seem like the right tool, no ? >> > Whole driver lacks IRQ handler which IMO could perform better diagnosis, > and I guess it could also help in recovery, but this is just my guess. > So if this patch is enough for now you can add: No, IRQ won't help you here, because by the time you get the IRQ, the DSI host already started streaming video on data lanes and you won't be able to correctly reinit the DSI83 unless you communicate to the DSI host that it should switch the data lanes back to LP11. And for that, there is a bigger chunk missing really. What needs to be added is a way for th
Re: [PATCH v3 10/16] drm/panel-simple: Non-eDP panels don't need "HPD" handling
Hi, On Sun, Sep 5, 2021 at 11:46 AM Sam Ravnborg wrote: > > On Wed, Sep 01, 2021 at 01:19:28PM -0700, Douglas Anderson wrote: > > All of the "HPD" handling added to panel-simple recently was for eDP > > panels. Remove it from panel-simple now that panel-simple-edp handles > > eDP panels. The "prepare_to_enable" delay only makes sense in the > > context of HPD, so remove it too. No non-eDP panels used it anyway. > > > > Signed-off-by: Douglas Anderson > > Maybe merge this with the patch that moved all the functionality > from panel-simple to panel-edp? Unless you feel strongly about it, I'm going to keep it separate still in the next version. To try to make diffing easier, I tried hard to make the minimal changes in the "split the driver in two" patch. -Doug
Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl
On Wed, Sep 8, 2021 at 9:36 PM Rob Clark wrote: > On Wed, Sep 8, 2021 at 11:49 AM Daniel Vetter wrote: > > On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote: > > > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter wrote: > > > > > > > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote: > > > > > From: Rob Clark > > > > > > > > > > The initial purpose is for igt tests, but this would also be useful > > > > > for > > > > > compositors that wait until close to vblank deadline to make decisions > > > > > about which frame to show. > > > > > > > > > > Signed-off-by: Rob Clark > > > > > > > > Needs userspace and I think ideally also some igts to make sure it works > > > > and doesn't go boom. > > > > > > See cover-letter.. there are igt tests, although currently that is the > > > only user. > > > > Ah sorry missed that. It would be good to record that in the commit too > > that adds the uapi. git blame doesn't find cover letters at all, unlike on > > gitlab where you get the MR request with everything. > > > > Ok there is the Link: thing, but since that only points at the last > > version all the interesting discussion is still usually lost, so I tend to > > not bother looking there. > > > > > I'd be ok to otherwise initially restrict this and the sw_sync UABI > > > (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are > > > both needed by the igt tests > > > > Hm really awkward, uapi for igts in cross vendor stuff like this isn't > > great. I think hiding it in vgem is semi-ok (we have fences there > > already). But it's all a bit silly ... > > > > For the tests, should we instead have a selftest/Kunit thing to exercise > > this stuff? igt probably not quite the right thing. Or combine with a page > > flip if you want to test msm. > > Hmm, IIRC we have used CONFIG_BROKEN or something along those lines > for UABI in other places where we weren't willing to commit to yet? > > I suppose if we had to I could make this a sw_sync ioctl instead. But > OTOH there are kind of a limited # of ways this ioctl could look. And > we already know that at least some wayland compositors are going to > want this. Hm I was trying to think up a few ways this could work, but didn't come up with anything reasonable. Forcing the compositor to boost the entire chain (for gl composited primary plane fallback) is something the kernel can easily do too. Also only makes sense for priority boost, not so much for clock boosting, since clock boosting only really needs the final element to be boosted. > I guess I can look at non-igt options. But the igt test is already a > pretty convenient way to contrive situations (like loops, which is a > thing I need to add) Yeah it's definitely very useful for testing ... One option could be a hacky debugfs interface, where you write a fd number and deadline and the debugfs read function does the deadline setting. Horribly, but since it's debugfs no one ever cares. That's at least where we're hiding all the i915 hacks that igts need. -Daniel > BR, > -R > > > > -Daniel > > > > > > > > BR, > > > -R > > > > > > > -Daniel > > > > > > > > > --- > > > > > drivers/dma-buf/sync_file.c| 19 +++ > > > > > include/uapi/linux/sync_file.h | 20 > > > > > 2 files changed, 39 insertions(+) > > > > > > > > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c > > > > > index 394e6e1e9686..f295772d5169 100644 > > > > > --- a/drivers/dma-buf/sync_file.c > > > > > +++ b/drivers/dma-buf/sync_file.c > > > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct > > > > > sync_file *sync_file, > > > > > return ret; > > > > > } > > > > > > > > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file, > > > > > + unsigned long arg) > > > > > +{ > > > > > + struct sync_set_deadline ts; > > > > > + > > > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts))) > > > > > + return -EFAULT; > > > > > + > > > > > + if (ts.pad) > > > > > + return -EINVAL; > > > > > + > > > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, > > > > > ts.tv_nsec)); > > > > > + > > > > > + return 0; > > > > > +} > > > > > + > > > > > static long sync_file_ioctl(struct file *file, unsigned int cmd, > > > > > unsigned long arg) > > > > > { > > > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, > > > > > unsigned int cmd, > > > > > case SYNC_IOC_FILE_INFO: > > > > > return sync_file_ioctl_fence_info(sync_file, arg); > > > > > > > > > > + case SYNC_IOC_SET_DEADLINE: > > > > > + return sync_file_ioctl_set_deadline(sync_file, arg); > > > > > + > > > > > default: > > > > > return -ENOTTY; > > > > > } > > > > > diff --git a/include/uapi/linux/sync_file.h > > > > > b/include/uapi/linux/sync_file.h > > > > > inde
Re: [PATCH 2/8] drm/i915/xehp: CCS shares the render reset domain
On Wed, Sep 08, 2021 at 11:07:07AM +0100, Tvrtko Ursulin wrote: > > On 07/09/2021 18:19, Matt Roper wrote: > > The reset domain is shared between render and all compute engines, > > so resetting one will affect the others. > > > > Note: Before performing a reset on an RCS or CCS engine, the GuC will > > attempt to preempt-to-idle the other non-hung RCS/CCS engines to avoid > > impacting other clients (since some shared modules will be reset). If > > other engines are executing non-preemptable workloads, the impact is > > unavoidable and some work may be lost. > > Since here it talks about engine reset, should this patch add warning if > same is attempted by i915 on a GuC platform - to document it is not Did you mean "on a *non* GuC platform" here? We aren't going to have compute engine support on any platforms where GuC submission isn't the default operating model, so the only way to get compute engines + execlist submission is to force an override via module parameters (e.g., enable_guc=0). Doing so will taint the kernel, so I think the current consensus from offline discussion is that the user has already put themselves into a configuration where it's easier than usual to shoot themselves in the foot; it's not too much different than the kind of trouble a user could get themselves into if they loaded the driver with hangcheck disabled or something. Matt > implemented/supported? Or perhaps later in the series, or future series > works better. > > Reviewed-by: Tvrtko Ursulin > > Regards, > > Tvrtko > > > Bspec: 52549 > > Original-patch-by: Michel Thierry > > Cc: Tvrtko Ursulin > > Cc: Vinay Belgaumkar > > Signed-off-by: Daniele Ceraolo Spurio > > Signed-off-by: Aravind Iddamsetty > > Signed-off-by: Matt Roper > > --- > > drivers/gpu/drm/i915/gt/intel_reset.c | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c > > b/drivers/gpu/drm/i915/gt/intel_reset.c > > index 91200c43951f..30598c1d070c 100644 > > --- a/drivers/gpu/drm/i915/gt/intel_reset.c > > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c > > @@ -507,6 +507,10 @@ static int gen11_reset_engines(struct intel_gt *gt, > > [VECS1] = GEN11_GRDOM_VECS2, > > [VECS2] = GEN11_GRDOM_VECS3, > > [VECS3] = GEN11_GRDOM_VECS4, > > + [CCS0] = GEN11_GRDOM_RENDER, > > + [CCS1] = GEN11_GRDOM_RENDER, > > + [CCS2] = GEN11_GRDOM_RENDER, > > + [CCS3] = GEN11_GRDOM_RENDER, > > }; > > struct intel_engine_cs *engine; > > intel_engine_mask_t tmp; > > -- Matt Roper Graphics Software Engineer VTT-OSGC Platform Enablement Intel Corporation (916) 356-2795
Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl
On Wed, Sep 8, 2021 at 11:49 AM Daniel Vetter wrote: > > On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote: > > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter wrote: > > > > > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote: > > > > From: Rob Clark > > > > > > > > The initial purpose is for igt tests, but this would also be useful for > > > > compositors that wait until close to vblank deadline to make decisions > > > > about which frame to show. > > > > > > > > Signed-off-by: Rob Clark > > > > > > Needs userspace and I think ideally also some igts to make sure it works > > > and doesn't go boom. > > > > See cover-letter.. there are igt tests, although currently that is the > > only user. > > Ah sorry missed that. It would be good to record that in the commit too > that adds the uapi. git blame doesn't find cover letters at all, unlike on > gitlab where you get the MR request with everything. > > Ok there is the Link: thing, but since that only points at the last > version all the interesting discussion is still usually lost, so I tend to > not bother looking there. > > > I'd be ok to otherwise initially restrict this and the sw_sync UABI > > (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are > > both needed by the igt tests > > Hm really awkward, uapi for igts in cross vendor stuff like this isn't > great. I think hiding it in vgem is semi-ok (we have fences there > already). But it's all a bit silly ... > > For the tests, should we instead have a selftest/Kunit thing to exercise > this stuff? igt probably not quite the right thing. Or combine with a page > flip if you want to test msm. Hmm, IIRC we have used CONFIG_BROKEN or something along those lines for UABI in other places where we weren't willing to commit to yet? I suppose if we had to I could make this a sw_sync ioctl instead. But OTOH there are kind of a limited # of ways this ioctl could look. And we already know that at least some wayland compositors are going to want this. I guess I can look at non-igt options. But the igt test is already a pretty convenient way to contrive situations (like loops, which is a thing I need to add) BR, -R > -Daniel > > > > > BR, > > -R > > > > > -Daniel > > > > > > > --- > > > > drivers/dma-buf/sync_file.c| 19 +++ > > > > include/uapi/linux/sync_file.h | 20 > > > > 2 files changed, 39 insertions(+) > > > > > > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c > > > > index 394e6e1e9686..f295772d5169 100644 > > > > --- a/drivers/dma-buf/sync_file.c > > > > +++ b/drivers/dma-buf/sync_file.c > > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct > > > > sync_file *sync_file, > > > > return ret; > > > > } > > > > > > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file, > > > > + unsigned long arg) > > > > +{ > > > > + struct sync_set_deadline ts; > > > > + > > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts))) > > > > + return -EFAULT; > > > > + > > > > + if (ts.pad) > > > > + return -EINVAL; > > > > + > > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, > > > > ts.tv_nsec)); > > > > + > > > > + return 0; > > > > +} > > > > + > > > > static long sync_file_ioctl(struct file *file, unsigned int cmd, > > > > unsigned long arg) > > > > { > > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, > > > > unsigned int cmd, > > > > case SYNC_IOC_FILE_INFO: > > > > return sync_file_ioctl_fence_info(sync_file, arg); > > > > > > > > + case SYNC_IOC_SET_DEADLINE: > > > > + return sync_file_ioctl_set_deadline(sync_file, arg); > > > > + > > > > default: > > > > return -ENOTTY; > > > > } > > > > diff --git a/include/uapi/linux/sync_file.h > > > > b/include/uapi/linux/sync_file.h > > > > index ee2dcfb3d660..f67d4ffe7566 100644 > > > > --- a/include/uapi/linux/sync_file.h > > > > +++ b/include/uapi/linux/sync_file.h > > > > @@ -67,6 +67,18 @@ struct sync_file_info { > > > > __u64 sync_fence_info; > > > > }; > > > > > > > > +/** > > > > + * struct sync_set_deadline - set a deadline on a fence > > > > + * @tv_sec: seconds elapsed since epoch > > > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec > > > > + * @pad: must be zero > > > > + */ > > > > +struct sync_set_deadline { > > > > + __s64 tv_sec; > > > > + __s32 tv_nsec; > > > > + __u32 pad; > > > > +}; > > > > + > > > > #define SYNC_IOC_MAGIC '>' > > > > > > > > /** > > > > @@ -95,4 +107,12 @@ struct sync_file_info { > > > > */ > > > > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct > > > > sync_file_info) > > > > > > > > + > > > > +/** > > > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence > > > > + * > >
[drm:i915-uncore-vfunc 30/31] drivers/gpu/drm/i915/selftests/mock_uncore.c:47:2: error: implicit declaration of function 'ASSIGN_RAW_WRITE_MMIO_VFUNCS'; did you mean 'MMIO_RAW_WRITE_VFUNCS'?
tree: git://people.freedesktop.org/~airlied/linux.git i915-uncore-vfunc head: b42168f90718a90b11f2d52306d9aeaa9468 commit: 99aebd17891290abfca80c48eca01f4e02413fb3 [30/31] drm/i915/uncore: constify the register vtables. config: i386-allyesconfig (attached as .config) compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 reproduce (this is a W=1 build): git remote add drm git://people.freedesktop.org/~airlied/linux.git git fetch --no-tags drm i915-uncore-vfunc git checkout 99aebd17891290abfca80c48eca01f4e02413fb3 # save the attached .config to linux build tree make W=1 ARCH=i386 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): In file included from drivers/gpu/drm/i915/intel_uncore.c:2630: drivers/gpu/drm/i915/selftests/mock_uncore.c: In function 'mock_uncore_init': >> drivers/gpu/drm/i915/selftests/mock_uncore.c:47:2: error: implicit >> declaration of function 'ASSIGN_RAW_WRITE_MMIO_VFUNCS'; did you mean >> 'MMIO_RAW_WRITE_VFUNCS'? [-Werror=implicit-function-declaration] 47 | ASSIGN_RAW_WRITE_MMIO_VFUNCS(uncore, nop); | ^~~~ | MMIO_RAW_WRITE_VFUNCS >> drivers/gpu/drm/i915/selftests/mock_uncore.c:47:39: error: 'nop' undeclared >> (first use in this function); did you mean 'nopv'? 47 | ASSIGN_RAW_WRITE_MMIO_VFUNCS(uncore, nop); | ^~~ | nopv drivers/gpu/drm/i915/selftests/mock_uncore.c:47:39: note: each undeclared identifier is reported only once for each function it appears in >> drivers/gpu/drm/i915/selftests/mock_uncore.c:48:2: error: implicit >> declaration of function 'ASSIGN_RAW_READ_MMIO_VFUNCS' >> [-Werror=implicit-function-declaration] 48 | ASSIGN_RAW_READ_MMIO_VFUNCS(uncore, nop); | ^~~ At top level: >> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read64' >> defined but not used [-Werror=unused-function] 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of macro '__nop_read' 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ >> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read32' >> defined but not used [-Werror=unused-function] 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of macro '__nop_read' 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ >> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read16' >> defined but not used [-Werror=unused-function] 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of macro '__nop_read' 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ >> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read8' >> defined but not used [-Werror=unused-function] 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of macro '__nop_read' 36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) { return 0; } | ^~~~ >> drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: error: 'nop_write32' >> defined but not used [-Werror=unused-function] 29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { } | ^ drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: note: in definition of macro '__nop_write' 29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { } | ^ >> drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: error: 'nop_write16' >> defined but not used [-Werror=unused-function] 29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { } | ^ drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: note: in definition of macro '__nop_write' 29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { } | ^ >> drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: error: 'nop_write8' >> defined but not used [-Werror=unused-function] 29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, bool trace) { } | ^ drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: note: in def
Re: [PATCH 13/14] drm/kmb: Enable alpha blended second plane
Hi Thomas, On Wed, Sep 08, 2021 at 07:50:42PM +0200, Thomas Zimmermann wrote: > Hi > > Am 03.08.21 um 07:10 schrieb Sam Ravnborg: > > Hi Anitha, > > > > On Mon, Aug 02, 2021 at 08:44:26PM +, Chrisanthus, Anitha wrote: > > > Hi Sam, > > > Thanks. Where should this go, drm-misc-fixes or drm-misc-next? > > > > Looks like a drm-misc-next candidate to me. > > I may improve something for existing users, but it does not look like it > > fixes an existing bug. > > I found this patch in drm-misc-fixes, although it doesn't look like a > bugfix. It should have gone into drm-misc-next. See [1]. If it indeed > belongs into drm-misc-fixes, it certainly should have contained a Fixes tag. The patch fixes some warnings that has become errors the last week. Anitha pinged me about it, but I failed to followup. So in the end it was applied to shut up the warning => errors. Sam
[PATCH] drm/nouveau/nvkm: Replace -ENOSYS with -ENODEV
nvkm test builds fail with the following error. drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c: In function 'nvkm_control_mthd_pstate_info': drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c:60:35: error: overflow in conversion from 'int' to '__s8' {aka 'signed char'} changes value from '-251' to '5' The code builds on most architectures, but fails on parisc where ENOSYS is defined as 251. Replace the error code with -ENODEV (-19). The actual error code does not really matter and is not passed to userspace - it just has to be negative. Fixes: 7238eca4cf18 ("drm/nouveau: expose pstate selection per-power source in sysfs") Signed-off-by: Guenter Roeck --- drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c index b0ece71aefde..ce774579c89d 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c @@ -57,7 +57,7 @@ nvkm_control_mthd_pstate_info(struct nvkm_control *ctrl, void *data, u32 size) args->v0.count = 0; args->v0.ustate_ac = NVIF_CONTROL_PSTATE_INFO_V0_USTATE_DISABLE; args->v0.ustate_dc = NVIF_CONTROL_PSTATE_INFO_V0_USTATE_DISABLE; - args->v0.pwrsrc = -ENOSYS; + args->v0.pwrsrc = -ENODEV; args->v0.pstate = NVIF_CONTROL_PSTATE_INFO_V0_PSTATE_UNKNOWN; } -- 2.33.0
[PATCH 1/2] drm/i915/gt: Queue and wait for the irq_work item.
Disabling interrupts and invoking the irq_work function directly breaks on PREEMPT_RT. PREEMPT_RT does not invoke all irq_work from hardirq context because some of the user have spinlock_t locking in the callback function. These locks are then turned into a sleeping locks which can not be acquired with disabled interrupts. Using irq_work_queue() has the benefit that the irqwork will be invoked in the regular context. In general there is "no" delay between enqueuing the callback and its invocation because the interrupt is raised right away on architectures which support it (which includes x86). Use irq_work_queue() + irq_work_sync() instead invoking the callback directly. Reported-by: Clark Williams Signed-off-by: Sebastian Andrzej Siewior --- drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c index 38cc42783dfb2..594dec2f76954 100644 --- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c +++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c @@ -318,10 +318,9 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b) /* Kick the work once more to drain the signalers, and disarm the irq */ irq_work_sync(&b->irq_work); while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) { - local_irq_disable(); - signal_irq_work(&b->irq_work); - local_irq_enable(); + irq_work_queue(&b->irq_work); cond_resched(); + irq_work_sync(&b->irq_work); } } -- 2.33.0
[RFC PATCH 2/2] drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock()
execlists_dequeue() is invoked from a function which uses local_irq_disable() to disable interrupts so the spin_lock() behaves like spin_lock_irq(). This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not the same as spin_lock_irq(). execlists_dequeue_irq() and execlists_dequeue() has each one caller only. If intel_engine_cs::active::lock is acquired and released with the _irq suffix then it behaves almost as if execlists_dequeue() would be invoked with disabled interrupts. The difference is the last part of the function which is then invoked with enabled interrupts. I can't tell if this makes a difference. From looking at it, it might work to move the last unlock at the end of the function as I didn't find anything that would acquire the lock again. Reported-by: Clark Williams Signed-off-by: Sebastian Andrzej Siewior --- .../drm/i915/gt/intel_execlists_submission.c| 17 + 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index fc77592d88a96..2ec1dd352960b 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -1265,7 +1265,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * and context switches) submission. */ - spin_lock(&engine->active.lock); + spin_lock_irq(&engine->active.lock); /* * If the queue is higher priority than the last @@ -1365,7 +1365,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * Even if ELSP[1] is occupied and not worthy * of timeslices, our queue might be. */ - spin_unlock(&engine->active.lock); + spin_unlock_irq(&engine->active.lock); return; } } @@ -1391,7 +1391,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) if (last && !can_merge_rq(last, rq)) { spin_unlock(&ve->base.active.lock); - spin_unlock(&engine->active.lock); + spin_unlock_irq(&engine->active.lock); return; /* leave this for another sibling */ } @@ -1552,7 +1552,7 @@ static void execlists_dequeue(struct intel_engine_cs *engine) * interrupt for secondary ports). */ execlists->queue_priority_hint = queue_prio(execlists); - spin_unlock(&engine->active.lock); + spin_unlock_irq(&engine->active.lock); /* * We can skip poking the HW if we ended up with exactly the same set @@ -1578,13 +1578,6 @@ static void execlists_dequeue(struct intel_engine_cs *engine) } } -static void execlists_dequeue_irq(struct intel_engine_cs *engine) -{ - local_irq_disable(); /* Suspend interrupts across request submission */ - execlists_dequeue(engine); - local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */ -} - static void clear_ports(struct i915_request **ports, int count) { memset_p((void **)ports, NULL, count); @@ -2377,7 +2370,7 @@ static void execlists_submission_tasklet(struct tasklet_struct *t) } if (!engine->execlists.pending[0]) { - execlists_dequeue_irq(engine); + execlists_dequeue(engine); start_timeslice(engine); } -- 2.33.0
[PATCH 0/2] drm/i915/gt: Locking splats PREEMPT_RT
Clark Williams reported two issues with the i915 driver running on PREEMPT_RT. While #1 looks simple I have no idea about #2 thus the RFC. Sebastian
Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl
On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote: > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter wrote: > > > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote: > > > From: Rob Clark > > > > > > The initial purpose is for igt tests, but this would also be useful for > > > compositors that wait until close to vblank deadline to make decisions > > > about which frame to show. > > > > > > Signed-off-by: Rob Clark > > > > Needs userspace and I think ideally also some igts to make sure it works > > and doesn't go boom. > > See cover-letter.. there are igt tests, although currently that is the > only user. Ah sorry missed that. It would be good to record that in the commit too that adds the uapi. git blame doesn't find cover letters at all, unlike on gitlab where you get the MR request with everything. Ok there is the Link: thing, but since that only points at the last version all the interesting discussion is still usually lost, so I tend to not bother looking there. > I'd be ok to otherwise initially restrict this and the sw_sync UABI > (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are > both needed by the igt tests Hm really awkward, uapi for igts in cross vendor stuff like this isn't great. I think hiding it in vgem is semi-ok (we have fences there already). But it's all a bit silly ... For the tests, should we instead have a selftest/Kunit thing to exercise this stuff? igt probably not quite the right thing. Or combine with a page flip if you want to test msm. -Daniel > > BR, > -R > > > -Daniel > > > > > --- > > > drivers/dma-buf/sync_file.c| 19 +++ > > > include/uapi/linux/sync_file.h | 20 > > > 2 files changed, 39 insertions(+) > > > > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c > > > index 394e6e1e9686..f295772d5169 100644 > > > --- a/drivers/dma-buf/sync_file.c > > > +++ b/drivers/dma-buf/sync_file.c > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct > > > sync_file *sync_file, > > > return ret; > > > } > > > > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file, > > > + unsigned long arg) > > > +{ > > > + struct sync_set_deadline ts; > > > + > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts))) > > > + return -EFAULT; > > > + > > > + if (ts.pad) > > > + return -EINVAL; > > > + > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, > > > ts.tv_nsec)); > > > + > > > + return 0; > > > +} > > > + > > > static long sync_file_ioctl(struct file *file, unsigned int cmd, > > > unsigned long arg) > > > { > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, > > > unsigned int cmd, > > > case SYNC_IOC_FILE_INFO: > > > return sync_file_ioctl_fence_info(sync_file, arg); > > > > > > + case SYNC_IOC_SET_DEADLINE: > > > + return sync_file_ioctl_set_deadline(sync_file, arg); > > > + > > > default: > > > return -ENOTTY; > > > } > > > diff --git a/include/uapi/linux/sync_file.h > > > b/include/uapi/linux/sync_file.h > > > index ee2dcfb3d660..f67d4ffe7566 100644 > > > --- a/include/uapi/linux/sync_file.h > > > +++ b/include/uapi/linux/sync_file.h > > > @@ -67,6 +67,18 @@ struct sync_file_info { > > > __u64 sync_fence_info; > > > }; > > > > > > +/** > > > + * struct sync_set_deadline - set a deadline on a fence > > > + * @tv_sec: seconds elapsed since epoch > > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec > > > + * @pad: must be zero > > > + */ > > > +struct sync_set_deadline { > > > + __s64 tv_sec; > > > + __s32 tv_nsec; > > > + __u32 pad; > > > +}; > > > + > > > #define SYNC_IOC_MAGIC '>' > > > > > > /** > > > @@ -95,4 +107,12 @@ struct sync_file_info { > > > */ > > > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct > > > sync_file_info) > > > > > > + > > > +/** > > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence > > > + * > > > + * Allows userspace to set a deadline on a fence, see > > > dma_fence_set_deadline() > > > + */ > > > +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct > > > sync_set_deadline) > > > + > > > #endif /* _UAPI_LINUX_SYNC_H */ > > > -- > > > 2.31.1 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support
On Wed, Sep 08, 2021 at 11:19:15AM -0700, Rob Clark wrote: > On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter wrote: > > > > On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote: > > > From: Rob Clark > > > > > > Signed-off-by: Rob Clark > > > --- > > > drivers/dma-buf/dma-fence-chain.c | 13 + > > > 1 file changed, 13 insertions(+) > > > > > > diff --git a/drivers/dma-buf/dma-fence-chain.c > > > b/drivers/dma-buf/dma-fence-chain.c > > > index 1b4cb3e5cec9..736a9ad3ea6d 100644 > > > --- a/drivers/dma-buf/dma-fence-chain.c > > > +++ b/drivers/dma-buf/dma-fence-chain.c > > > @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence > > > *fence) > > > dma_fence_free(fence); > > > } > > > > > > + > > > +static void dma_fence_chain_set_deadline(struct dma_fence *fence, > > > + ktime_t deadline) > > > +{ > > > + dma_fence_chain_for_each(fence, fence) { > > > + struct dma_fence_chain *chain = to_dma_fence_chain(fence); > > > + struct dma_fence *f = chain ? chain->fence : fence; > > > > Doesn't this just end up calling set_deadline on a chain, potenetially > > resulting in recursion? Also I don't think this should ever happen, why > > did you add that? > > Tbh the fence-chain was the part I was a bit fuzzy about, and the main > reason I added igt tests. The iteration is similar to how, for ex, > dma_fence_chain_signaled() work, and according to the igt test it does > what was intended Huh indeed. Maybe something we should fix, like why does the dma_fence_chain_for_each not give you the upcast chain pointer ... I guess this also needs more Christian and less me. -Daniel > > BR, > -R > > > -Daniel > > > > > + > > > + dma_fence_set_deadline(f, deadline); > > > + } > > > +} > > > + > > > const struct dma_fence_ops dma_fence_chain_ops = { > > > .use_64bit_seqno = true, > > > .get_driver_name = dma_fence_chain_get_driver_name, > > > @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = { > > > .enable_signaling = dma_fence_chain_enable_signaling, > > > .signaled = dma_fence_chain_signaled, > > > .release = dma_fence_chain_release, > > > + .set_deadline = dma_fence_chain_set_deadline, > > > }; > > > EXPORT_SYMBOL(dma_fence_chain_ops); > > > > > > -- > > > 2.31.1 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/plane-helper: fix uninitialized variable reference
On Tue, Sep 07, 2021 at 10:08:36AM -0400, Alex Xu (Hello71) wrote: > drivers/gpu/drm/drm_plane_helper.c: In function 'drm_primary_helper_update': > drivers/gpu/drm/drm_plane_helper.c:113:32: error: 'visible' is used > uninitialized [-Werror=uninitialized] > 113 | struct drm_plane_state plane_state = { > |^~~ > drivers/gpu/drm/drm_plane_helper.c:178:14: note: 'visible' was declared here > 178 | bool visible; > | ^~~ > cc1: all warnings being treated as errors > > visible is an output, not an input. in practice this use might turn out > OK but it's still UB. > > Fixes: df86af9133 ("drm/plane-helper: Add drm_plane_helper_check_state()") I need a signed-off-by from you before I can merge this. See https://dri.freedesktop.org/docs/drm/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin Patch lgtm otherwise. -Daniel > --- > drivers/gpu/drm/drm_plane_helper.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_plane_helper.c > b/drivers/gpu/drm/drm_plane_helper.c > index 5b2d0ca03705..838b32b70bce 100644 > --- a/drivers/gpu/drm/drm_plane_helper.c > +++ b/drivers/gpu/drm/drm_plane_helper.c > @@ -123,7 +123,6 @@ static int drm_plane_helper_check_update(struct drm_plane > *plane, > .crtc_w = drm_rect_width(dst), > .crtc_h = drm_rect_height(dst), > .rotation = rotation, > - .visible = *visible, > }; > struct drm_crtc_state crtc_state = { > .crtc = crtc, > -- > 2.33.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] kernel/locking: Add context to ww_mutex_trylock.
On Wed, Sep 08, 2021 at 12:14:23PM +0200, Peter Zijlstra wrote: > On Tue, Sep 07, 2021 at 03:20:44PM +0200, Maarten Lankhorst wrote: > > i915 will soon gain an eviction path that trylock a whole lot of locks > > for eviction, getting dmesg failures like below: > > > > BUG: MAX_LOCK_DEPTH too low! > > turning off the locking correctness validator. > > depth: 48 max: 48! > > 48 locks held by i915_selftest/5776: > > #0: 888101a79240 (&dev->mutex){}-{3:3}, at: > > __driver_attach+0x88/0x160 > > #1: c99778c0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: > > i915_vma_pin.constprop.63+0x39/0x1b0 [i915] > > #2: 88800cf74de8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: > > i915_vma_pin.constprop.63+0x5f/0x1b0 [i915] > > #3: 88810c7f9e38 (&vm->mutex/1){+.+.}-{3:3}, at: > > i915_vma_pin_ww+0x1c4/0x9d0 [i915] > > #4: 88810bad5768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: > > i915_gem_evict_something+0x110/0x860 [i915] > > #5: 88810bad60e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: > > i915_gem_evict_something+0x110/0x860 [i915] > > ... > > #46: 88811964d768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: > > i915_gem_evict_something+0x110/0x860 [i915] > > #47: 88811964e0e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: > > i915_gem_evict_something+0x110/0x860 [i915] > > INFO: lockdep is turned off. > > > As an intermediate solution, add an acquire context to ww_mutex_trylock, > > which allows us to do proper nesting annotations on the trylocks, making > > the above lockdep splat disappear. > > Fair enough I suppose. What's maybe missing from the commit message - we'll probably use this for ttm too eventually - even when we add full ww_mutex locking we'll still have the trylock fastpath. This is because we have a lock inversion against list locks in these eviction paths, and the slow path unroll to drop that list lock is a bit nasty (and defintely expensive). iow even long term this here is needed in some form I think. -Daniel > > > +/** > > + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire > > context > > + * @lock: mutex to lock > > + * @ctx: optional w/w acquire context > > + * > > + * Trylocks a mutex with the optional acquire context; no deadlock > > detection is > > + * possible. Returns 1 if the mutex has been acquired successfully, 0 > > otherwise. > > + * > > + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a > > @ctx is > > + * specified, -EALREADY and -EDEADLK handling may happen in calls to > > ww_mutex_lock. > > + * > > + * A mutex acquired with this function must be released with > > ww_mutex_unlock. > > + */ > > +int __sched > > +ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ctx) > > +{ > > + bool locked; > > + > > + if (!ctx) > > + return mutex_trylock(&ww->base); > > + > > +#ifdef CONFIG_DEBUG_MUTEXES > > + DEBUG_LOCKS_WARN_ON(ww->base.magic != &ww->base); > > +#endif > > + > > + preempt_disable(); > > + locked = __mutex_trylock(&ww->base); > > + > > + if (locked) { > > + ww_mutex_set_context_fastpath(ww, ctx); > > + mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ctx->dep_map, > > _RET_IP_); > > + } > > + preempt_enable(); > > + > > + return locked; > > +} > > +EXPORT_SYMBOL(ww_mutex_trylock); > > You'll need a similar hunk in ww_rt_mutex.c -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [resend PATCH] drm/ttm: Fix a deadlock if the target BO is not idle during swap
On Tue, Sep 07, 2021 at 11:28:23AM +0200, Christian König wrote: > Am 07.09.21 um 11:05 schrieb Daniel Vetter: > > On Tue, Sep 07, 2021 at 08:22:20AM +0200, Christian König wrote: > > > Added a Fixes tag and pushed this to drm-misc-fixes. > > We're in the merge window, this should have been drm-misc-next-fixes. I'll > > poke misc maintainers so it's not lost. > > Hui? It's a fix for a problem in stable and not in drm-misc-next. Ah the flow chart is confusing. There is no current -rc, so it's always -next-fixes. Or you're running the risk that it's lost until after -rc1. Maybe we should clarify that "is the bug in current -rc?" only applies if there is a current -rc. Anyway Thomas sent out a pr, so it's all good. -Daniel > > Christian. > > > -Daniel > > > > > It will take a while until it cycles back into the development branches, > > > so > > > feel free to push some version to amd-staging-drm-next as well. Just ping > > > Alex when you do this. > > > > > > Thanks, > > > Christian. > > > > > > Am 07.09.21 um 06:08 schrieb xinhui pan: > > > > The ret value might be -EBUSY, caller will think lru lock is still > > > > locked but actually NOT. So return -ENOSPC instead. Otherwise we hit > > > > list corruption. > > > > > > > > ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, > > > > caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will > > > > be stuck as we actually did not free any BO memory. This usually happens > > > > when the fence is not signaled for a long time. > > > > > > > > Signed-off-by: xinhui pan > > > > Reviewed-by: Christian König > > > > --- > > > >drivers/gpu/drm/ttm/ttm_bo.c | 6 +++--- > > > >1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c > > > > index 8d7fd65ccced..23f906941ac9 100644 > > > > --- a/drivers/gpu/drm/ttm/ttm_bo.c > > > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > > > > @@ -1152,9 +1152,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, > > > > struct ttm_operation_ctx *ctx, > > > > } > > > > if (bo->deleted) { > > > > - ttm_bo_cleanup_refs(bo, false, false, locked); > > > > + ret = ttm_bo_cleanup_refs(bo, false, false, locked); > > > > ttm_bo_put(bo); > > > > - return 0; > > > > + return ret == -EBUSY ? -ENOSPC : ret; > > > > } > > > > ttm_bo_del_from_lru(bo); > > > > @@ -1208,7 +1208,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, > > > > struct ttm_operation_ctx *ctx, > > > > if (locked) > > > > dma_resv_unlock(bo->base.resv); > > > > ttm_bo_put(bo); > > > > - return ret; > > > > + return ret == -EBUSY ? -ENOSPC : ret; > > > >} > > > >void ttm_bo_tt_destroy(struct ttm_buffer_object *bo) > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm: mxsfb: Fix NULL pointer dereference crash on unload
On Tue, Sep 07, 2021 at 04:49:00AM +0200, Marek Vasut wrote: > The mxsfb->crtc.funcs may already be NULL when unloading the driver, > in which case calling mxsfb_irq_disable() via drm_irq_uninstall() from > mxsfb_unload() leads to NULL pointer dereference. > > Since all we care about is masking the IRQ and mxsfb->base is still > valid, just use that to clear and mask the IRQ. > > Fixes: ae1ed00932819 ("drm: mxsfb: Stop using DRM simple display pipeline > helper") > Signed-off-by: Marek Vasut > Cc: Daniel Abrecht > Cc: Emil Velikov > Cc: Laurent Pinchart > Cc: Sam Ravnborg > Cc: Stefan Agner You probably want a drm_atomic_helper_shutdown instead of trying to do all that manually. We've also added a bunch more devm and drmm_ functions to automate the cleanup a lot more here, e.g. your drm_mode_config_cleanup is in the wrong place. Also I'm confused because I'm not even seeing this function anywhere in upstream. -Daniel > --- > drivers/gpu/drm/mxsfb/mxsfb_drv.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/mxsfb/mxsfb_drv.c > b/drivers/gpu/drm/mxsfb/mxsfb_drv.c > index ec0432fe1bdf8..86d78634a9799 100644 > --- a/drivers/gpu/drm/mxsfb/mxsfb_drv.c > +++ b/drivers/gpu/drm/mxsfb/mxsfb_drv.c > @@ -173,7 +173,11 @@ static void mxsfb_irq_disable(struct drm_device *drm) > struct mxsfb_drm_private *mxsfb = drm->dev_private; > > mxsfb_enable_axi_clk(mxsfb); > - mxsfb->crtc.funcs->disable_vblank(&mxsfb->crtc); > + > + /* Disable and clear VBLANK IRQ */ > + writel(CTRL1_CUR_FRAME_DONE_IRQ_EN, mxsfb->base + LCDC_CTRL1 + REG_CLR); > + writel(CTRL1_CUR_FRAME_DONE_IRQ, mxsfb->base + LCDC_CTRL1 + REG_CLR); > + > mxsfb_disable_axi_clk(mxsfb); > } > > -- > 2.33.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 1/2] drm/nouveau/ga102-: support ttm buffer moves via copy engine
On Mon, Sep 06, 2021 at 10:56:27AM +1000, Ben Skeggs wrote: > From: Ben Skeggs > > We don't currently have any kind of real acceleration on Ampere GPUs, > but the TTM memcpy() fallback paths aren't really designed to handle > copies between different devices, such as on Optimus systems, and > result in a kernel OOPS. Is this just for moving a buffer from vram to system memory when you pin it for dma-buf? I'm kinda lost what you even use ttm bo moves for if there's no one using the gpu. Also I guess memcpy goes boom if you can't mmap it because it's outside the gart? Or just that it's very slow. We're trying to use ttm memcyp as fallback, so want to know how this can all go wrong :-) -Daniel > > A few options were investigated to try and fix this, but didn't work > out, and likely would have resulted in a very unpleasant experience > for users anyway. > > This commit adds just enough support for setting up a single channel > connected to a copy engine, which the kernel can use to accelerate > the buffer copies between devices. Userspace has no access to this > incomplete channel support, but it's suitable for TTM's needs. > > A more complete implementation of host(fifo) for Ampere GPUs is in > the works, but the required changes are far too invasive that they > would be unsuitable to backport to fix this issue on current kernels. > > Signed-off-by: Ben Skeggs > Cc: Lyude Paul > Cc: Karol Herbst > Cc: # v5.12+ > --- > drivers/gpu/drm/nouveau/include/nvif/class.h | 2 + > .../drm/nouveau/include/nvkm/engine/fifo.h| 1 + > drivers/gpu/drm/nouveau/nouveau_bo.c | 1 + > drivers/gpu/drm/nouveau/nouveau_chan.c| 6 +- > drivers/gpu/drm/nouveau/nouveau_drm.c | 4 + > drivers/gpu/drm/nouveau/nv84_fence.c | 2 +- > .../gpu/drm/nouveau/nvkm/engine/device/base.c | 3 + > .../gpu/drm/nouveau/nvkm/engine/fifo/Kbuild | 1 + > .../gpu/drm/nouveau/nvkm/engine/fifo/ga102.c | 308 ++ > .../gpu/drm/nouveau/nvkm/subdev/top/ga100.c | 7 +- > 10 files changed, 329 insertions(+), 6 deletions(-) > create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fifo/ga102.c > > diff --git a/drivers/gpu/drm/nouveau/include/nvif/class.h > b/drivers/gpu/drm/nouveau/include/nvif/class.h > index c68cc957248e..a582c0cb0cb0 100644 > --- a/drivers/gpu/drm/nouveau/include/nvif/class.h > +++ b/drivers/gpu/drm/nouveau/include/nvif/class.h > @@ -71,6 +71,7 @@ > #define PASCAL_CHANNEL_GPFIFO_A /* cla06f.h */ > 0xc06f > #define VOLTA_CHANNEL_GPFIFO_A/* clc36f.h */ > 0xc36f > #define TURING_CHANNEL_GPFIFO_A /* clc36f.h */ > 0xc46f > +#define AMPERE_CHANNEL_GPFIFO_B /* clc36f.h */ > 0xc76f > > #define NV50_DISP /* cl5070.h */ > 0x5070 > #define G82_DISP /* cl5070.h */ > 0x8270 > @@ -200,6 +201,7 @@ > #define PASCAL_DMA_COPY_B > 0xc1b5 > #define VOLTA_DMA_COPY_A > 0xc3b5 > #define TURING_DMA_COPY_A > 0xc5b5 > +#define AMPERE_DMA_COPY_B > 0xc7b5 > > #define FERMI_DECOMPRESS > 0x90b8 > > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > index 54fab7cc36c1..64ee82c7c1be 100644 > --- a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > +++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h > @@ -77,4 +77,5 @@ int gp100_fifo_new(struct nvkm_device *, enum > nvkm_subdev_type, int inst, struct > int gp10b_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > struct nvkm_fifo **); > int gv100_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > struct nvkm_fifo **); > int tu102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > struct nvkm_fifo **); > +int ga102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, > struct nvkm_fifo **); > #endif > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c > b/drivers/gpu/drm/nouveau/nouveau_bo.c > index 4a7cebac8060..b3e4f555fa05 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_bo.c > +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c > @@ -844,6 +844,7 @@ nouveau_bo_move_init(struct nouveau_drm *drm) > struct ttm_resource *, struct ttm_resource *); > int (*init)(struct nouveau_channel *, u32 handle); > } _methods[] = { > + { "COPY", 4, 0xc7b5, nve0_bo_move_copy, nve0_bo_move_init }, > { "COPY", 4, 0xc5b5, nve0_bo_move_copy, nve0_bo_move_init }, > { "GRCE", 0, 0xc5b5, nve0_bo_move_copy, nvc0_bo_move_init }, > { "COPY", 4, 0xc3b5, nve0_bo_move_copy, nve0_bo_move_i
Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl
On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter wrote: > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote: > > From: Rob Clark > > > > The initial purpose is for igt tests, but this would also be useful for > > compositors that wait until close to vblank deadline to make decisions > > about which frame to show. > > > > Signed-off-by: Rob Clark > > Needs userspace and I think ideally also some igts to make sure it works > and doesn't go boom. See cover-letter.. there are igt tests, although currently that is the only user. I'd be ok to otherwise initially restrict this and the sw_sync UABI (CAP_SYS_ADMIN? Or??) until there is a non-igt user, but they are both needed by the igt tests BR, -R > -Daniel > > > --- > > drivers/dma-buf/sync_file.c| 19 +++ > > include/uapi/linux/sync_file.h | 20 > > 2 files changed, 39 insertions(+) > > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c > > index 394e6e1e9686..f295772d5169 100644 > > --- a/drivers/dma-buf/sync_file.c > > +++ b/drivers/dma-buf/sync_file.c > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct > > sync_file *sync_file, > > return ret; > > } > > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file, > > + unsigned long arg) > > +{ > > + struct sync_set_deadline ts; > > + > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts))) > > + return -EFAULT; > > + > > + if (ts.pad) > > + return -EINVAL; > > + > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, > > ts.tv_nsec)); > > + > > + return 0; > > +} > > + > > static long sync_file_ioctl(struct file *file, unsigned int cmd, > > unsigned long arg) > > { > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned > > int cmd, > > case SYNC_IOC_FILE_INFO: > > return sync_file_ioctl_fence_info(sync_file, arg); > > > > + case SYNC_IOC_SET_DEADLINE: > > + return sync_file_ioctl_set_deadline(sync_file, arg); > > + > > default: > > return -ENOTTY; > > } > > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h > > index ee2dcfb3d660..f67d4ffe7566 100644 > > --- a/include/uapi/linux/sync_file.h > > +++ b/include/uapi/linux/sync_file.h > > @@ -67,6 +67,18 @@ struct sync_file_info { > > __u64 sync_fence_info; > > }; > > > > +/** > > + * struct sync_set_deadline - set a deadline on a fence > > + * @tv_sec: seconds elapsed since epoch > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec > > + * @pad: must be zero > > + */ > > +struct sync_set_deadline { > > + __s64 tv_sec; > > + __s32 tv_nsec; > > + __u32 pad; > > +}; > > + > > #define SYNC_IOC_MAGIC '>' > > > > /** > > @@ -95,4 +107,12 @@ struct sync_file_info { > > */ > > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct > > sync_file_info) > > > > + > > +/** > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence > > + * > > + * Allows userspace to set a deadline on a fence, see > > dma_fence_set_deadline() > > + */ > > +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct > > sync_set_deadline) > > + > > #endif /* _UAPI_LINUX_SYNC_H */ > > -- > > 2.31.1 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
[PATCH 2/2] drm/bridge: parade-ps8640: Add support for AUX channel
Implement the first version of AUX support, which will be useful as we expand the driver to support varied use cases. Signed-off-by: Philip Chen --- drivers/gpu/drm/bridge/parade-ps8640.c | 123 + 1 file changed, 123 insertions(+) diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c b/drivers/gpu/drm/bridge/parade-ps8640.c index a16725dbf912..3f0241a60357 100644 --- a/drivers/gpu/drm/bridge/parade-ps8640.c +++ b/drivers/gpu/drm/bridge/parade-ps8640.c @@ -9,15 +9,36 @@ #include #include #include +#include #include #include #include +#include #include #include #include #include +#define PAGE0_AUXCH_CFG3 0x76 +#define AUXCH_CFG3_RESET 0xff +#define PAGE0_AUX_ADDR_7_0 0x7d +#define PAGE0_AUX_ADDR_15_80x7e +#define PAGE0_AUX_ADDR_23_16 0x7f +#define AUX_ADDR_19_16_MASK GENMASK(3, 0) +#define AUX_CMD_MASK GENMASK(7, 4) +#define PAGE0_AUX_LENGTH 0x80 +#define AUX_LENGTH_MASK GENMASK(3, 0) +#define PAGE0_AUX_WDATA0x81 +#define PAGE0_AUX_RDATA0x82 +#define PAGE0_AUX_CTRL 0x83 +#define AUX_START 0x01 +#define PAGE0_AUX_STATUS 0x84 +#define AUX_STATUS_MASK GENMASK(7, 5) +#define AUX_STATUS_TIMEOUT(0x7 << 5) +#define AUX_STATUS_DEFER (0x2 << 5) +#define AUX_STATUS_NACK (0x1 << 5) + #define PAGE2_GPIO_H 0xa7 #define PS_GPIO9 BIT(1) #define PAGE2_I2C_BYPASS 0xea @@ -63,6 +84,7 @@ enum ps8640_vdo_control { struct ps8640 { struct drm_bridge bridge; struct drm_bridge *panel_bridge; + struct drm_dp_aux aux; struct mipi_dsi_device *dsi; struct i2c_client *page[MAX_DEVS]; struct regmap *regmap[MAX_DEVS]; @@ -93,6 +115,102 @@ static inline struct ps8640 *bridge_to_ps8640(struct drm_bridge *e) return container_of(e, struct ps8640, bridge); } +static inline struct ps8640 *aux_to_ps8640(struct drm_dp_aux *aux) +{ + return container_of(aux, struct ps8640, aux); +} + +static ssize_t ps8640_aux_transfer(struct drm_dp_aux *aux, + struct drm_dp_aux_msg *msg) +{ + struct ps8640 *ps_bridge = aux_to_ps8640(aux); + struct i2c_client *client = ps_bridge->page[PAGE0_DP_CNTL]; + struct regmap *map = ps_bridge->regmap[PAGE0_DP_CNTL]; + unsigned int len = msg->size; + unsigned int data; + int ret; + u8 request = msg->request & +~(DP_AUX_I2C_MOT | DP_AUX_I2C_WRITE_STATUS_UPDATE); + u8 *buf = msg->buffer; + bool is_native_aux = false; + + if (len > DP_AUX_MAX_PAYLOAD_BYTES) + return -EINVAL; + + pm_runtime_get_sync(&client->dev); + + switch (request) { + case DP_AUX_NATIVE_WRITE: + case DP_AUX_NATIVE_READ: + is_native_aux = true; + case DP_AUX_I2C_WRITE: + case DP_AUX_I2C_READ: + regmap_write(map, PAGE0_AUXCH_CFG3, AUXCH_CFG3_RESET); + break; + default: + ret = -EINVAL; + goto exit; + } + + /* Assume it's good */ + msg->reply = 0; + + data = ((request << 4) & AUX_CMD_MASK) | + ((msg->address >> 16) & AUX_ADDR_19_16_MASK); + regmap_write(map, PAGE0_AUX_ADDR_23_16, data); + data = (msg->address >> 8) & 0xff; + regmap_write(map, PAGE0_AUX_ADDR_15_8, data); + data = msg->address & 0xff; + regmap_write(map, PAGE0_AUX_ADDR_7_0, msg->address & 0xff); + + data = (len - 1) & AUX_LENGTH_MASK; + regmap_write(map, PAGE0_AUX_LENGTH, data); + + if (request == DP_AUX_NATIVE_WRITE || request == DP_AUX_I2C_WRITE) { + ret = regmap_noinc_write(map, PAGE0_AUX_WDATA, buf, len); + if (ret < 0) { + DRM_ERROR("failed to write PAGE0_AUX_WDATA"); + goto exit; + } + } + + regmap_write(map, PAGE0_AUX_CTRL, AUX_START); + + regmap_read(map, PAGE0_AUX_STATUS, &data); + switch (data & AUX_STATUS_MASK) { + case AUX_STATUS_DEFER: + if (is_native_aux) + msg->reply |= DP_AUX_NATIVE_REPLY_DEFER; + else + msg->reply |= DP_AUX_I2C_REPLY_DEFER; + goto exit; + case AUX_STATUS_NACK: + if (is_native_aux) + msg->reply |= DP_AUX_NATIVE_REPLY_NACK; + else + msg->reply |= DP_AUX_I2C_REPLY_NACK; + goto exit; + case AUX_STATUS_TIMEOUT: + ret = -ETIMEDOUT; + goto exit; + } + + if (request == DP_AUX_NATIVE_READ || request == DP_AUX_I2C_READ) { + ret = regmap_noinc_read(map, PAGE0_AUX_RDATA, buf, len); + if (ret < 0) + DRM_ERROR("failed to read PAGE0_AUX_RDATA"); + } + +exit: + pm_runtime_mark_last_busy
[PATCH 1/2] drm/bridge: parade-ps8640: Use regmap APIs
Replace the direct i2c access (i2c_smbus_* functions) with regmap APIs, which will simplify the future update on ps8640 driver. Signed-off-by: Philip Chen --- drivers/gpu/drm/bridge/parade-ps8640.c | 66 +++--- 1 file changed, 39 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c b/drivers/gpu/drm/bridge/parade-ps8640.c index 685e9c38b2db..a16725dbf912 100644 --- a/drivers/gpu/drm/bridge/parade-ps8640.c +++ b/drivers/gpu/drm/bridge/parade-ps8640.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -64,12 +65,29 @@ struct ps8640 { struct drm_bridge *panel_bridge; struct mipi_dsi_device *dsi; struct i2c_client *page[MAX_DEVS]; + struct regmap *regmap[MAX_DEVS]; struct regulator_bulk_data supplies[2]; struct gpio_desc *gpio_reset; struct gpio_desc *gpio_powerdown; bool powered; }; +static const struct regmap_range ps8640_volatile_ranges[] = { + { .range_min = 0, .range_max = 0xff }, +}; + +static const struct regmap_access_table ps8640_volatile_table = { + .yes_ranges = ps8640_volatile_ranges, + .n_yes_ranges = ARRAY_SIZE(ps8640_volatile_ranges), +}; + +static const struct regmap_config ps8640_regmap_config = { + .reg_bits = 8, + .val_bits = 8, + .volatile_table = &ps8640_volatile_table, + .cache_type = REGCACHE_NONE, +}; + static inline struct ps8640 *bridge_to_ps8640(struct drm_bridge *e) { return container_of(e, struct ps8640, bridge); @@ -78,13 +96,13 @@ static inline struct ps8640 *bridge_to_ps8640(struct drm_bridge *e) static int ps8640_bridge_vdo_control(struct ps8640 *ps_bridge, const enum ps8640_vdo_control ctrl) { - struct i2c_client *client = ps_bridge->page[PAGE3_DSI_CNTL1]; - u8 vdo_ctrl_buf[] = { VDO_CTL_ADD, ctrl }; + struct regmap *map = ps_bridge->regmap[PAGE3_DSI_CNTL1]; + u8 vdo_ctrl_buf[] = {VDO_CTL_ADD, ctrl}; int ret; - ret = i2c_smbus_write_i2c_block_data(client, PAGE3_SET_ADD, -sizeof(vdo_ctrl_buf), -vdo_ctrl_buf); + ret = regmap_bulk_write(map, PAGE3_SET_ADD, + vdo_ctrl_buf, sizeof(vdo_ctrl_buf)); + if (ret < 0) { DRM_ERROR("failed to %sable VDO: %d\n", ctrl == ENABLE ? "en" : "dis", ret); @@ -96,8 +114,7 @@ static int ps8640_bridge_vdo_control(struct ps8640 *ps_bridge, static void ps8640_bridge_poweron(struct ps8640 *ps_bridge) { - struct i2c_client *client = ps_bridge->page[PAGE2_TOP_CNTL]; - unsigned long timeout; + struct regmap *map = ps_bridge->regmap[PAGE2_TOP_CNTL]; int ret, status; if (ps_bridge->powered) @@ -121,18 +138,12 @@ static void ps8640_bridge_poweron(struct ps8640 *ps_bridge) */ msleep(200); - timeout = jiffies + msecs_to_jiffies(200) + 1; + ret = regmap_read_poll_timeout(map, PAGE2_GPIO_H, status, + status & PS_GPIO9, 20 * 1000, 200 * 1000); - while (time_is_after_jiffies(timeout)) { - status = i2c_smbus_read_byte_data(client, PAGE2_GPIO_H); - if (status < 0) { - DRM_ERROR("failed read PAGE2_GPIO_H: %d\n", status); - goto err_regulators_disable; - } - if ((status & PS_GPIO9) == PS_GPIO9) - break; - - msleep(20); + if (ret < 0) { + DRM_ERROR("failed read PAGE2_GPIO_H: %d\n", ret); + goto err_regulators_disable; } msleep(50); @@ -144,22 +155,15 @@ static void ps8640_bridge_poweron(struct ps8640 *ps_bridge) * disabled by the manufacturer. Once disabled, all MCS commands are * ignored by the display interface. */ - status = i2c_smbus_read_byte_data(client, PAGE2_MCS_EN); - if (status < 0) { - DRM_ERROR("failed read PAGE2_MCS_EN: %d\n", status); - goto err_regulators_disable; - } - ret = i2c_smbus_write_byte_data(client, PAGE2_MCS_EN, - status & ~MCS_EN); + ret = regmap_update_bits(map, PAGE2_MCS_EN, MCS_EN, 0); if (ret < 0) { DRM_ERROR("failed write PAGE2_MCS_EN: %d\n", ret); goto err_regulators_disable; } /* Switch access edp panel's edid through i2c */ - ret = i2c_smbus_write_byte_data(client, PAGE2_I2C_BYPASS, - I2C_BYPASS_EN); + ret = regmap_write(map, PAGE2_I2C_BYPASS, I2C_BYPASS_EN); if (ret < 0) { DRM_ERROR("failed write PAGE2_I2C_BYPASS: %d\n", ret); goto err_regulators_disable; @@ -361,6 +365,10 @@ static int ps8640_probe(struct i2c_client *client)
Re: [PATCH] doc: gpu: Add document describing buffer exchange
On Sun, Sep 05, 2021 at 01:27:42PM +0100, Daniel Stone wrote: > Since there's a lot of confusion around this, document both the rules > and the best practice around negotiating, allocating, importing, and > using buffers when crossing context/process/device/subsystem boundaries. > > This ties up all of dmabuf, formats and modifiers, and their usage. > > Signed-off-by: Daniel Stone > --- > > This is just a quick first draft, inspired by: > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3197#note_1048637 > > It's not complete or perfect, but I'm off to eat a roast then have a > nice walk in the sun, so figured it'd be better to dash it off rather > than let it rot on my hard drive. > > > .../gpu/exchanging-pixel-buffers.rst | 285 ++ I think we should stuff this into the dma-buf.rst page instead of hiding it in gpu? Maybe then link to it from everywhere, so from a the prime stuff in gpu, and from whatever doc there is for the v4l import/export ioctls. > Documentation/gpu/index.rst | 1 + > 2 files changed, 286 insertions(+) > create mode 100644 Documentation/gpu/exchanging-pixel-buffers.rst > > diff --git a/Documentation/gpu/exchanging-pixel-buffers.rst > b/Documentation/gpu/exchanging-pixel-buffers.rst > new file mode 100644 > index ..75c4de13d5c8 > --- /dev/null > +++ b/Documentation/gpu/exchanging-pixel-buffers.rst > @@ -0,0 +1,285 @@ > +.. Copyright 2021 Collabora Ltd. > + > + > +Exchanging pixel buffers > + > + > +As originally designed, the Linux graphics subsystem had extremely limited > +support for sharing pixel-buffer allocations between processes, devices, and > +subsystems. Modern systems require extensive integration between all three > +classes; this document details how applications and kernel subsystems should > +approach this sharing for two-dimensional image data. > + > +It is written with reference to the DRM subsystem for GPU and display > devices, > +V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspace > +support, however any other subsystems should also follow this design and > advice. > + > + > +Formats and modifiers > += > + > +Each buffer must have an underlying format. This format describes the data > which > +can be stored and loaded for each pixel. Although each subsystem has its own > +format descriptions (e.g. V4L2 and fbdev), the `DRM_FORMAT_*` tokens should > be > +reused wherever possible, as they are the standard descriptions used for > +interchange. > + > +Each `DRM_FORMAT_*` token describes the per-pixel data available, in terms of > +the translation between one or more pixels in memory, and the color data > +contained within that memory. The number and type of color channels are > +described: whether they are RGB or YUV, integer or floating-point, the size > +of each channel and their locations within the pixel memory, and the > +relationship between color planes. > + > +For example, `DRM_FORMAT_ARGB` describes a format in which each pixel > has a > +single 32-bit value in memory. Alpha, red, green, and blue, color channels > are > +available at 8-byte precision per channel, ordered respectively from most to > +least significant bits in little-endian storage. As a more complex example, > +`DRM_FORMAT_NV12` describes a format in which luma and chroma YUV samples are > +stored in separate memory planes, where the chroma plane is stored at half > the > +resolution in both dimensions (i.e. one U/V chroma sample is stored for each > 2x2 > +pixel grouping). > + > +Format modifiers describe a translation mechanism between these per-pixel > memory > +samples, and the actual memory storage for the buffer. The most > straightforward > +modifier is `DRM_FORMAT_MOD_LINEAR`, describing a scheme in which each pixel > has > +contiguous storage beginning at (0,0); each pixel's location in memory will > be > +`base + (y * stride) + (x * bpp)`. This is considered the baseline > interchange > +format, and most convenient for CPU access. > + > +Modern hardware employs much more sophisticated access mechanisms, typically > +making use of tiled access and possibly also compression. For example, the > +`DRM_FORMAT_MOD_VIVANTE_TILED` modifier describes memory storage where pixels > +are stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile > in > +memory stores pixels (0,0) to (3,3) inclusive, and the second tile in memory > +stores pixels (4,0) to (7,3) inclusive. > + > +Some modifiers may modify the number of memory buffers required to store the > +data; for example, the `I915_FORMAT_MOD_Y_TILED_CCS` modifier adds a second > +memory buffer to RGB formats in which it stores data about the status of > every > +tile, notably including whether the tile is fully populated with pixel data, > or > +can be expanded from a single solid color. > + > +These extended layouts are highly vendor-spe
Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support
On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter wrote: > > On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote: > > From: Rob Clark > > > > Signed-off-by: Rob Clark > > --- > > drivers/dma-buf/dma-fence-chain.c | 13 + > > 1 file changed, 13 insertions(+) > > > > diff --git a/drivers/dma-buf/dma-fence-chain.c > > b/drivers/dma-buf/dma-fence-chain.c > > index 1b4cb3e5cec9..736a9ad3ea6d 100644 > > --- a/drivers/dma-buf/dma-fence-chain.c > > +++ b/drivers/dma-buf/dma-fence-chain.c > > @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence > > *fence) > > dma_fence_free(fence); > > } > > > > + > > +static void dma_fence_chain_set_deadline(struct dma_fence *fence, > > + ktime_t deadline) > > +{ > > + dma_fence_chain_for_each(fence, fence) { > > + struct dma_fence_chain *chain = to_dma_fence_chain(fence); > > + struct dma_fence *f = chain ? chain->fence : fence; > > Doesn't this just end up calling set_deadline on a chain, potenetially > resulting in recursion? Also I don't think this should ever happen, why > did you add that? Tbh the fence-chain was the part I was a bit fuzzy about, and the main reason I added igt tests. The iteration is similar to how, for ex, dma_fence_chain_signaled() work, and according to the igt test it does what was intended BR, -R > -Daniel > > > + > > + dma_fence_set_deadline(f, deadline); > > + } > > +} > > + > > const struct dma_fence_ops dma_fence_chain_ops = { > > .use_64bit_seqno = true, > > .get_driver_name = dma_fence_chain_get_driver_name, > > @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = { > > .enable_signaling = dma_fence_chain_enable_signaling, > > .signaled = dma_fence_chain_signaled, > > .release = dma_fence_chain_release, > > + .set_deadline = dma_fence_chain_set_deadline, > > }; > > EXPORT_SYMBOL(dma_fence_chain_ops); > > > > -- > > 2.31.1 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
Re: [PATCH v2 7/7] drm/gud: Add module parameter to control emulation: xrgb8888
Hi Am 07.09.21 um 13:57 schrieb Noralf Trønnes: For devices that don't support XRGB give the user the ability to choose what's most important: Color depth or frames per second. Add an 'xrgb' module parameter to override the emulation format. Assume the user wants full control if xrgb is set and don't set DRM_CAP_DUMB_PREFERRED_DEPTH if RGB565 is supported (AFAIK only X.org supports this). More of a general statement: wouldn't it make more sense to auto-detect this entirely? The GUD protocol could order the list of supported formats by preference (maybe it does already). Or you could take the type of USB connection into account. Additionally, xrgb is really a fall-back for lazy userspace programs, but userspace should do better IMHO. Best regards Thomas Signed-off-by: Noralf Trønnes --- drivers/gpu/drm/gud/gud_drv.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c index 3f9d4b9a1e3d..60d27ee5ddbd 100644 --- a/drivers/gpu/drm/gud/gud_drv.c +++ b/drivers/gpu/drm/gud/gud_drv.c @@ -30,6 +30,10 @@ #include "gud_internal.h" +static int gud_xrgb; +module_param_named(xrgb, gud_xrgb, int, 0644); +MODULE_PARM_DESC(xrgb, "XRGB emulation format: GUD_PIXEL_FORMAT_* value, 0=auto, -1=disable [default=auto]"); + /* Only used internally */ static const struct drm_format_info gud_drm_format_r1 = { .format = GUD_DRM_FORMAT_R1, @@ -530,12 +534,12 @@ static int gud_probe(struct usb_interface *intf, const struct usb_device_id *id) case DRM_FORMAT_RGB332: fallthrough; case DRM_FORMAT_RGB888: - if (!xrgb_emulation_format) + if (!gud_xrgb && !xrgb_emulation_format) xrgb_emulation_format = info; break; case DRM_FORMAT_RGB565: rgb565_supported = true; - if (!xrgb_emulation_format) + if (!gud_xrgb && !xrgb_emulation_format) xrgb_emulation_format = info; break; case DRM_FORMAT_XRGB: @@ -543,6 +547,9 @@ static int gud_probe(struct usb_interface *intf, const struct usb_device_id *id) break; } + if (gud_xrgb == formats_dev[i]) + xrgb_emulation_format = info; + fmt_buf_size = drm_format_info_min_pitch(info, 0, drm->mode_config.max_width) * drm->mode_config.max_height; max_buffer_size = max(max_buffer_size, fmt_buf_size); @@ -559,7 +566,7 @@ static int gud_probe(struct usb_interface *intf, const struct usb_device_id *id) } /* Prefer speed over color depth */ - if (rgb565_supported) + if (!gud_xrgb && rgb565_supported) drm->mode_config.preferred_depth = 16; if (!xrgb_supported && xrgb_emulation_format) { -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH] drm/rockchip: Update crtc fixup to account for fractional clk change
On Wed, Sep 08, 2021 at 08:53:56AM -0500, Chris Morgan wrote: > From: Chris Morgan > > After commit 928f9e268611 ("clk: fractional-divider: Hide > clk_fractional_divider_ops from wide audience") was merged it appears > that the DSI panel on my Odroid Go Advance stopped working. Upon closer > examination of the problem, it looks like it was the fixup in the > rockchip_drm_vop.c file was causing the issue. The changes made to the > clk driver appear to change some assumptions made in the fixup. > > After debugging the working 5.14 kernel and the no-longer working > 5.15 kernel, it looks like this was broken all along but still > worked, whereas after the fractional clock change it stopped > working despite the issue (it went from sort-of broken to very broken). > > In the 5.14 kernel the dclk_vopb_frac was being requested to be set to > 17000999 on my board. The clock driver was taking the value of the > parent clock and attempting to divide the requested value from it > (1700/17000999 = 0), then subtracting 1 from it (making it -1), > and running it through fls_long to get 64. It would then subtract > the value of fd->mwidth from it to get 48, and then bit shift > 17000999 to the left by 48, coming up with a very large number of > 7649082492112076800. This resulted in a numerator of 65535 and a > denominator of 1 from the clk driver. The driver seemingly would > try again and get a correct 1:1 value later, and then move on. > > Output from my 5.14 kernel (with some printfs for good measure): > [2.830066] rockchip-drm display-subsystem: bound ff46.vop (ops > vop_component_ops) > [2.839431] rockchip-drm display-subsystem: bound ff45.dsi (ops > dw_mipi_dsi_rockchip_ops) > [2.855980] Clock is dclk_vopb_frac > [2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent > Rate 1700, Best Numerator 65535, Best Denominator 1, fd->mwidth 16 > [2.903529] Clock is dclk_vopb_frac > [2.903556] Scale 0, Rate 1700, Oldrate 1700, Parent Rate > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > [2.903579] Clock is dclk_vopb_frac > [2.903583] Scale 0, Rate 1700, Oldrate 1700, Parent Rate > 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16 > > Contrast this with 5.15 after the clk change where the rate of 17000999 > was getting passed and resulted in numerators/denomiators of 17001/ > 17000. > > Output from my 5.15 kernel (with some printfs added for good measure): > [2.817571] rockchip-drm display-subsystem: bound ff46.vop (ops > vop_component_ops) > [2.826975] rockchip-drm display-subsystem: bound ff45.dsi (ops > dw_mipi_dsi_rockchip_ops) > [2.843430] Rate 17000999, Parent Rate 1700, Best Numerator 17018, > Best Denominator 17017 > [2.891073] Rate 17001000, Parent Rate 1700, Best Numerator 17001, > Best Denominator 17000 > [2.891269] Rate 17001000, Parent Rate 1700, Best Numerator 17001, > Best Denominator 17000 > [2.891281] Rate 17001000, Parent Rate 1700, Best Numerator 17001, > Best Denominator 17000 > > After tracing through the code it appeared that this function here was > adding a 999 to the requested frequency because of how the clk driver > was rounding/accepting those frequencies. I believe after the changes > made in the commit listed above the assumptions listed in this driver > are no longer true. When I remove the + 999 from the driver the DSI > panel begins to work again. > > Output from my 5.15 kernel with 999 removed (printfs added): > [2.852054] rockchip-drm display-subsystem: bound ff46.vop (ops > vop_component_ops) > [2.864483] rockchip-drm display-subsystem: bound ff45.dsi (ops > dw_mipi_dsi_rockchip_ops) > [2.880869] Clock is dclk_vopb_frac > [2.880892] Rate 1700, Parent Rate 1700, Best Numerator 1, Best > Denominator 1 > [2.928521] Clock is dclk_vopb_frac > [2.928551] Rate 1700, Parent Rate 1700, Best Numerator 1, Best > Denominator 1 > [2.928570] Clock is dclk_vopb_frac > [2.928574] Rate 1700, Parent Rate 1700, Best Numerator 1, Best > Denominator 1 > > I have tested the change extensively on my Odroid Go Advance (Rockchip > RK3326) and it appears to work well. However, this change will affect > all Rockchip SoCs that use this driver so I believe further testing > is warranted. Please note that without this change I can confirm > at least all PX30s with DSI panels will stop working with the 5.15 > kernel. To me it all makes a lot of sense, thank you for deep analysis of the issue! In any case I think we will need a Fixes tag to something (either one of clk-fractional-divider.c series or preexisted). Anyway, FWIW, Reviewed-by: Andy Shevchenko > Signed-off-by: Chris Morgan > --- > drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++-- > 1 file changed, 3 insertions(+), 18 deletions(-) > > diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
Re: [PATCH v3 6/9] dma-buf/fence-array: Add fence deadline support
On Fri, Sep 03, 2021 at 11:47:57AM -0700, Rob Clark wrote: > From: Rob Clark > > Signed-off-by: Rob Clark > --- > drivers/dma-buf/dma-fence-array.c | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/dma-buf/dma-fence-array.c > b/drivers/dma-buf/dma-fence-array.c > index d3fbd950be94..8d194b09ee3d 100644 > --- a/drivers/dma-buf/dma-fence-array.c > +++ b/drivers/dma-buf/dma-fence-array.c > @@ -119,12 +119,23 @@ static void dma_fence_array_release(struct dma_fence > *fence) > dma_fence_free(fence); > } > > +static void dma_fence_array_set_deadline(struct dma_fence *fence, > + ktime_t deadline) > +{ > + struct dma_fence_array *array = to_dma_fence_array(fence); > + unsigned i; > + > + for (i = 0; i < array->num_fences; ++i) > + dma_fence_set_deadline(array->fences[i], deadline); Hm I wonder whether this can go wrong, and whether we need Christian's massive fence iterator that I've seen flying around. If you nest these things too much it could all go wrong I think. I looked at other users which inspect dma_fence_array and none of them have a risk for unbounded recursion. Maybe check with Christian. -Daniel > +} > + > const struct dma_fence_ops dma_fence_array_ops = { > .get_driver_name = dma_fence_array_get_driver_name, > .get_timeline_name = dma_fence_array_get_timeline_name, > .enable_signaling = dma_fence_array_enable_signaling, > .signaled = dma_fence_array_signaled, > .release = dma_fence_array_release, > + .set_deadline = dma_fence_array_set_deadline, > }; > EXPORT_SYMBOL(dma_fence_array_ops); > > -- > 2.31.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 1/9] dma-fence: Add deadline awareness
On Fri, Sep 03, 2021 at 11:47:52AM -0700, Rob Clark wrote: > From: Rob Clark > > Add a way to hint to the fence signaler of an upcoming deadline, such as > vblank, which the fence waiter would prefer not to miss. This is to aid > the fence signaler in making power management decisions, like boosting > frequency as the deadline approaches and awareness of missing deadlines > so that can be factored in to the frequency scaling. > > v2: Drop dma_fence::deadline and related logic to filter duplicate > deadlines, to avoid increasing dma_fence size. The fence-context > implementation will need similar logic to track deadlines of all > the fences on the same timeline. [ckoenig] > > Signed-off-by: Rob Clark > Reviewed-by: Christian König > Signed-off-by: Rob Clark > --- > drivers/dma-buf/dma-fence.c | 20 > include/linux/dma-fence.h | 16 > 2 files changed, 36 insertions(+) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index ce0f5eff575d..1f444863b94d 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -910,6 +910,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, > uint32_t count, > } > EXPORT_SYMBOL(dma_fence_wait_any_timeout); > > + > +/** > + * dma_fence_set_deadline - set desired fence-wait deadline > + * @fence:the fence that is to be waited on > + * @deadline: the time by which the waiter hopes for the fence to be > + *signaled > + * > + * Inform the fence signaler of an upcoming deadline, such as vblank, by > + * which point the waiter would prefer the fence to be signaled by. This > + * is intended to give feedback to the fence signaler to aid in power > + * management decisions, such as boosting GPU frequency if a periodic > + * vblank deadline is approaching. > + */ > +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > +{ > + if (fence->ops->set_deadline && !dma_fence_is_signaled(fence)) > + fence->ops->set_deadline(fence, deadline); > +} > +EXPORT_SYMBOL(dma_fence_set_deadline); > + > /** > * dma_fence_init - Initialize a custom fence. > * @fence: the fence to initialize > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h > index 6ffb4b2c6371..9c809f0d5d0a 100644 > --- a/include/linux/dma-fence.h > +++ b/include/linux/dma-fence.h > @@ -99,6 +99,7 @@ enum dma_fence_flag_bits { > DMA_FENCE_FLAG_SIGNALED_BIT, > DMA_FENCE_FLAG_TIMESTAMP_BIT, > DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, > + DMA_FENCE_FLAG_HAS_DEADLINE_BIT, > DMA_FENCE_FLAG_USER_BITS, /* must always be last member */ > }; > > @@ -261,6 +262,19 @@ struct dma_fence_ops { >*/ > void (*timeline_value_str)(struct dma_fence *fence, > char *str, int size); > + > + /** > + * @set_deadline: > + * > + * Callback to allow a fence waiter to inform the fence signaler of an > + * upcoming deadline, such as vblank, by which point the waiter would > + * prefer the fence to be signaled by. This is intended to give > feedback > + * to the fence signaler to aid in power management decisions, such as > + * boosting GPU frequency. Please add here that this callback is called without &dma_fence.lock held, and that locking is up to callers if they have some state to manage. I realized that while scratching some heads over your later patches. -Daniel > + * > + * This callback is optional. > + */ > + void (*set_deadline)(struct dma_fence *fence, ktime_t deadline); > }; > > void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops, > @@ -586,6 +600,8 @@ static inline signed long dma_fence_wait(struct dma_fence > *fence, bool intr) > return ret < 0 ? ret : 0; > } > > +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline); > + > struct dma_fence *dma_fence_get_stub(void); > struct dma_fence *dma_fence_allocate_private_stub(void); > u64 dma_fence_context_alloc(unsigned num); > -- > 2.31.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support
On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote: > From: Rob Clark > > Signed-off-by: Rob Clark > --- > drivers/dma-buf/dma-fence-chain.c | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/drivers/dma-buf/dma-fence-chain.c > b/drivers/dma-buf/dma-fence-chain.c > index 1b4cb3e5cec9..736a9ad3ea6d 100644 > --- a/drivers/dma-buf/dma-fence-chain.c > +++ b/drivers/dma-buf/dma-fence-chain.c > @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence > *fence) > dma_fence_free(fence); > } > > + > +static void dma_fence_chain_set_deadline(struct dma_fence *fence, > + ktime_t deadline) > +{ > + dma_fence_chain_for_each(fence, fence) { > + struct dma_fence_chain *chain = to_dma_fence_chain(fence); > + struct dma_fence *f = chain ? chain->fence : fence; Doesn't this just end up calling set_deadline on a chain, potenetially resulting in recursion? Also I don't think this should ever happen, why did you add that? -Daniel > + > + dma_fence_set_deadline(f, deadline); > + } > +} > + > const struct dma_fence_ops dma_fence_chain_ops = { > .use_64bit_seqno = true, > .get_driver_name = dma_fence_chain_get_driver_name, > @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = { > .enable_signaling = dma_fence_chain_enable_signaling, > .signaled = dma_fence_chain_signaled, > .release = dma_fence_chain_release, > + .set_deadline = dma_fence_chain_set_deadline, > }; > EXPORT_SYMBOL(dma_fence_chain_ops); > > -- > 2.31.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 5/9] drm/msm: Add deadline based boost support
On Wed, Sep 8, 2021 at 10:48 AM Daniel Vetter wrote: > > On Fri, Sep 03, 2021 at 11:47:56AM -0700, Rob Clark wrote: > > From: Rob Clark > > > > Signed-off-by: Rob Clark > > Why do you need a kthread_work here? Is this just to make sure you're > running at realtime prio? Maybe a comment to that effect would be good. Mostly because we are already using a kthread_worker for things the GPU needs to kick off to a different context.. but I think this is something we'd want at a realtime prio BR, -R > -Daniel > > > --- > > drivers/gpu/drm/msm/msm_fence.c | 76 +++ > > drivers/gpu/drm/msm/msm_fence.h | 20 +++ > > drivers/gpu/drm/msm/msm_gpu.h | 1 + > > drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++ > > 4 files changed, 117 insertions(+) > > > > diff --git a/drivers/gpu/drm/msm/msm_fence.c > > b/drivers/gpu/drm/msm/msm_fence.c > > index f2cece542c3f..67c2a96e1c85 100644 > > --- a/drivers/gpu/drm/msm/msm_fence.c > > +++ b/drivers/gpu/drm/msm/msm_fence.c > > @@ -8,6 +8,37 @@ > > > > #include "msm_drv.h" > > #include "msm_fence.h" > > +#include "msm_gpu.h" > > + > > +static inline bool fence_completed(struct msm_fence_context *fctx, > > uint32_t fence); > > + > > +static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx) > > +{ > > + struct msm_drm_private *priv = fctx->dev->dev_private; > > + return priv->gpu; > > +} > > + > > +static enum hrtimer_restart deadline_timer(struct hrtimer *t) > > +{ > > + struct msm_fence_context *fctx = container_of(t, > > + struct msm_fence_context, deadline_timer); > > + > > + kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work); > > + > > + return HRTIMER_NORESTART; > > +} > > + > > +static void deadline_work(struct kthread_work *work) > > +{ > > + struct msm_fence_context *fctx = container_of(work, > > + struct msm_fence_context, deadline_work); > > + > > + /* If deadline fence has already passed, nothing to do: */ > > + if (fence_completed(fctx, fctx->next_deadline_fence)) > > + return; > > + > > + msm_devfreq_boost(fctx2gpu(fctx), 2); > > +} > > > > > > struct msm_fence_context * > > @@ -26,6 +57,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile > > uint32_t *fenceptr, > > fctx->fenceptr = fenceptr; > > spin_lock_init(&fctx->spinlock); > > > > + hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, > > HRTIMER_MODE_ABS); > > + fctx->deadline_timer.function = deadline_timer; > > + > > + kthread_init_work(&fctx->deadline_work, deadline_work); > > + > > + fctx->next_deadline = ktime_get(); > > + > > return fctx; > > } > > > > @@ -49,6 +87,8 @@ void msm_update_fence(struct msm_fence_context *fctx, > > uint32_t fence) > > { > > spin_lock(&fctx->spinlock); > > fctx->completed_fence = max(fence, fctx->completed_fence); > > + if (fence_completed(fctx, fctx->next_deadline_fence)) > > + hrtimer_cancel(&fctx->deadline_timer); > > spin_unlock(&fctx->spinlock); > > } > > > > @@ -79,10 +119,46 @@ static bool msm_fence_signaled(struct dma_fence *fence) > > return fence_completed(f->fctx, f->base.seqno); > > } > > > > +static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t > > deadline) > > +{ > > + struct msm_fence *f = to_msm_fence(fence); > > + struct msm_fence_context *fctx = f->fctx; > > + unsigned long flags; > > + ktime_t now; > > + > > + spin_lock_irqsave(&fctx->spinlock, flags); > > + now = ktime_get(); > > + > > + if (ktime_after(now, fctx->next_deadline) || > > + ktime_before(deadline, fctx->next_deadline)) { > > + fctx->next_deadline = deadline; > > + fctx->next_deadline_fence = > > + max(fctx->next_deadline_fence, > > (uint32_t)fence->seqno); > > + > > + /* > > + * Set timer to trigger boost 3ms before deadline, or > > + * if we are already less than 3ms before the deadline > > + * schedule boost work immediately. > > + */ > > + deadline = ktime_sub(deadline, ms_to_ktime(3)); > > + > > + if (ktime_after(now, deadline)) { > > + kthread_queue_work(fctx2gpu(fctx)->worker, > > + &fctx->deadline_work); > > + } else { > > + hrtimer_start(&fctx->deadline_timer, deadline, > > + HRTIMER_MODE_ABS); > > + } > > + } > > + > > + spin_unlock_irqrestore(&fctx->spinlock, flags); > > +} > > + > > static const struct dma_fence_ops msm_fence_ops = { > > .get_driver_name = msm_fence_get_driver_name, > > .get_timeline_name = msm_fence_get_timeline_name, > > .signaled = msm_fence_signaled, > > + .set_deadline = msm_fence_set_deadline, > > }; > > > > struct dma_fence * > > diff --git a/d
Re: [PATCH 13/14] drm/kmb: Enable alpha blended second plane
Hi Am 03.08.21 um 07:10 schrieb Sam Ravnborg: Hi Anitha, On Mon, Aug 02, 2021 at 08:44:26PM +, Chrisanthus, Anitha wrote: Hi Sam, Thanks. Where should this go, drm-misc-fixes or drm-misc-next? Looks like a drm-misc-next candidate to me. I may improve something for existing users, but it does not look like it fixes an existing bug. I found this patch in drm-misc-fixes, although it doesn't look like a bugfix. It should have gone into drm-misc-next. See [1]. If it indeed belongs into drm-misc-fixes, it certainly should have contained a Fixes tag. Best regards Thomas [1] https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-misc.html#where-do-i-apply-my-patch Sam -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer OpenPGP_signature Description: OpenPGP digital signature
Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl
On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote: > From: Rob Clark > > The initial purpose is for igt tests, but this would also be useful for > compositors that wait until close to vblank deadline to make decisions > about which frame to show. > > Signed-off-by: Rob Clark Needs userspace and I think ideally also some igts to make sure it works and doesn't go boom. -Daniel > --- > drivers/dma-buf/sync_file.c| 19 +++ > include/uapi/linux/sync_file.h | 20 > 2 files changed, 39 insertions(+) > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c > index 394e6e1e9686..f295772d5169 100644 > --- a/drivers/dma-buf/sync_file.c > +++ b/drivers/dma-buf/sync_file.c > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file > *sync_file, > return ret; > } > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file, > + unsigned long arg) > +{ > + struct sync_set_deadline ts; > + > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts))) > + return -EFAULT; > + > + if (ts.pad) > + return -EINVAL; > + > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, > ts.tv_nsec)); > + > + return 0; > +} > + > static long sync_file_ioctl(struct file *file, unsigned int cmd, > unsigned long arg) > { > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned > int cmd, > case SYNC_IOC_FILE_INFO: > return sync_file_ioctl_fence_info(sync_file, arg); > > + case SYNC_IOC_SET_DEADLINE: > + return sync_file_ioctl_set_deadline(sync_file, arg); > + > default: > return -ENOTTY; > } > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h > index ee2dcfb3d660..f67d4ffe7566 100644 > --- a/include/uapi/linux/sync_file.h > +++ b/include/uapi/linux/sync_file.h > @@ -67,6 +67,18 @@ struct sync_file_info { > __u64 sync_fence_info; > }; > > +/** > + * struct sync_set_deadline - set a deadline on a fence > + * @tv_sec: seconds elapsed since epoch > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec > + * @pad: must be zero > + */ > +struct sync_set_deadline { > + __s64 tv_sec; > + __s32 tv_nsec; > + __u32 pad; > +}; > + > #define SYNC_IOC_MAGIC '>' > > /** > @@ -95,4 +107,12 @@ struct sync_file_info { > */ > #define SYNC_IOC_FILE_INFO _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info) > > + > +/** > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence > + * > + * Allows userspace to set a deadline on a fence, see > dma_fence_set_deadline() > + */ > +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct > sync_set_deadline) > + > #endif /* _UAPI_LINUX_SYNC_H */ > -- > 2.31.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH v3 5/9] drm/msm: Add deadline based boost support
On Fri, Sep 03, 2021 at 11:47:56AM -0700, Rob Clark wrote: > From: Rob Clark > > Signed-off-by: Rob Clark Why do you need a kthread_work here? Is this just to make sure you're running at realtime prio? Maybe a comment to that effect would be good. -Daniel > --- > drivers/gpu/drm/msm/msm_fence.c | 76 +++ > drivers/gpu/drm/msm/msm_fence.h | 20 +++ > drivers/gpu/drm/msm/msm_gpu.h | 1 + > drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++ > 4 files changed, 117 insertions(+) > > diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c > index f2cece542c3f..67c2a96e1c85 100644 > --- a/drivers/gpu/drm/msm/msm_fence.c > +++ b/drivers/gpu/drm/msm/msm_fence.c > @@ -8,6 +8,37 @@ > > #include "msm_drv.h" > #include "msm_fence.h" > +#include "msm_gpu.h" > + > +static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t > fence); > + > +static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx) > +{ > + struct msm_drm_private *priv = fctx->dev->dev_private; > + return priv->gpu; > +} > + > +static enum hrtimer_restart deadline_timer(struct hrtimer *t) > +{ > + struct msm_fence_context *fctx = container_of(t, > + struct msm_fence_context, deadline_timer); > + > + kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work); > + > + return HRTIMER_NORESTART; > +} > + > +static void deadline_work(struct kthread_work *work) > +{ > + struct msm_fence_context *fctx = container_of(work, > + struct msm_fence_context, deadline_work); > + > + /* If deadline fence has already passed, nothing to do: */ > + if (fence_completed(fctx, fctx->next_deadline_fence)) > + return; > + > + msm_devfreq_boost(fctx2gpu(fctx), 2); > +} > > > struct msm_fence_context * > @@ -26,6 +57,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile > uint32_t *fenceptr, > fctx->fenceptr = fenceptr; > spin_lock_init(&fctx->spinlock); > > + hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); > + fctx->deadline_timer.function = deadline_timer; > + > + kthread_init_work(&fctx->deadline_work, deadline_work); > + > + fctx->next_deadline = ktime_get(); > + > return fctx; > } > > @@ -49,6 +87,8 @@ void msm_update_fence(struct msm_fence_context *fctx, > uint32_t fence) > { > spin_lock(&fctx->spinlock); > fctx->completed_fence = max(fence, fctx->completed_fence); > + if (fence_completed(fctx, fctx->next_deadline_fence)) > + hrtimer_cancel(&fctx->deadline_timer); > spin_unlock(&fctx->spinlock); > } > > @@ -79,10 +119,46 @@ static bool msm_fence_signaled(struct dma_fence *fence) > return fence_completed(f->fctx, f->base.seqno); > } > > +static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > +{ > + struct msm_fence *f = to_msm_fence(fence); > + struct msm_fence_context *fctx = f->fctx; > + unsigned long flags; > + ktime_t now; > + > + spin_lock_irqsave(&fctx->spinlock, flags); > + now = ktime_get(); > + > + if (ktime_after(now, fctx->next_deadline) || > + ktime_before(deadline, fctx->next_deadline)) { > + fctx->next_deadline = deadline; > + fctx->next_deadline_fence = > + max(fctx->next_deadline_fence, (uint32_t)fence->seqno); > + > + /* > + * Set timer to trigger boost 3ms before deadline, or > + * if we are already less than 3ms before the deadline > + * schedule boost work immediately. > + */ > + deadline = ktime_sub(deadline, ms_to_ktime(3)); > + > + if (ktime_after(now, deadline)) { > + kthread_queue_work(fctx2gpu(fctx)->worker, > + &fctx->deadline_work); > + } else { > + hrtimer_start(&fctx->deadline_timer, deadline, > + HRTIMER_MODE_ABS); > + } > + } > + > + spin_unlock_irqrestore(&fctx->spinlock, flags); > +} > + > static const struct dma_fence_ops msm_fence_ops = { > .get_driver_name = msm_fence_get_driver_name, > .get_timeline_name = msm_fence_get_timeline_name, > .signaled = msm_fence_signaled, > + .set_deadline = msm_fence_set_deadline, > }; > > struct dma_fence * > diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h > index 4783db528bcc..d34e853c555a 100644 > --- a/drivers/gpu/drm/msm/msm_fence.h > +++ b/drivers/gpu/drm/msm/msm_fence.h > @@ -50,6 +50,26 @@ struct msm_fence_context { > volatile uint32_t *fenceptr; > > spinlock_t spinlock; > + > + /* > + * TODO this doesn't really deal with multiple deadlines, like > + * if userspace got multiple frames ahead.. OTOH atomic updates > + * don't queue, so maybe that is
Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support
On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote: > From: Rob Clark > > As the finished fence is the one that is exposed to userspace, and > therefore the one that other operations, like atomic update, would > block on, we need to propagate the deadline from from the finished > fence to the actual hw fence. > > v2: Split into drm_sched_fence_set_parent() (ckoenig) > > Signed-off-by: Rob Clark > --- > drivers/gpu/drm/scheduler/sched_fence.c | 34 + > drivers/gpu/drm/scheduler/sched_main.c | 2 +- > include/drm/gpu_scheduler.h | 8 ++ > 3 files changed, 43 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c > b/drivers/gpu/drm/scheduler/sched_fence.c > index bcea035cf4c6..4fc41a71d1c7 100644 > --- a/drivers/gpu/drm/scheduler/sched_fence.c > +++ b/drivers/gpu/drm/scheduler/sched_fence.c > @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct > dma_fence *f) > dma_fence_put(&fence->scheduled); > } > > +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f, > + ktime_t deadline) > +{ > + struct drm_sched_fence *fence = to_drm_sched_fence(f); > + unsigned long flags; > + > + spin_lock_irqsave(&fence->lock, flags); > + > + /* If we already have an earlier deadline, keep it: */ > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) && > + ktime_before(fence->deadline, deadline)) { > + spin_unlock_irqrestore(&fence->lock, flags); > + return; > + } > + > + fence->deadline = deadline; > + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags); > + > + spin_unlock_irqrestore(&fence->lock, flags); > + > + if (fence->parent) > + dma_fence_set_deadline(fence->parent, deadline); > +} > + > static const struct dma_fence_ops drm_sched_fence_ops_scheduled = { > .get_driver_name = drm_sched_fence_get_driver_name, > .get_timeline_name = drm_sched_fence_get_timeline_name, > @@ -138,6 +162,7 @@ static const struct dma_fence_ops > drm_sched_fence_ops_finished = { > .get_driver_name = drm_sched_fence_get_driver_name, > .get_timeline_name = drm_sched_fence_get_timeline_name, > .release = drm_sched_fence_release_finished, > + .set_deadline = drm_sched_fence_set_deadline_finished, > }; > > struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f) > @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct > dma_fence *f) > } > EXPORT_SYMBOL(to_drm_sched_fence); > > +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence, > + struct dma_fence *fence) > +{ > + s_fence->parent = dma_fence_get(fence); > + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, > + &s_fence->finished.flags)) Don't you need the spinlock here too to avoid races? test_bit is very unordered, so guarantees nothing. Spinlock would need to be both around ->parent = and the test_bit. Entirely aside, but there's discussions going on to preallocate the hw fence somehow. If we do that we could make the deadline forwarding lockless here. Having a spinlock just to set the parent is a bit annoying ... Alternative is that you do this locklessly with barriers and a _lot_ of comments. Would be good to benchmark whether the overhead matters though first. -Daniel > + dma_fence_set_deadline(fence, s_fence->deadline); > +} > + > struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity > *entity, > void *owner) > { > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index 595e47ff7d06..27bf0ac0625f 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -978,7 +978,7 @@ static int drm_sched_main(void *param) > drm_sched_fence_scheduled(s_fence); > > if (!IS_ERR_OR_NULL(fence)) { > - s_fence->parent = dma_fence_get(fence); > + drm_sched_fence_set_parent(s_fence, fence); > r = dma_fence_add_callback(fence, &sched_job->cb, > drm_sched_job_done_cb); > if (r == -ENOENT) > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index 7f77a455722c..158ddd662469 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -238,6 +238,12 @@ struct drm_sched_fence { > */ > struct dma_fencefinished; > > + /** > + * @deadline: deadline set on &drm_sched_fence.finished which > + * potentially needs to be propagated to &drm_sched_fence.parent > + */ > + ktime_t deadline; > + > /** > * @parent: the fence returned by &drm_sched_backend_ops.run_
[PULL] drm-misc-fixes
Hi Dave and Daniel, here's this week's PR for drm-misc-fixes. One patch is a potential deadlock in TTM, the other enables an additional plane in kmb. I'm slightly unhappy that the latter one ended up in -fixes as it's not a bugfix AFAICT. Best regards Thomas drm-misc-fixes-2021-09-08: Short summary of fixes pull: * kmb: Emable second plane * ttm: Fix potential deadlock during swap The following changes since commit fa0b1ef5f7a694f48e00804a391245f3471aa155: drm: Copy drm_wait_vblank to user before returning (2021-08-17 13:56:03 -0400) are available in the Git repository at: git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-09-08 for you to fetch changes up to c8704b7ec182f9293e6a994310c7d4203428cdfb: drm/kmb: Enable alpha blended second plane (2021-09-07 10:10:30 -0700) Short summary of fixes pull: * kmb: Emable second plane * ttm: Fix potential deadlock during swap Edmund Dea (1): drm/kmb: Enable alpha blended second plane xinhui pan (1): drm/ttm: Fix a deadlock if the target BO is not idle during swap drivers/gpu/drm/kmb/kmb_drv.c | 8 ++-- drivers/gpu/drm/kmb/kmb_drv.h | 5 +++ drivers/gpu/drm/kmb/kmb_plane.c | 81 - drivers/gpu/drm/kmb/kmb_plane.h | 5 ++- drivers/gpu/drm/kmb/kmb_regs.h | 3 ++ drivers/gpu/drm/ttm/ttm_bo.c| 6 +-- 6 files changed, 90 insertions(+), 18 deletions(-) -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer
Re: [PATCH v2 1/2] drm: document drm_mode_create_lease object requirements
On Fri, Sep 03, 2021 at 01:00:16PM +, Simon Ser wrote: > validate_lease expects one CRTC, one connector and one plane. > > Signed-off-by: Simon Ser > Cc: Daniel Vetter > Cc: Pekka Paalanen > Cc: Keith Packard Reviewed-by: Daniel Vetter > --- > include/uapi/drm/drm_mode.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h > index 90c55383f1ee..e4a2570a6058 100644 > --- a/include/uapi/drm/drm_mode.h > +++ b/include/uapi/drm/drm_mode.h > @@ -1110,6 +1110,9 @@ struct drm_mode_destroy_blob { > * struct drm_mode_create_lease - Create lease > * > * Lease mode resources, creating another drm_master. > + * > + * The @object_ids array must reference at least one CRTC, one connector and > + * one plane if &DRM_CLIENT_CAP_UNIVERSAL_PLANES is enabled. > */ > struct drm_mode_create_lease { > /** @object_ids: Pointer to array of object ids (__u32) */ > -- > 2.33.0 > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/i915/request: fix early tracepoints
On Fri, Sep 03, 2021 at 12:24:05PM +0100, Matthew Auld wrote: > Currently we blow up in trace_dma_fence_init, when calling into > get_driver_name or get_timeline_name, since both the engine and context > might be NULL(or contain some garbage address) in the case of newly > allocated slab objects via the request ctor. Note that we also use > SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately > freed, but delay freeing the underlying page by an RCU grace period. > With this scheme requests can be re-allocated, at the same time as they > are also being read by some lockless RCU lookup mechanism. > > One possible fix, since we don't yet have a fully initialised request > when in the ctor, is just setting the context/engine as NULL and adding > some extra handling in get_driver_name etc. And since the ctor is only > called for new slab objects(i.e allocate new page and call the ctor for > each object) it's safe to reset the context/engine prior to calling into > dma_fence_init, since we can be certain that no one is doing an RCU > lookup which might depend on peeking at the engine/context, like in > active_engine(), since the object can't yet be externally visible. > > In the recycled case(which might also be externally visible) the request > refcount always transitions from 0->1 after we set the context/engine > etc, which should ensure it's valid to dereference the engine for > example, when doing an RCU list-walk, so long as we can also increment > the refcount first. If the refcount is already zero, then the request is > considered complete/released. If it's non-zero, then the request might > be in the process of being re-allocated, or potentially still in flight, > however after successfully incrementing the refcount, it's possible to > carefully inspect the request state, to determine if the request is > still what we were looking for. Note that all externally visible > requests returned to the cache must have zero refcount. The commit message here is a bit confusing, since you start out with describing a solution that you're not actually implementing it. I usually do this by putting alternate solutions at the bottom, starting with "An alternate solution would be ..." or so. And then closing with why we don't do that, here it would be that we do no longer have a need for these partially set up i915_requests, and therefore just reverting that complication is the simplest solution. > An alternative fix then is to instead move the dma_fence_init out from > the request ctor. Originally this was how it was done, but it was moved > in: > > commit 855e39e65cfc33a73724f1cc644ffc5754864a20 > Author: Chris Wilson > Date: Mon Feb 3 09:41:48 2020 + > > drm/i915: Initialise basic fence before acquiring seqno > > where it looks like intel_timeline_get_seqno() relied on some of the > rq->fence state, but that is no longer the case since: > > commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc > Author: Maarten Lankhorst > Date: Tue Mar 23 16:49:50 2021 +0100 > > drm/i915: Do not share hwsp across contexts any more, v8. > > intel_timeline_get_seqno() could also be cleaned up slightly by dropping > the request argument. > > Moving dma_fence_init back out of the ctor, should ensure we have enough > of the request initialised in case of trace_dma_fence_init. > Functionally this should be the same, and is effectively what we were > already open coding before, except now we also assign the fence->lock > and fence->ops, but since these are invariant for recycled > requests(which might be externally visible), and will therefore already > hold the same value, it shouldn't matter. We still leave the > spin_lock_init() in the ctor, since we can't re-init the rq->lock in > case it is already held. Holding rq->lock without having a full reference to it sounds like really bad taste. I think it would be good to have a (kerneldoc) comment next to i915_request.lock about this, with a FIXME. But separate patch. > Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring > seqno") > Signed-off-by: Matthew Auld > Cc: Michael Mason > Cc: Daniel Vetter With the commit message restructured a bit, and assuming this one actually works: Reviewed-by: Daniel Vetter But I'm really not confident :-( -Daniel > --- > drivers/gpu/drm/i915/i915_request.c | 11 ++- > 1 file changed, 2 insertions(+), 9 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_request.c > b/drivers/gpu/drm/i915/i915_request.c > index ce446716d092..79da5eca60af 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg) > i915_sw_fence_init(&rq->submit, submit_notify); > i915_sw_fence_init(&rq->semaphore, semaphore_notify); > > - dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0); > - > rq->capture_list = NULL; > > init_llist_head(&rq->execute_cb); > @@ -905,
Re: [PATCH] drm/sched: Fix drm_sched_fence_free() so it can be passed an uninitialized fence
On Fri, Sep 03, 2021 at 02:05:54PM +0200, Boris Brezillon wrote: > drm_sched_job_cleanup() will pass an uninitialized fence to > drm_sched_fence_free(), which will cause to_drm_sched_fence() to return > a NULL fence object, causing a NULL pointer deref when this NULL object > is passed to kmem_cache_free(). > > Let's create a new drm_sched_fence_free() function that takes a > drm_sched_fence pointer and suffix the old function with _rcu. While at > it, complain if drm_sched_fence_free() is passed an initialized fence > or if drm_sched_fence_free_rcu() is passed an uninitialized fence. > > Fixes: dbe48d030b28 ("drm/sched: Split drm_sched_job_init") > Signed-off-by: Boris Brezillon > --- > Found while debugging another issue in panfrost causing a failure in > the submit ioctl and exercising the error path (path that calls > drm_sched_job_cleanup() on an unarmed job). Reviewed-by: Daniel Vetter I already provided an irc r-b, just here for the record too. -Daniel > --- > drivers/gpu/drm/scheduler/sched_fence.c | 29 - > drivers/gpu/drm/scheduler/sched_main.c | 2 +- > include/drm/gpu_scheduler.h | 2 +- > 3 files changed, 21 insertions(+), 12 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_fence.c > b/drivers/gpu/drm/scheduler/sched_fence.c > index db3fd1303fc4..7fd869520ef2 100644 > --- a/drivers/gpu/drm/scheduler/sched_fence.c > +++ b/drivers/gpu/drm/scheduler/sched_fence.c > @@ -69,19 +69,28 @@ static const char > *drm_sched_fence_get_timeline_name(struct dma_fence *f) > return (const char *)fence->sched->name; > } > > -/** > - * drm_sched_fence_free - free up the fence memory > - * > - * @rcu: RCU callback head > - * > - * Free up the fence memory after the RCU grace period. > - */ > -void drm_sched_fence_free(struct rcu_head *rcu) > +static void drm_sched_fence_free_rcu(struct rcu_head *rcu) > { > struct dma_fence *f = container_of(rcu, struct dma_fence, rcu); > struct drm_sched_fence *fence = to_drm_sched_fence(f); > > - kmem_cache_free(sched_fence_slab, fence); > + if (!WARN_ON_ONCE(!fence)) > + kmem_cache_free(sched_fence_slab, fence); > +} > + > +/** > + * drm_sched_fence_free - free up an uninitialized fence > + * > + * @fence: fence to free > + * > + * Free up the fence memory. Should only be used if drm_sched_fence_init() > + * has not been called yet. > + */ > +void drm_sched_fence_free(struct drm_sched_fence *fence) > +{ > + /* This function should not be called if the fence has been > initialized. */ > + if (!WARN_ON_ONCE(fence->sched)) > + kmem_cache_free(sched_fence_slab, fence); > } > > /** > @@ -97,7 +106,7 @@ static void drm_sched_fence_release_scheduled(struct > dma_fence *f) > struct drm_sched_fence *fence = to_drm_sched_fence(f); > > dma_fence_put(fence->parent); > - call_rcu(&fence->finished.rcu, drm_sched_fence_free); > + call_rcu(&fence->finished.rcu, drm_sched_fence_free_rcu); > } > > /** > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index fbbd3b03902f..6987d412a946 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -750,7 +750,7 @@ void drm_sched_job_cleanup(struct drm_sched_job *job) > dma_fence_put(&job->s_fence->finished); > } else { > /* aborted job before committing to run it */ > - drm_sched_fence_free(&job->s_fence->finished.rcu); > + drm_sched_fence_free(job->s_fence); > } > > job->s_fence = NULL; > diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h > index 7f77a455722c..f011e4c407f2 100644 > --- a/include/drm/gpu_scheduler.h > +++ b/include/drm/gpu_scheduler.h > @@ -509,7 +509,7 @@ struct drm_sched_fence *drm_sched_fence_alloc( > struct drm_sched_entity *s_entity, void *owner); > void drm_sched_fence_init(struct drm_sched_fence *fence, > struct drm_sched_entity *entity); > -void drm_sched_fence_free(struct rcu_head *rcu); > +void drm_sched_fence_free(struct drm_sched_fence *fence); > > void drm_sched_fence_scheduled(struct drm_sched_fence *fence); > void drm_sched_fence_finished(struct drm_sched_fence *fence); > -- > 2.31.1 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH][next] drm/i915: clean up inconsistent indenting
On Thu, Sep 02, 2021 at 10:57:37PM +0100, Colin King wrote: > From: Colin Ian King > > There is a statement that is indented one character too deeply, > clean this up. > > Signed-off-by: Colin Ian King Queued to drm-intel-gt-next, thanks for patch. -Daniel > --- > drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > index de5f9c86b9a4..aeb324b701ec 100644 > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c > @@ -2565,7 +2565,7 @@ __execlists_context_pre_pin(struct intel_context *ce, > if (!__test_and_set_bit(CONTEXT_INIT_BIT, &ce->flags)) { > lrc_init_state(ce, engine, *vaddr); > > - __i915_gem_object_flush_map(ce->state->obj, 0, > engine->context_size); > + __i915_gem_object_flush_map(ce->state->obj, 0, > engine->context_size); > } > > return 0; > -- > 2.32.0 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH] drm/ttm: provide default page protection for UML
On Sat, Sep 04, 2021 at 11:50:37AM +0800, David Gow wrote: > On Thu, Sep 2, 2021 at 10:46 PM Daniel Vetter wrote: > > > > On Thu, Sep 02, 2021 at 07:19:01AM +0100, Anton Ivanov wrote: > > > On 02/09/2021 06:52, Randy Dunlap wrote: > > > > On 9/1/21 10:48 PM, Anton Ivanov wrote: > > > > > On 02/09/2021 03:01, Randy Dunlap wrote: > > > > > > boot_cpu_data [struct cpuinfo_um (on UML)] does not have a struct > > > > > > member named 'x86', so provide a default page protection mode > > > > > > for CONFIG_UML. > > > > > > > > > > > > Mends this build error: > > > > > > ../drivers/gpu/drm/ttm/ttm_module.c: In function > > > > > > ‘ttm_prot_from_caching’: > > > > > > ../drivers/gpu/drm/ttm/ttm_module.c:59:24: error: ‘struct > > > > > > cpuinfo_um’ has no member named ‘x86’ > > > > > >else if (boot_cpu_data.x86 > 3) > > > > > > ^ > > > > > > > > > > > > Fixes: 3bf3710e3718 ("drm/ttm: Add a generic TTM memcpy move for > > > > > > page-based iomem") > > > > > > Signed-off-by: Randy Dunlap > > > > > > Cc: Thomas Hellström > > > > > > Cc: Christian König > > > > > > Cc: Huang Rui > > > > > > Cc: dri-devel@lists.freedesktop.org > > > > > > Cc: Jeff Dike > > > > > > Cc: Richard Weinberger > > > > > > Cc: Anton Ivanov > > > > > > Cc: linux...@lists.infradead.org > > > > > > Cc: David Airlie > > > > > > Cc: Daniel Vetter > > > > > > --- > > > > > > drivers/gpu/drm/ttm/ttm_module.c |4 > > > > > > 1 file changed, 4 insertions(+) > > > > > > > > > > > > --- linux-next-20210901.orig/drivers/gpu/drm/ttm/ttm_module.c > > > > > > +++ linux-next-20210901/drivers/gpu/drm/ttm/ttm_module.c > > > > > > @@ -53,6 +53,9 @@ pgprot_t ttm_prot_from_caching(enum ttm_ > > > > > > if (caching == ttm_cached) > > > > > > return tmp; > > > > > > +#ifdef CONFIG_UML > > > > > > +tmp = pgprot_noncached(tmp); > > > > > > +#else > > > > > > #if defined(__i386__) || defined(__x86_64__) > > > > > > if (caching == ttm_write_combined) > > > > > > tmp = pgprot_writecombine(tmp); > > > > > > @@ -69,6 +72,7 @@ pgprot_t ttm_prot_from_caching(enum ttm_ > > > > > > #if defined(__sparc__) > > > > > > tmp = pgprot_noncached(tmp); > > > > > > #endif > > > > > > +#endif > > > > > > return tmp; > > > > > > } > > > > > > > > > > Patch looks OK. > > > > > > > > > > I have a question though - why all of DRM is not !UML in config. Not > > > > > like we can use them. > > > > > > > > I have no idea about that. > > > > Hopefully one of the (other) UML maintainers can answer you. > > > > > > Touche. > > > > > > We will discuss that and possibly push a patch to !UML that part of the > > > tree. IMHO it is not applicable. > > > > I thought kunit is based on top of uml, and we do want to eventually adopt > > that. Especially for helper libraries like ttm. > > UML is not actually a dependency for KUnit, so it's definitely > possible to test things which aren't compatible with UML. (In fact, > there's even now some tooling support to use qemu instead on a number > of architectures.) > > That being said, the KUnit tooling does use UML by default, so if it's > not too difficult to keep some level of UML support, it'll make it a > little easier (and faster) for people to run any KUnit tests. Yeah my understanding is that uml is the quickest way to spawn a new kernel, which kunit needs to run. And I really do like that idea, because having virtualization support in cloud CI systems (which use containers themselves) is a bit a fun exercise. The less we rely on virtual machines in containers for that, the better. Hence why I really like the uml approach for kunit. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [Intel-gfx] [PATCH v2] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup
On Thu, Sep 02, 2021 at 04:01:40PM +0100, Tvrtko Ursulin wrote: > > On 02/09/2021 15:33, Daniel Vetter wrote: > > On Tue, Aug 31, 2021 at 02:18:15PM +0100, Tvrtko Ursulin wrote: > > > > > > On 31/08/2021 13:43, Daniel Vetter wrote: > > > > On Tue, Aug 31, 2021 at 10:15:03AM +0100, Tvrtko Ursulin wrote: > > > > > > > > > > On 30/08/2021 09:26, Daniel Vetter wrote: > > > > > > On Fri, Aug 27, 2021 at 03:44:42PM +0100, Tvrtko Ursulin wrote: > > > > > > > > > > > > > > On 27/08/2021 15:39, Tvrtko Ursulin wrote: > > > > > > > > From: Tvrtko Ursulin > > > > > > > > > > > > > > > > In short this makes i915 work for hybrid setups (DRI_PRIME=1 > > > > > > > > with Mesa) > > > > > > > > when rendering is done on Intel dgfx and scanout/composition on > > > > > > > > Intel > > > > > > > > igfx. > > > > > > > > > > > > > > > > Before this patch the driver was not quite ready for that > > > > > > > > setup, mainly > > > > > > > > because it was able to emit a semaphore wait between the two > > > > > > > > GPUs, which > > > > > > > > results in deadlocks because semaphore target location in HWSP > > > > > > > > is neither > > > > > > > > shared between the two, nor mapped in both GGTT spaces. > > > > > > > > > > > > > > > > To fix it the patch adds an additional check to a couple of > > > > > > > > relevant code > > > > > > > > paths in order to prevent using semaphores for inter-engine > > > > > > > > synchronisation between different driver instances. > > > > > > > > > > > > > > > > Patch also moves singly used i915_gem_object_last_write_engine > > > > > > > > to be > > > > > > > > private in its only calling unit (debugfs), while modifying it > > > > > > > > to only > > > > > > > > show activity belonging to the respective driver instance. > > > > > > > > > > > > > > > > What remains in this problem space is the question of the GEM > > > > > > > > busy ioctl. > > > > > > > > We have a somewhat ambigous comment there saying only status of > > > > > > > > native > > > > > > > > fences will be reported, which could be interpreted as either > > > > > > > > i915, or > > > > > > > > native to the drm fd. For now I have decided to leave that as > > > > > > > > is, meaning > > > > > > > > any i915 instance activity continues to be reported. > > > > > > > > > > > > > > > > v2: > > > > > > > > * Avoid adding rq->i915. (Chris) > > > > > > > > > > > > > > > > Signed-off-by: Tvrtko Ursulin > > > > > > > > > > > > Can't we just delete semaphore code and done? > > > > > > - GuC won't have it > > > > > > - media team benchmarked on top of softpin media driver, found no > > > > > > difference > > > > > > > > > > You have S-curve for saturated workloads or something else? How > > > > > thorough and > > > > > which media team I guess. > > > > > > > > > > From memory it was a nice win for some benchmarks (non-saturated > > > > > ones), but > > > > > as I have told you previously, we haven't been putting numbers in > > > > > commit > > > > > messages since it wasn't allowed. I may be able to dig out some more > > > > > details > > > > > if I went trawling through GEM channel IRC logs, although probably > > > > > not the > > > > > actual numbers since those were usually on pastebin. Or you go an > > > > > talk with > > > > > Chris since he probably remembers more details. Or you just decide > > > > > you don't > > > > > care and remove it. I wouldn't do that without putting the complete > > > > > story in > > > > > writing, but it's your call after all. > > > > > > > > Media has also changed, they're not using relocations anymore. > > > > > > Meaning you think it changes the benchmarking story? When coupled with > > > removal of GPU relocations then possibly yes. > > > > > > > Unless there's solid data performance tuning of any kind that gets in > > > > the > > > > way simply needs to be removed. Yes this is radical, but the codebase is > > > > in a state to require this. > > > > > > > > So either way we'd need to rebenchmark this if it's really required. > > > > Also > > > > > > Therefore can you share what benchmarks have been done or is it secret? > > > As > > > said, I think the non-saturated case was the more interesting one here. > > > > > > > if we really need this code still someone needs to fix the design, the > > > > current code is making layering violations an art form. > > > > > > > > > Anyway, without the debugfs churn it is more or less two line patch > > > > > to fix > > > > > igfx + dgfx hybrid setup. So while mulling it over this could go in. > > > > > I'd > > > > > just refine it to use a GGTT check instead of GT. And unless DG1 ends > > > > > up > > > > > being GuC only. > > > > > > > > The minimal robust fix here is imo to stop us from upcasting dma_fence > > > > to > > > > i915_request if it's not for our device. Not sprinkle code here into the > > > > semaphore code. We shouldn't even get this far with foreign fences. > > > > > > Device check does not w