date:20210908

Re: [PATCH v3 6/9] dma-buf/fence-array: Add fence deadline support

2021-09-08 Thread Christian König


Am 08.09.21 um 20:00 schrieb Daniel Vetter:

On Fri, Sep 03, 2021 at 11:47:57AM -0700, Rob Clark wrote:

From: Rob Clark 

Signed-off-by: Rob Clark 
---
  drivers/dma-buf/dma-fence-array.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-array.c 
b/drivers/dma-buf/dma-fence-array.c
index d3fbd950be94..8d194b09ee3d 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -119,12 +119,23 @@ static void dma_fence_array_release(struct dma_fence 
*fence)
dma_fence_free(fence);
  }
  
+static void dma_fence_array_set_deadline(struct dma_fence *fence,

+ktime_t deadline)
+{
+   struct dma_fence_array *array = to_dma_fence_array(fence);
+   unsigned i;
+
+   for (i = 0; i < array->num_fences; ++i)
+   dma_fence_set_deadline(array->fences[i], deadline);

Hm I wonder whether this can go wrong, and whether we need Christian's
massive fence iterator that I've seen flying around. If you nest these
things too much it could all go wrong I think. I looked at other users
which inspect dma_fence_array and none of them have a risk for unbounded
recursion.


That should work fine or at least doesn't add anything new which could 
go boom.


The dma_fence_array() can't contain other dma_fence_array or 
dma_fence_chain objects or it could end up in a recursion and corrupt 
the kernel stack.


That's a well known limitation for other code paths as well.

So Reviewed-by: Christian König  for this one.

Regards,
Christian.



Maybe check with Christian.
-Daniel



+}
+
  const struct dma_fence_ops dma_fence_array_ops = {
.get_driver_name = dma_fence_array_get_driver_name,
.get_timeline_name = dma_fence_array_get_timeline_name,
.enable_signaling = dma_fence_array_enable_signaling,
.signaled = dma_fence_array_signaled,
.release = dma_fence_array_release,
+   .set_deadline = dma_fence_array_set_deadline,
  };
  EXPORT_SYMBOL(dma_fence_array_ops);
  
--

2.31.1

Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

2021-09-08 Thread Christian König


Am 08.09.21 um 20:45 schrieb Daniel Vetter:

On Wed, Sep 08, 2021 at 11:19:15AM -0700, Rob Clark wrote:

On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter  wrote:

On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:

From: Rob Clark 

Signed-off-by: Rob Clark 
---
  drivers/dma-buf/dma-fence-chain.c | 13 +
  1 file changed, 13 insertions(+)

diff --git a/drivers/dma-buf/dma-fence-chain.c 
b/drivers/dma-buf/dma-fence-chain.c
index 1b4cb3e5cec9..736a9ad3ea6d 100644
--- a/drivers/dma-buf/dma-fence-chain.c
+++ b/drivers/dma-buf/dma-fence-chain.c
@@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence 
*fence)
   dma_fence_free(fence);
  }

+
+static void dma_fence_chain_set_deadline(struct dma_fence *fence,
+  ktime_t deadline)
+{
+ dma_fence_chain_for_each(fence, fence) {
+ struct dma_fence_chain *chain = to_dma_fence_chain(fence);
+ struct dma_fence *f = chain ? chain->fence : fence;

Doesn't this just end up calling set_deadline on a chain, potenetially
resulting in recursion? Also I don't think this should ever happen, why
did you add that?

Tbh the fence-chain was the part I was a bit fuzzy about, and the main
reason I added igt tests.  The iteration is similar to how, for ex,
dma_fence_chain_signaled() work, and according to the igt test it does
what was intended

Huh indeed. Maybe something we should fix, like why does the
dma_fence_chain_for_each not give you the upcast chain pointer ... I guess
this also needs more Christian and less me.


Yeah I was also already thinking about having a 
dma_fence_chain_for_each_contained() macro which directly returns the 
containing fence, just didn't had time to implement/clean that up.


And yes the patch is correct as it is and avoid the recursion, so 
Reviewed-by: Christian König  for this one.


Regards,
Christian.


-Daniel


BR,
-R


-Daniel


+
+ dma_fence_set_deadline(f, deadline);
+ }
+}
+
  const struct dma_fence_ops dma_fence_chain_ops = {
   .use_64bit_seqno = true,
   .get_driver_name = dma_fence_chain_get_driver_name,
@@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
   .enable_signaling = dma_fence_chain_enable_signaling,
   .signaled = dma_fence_chain_signaled,
   .release = dma_fence_chain_release,
+ .set_deadline = dma_fence_chain_set_deadline,
  };
  EXPORT_SYMBOL(dma_fence_chain_ops);

--
2.31.1


--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

2021-09-08 Thread Christian König


Am 08.09.21 um 19:45 schrieb Daniel Vetter:

On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:

From: Rob Clark 

As the finished fence is the one that is exposed to userspace, and
therefore the one that other operations, like atomic update, would
block on, we need to propagate the deadline from from the finished
fence to the actual hw fence.

v2: Split into drm_sched_fence_set_parent() (ckoenig)

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/scheduler/sched_fence.c | 34 +
  drivers/gpu/drm/scheduler/sched_main.c  |  2 +-
  include/drm/gpu_scheduler.h |  8 ++
  3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
b/drivers/gpu/drm/scheduler/sched_fence.c
index bcea035cf4c6..4fc41a71d1c7 100644
--- a/drivers/gpu/drm/scheduler/sched_fence.c
+++ b/drivers/gpu/drm/scheduler/sched_fence.c
@@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct 
dma_fence *f)
dma_fence_put(&fence->scheduled);
  }
  
+static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,

+ ktime_t deadline)
+{
+   struct drm_sched_fence *fence = to_drm_sched_fence(f);
+   unsigned long flags;
+
+   spin_lock_irqsave(&fence->lock, flags);
+
+   /* If we already have an earlier deadline, keep it: */
+   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
+   ktime_before(fence->deadline, deadline)) {
+   spin_unlock_irqrestore(&fence->lock, flags);
+   return;
+   }
+
+   fence->deadline = deadline;
+   set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
+
+   spin_unlock_irqrestore(&fence->lock, flags);
+
+   if (fence->parent)
+   dma_fence_set_deadline(fence->parent, deadline);
+}
+
  static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
@@ -138,6 +162,7 @@ static const struct dma_fence_ops 
drm_sched_fence_ops_finished = {
.get_driver_name = drm_sched_fence_get_driver_name,
.get_timeline_name = drm_sched_fence_get_timeline_name,
.release = drm_sched_fence_release_finished,
+   .set_deadline = drm_sched_fence_set_deadline_finished,
  };
  
  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)

@@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct 
dma_fence *f)
  }
  EXPORT_SYMBOL(to_drm_sched_fence);
  
+void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,

+   struct dma_fence *fence)
+{
+   s_fence->parent = dma_fence_get(fence);
+   if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
+&s_fence->finished.flags))

Don't you need the spinlock here too to avoid races? test_bit is very
unordered, so guarantees nothing. Spinlock would need to be both around
->parent = and the test_bit.

Entirely aside, but there's discussions going on to preallocate the hw
fence somehow. If we do that we could make the deadline forwarding
lockless here. Having a spinlock just to set the parent is a bit annoying


Well previously we didn't needed the spinlock to set the parent because 
the parent wasn't used outside of the main thread.


This becomes an issue now because we can race with setting the deadline. 
So yeah we probably do need the spinlock here now.


Christian.


...

Alternative is that you do this locklessly with barriers and a _lot_ of
comments. Would be good to benchmark whether the overhead matters though
first.
-Daniel


+   dma_fence_set_deadline(fence, s_fence->deadline);
+}
+
  struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity *entity,
  void *owner)
  {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 595e47ff7d06..27bf0ac0625f 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
drm_sched_fence_scheduled(s_fence);
  
  		if (!IS_ERR_OR_NULL(fence)) {

-   s_fence->parent = dma_fence_get(fence);
+   drm_sched_fence_set_parent(s_fence, fence);
r = dma_fence_add_callback(fence, &sched_job->cb,
   drm_sched_job_done_cb);
if (r == -ENOENT)
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 7f77a455722c..158ddd662469 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -238,6 +238,12 @@ struct drm_sched_fence {
   */
struct dma_fencefinished;
  
+	/**

+* @deadline: deadline set on &drm_sched_fence.finished which
+* potentially needs to be pro

Re: [PATCH] kernel/locking: Add context to ww_mutex_trylock.

2021-09-08 Thread Maarten Lankhorst

Op 08-09-2021 om 12:14 schreef Peter Zijlstra:
> On Tue, Sep 07, 2021 at 03:20:44PM +0200, Maarten Lankhorst wrote:
>> i915 will soon gain an eviction path that trylock a whole lot of locks
>> for eviction, getting dmesg failures like below:
>>
>> BUG: MAX_LOCK_DEPTH too low!
>> turning off the locking correctness validator.
>> depth: 48  max: 48!
>> 48 locks held by i915_selftest/5776:
>>  #0: 888101a79240 (&dev->mutex){}-{3:3}, at: 
>> __driver_attach+0x88/0x160
>>  #1: c99778c0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: 
>> i915_vma_pin.constprop.63+0x39/0x1b0 [i915]
>>  #2: 88800cf74de8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
>> i915_vma_pin.constprop.63+0x5f/0x1b0 [i915]
>>  #3: 88810c7f9e38 (&vm->mutex/1){+.+.}-{3:3}, at: 
>> i915_vma_pin_ww+0x1c4/0x9d0 [i915]
>>  #4: 88810bad5768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
>> i915_gem_evict_something+0x110/0x860 [i915]
>>  #5: 88810bad60e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
>> i915_gem_evict_something+0x110/0x860 [i915]
>> ...
>>  #46: 88811964d768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
>> i915_gem_evict_something+0x110/0x860 [i915]
>>  #47: 88811964e0e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
>> i915_gem_evict_something+0x110/0x860 [i915]
>> INFO: lockdep is turned off.
>> As an intermediate solution, add an acquire context to ww_mutex_trylock,
>> which allows us to do proper nesting annotations on the trylocks, making
>> the above lockdep splat disappear.
> Fair enough I suppose.
>
>> +/**
>> + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire 
>> context
>> + * @lock: mutex to lock
>> + * @ctx: optional w/w acquire context
>> + *
>> + * Trylocks a mutex with the optional acquire context; no deadlock 
>> detection is
>> + * possible. Returns 1 if the mutex has been acquired successfully, 0 
>> otherwise.
>> + *
>> + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a 
>> @ctx is
>> + * specified, -EALREADY and -EDEADLK handling may happen in calls to 
>> ww_mutex_lock.
>> + *
>> + * A mutex acquired with this function must be released with 
>> ww_mutex_unlock.
>> + */
>> +int __sched
>> +ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ctx)
>> +{
>> +bool locked;
>> +
>> +if (!ctx)
>> +return mutex_trylock(&ww->base);
>> +
>> +#ifdef CONFIG_DEBUG_MUTEXES
>> +DEBUG_LOCKS_WARN_ON(ww->base.magic != &ww->base);
>> +#endif
>> +
>> +preempt_disable();
>> +locked = __mutex_trylock(&ww->base);
>> +
>> +if (locked) {
>> +ww_mutex_set_context_fastpath(ww, ctx);
>> +mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ctx->dep_map, 
>> _RET_IP_);
>> +}
>> +preempt_enable();
>> +
>> +return locked;
>> +}
>> +EXPORT_SYMBOL(ww_mutex_trylock);
> You'll need a similar hunk in ww_rt_mutex.c

What tree has that file?

Re: [PATCH] drm/amd/amdkfd: fix possible memory leak in svm_range_restore_pages

2021-09-08 Thread Felix Kuehling

Hi Xiyu Yang,

This bug was already fixed by this commit:
https://gitlab.freedesktop.org/agd5f/linux/-/commit/598a118db0d85a432f8cd541a6a5d31e31c56b6b

Regards,
 Felix


Am 2021-09-09 um 12:27 a.m. schrieb Xiyu Yang:
> The memory leak issue may take place in an error handling path. When
> p->xnack_enabled is NULL, the function simply returns with -EFAULT and
> forgets to decrement the reference count of a kfd_process object bumped
> by kfd_lookup_process_by_pasid, which may incur memory leaks.
>
> Fix it by jumping to label "out", in which kfd_unref_process() decreases
> the refcount.
>
> Signed-off-by: Xiyu Yang 
> Signed-off-by: Xin Xiong 
> Signed-off-by: Xin Tan 
> ---
>  drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
> b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> index e883731c3f8f..0f7f1e5621ea 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
> @@ -2426,7 +2426,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
> unsigned int pasid,
>   }
>   if (!p->xnack_enabled) {
>   pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
> - return -EFAULT;
> + r = -EFAULT;
> + goto out;
>   }
>   svms = &p->svms;
>

Re: [PATCH v1 03/14] mm: add iomem vma selection for memory migration

2021-09-08 Thread Felix Kuehling

Am 2021-09-01 um 9:14 p.m. schrieb Dave Chinner:
> On Wed, Sep 01, 2021 at 07:07:34PM -0400, Felix Kuehling wrote:
>> On 2021-09-01 6:03 p.m., Dave Chinner wrote:
>>> On Wed, Sep 01, 2021 at 11:40:43AM -0400, Felix Kuehling wrote:
 Am 2021-09-01 um 4:29 a.m. schrieb Christoph Hellwig:
> On Mon, Aug 30, 2021 at 01:04:43PM -0400, Felix Kuehling wrote:
 driver code is not really involved in updating the CPU mappings. Maybe
 it's something we need to do in the migration helpers.
>>> It looks like I'm totally misunderstanding what you are adding here
>>> then.  Why do we need any special treatment at all for memory that
>>> has normal struct pages and is part of the direct kernel map?
>> The pages are like normal memory for purposes of mapping them in CPU
>> page tables and for coherent access from the CPU.
> That's the user page tables.  What about the kernel direct map?
> If there is a normal kernel struct page backing there really should
> be no need for the pgmap.
 I'm not sure. The physical address ranges are in the UEFI system address
 map as special-purpose memory. Does Linux create the struct pages and
 kernel direct map for that without a pgmap call? I didn't see that last
 time I went digging through that code.


>>  From an application
>> perspective, we want file-backed and anonymous mappings to be able to
>> use DEVICE_PUBLIC pages with coherent CPU access. The goal is to
>> optimize performance for GPU heavy workloads while minimizing the need
>> to migrate data back-and-forth between system memory and device memory.
> I don't really understand that part.  file backed pages are always
> allocated by the file system using the pagecache helpers, that is
> using the page allocator.  Anonymouns memory also always comes from
> the page allocator.
 I'm coming at this from my experience with DEVICE_PRIVATE. Both
 anonymous and file-backed pages should be migrateable to DEVICE_PRIVATE
 memory by the migrate_vma_* helpers for more efficient access by our
 GPU. (*) It's part of the basic premise of HMM as I understand it. I
 would expect the same thing to work for DEVICE_PUBLIC memory.

 (*) I believe migrating file-backed pages to DEVICE_PRIVATE doesn't
 currently work, but that's something I'm hoping to fix at some point.
>>> FWIW, I'd love to see the architecture documents that define how
>>> filesystems are supposed to interact with this device private
>>> memory. This whole "hand filesystem controlled memory to other
>>> devices" is a minefield that is trivial to get wrong iand very
>>> difficult to fix - just look at the historical mess that RDMA
>>> to/from file backed and/or DAX pages has been.
>>>
>>> So, really, from my perspective as a filesystem engineer, I want to
>>> see an actual specification for how this new memory type is going to
>>> interact with filesystem and the page cache so everyone has some
>>> idea of how this is going to work and can point out how it doesn't
>>> work before code that simply doesn't work is pushed out into
>>> production systems and then merged
>> OK. To be clear, that's not part of this patch series. And I have no
>> authority to push anything in this part of the kernel, so you have nothing
>> to fear. ;)
> I know this isn't part of the series. but this patchset is laying
> the foundation for future work that will impact subsystems that
> currently have zero visibility and/or knowledge of these changes.

I don't think this patchset is the foundation. Jerome Glisse's work on
HMM is, which was merged 4 years ago and is being used by multiple
drivers now, with the AMD GPU driver being a fairly recent addition.


> There must be an overall architectural plan for this functionality,
> regardless of the current state of implementation. It's that overall
> architectural plan I'm asking about here, because I need to
> understand that before I can sanely comment on the page
> cache/filesystem aspect of the proposed functionality...

The overall HMM and ZONE_DEVICE architecture is documented to some
extent in Documentation/vm/hmm.rst, though it may not go into the level
of detail you are looking for.


>
>> FWIW, we already have the ability to map file-backed system memory pages
>> into device page tables with HMM and interval notifiers. But we cannot
>> currently migrate them to ZONE_DEVICE pages.
> Sure, but sharing page cache pages allocated and managed by the
> filesystem is not what you are talking about. You're talking about
> migrating page cache data to completely different memory allocated
> by a different memory manager that the filesystems currently have no
> knowledge of or have any way of interfacing with.

This is not part of the current patch series. It is only my intention to
look into ways to migrate file-backed pages to ZONE_DEVICE memory in the
future.


>
> So I'm asking basic, fundamental questions abou

[PATCH v2 2/2] dt-bindings: display: Add binding for LG.Philips SW43101

2021-09-08 Thread Yassine Oudjana

Add a device tree binding for LG.Philips SW43101.

Signed-off-by: Yassine Oudjana 
---
Changes since v1:
 - Add regulator support.
 - Add MAINTAINERS entry.
 - Dual-license DT binding.

 .../display/panel/lgphilips,sw43101.yaml  | 75 +++
 MAINTAINERS   |  1 +
 2 files changed, 76 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml

diff --git 
a/Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml 
b/Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml
new file mode 100644
index ..6f67130bab8b
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml
@@ -0,0 +1,75 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/panel/lgphilips,sw43101.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: LG.Philips SW43101 1080x1920 OLED panel
+
+maintainers:
+  - Yassine Oudjana 
+
+allOf:
+  - $ref: panel-common.yaml#
+
+properties:
+  compatible:
+const: lgphilips,sw43101
+
+  reg: true
+  reset-gpios: true
+
+  vdd-supply:
+description: Digital power supply
+
+  avdd-supply:
+description: Analog power supply
+
+  elvdd-supply:
+description: Positive electroluminescence power supply
+
+  elvss-supply:
+description: Negative electroluminescence power supply
+
+  port: true
+
+required:
+  - compatible
+  - reg
+  - reset-gpios
+  - vdd-supply
+  - avdd-supply
+  - elvdd-supply
+  - elvss-supply
+  - port
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+dsi {
+#address-cells = <1>;
+#size-cells = <0>;
+
+panel@0 {
+compatible = "lgphilips,sw43101";
+reg = <0>;
+
+reset-gpios = <&msmgpio 8 GPIO_ACTIVE_LOW>;
+
+vdd-supply = <&vreg_l14a_1p8>;
+avdd-supply = <&vlin1_7v3>;
+elvdd-supply = <&elvdd>;
+elvss-supply = <&elvss>;
+
+port {
+panel_in: endpoint {
+remote-endpoint = <&dsi_out>;
+};
+};
+};
+};
+
+...
diff --git a/MAINTAINERS b/MAINTAINERS
index 46431e8ad373..aab9f057e8d7 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5902,6 +5902,7 @@ F:include/uapi/drm/i810_drm.h
 DRM DRIVER FOR LG.PHILIPS SW43101 PANEL
 M: Yassine Oudjana 
 S: Maintained
+F: Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml
 F: drivers/gpu/drm/panel/panel-lgphilips-sw43101.c
 
 DRM DRIVER FOR LVDS PANELS
-- 
2.33.0

[PATCH v2 1/2] drm/panel: Add driver for LG.Philips SW43101 DSI video mode panel

2021-09-08 Thread Yassine Oudjana

Add a driver for the LG.Philips SW43101 FHD (1080x1920) OLED DSI video mode 
panel.
This driver has been generated using linux-mdss-dsi-panel-driver-generator.

Signed-off-by: Yassine Oudjana 
---
Changes since v1:
 - Add regulator support.
 - Add MAINTAINERS entry.

 MAINTAINERS   |   5 +
 drivers/gpu/drm/panel/Kconfig |  10 +
 drivers/gpu/drm/panel/Makefile|   1 +
 .../gpu/drm/panel/panel-lgphilips-sw43101.c   | 363 ++
 4 files changed, 379 insertions(+)
 create mode 100644 drivers/gpu/drm/panel/panel-lgphilips-sw43101.c

diff --git a/MAINTAINERS b/MAINTAINERS
index f58dad1a1922..46431e8ad373 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -5899,6 +5899,11 @@ S:   Orphan / Obsolete
 F: drivers/gpu/drm/i810/
 F: include/uapi/drm/i810_drm.h
 
+DRM DRIVER FOR LG.PHILIPS SW43101 PANEL
+M: Yassine Oudjana 
+S: Maintained
+F: drivers/gpu/drm/panel/panel-lgphilips-sw43101.c
+
 DRM DRIVER FOR LVDS PANELS
 M: Laurent Pinchart 
 L: dri-devel@lists.freedesktop.org
diff --git a/drivers/gpu/drm/panel/Kconfig b/drivers/gpu/drm/panel/Kconfig
index beb581b96ecd..d8741c35bbfc 100644
--- a/drivers/gpu/drm/panel/Kconfig
+++ b/drivers/gpu/drm/panel/Kconfig
@@ -226,6 +226,16 @@ config DRM_PANEL_SAMSUNG_LD9040
depends on OF && SPI
select VIDEOMODE_HELPERS
 
+config DRM_PANEL_LGPHILIPS_SW43101
+   tristate "LG.Philips SW43101 DSI video mode panel"
+   depends on OF
+   depends on DRM_MIPI_DSI
+   depends on BACKLIGHT_CLASS_DEVICE
+   help
+ Say Y here if you want to enable support for the LG.Philips SW43101 
FHD
+ (1080x1920) OLED DSI video mode panel found on the Xiaomi Mi Note 2.
+ To compile this driver as a module, choose M here.
+
 config DRM_PANEL_LG_LB035Q02
tristate "LG LB035Q024573 RGB panel"
depends on GPIOLIB && OF && SPI
diff --git a/drivers/gpu/drm/panel/Makefile b/drivers/gpu/drm/panel/Makefile
index c8132050bcec..e79143ad14dd 100644
--- a/drivers/gpu/drm/panel/Makefile
+++ b/drivers/gpu/drm/panel/Makefile
@@ -20,6 +20,7 @@ obj-$(CONFIG_DRM_PANEL_KHADAS_TS050) += panel-khadas-ts050.o
 obj-$(CONFIG_DRM_PANEL_KINGDISPLAY_KD097D04) += panel-kingdisplay-kd097d04.o
 obj-$(CONFIG_DRM_PANEL_LEADTEK_LTK050H3146W) += panel-leadtek-ltk050h3146w.o
 obj-$(CONFIG_DRM_PANEL_LEADTEK_LTK500HD1829) += panel-leadtek-ltk500hd1829.o
+obj-$(CONFIG_DRM_PANEL_LGPHILIPS_SW43101) += panel-lgphilips-sw43101.o
 obj-$(CONFIG_DRM_PANEL_LG_LB035Q02) += panel-lg-lb035q02.o
 obj-$(CONFIG_DRM_PANEL_LG_LG4573) += panel-lg-lg4573.o
 obj-$(CONFIG_DRM_PANEL_NEC_NL8048HL11) += panel-nec-nl8048hl11.o
diff --git a/drivers/gpu/drm/panel/panel-lgphilips-sw43101.c 
b/drivers/gpu/drm/panel/panel-lgphilips-sw43101.c
new file mode 100644
index ..b7adf1d4a8ab
--- /dev/null
+++ b/drivers/gpu/drm/panel/panel-lgphilips-sw43101.c
@@ -0,0 +1,363 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * LG.Philips SW43101 OLED Panel driver
+ * Generated with linux-mdss-dsi-panel-driver-generator
+ *
+ * Copyright (c) 2020 Yassine Oudjana 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+
+static const char * const regulator_names[] = {
+   "vdd",
+   "avdd",
+   "elvdd",
+   "elvss",
+};
+
+struct sw43101_device {
+   struct drm_panel panel;
+   struct mipi_dsi_device *dsi;
+   struct gpio_desc *reset_gpio;
+   struct regulator_bulk_data supplies[ARRAY_SIZE(regulator_names)];
+   bool prepared;
+};
+
+static inline
+struct sw43101_device *to_sw43101_device(struct drm_panel *panel)
+{
+   return container_of(panel, struct sw43101_device, panel);
+}
+
+#define dsi_dcs_write_seq(dsi, seq...) do {\
+   static const u8 d[] = { seq };  \
+   int ret;\
+   ret = mipi_dsi_dcs_write_buffer(dsi, d, ARRAY_SIZE(d)); \
+   if (ret < 0)\
+   return ret; \
+   } while (0)
+
+static void sw43101_reset(struct sw43101_device *ctx)
+{
+   gpiod_set_value_cansleep(ctx->reset_gpio, 0);
+   usleep_range(1, 11000);
+   gpiod_set_value_cansleep(ctx->reset_gpio, 1);
+   usleep_range(1000, 2000);
+   gpiod_set_value_cansleep(ctx->reset_gpio, 0);
+   msleep(20);
+}
+
+static int sw43101_on(struct sw43101_device *ctx)
+{
+   struct mipi_dsi_device *dsi = ctx->dsi;
+   struct device *dev = &dsi->dev;
+   int ret;
+
+   dsi->mode_flags |= MIPI_DSI_MODE_LPM;
+
+   dsi_dcs_write_seq(dsi, 0xb0, 0x5a);
+   usleep_range(1000, 2000);
+   dsi_dcs_write_seq(dsi, 0xb2, 0x13, 0x12, 0x40, 0xd0, 0xff, 0xff, 0x15);
+   dsi_dcs_write_seq(dsi, 0xe3, 0x01);
+   usleep_range(1000, 2000);
+   dsi_d

[PATCH v2 0/2] drm/panel: Add support for LG.Philips SW43101 DSI video mode panel

2021-09-08 Thread Yassine Oudjana

This adds a driver for the LG.Philips SW43101 FHD (1080x1920) 58Hz OLED DSI
video mode panel, found on the Xiaomi Mi Note 2.

Changes since v1:
 - Add regulator support.
 - Add MAINTAINERS entry.
 - Dual-license DT binding.

Yassine Oudjana (2):
  drm/panel: Add driver for LG.Philips SW43101 DSI video mode panel
  dt-bindings: display: Add binding for LG.Philips SW43101

 .../display/panel/lgphilips,sw43101.yaml  |  75 
 MAINTAINERS   |   6 +
 drivers/gpu/drm/panel/Kconfig |  10 +
 drivers/gpu/drm/panel/Makefile|   1 +
 .../gpu/drm/panel/panel-lgphilips-sw43101.c   | 363 ++
 5 files changed, 455 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/panel/lgphilips,sw43101.yaml
 create mode 100644 drivers/gpu/drm/panel/panel-lgphilips-sw43101.c

-- 
2.33.0

Re: [PATCH v1 03/14] mm: add iomem vma selection for memory migration

2021-09-08 Thread Felix Kuehling

Am 2021-09-02 um 4:18 a.m. schrieb Christoph Hellwig:
> On Wed, Sep 01, 2021 at 11:40:43AM -0400, Felix Kuehling wrote:
> It looks like I'm totally misunderstanding what you are adding here
> then.  Why do we need any special treatment at all for memory that
> has normal struct pages and is part of the direct kernel map?
 The pages are like normal memory for purposes of mapping them in CPU
 page tables and for coherent access from the CPU.
>>> That's the user page tables.  What about the kernel direct map?
>>> If there is a normal kernel struct page backing there really should
>>> be no need for the pgmap.
>> I'm not sure. The physical address ranges are in the UEFI system address
>> map as special-purpose memory. Does Linux create the struct pages and
>> kernel direct map for that without a pgmap call? I didn't see that last
>> time I went digging through that code.
> So doing some googling finds a patch from Dan that claims to hand EFI
> special purpose memory to the device dax driver.  But when I try to
> follow the version that got merged it looks it is treated simply as an
> MMIO region to be claimed by drivers, which would not get a struct page.
>
> Dan, did I misunderstand how E820_TYPE_SOFT_RESERVED works?
>
 From an application
 perspective, we want file-backed and anonymous mappings to be able to
 use DEVICE_PUBLIC pages with coherent CPU access. The goal is to
 optimize performance for GPU heavy workloads while minimizing the need
 to migrate data back-and-forth between system memory and device memory.
>>> I don't really understand that part.  file backed pages are always
>>> allocated by the file system using the pagecache helpers, that is
>>> using the page allocator.  Anonymouns memory also always comes from
>>> the page allocator.
>> I'm coming at this from my experience with DEVICE_PRIVATE. Both
>> anonymous and file-backed pages should be migrateable to DEVICE_PRIVATE
>> memory by the migrate_vma_* helpers for more efficient access by our
>> GPU. (*) It's part of the basic premise of HMM as I understand it. I
>> would expect the same thing to work for DEVICE_PUBLIC memory.
> Ok, so you want to migrate to and from them.  Not use DEVICE_PUBLIC
> for the actual page cache pages.  That maks a lot more sense.
>
>> I see DEVICE_PUBLIC as an improved version of DEVICE_PRIVATE that allows
>> the CPU to map the device memory coherently to minimize the need for
>> migrations when CPU and GPU access the same memory concurrently or
>> alternatingly. But we're not going as far as putting that memory
>> entirely under the management of the Linux memory manager and VM
>> subsystem. Our (and HPE's) system architects decided that this memory is
>> not suitable to be used like regular NUMA system memory by the Linux
>> memory manager.
> So yes.  It is a Memory Mapped I/O region, which unlike the PCIe BARs
> that people typically deal with is fully cache coherent.  I think this
> does make more sense as a description.
>
> But to go back to what start this discussion:  If these are memory
> mapped I/O pfn_valid should generally not return true for them.

As I understand it, pfn_valid should be true for any pfn that's part of
the kernel's physical memory map, i.e. is returned by page_to_pfn or
works with pfn_to_page. Both the hmm_range_fault and the migrate_vma_*
APIs use pfns to refer to regular system memory and ZONE_DEVICE pages
(even DEVICE_PRIVATE). Therefore I believe pfn_valid should be true for
ZONE_DEVICE pages as well.

Regards,
  Felix


>
> And as you already pointed out in reply to Alex we need to tighten the
> selection criteria one way or another.

[Bug 213917] Screen starts flickering when laptop(amdgpu) wakes up after suspend.

2021-09-08 Thread bugzilla-daemon

https://bugzilla.kernel.org/show_bug.cgi?id=213917

Samuel Sieb (samuel-kb...@sieb.net) changed:

   What|Removed |Added

 CC||samuel-kb...@sieb.net

--- Comment #1 from Samuel Sieb (samuel-kb...@sieb.net) ---
I've just run into this as well.  I upgraded the laptop to 5.13.12 and the
display doesn't resume properly as described.  The cpu is "AMD A10-9600P RADEON
R5".  The only relevant messages in the journal are 4 lines like: 
amdgpu :00:01.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x
address=0x10aa25720 flags=0x0070]
with a different value for address in each one.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PULL] drm-misc-fixes

2021-09-08 Thread Dave Airlie

On Thu, 9 Sept 2021 at 03:44, Thomas Zimmermann  wrote:
>
> Hi Dave and Daniel,
>
> here's this week's PR for drm-misc-fixes. One patch is a potential deadlock
> in TTM, the other enables an additional plane in kmb. I'm slightly unhappy
> that the latter one ended up in -fixes as it's not a bugfix AFAICT.

To avoid messy merge window, I'm not pulling this until after rc1
unless there is some major reason?

the current drm-next doesn't have v5.14 in it, and the merge is rather
ugly right now.

(maybe I should always pull it in before sending to Linus to avoid
this in future).

Dave.

[PATCH] dma-buf: system_heap: Avoid warning on mid-order allocations

2021-09-08 Thread John Stultz

When trying to do mid-order allocations, set __GFP_NOWARN to
avoid warning messages if the allocation fails, as we will
still fall back to single page allocatitions in that case.
This is the similar to what we already do for large order
allocations.

Cc: Daniel Vetter 
Cc: Christian Koenig 
Cc: Sumit Semwal 
Cc: Liam Mark 
Cc: Chris Goldsworthy 
Cc: Laura Abbott 
Cc: Brian Starkey 
Cc: Hridya Valsaraju 
Cc: Suren Baghdasaryan 
Cc: Sandeep Patil 
Cc: Daniel Mentz 
Cc: Ørjan Eide 
Cc: Robin Murphy 
Cc: Simon Ser 
Cc: James Jones 
Cc: Leo Yan 
Cc: linux-me...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: John Stultz 
---
 drivers/dma-buf/heaps/system_heap.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/heaps/system_heap.c 
b/drivers/dma-buf/heaps/system_heap.c
index 23a7e74ef966..f57a39ddd063 100644
--- a/drivers/dma-buf/heaps/system_heap.c
+++ b/drivers/dma-buf/heaps/system_heap.c
@@ -40,11 +40,12 @@ struct dma_heap_attachment {
bool mapped;
 };
 
+#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
+#define MID_ORDER_GFP (LOW_ORDER_GFP | __GFP_NOWARN)
 #define HIGH_ORDER_GFP  (((GFP_HIGHUSER | __GFP_ZERO | __GFP_NOWARN \
| __GFP_NORETRY) & ~__GFP_RECLAIM) \
| __GFP_COMP)
-#define LOW_ORDER_GFP (GFP_HIGHUSER | __GFP_ZERO | __GFP_COMP)
-static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, LOW_ORDER_GFP};
+static gfp_t order_flags[] = {HIGH_ORDER_GFP, MID_ORDER_GFP, LOW_ORDER_GFP};
 /*
  * The selection of the orders used for allocation (1MB, 64K, 4K) is designed
  * to match with the sizes often found in IOMMUs. Using order 4 pages instead
-- 
2.25.1

[PATCH AUTOSEL 5.14 010/252] dma-buf: fix dma_resv_test_signaled test_all handling v2

2021-09-08 Thread Sasha Levin

From: Christian König 

[ Upstream commit 9d38814d1e346ea37a51cbf31f4424c9d059459e ]

As the name implies if testing all fences is requested we
should indeed test all fences and not skip the exclusive
one because we see shared ones.

v2: fix logic once more

Signed-off-by: Christian König 
Reviewed-by: Daniel Vetter 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210702111642.17259-3-christian.koe...@amd.com
Signed-off-by: Sasha Levin 
---
 drivers/dma-buf/dma-resv.c | 33 -
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index f26c71747d43..e744fd87c63c 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -615,25 +615,21 @@ static inline int dma_resv_test_signaled_single(struct 
dma_fence *passed_fence)
  */
 bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all)
 {
-   unsigned int seq, shared_count;
+   struct dma_fence *fence;
+   unsigned int seq;
int ret;
 
rcu_read_lock();
 retry:
ret = true;
-   shared_count = 0;
seq = read_seqcount_begin(&obj->seq);
 
if (test_all) {
struct dma_resv_list *fobj = dma_resv_shared_list(obj);
-   unsigned int i;
-
-   if (fobj)
-   shared_count = fobj->shared_count;
+   unsigned int i, shared_count;
 
+   shared_count = fobj ? fobj->shared_count : 0;
for (i = 0; i < shared_count; ++i) {
-   struct dma_fence *fence;
-
fence = rcu_dereference(fobj->shared[i]);
ret = dma_resv_test_signaled_single(fence);
if (ret < 0)
@@ -641,24 +637,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool 
test_all)
else if (!ret)
break;
}
-
-   if (read_seqcount_retry(&obj->seq, seq))
-   goto retry;
}
 
-   if (!shared_count) {
-   struct dma_fence *fence_excl = dma_resv_excl_fence(obj);
-
-   if (fence_excl) {
-   ret = dma_resv_test_signaled_single(fence_excl);
-   if (ret < 0)
-   goto retry;
+   fence = dma_resv_excl_fence(obj);
+   if (ret && fence) {
+   ret = dma_resv_test_signaled_single(fence);
+   if (ret < 0)
+   goto retry;
 
-   if (read_seqcount_retry(&obj->seq, seq))
-   goto retry;
-   }
}
 
+   if (read_seqcount_retry(&obj->seq, seq))
+   goto retry;
+
rcu_read_unlock();
return ret;
 }
-- 
2.30.2

[PATCH AUTOSEL 5.14 008/252] drm/amdgpu: Fix koops when accessing RAS EEPROM

2021-09-08 Thread Sasha Levin

From: Luben Tuikov 

[ Upstream commit 1d9d2ca85b32605ac9c74c8fa42d0c1cfbe019d4 ]

Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.

Cc: Alexander Deucher 
Cc: John Clements 
Cc: Hawking Zhang 
Signed-off-by: Luben Tuikov 
Acked-by: Alexander Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
index fc66aca28594..95d5842385b3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
@@ -1966,11 +1966,20 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
bool exc_err_limit = false;
int ret;
 
-   if (adev->ras_enabled && con)
-   data = &con->eh_data;
-   else
+   if (!con)
+   return 0;
+
+   /* Allow access to RAS EEPROM via debugfs, when the ASIC
+* supports RAS and debugfs is enabled, but when
+* adev->ras_enabled is unset, i.e. when "ras_enable"
+* module parameter is set to 0.
+*/
+   con->adev = adev;
+
+   if (!adev->ras_enabled)
return 0;
 
+   data = &con->eh_data;
*data = kmalloc(sizeof(**data), GFP_KERNEL | __GFP_ZERO);
if (!*data) {
ret = -ENOMEM;
@@ -1980,7 +1989,6 @@ int amdgpu_ras_recovery_init(struct amdgpu_device *adev)
mutex_init(&con->recovery_lock);
INIT_WORK(&con->recovery_work, amdgpu_ras_do_recovery);
atomic_set(&con->in_recovery, 0);
-   con->adev = adev;
 
max_eeprom_records_len = amdgpu_ras_eeprom_get_record_max_length();
amdgpu_ras_validate_threshold(adev, max_eeprom_records_len);
-- 
2.30.2

[PATCH AUTOSEL 5.14 009/252] drm: vc4: Fix pixel-wrap issue with DVP teardown

2021-09-08 Thread Sasha Levin

From: Tim Gover 

[ Upstream commit 0b066a6809d0f8fd9868e383add36aa5a2fa409d ]

Adjust the DVP enable/disable sequence to avoid a pixel getting stuck
in an internal, non resettable FIFO within PixelValve when changing
HDMI resolution.

The blank pixels features of the DVP can prevent signals back to
pixelvalve causing it to not clear the FIFO. Adjust the ordering
and timing of operations to ensure the clear signal makes it through to
pixelvalve.

Signed-off-by: Tim Gover 
Signed-off-by: Maxime Ripard 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210628130533.144617-1-max...@cerno.tech
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index ad92dbb128b3..f91d37beb113 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -613,12 +613,12 @@ static void vc4_hdmi_encoder_post_crtc_disable(struct 
drm_encoder *encoder,
 
HDMI_WRITE(HDMI_RAM_PACKET_CONFIG, 0);
 
-   HDMI_WRITE(HDMI_VID_CTL, HDMI_READ(HDMI_VID_CTL) |
-  VC4_HD_VID_CTL_CLRRGB | VC4_HD_VID_CTL_CLRSYNC);
+   HDMI_WRITE(HDMI_VID_CTL, HDMI_READ(HDMI_VID_CTL) | 
VC4_HD_VID_CTL_CLRRGB);
 
-   HDMI_WRITE(HDMI_VID_CTL,
-  HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_BLANKPIX);
+   mdelay(1);
 
+   HDMI_WRITE(HDMI_VID_CTL,
+  HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE);
vc4_hdmi_disable_scrambling(encoder);
 }
 
@@ -628,12 +628,12 @@ static void vc4_hdmi_encoder_post_crtc_powerdown(struct 
drm_encoder *encoder,
struct vc4_hdmi *vc4_hdmi = encoder_to_vc4_hdmi(encoder);
int ret;
 
+   HDMI_WRITE(HDMI_VID_CTL,
+  HDMI_READ(HDMI_VID_CTL) | VC4_HD_VID_CTL_BLANKPIX);
+
if (vc4_hdmi->variant->phy_disable)
vc4_hdmi->variant->phy_disable(vc4_hdmi);
 
-   HDMI_WRITE(HDMI_VID_CTL,
-  HDMI_READ(HDMI_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE);
-
clk_disable_unprepare(vc4_hdmi->pixel_bvb_clock);
clk_disable_unprepare(vc4_hdmi->pixel_clock);
 
@@ -1015,6 +1015,7 @@ static void vc4_hdmi_encoder_post_crtc_enable(struct 
drm_encoder *encoder,
 
HDMI_WRITE(HDMI_VID_CTL,
   VC4_HD_VID_CTL_ENABLE |
+  VC4_HD_VID_CTL_CLRRGB |
   VC4_HD_VID_CTL_UNDERFLOW_ENABLE |
   VC4_HD_VID_CTL_FRAME_COUNTER_RESET |
   (vsync_pos ? 0 : VC4_HD_VID_CTL_VSYNC_LOW) |
-- 
2.30.2

[PATCH AUTOSEL 5.14 007/252] drm/amdgpu: Fix amdgpu_ras_eeprom_init()

2021-09-08 Thread Sasha Levin

From: Luben Tuikov 

[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ]

No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.

Cc: Andrey Grodzovsky 
Cc: Alexander Deucher 
Signed-off-by: Luben Tuikov 
Acked-by: Alexander Deucher 
Signed-off-by: Alex Deucher 
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
index 38222de921d1..8dd151c9e459 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c
@@ -325,7 +325,7 @@ int amdgpu_ras_eeprom_init(struct amdgpu_ras_eeprom_control 
*control,
return ret;
}
 
-   __decode_table_header_from_buff(hdr, &buff[2]);
+   __decode_table_header_from_buff(hdr, buff);
 
if (hdr->header == EEPROM_TABLE_HDR_VAL) {
control->num_recs = (hdr->tbl_size - EEPROM_TABLE_HEADER_SIZE) /
-- 
2.30.2

[PATCH AUTOSEL 5.14 006/252] drm/omap: Follow implicit fencing in prepare_fb

2021-09-08 Thread Sasha Levin

From: Daniel Vetter 

[ Upstream commit 942d8344d5f14b9ea2ae43756f319b9f44216ba4 ]

I guess no one ever tried running omap together with lima or panfrost,
not even sure that's possible. Anyway for consistency, fix this.

Reviewed-by: Tomi Valkeinen 
Signed-off-by: Daniel Vetter 
Cc: Tomi Valkeinen 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-12-daniel.vet...@ffwll.ch
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/omapdrm/omap_plane.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c 
b/drivers/gpu/drm/omapdrm/omap_plane.c
index 801da917507d..512af976b7e9 100644
--- a/drivers/gpu/drm/omapdrm/omap_plane.c
+++ b/drivers/gpu/drm/omapdrm/omap_plane.c
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 #include "omap_dmm_tiler.h"
@@ -29,6 +30,8 @@ static int omap_plane_prepare_fb(struct drm_plane *plane,
if (!new_state->fb)
return 0;
 
+   drm_gem_plane_helper_prepare_fb(plane, new_state);
+
return omap_framebuffer_pin(new_state->fb);
 }
 
-- 
2.30.2

[PATCH AUTOSEL 5.14 005/252] drm/ttm: Fix multihop assert on eviction.

2021-09-08 Thread Sasha Levin

From: Andrey Grodzovsky 

[ Upstream commit 403797925768d9fa870f5b1ebcd20016b397083b ]

Problem:
Under memory pressure when GTT domain is almost full multihop assert
will come up when trying to evict LRU BO from VRAM to SYSTEM.

Fix:
Don't assert on multihop error in evict code but rather do a retry
as we do in ttm_bo_move_buffer

Signed-off-by: Andrey Grodzovsky 
Reviewed-by: Christian König 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-6-andrey.grodzov...@amd.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/ttm/ttm_bo.c | 63 +++-
 1 file changed, 34 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 8d7fd65ccced..32202385073a 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -488,6 +488,31 @@ void ttm_bo_unlock_delayed_workqueue(struct ttm_device 
*bdev, int resched)
 }
 EXPORT_SYMBOL(ttm_bo_unlock_delayed_workqueue);
 
+static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo,
+struct ttm_resource **mem,
+struct ttm_operation_ctx *ctx,
+struct ttm_place *hop)
+{
+   struct ttm_placement hop_placement;
+   struct ttm_resource *hop_mem;
+   int ret;
+
+   hop_placement.num_placement = hop_placement.num_busy_placement = 1;
+   hop_placement.placement = hop_placement.busy_placement = hop;
+
+   /* find space in the bounce domain */
+   ret = ttm_bo_mem_space(bo, &hop_placement, &hop_mem, ctx);
+   if (ret)
+   return ret;
+   /* move to the bounce domain */
+   ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL);
+   if (ret) {
+   ttm_resource_free(bo, &hop_mem);
+   return ret;
+   }
+   return 0;
+}
+
 static int ttm_bo_evict(struct ttm_buffer_object *bo,
struct ttm_operation_ctx *ctx)
 {
@@ -527,12 +552,17 @@ static int ttm_bo_evict(struct ttm_buffer_object *bo,
goto out;
}
 
+bounce:
ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, &hop);
-   if (unlikely(ret)) {
-   WARN(ret == -EMULTIHOP, "Unexpected multihop in eviction - 
likely driver bug\n");
-   if (ret != -ERESTARTSYS)
+   if (ret == -EMULTIHOP) {
+   ret = ttm_bo_bounce_temp_buffer(bo, &evict_mem, ctx, &hop);
+   if (ret) {
pr_err("Buffer eviction failed\n");
-   ttm_resource_free(bo, &evict_mem);
+   ttm_resource_free(bo, &evict_mem);
+   goto out;
+   }
+   /* try and move to final place now. */
+   goto bounce;
}
 out:
return ret;
@@ -847,31 +877,6 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo,
 }
 EXPORT_SYMBOL(ttm_bo_mem_space);
 
-static int ttm_bo_bounce_temp_buffer(struct ttm_buffer_object *bo,
-struct ttm_resource **mem,
-struct ttm_operation_ctx *ctx,
-struct ttm_place *hop)
-{
-   struct ttm_placement hop_placement;
-   struct ttm_resource *hop_mem;
-   int ret;
-
-   hop_placement.num_placement = hop_placement.num_busy_placement = 1;
-   hop_placement.placement = hop_placement.busy_placement = hop;
-
-   /* find space in the bounce domain */
-   ret = ttm_bo_mem_space(bo, &hop_placement, &hop_mem, ctx);
-   if (ret)
-   return ret;
-   /* move to the bounce domain */
-   ret = ttm_bo_handle_move_mem(bo, hop_mem, false, ctx, NULL);
-   if (ret) {
-   ttm_resource_free(bo, &hop_mem);
-   return ret;
-   }
-   return 0;
-}
-
 static int ttm_bo_move_buffer(struct ttm_buffer_object *bo,
  struct ttm_placement *placement,
  struct ttm_operation_ctx *ctx)
-- 
2.30.2

[PATCH AUTOSEL 5.14 004/252] drm/vc4: hdmi: Set HD_CTL_WHOLSMP and HD_CTL_CHALIGN_SET

2021-09-08 Thread Sasha Levin

From: Dom Cobley 

[ Upstream commit 1698ecb218eb82587dbfc71a2e26ded66e5ecf59 ]

Symptom is random switching of speakers when using multichannel.

Repeatedly running speakertest -c8 occasionally starts with
channels jumbled. This is fixed with HD_CTL_WHOLSMP.

The other bit looks beneficial and apears harmless in testing so
I'd suggest adding it too.

Documentation says: HD_CTL_WHILSMP_SET
Wait for whole sample. When this bit is set MAI transmit will start
only when there is at least one whole sample available in the fifo.

Documentation says: HD_CTL_CHALIGN_SET
Channel Align When Overflow. This bit is used to realign the audio
channels in case of an overflow.
If this bit is set, after the detection of an overflow, equal
amount of dummy words to the missing words will be written to fifo,
filling up the broken sample and maintaining alignment.

Signed-off-by: Dom Cobley 
Signed-off-by: Maxime Ripard 
Reviewed-by: Nicolas Saenz Julienne 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210525132354.297468-7-max...@cerno.tech
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/vc4/vc4_hdmi.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index c2876731ee2d..ad92dbb128b3 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -1372,7 +1372,9 @@ static int vc4_hdmi_audio_trigger(struct 
snd_pcm_substream *substream, int cmd,
HDMI_WRITE(HDMI_MAI_CTL,
   VC4_SET_FIELD(vc4_hdmi->audio.channels,
 VC4_HD_MAI_CTL_CHNUM) |
-  VC4_HD_MAI_CTL_ENABLE);
+VC4_HD_MAI_CTL_WHOLSMP |
+VC4_HD_MAI_CTL_CHALIGN |
+VC4_HD_MAI_CTL_ENABLE);
break;
case SNDRV_PCM_TRIGGER_STOP:
HDMI_WRITE(HDMI_MAI_CTL,
-- 
2.30.2

[PATCH AUTOSEL 5.14 003/252] drm/vmwgfx: Fix some static checker warnings

2021-09-08 Thread Sasha Levin

From: Zack Rusin 

[ Upstream commit 74231041d14030f1ae6582b9233bfe782ac23e33 ]

Fix some minor issues that Coverity spotted in the code. None
of that are serious but they're all valid concerns so fixing
them makes sense.

Signed-off-by: Zack Rusin 
Reviewed-by: Roland Scheidegger 
Reviewed-by: Martin Krastev 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210609172307.131929-5-za...@vmware.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/vmwgfx/ttm_memory.c|  2 ++
 drivers/gpu/drm/vmwgfx/vmwgfx_binding.c| 20 
 drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf_res.c |  4 +++-
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c|  2 ++
 drivers/gpu/drm/vmwgfx/vmwgfx_mob.c|  4 +++-
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c|  6 --
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c   |  8 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_so.c |  3 ++-
 drivers/gpu/drm/vmwgfx/vmwgfx_validation.c |  4 ++--
 10 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/ttm_memory.c 
b/drivers/gpu/drm/vmwgfx/ttm_memory.c
index aeb0a22a2c34..edd17c30d5a5 100644
--- a/drivers/gpu/drm/vmwgfx/ttm_memory.c
+++ b/drivers/gpu/drm/vmwgfx/ttm_memory.c
@@ -435,8 +435,10 @@ int ttm_mem_global_init(struct ttm_mem_global *glob, 
struct device *dev)
 
si_meminfo(&si);
 
+   spin_lock(&glob->lock);
/* set it as 0 by default to keep original behavior of OOM */
glob->lower_mem_limit = 0;
+   spin_unlock(&glob->lock);
 
ret = ttm_mem_init_kernel_zone(glob, &si);
if (unlikely(ret != 0))
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c
index 05b324825900..ea6d8c86985f 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_binding.c
@@ -715,7 +715,7 @@ static int vmw_binding_scrub_cb(struct vmw_ctx_bindinfo 
*bi, bool rebind)
  * without checking which bindings actually need to be emitted
  *
  * @cbs: Pointer to the context's struct vmw_ctx_binding_state
- * @bi: Pointer to where the binding info array is stored in @cbs
+ * @biv: Pointer to where the binding info array is stored in @cbs
  * @max_num: Maximum number of entries in the @bi array.
  *
  * Scans the @bi array for bindings and builds a buffer of view id data.
@@ -725,11 +725,9 @@ static int vmw_binding_scrub_cb(struct vmw_ctx_bindinfo 
*bi, bool rebind)
  * contains the command data.
  */
 static void vmw_collect_view_ids(struct vmw_ctx_binding_state *cbs,
-const struct vmw_ctx_bindinfo *bi,
+const struct vmw_ctx_bindinfo_view *biv,
 u32 max_num)
 {
-   const struct vmw_ctx_bindinfo_view *biv =
-   container_of(bi, struct vmw_ctx_bindinfo_view, bi);
unsigned long i;
 
cbs->bind_cmd_count = 0;
@@ -838,7 +836,7 @@ static int vmw_emit_set_sr(struct vmw_ctx_binding_state 
*cbs,
  */
 static int vmw_emit_set_rt(struct vmw_ctx_binding_state *cbs)
 {
-   const struct vmw_ctx_bindinfo *loc = &cbs->render_targets[0].bi;
+   const struct vmw_ctx_bindinfo_view *loc = &cbs->render_targets[0];
struct {
SVGA3dCmdHeader header;
SVGA3dCmdDXSetRenderTargets body;
@@ -874,7 +872,7 @@ static int vmw_emit_set_rt(struct vmw_ctx_binding_state 
*cbs)
  * without checking which bindings actually need to be emitted
  *
  * @cbs: Pointer to the context's struct vmw_ctx_binding_state
- * @bi: Pointer to where the binding info array is stored in @cbs
+ * @biso: Pointer to where the binding info array is stored in @cbs
  * @max_num: Maximum number of entries in the @bi array.
  *
  * Scans the @bi array for bindings and builds a buffer of SVGA3dSoTarget data.
@@ -884,11 +882,9 @@ static int vmw_emit_set_rt(struct vmw_ctx_binding_state 
*cbs)
  * contains the command data.
  */
 static void vmw_collect_so_targets(struct vmw_ctx_binding_state *cbs,
-  const struct vmw_ctx_bindinfo *bi,
+  const struct vmw_ctx_bindinfo_so_target 
*biso,
   u32 max_num)
 {
-   const struct vmw_ctx_bindinfo_so_target *biso =
-   container_of(bi, struct vmw_ctx_bindinfo_so_target, bi);
unsigned long i;
SVGA3dSoTarget *so_buffer = (SVGA3dSoTarget *) cbs->bind_cmd_buffer;
 
@@ -919,7 +915,7 @@ static void vmw_collect_so_targets(struct 
vmw_ctx_binding_state *cbs,
  */
 static int vmw_emit_set_so_target(struct vmw_ctx_binding_state *cbs)
 {
-   const struct vmw_ctx_bindinfo *loc = &cbs->so_targets[0].bi;
+   const struct vmw_ctx_bindinfo_so_target *loc = &cbs->so_targets[0];
struct {
SVGA3dCmdHeader header;
SVGA3dCmdDXSetSOTargets body;
@@ -1066,7 +1062,7 @@ static int vmw_emit_set_vb(struct vmw_ctx_binding_state 
*cbs)
 
 static int vmw_emit_set_uav(

[PATCH AUTOSEL 5.14 002/252] drm/vmwgfx: Fix subresource updates with new contexts

2021-09-08 Thread Sasha Levin

From: Zack Rusin 

[ Upstream commit a12be0277316ed923411c9c80b2899ee74d2b033 ]

The has_dx variable was only set during the initialization which
meant that UPDATE_SUBRESOURCE was never used. We were emulating it
with UPDATE_GB_IMAGE but that's always been a stop-gap. Instead
of has_dx which has been deprecated a long time ago we need to check
for whether shader model 4.0 or newer is available to the device.

Signed-off-by: Zack Rusin 
Reviewed-by: Roland Scheidegger 
Reviewed-by: Martin Krastev 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210609172307.131929-4-za...@vmware.com
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
index 0835468bb2ee..47c03a276515 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
@@ -1872,7 +1872,6 @@ static void vmw_surface_dirty_range_add(struct 
vmw_resource *res, size_t start,
 static int vmw_surface_dirty_sync(struct vmw_resource *res)
 {
struct vmw_private *dev_priv = res->dev_priv;
-   bool has_dx = 0;
u32 i, num_dirty;
struct vmw_surface_dirty *dirty =
(struct vmw_surface_dirty *) res->dirty;
@@ -1899,7 +1898,7 @@ static int vmw_surface_dirty_sync(struct vmw_resource 
*res)
if (!num_dirty)
goto out;
 
-   alloc_size = num_dirty * ((has_dx) ? sizeof(*cmd1) : sizeof(*cmd2));
+   alloc_size = num_dirty * ((has_sm4_context(dev_priv)) ? sizeof(*cmd1) : 
sizeof(*cmd2));
cmd = VMW_CMD_RESERVE(dev_priv, alloc_size);
if (!cmd)
return -ENOMEM;
@@ -1917,7 +1916,7 @@ static int vmw_surface_dirty_sync(struct vmw_resource 
*res)
 * DX_UPDATE_SUBRESOURCE is aware of array surfaces.
 * UPDATE_GB_IMAGE is not.
 */
-   if (has_dx) {
+   if (has_sm4_context(dev_priv)) {
cmd1->header.id = SVGA_3D_CMD_DX_UPDATE_SUBRESOURCE;
cmd1->header.size = sizeof(cmd1->body);
cmd1->body.sid = res->id;
-- 
2.30.2

[PATCH AUTOSEL 5.14 001/252] drm/bridge: ti-sn65dsi86: Don't read EDID blob over DDC

2021-09-08 Thread Sasha Levin

From: Douglas Anderson 

[ Upstream commit a70e558c151043ce46a5e5999f4310e0b3551f57 ]

This is really just a revert of commit 58074b08c04a ("drm/bridge:
ti-sn65dsi86: Read EDID blob over DDC"), resolving conflicts.

The old code failed to read the EDID properly in a very important
case: before the bridge's pre_enable() was called. The way things need
to work:
1. Read the EDID.
2. Based on the EDID, decide on video settings and pixel clock.
3. Enable the bridge w/ the desired settings.

The way things were working:
1. Try to read the EDID but fail; fall back to hardcoded values.
2. Based on hardcoded values, decide on video settings and pixel clock.
3. Enable the bridge w/ the desired settings.
4. Try again to read the EDID, it works now!
5. Realize that the hardcoded settings weren't quite right.
6. Disable / reenable the bridge w/ the right settings.

The reasons for the failures were twofold:
a) Since we never ran the bridge chip's pre-enable then we never set
   the bit to ignore HPD. This meant the bridge chip didn't even _try_
   to go out on the bus and communicate with the panel.
b) Even if we fixed things to ignore HPD, the EDID still wouldn't read
   if the panel wasn't on.

Instead of reverting the code, we could fix it to set the HPD bit and
also power on the panel. However, it also works nicely to just let the
panel code read the EDID. Now that we've split the driver up we can
expose the DDC AUX channel bus to the panel node. The panel can take
charge of reading the EDID.

NOTE: in order for things to work, anyone that needs to read the EDID
will need to instantiate their panel using the new DP AUX bus (AKA by
listing their panel under the "aux-bus" node of the bridge chip in the
device tree).

In the future if we want to use the bridge chip to provide a full
external DP port (which won't have a panel) then we will have to
conditinally add EDID reading back in.

Suggested-by: Andrzej Hajda 
Signed-off-by: Douglas Anderson 
Reviewed-by: Bjorn Andersson 
Link: 
https://patchwork.freedesktop.org/patch/msgid/20210611101711.v10.9.I9330684c25f65bb318eff57f0616500f83eac3cc@changeid
Signed-off-by: Sasha Levin 
---
 drivers/gpu/drm/bridge/ti-sn65dsi86.c | 22 --
 1 file changed, 22 deletions(-)

diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi86.c 
b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
index 45a2969afb2b..aef850296756 100644
--- a/drivers/gpu/drm/bridge/ti-sn65dsi86.c
+++ b/drivers/gpu/drm/bridge/ti-sn65dsi86.c
@@ -124,7 +124,6 @@
  * @connector:Our connector.
  * @host_node:Remote DSI node.
  * @dsi:  Our MIPI DSI source.
- * @edid: Detected EDID of eDP panel.
  * @refclk:   Our reference clock.
  * @panel:Our panel.
  * @enable_gpio:  The GPIO we toggle to enable the bridge.
@@ -154,7 +153,6 @@ struct ti_sn65dsi86 {
struct drm_dp_aux   aux;
struct drm_bridge   bridge;
struct drm_connectorconnector;
-   struct edid *edid;
struct device_node  *host_node;
struct mipi_dsi_device  *dsi;
struct clk  *refclk;
@@ -403,24 +401,6 @@ connector_to_ti_sn65dsi86(struct drm_connector *connector)
 static int ti_sn_bridge_connector_get_modes(struct drm_connector *connector)
 {
struct ti_sn65dsi86 *pdata = connector_to_ti_sn65dsi86(connector);
-   struct edid *edid = pdata->edid;
-   int num, ret;
-
-   if (!edid) {
-   pm_runtime_get_sync(pdata->dev);
-   edid = pdata->edid = drm_get_edid(connector, &pdata->aux.ddc);
-   pm_runtime_put_autosuspend(pdata->dev);
-   }
-
-   if (edid && drm_edid_is_valid(edid)) {
-   ret = drm_connector_update_edid_property(connector, edid);
-   if (!ret) {
-   num = drm_add_edid_modes(connector, edid);
-   if (num)
-   return num;
-   }
-   }
-
return drm_panel_get_modes(pdata->panel, connector);
 }
 
@@ -1358,8 +1338,6 @@ static void ti_sn_bridge_remove(struct auxiliary_device 
*adev)
mipi_dsi_device_unregister(pdata->dsi);
}
 
-   kfree(pdata->edid);
-
drm_bridge_remove(&pdata->bridge);
 
of_node_put(pdata->host_node);
-- 
2.30.2

[PATCH v1 12/12] drm/virtio: implement context init: advertise feature to userspace

2021-09-08 Thread Gurchetan Singh

This advertises the context init feature to userspace, along with
a mask of supported capabilities.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index fdaa7f3d9eeb..5618a1d5879c 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -286,6 +286,12 @@ static int virtio_gpu_getparam_ioctl(struct drm_device 
*dev, void *data,
case VIRTGPU_PARAM_CROSS_DEVICE:
value = vgdev->has_resource_assign_uuid ? 1 : 0;
break;
+   case VIRTGPU_PARAM_CONTEXT_INIT:
+   value = vgdev->has_context_init ? 1 : 0;
+   break;
+   case VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs:
+   value = vgdev->capset_id_mask;
+   break;
default:
return -EINVAL;
}
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 11/12] drm/virtio: implement context init: add virtio_gpu_fence_event

2021-09-08 Thread Gurchetan Singh

Similar to DRM_VMW_EVENT_FENCE_SIGNALED.  Sends a pollable event
to the DRM file descriptor when a fence on a specific ring is
signaled.

One difference is the event is not exposed via the UAPI -- this is
because host responses are on a shared memory buffer of type
BLOB_MEM_GUEST [this is the common way to receive responses with
virtgpu].  As such, there is no context specific read(..)
implementation either -- just a poll(..) implementation.

Signed-off-by: Gurchetan Singh 
Acked-by: Nicholas Verne 
---
 drivers/gpu/drm/virtio/virtgpu_drv.c   | 43 +-
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  7 +
 drivers/gpu/drm/virtio/virtgpu_fence.c | 10 ++
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 
 4 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c 
b/drivers/gpu/drm/virtio/virtgpu_drv.c
index 9d963f1fda8f..749db18dcfa2 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -29,6 +29,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 #include 
 #include 
@@ -155,6 +157,35 @@ static void virtio_gpu_config_changed(struct virtio_device 
*vdev)
schedule_work(&vgdev->config_changed_work);
 }
 
+static __poll_t virtio_gpu_poll(struct file *filp,
+   struct poll_table_struct *wait)
+{
+   struct drm_file *drm_file = filp->private_data;
+   struct virtio_gpu_fpriv *vfpriv = drm_file->driver_priv;
+   struct drm_device *dev = drm_file->minor->dev;
+   struct drm_pending_event *e = NULL;
+   __poll_t mask = 0;
+
+   if (!vfpriv->ring_idx_mask)
+   return drm_poll(filp, wait);
+
+   poll_wait(filp, &drm_file->event_wait, wait);
+
+   if (!list_empty(&drm_file->event_list)) {
+   spin_lock_irq(&dev->event_lock);
+   e = list_first_entry(&drm_file->event_list,
+struct drm_pending_event, link);
+   drm_file->event_space += e->event->length;
+   list_del(&e->link);
+   spin_unlock_irq(&dev->event_lock);
+
+   kfree(e);
+   mask |= EPOLLIN | EPOLLRDNORM;
+   }
+
+   return mask;
+}
+
 static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_GPU, VIRTIO_DEV_ANY_ID },
{ 0 },
@@ -194,7 +225,17 @@ MODULE_AUTHOR("Dave Airlie ");
 MODULE_AUTHOR("Gerd Hoffmann ");
 MODULE_AUTHOR("Alon Levy");
 
-DEFINE_DRM_GEM_FOPS(virtio_gpu_driver_fops);
+static const struct file_operations virtio_gpu_driver_fops = {
+   .owner  = THIS_MODULE,
+   .open   = drm_open,
+   .release= drm_release,
+   .unlocked_ioctl = drm_ioctl,
+   .compat_ioctl   = drm_compat_ioctl,
+   .poll   = virtio_gpu_poll,
+   .read   = drm_read,
+   .llseek = noop_llseek,
+   .mmap   = drm_gem_mmap
+};
 
 static const struct drm_driver driver = {
.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | 
DRIVER_ATOMIC,
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index cb60d52c2bd1..e0265fe74aa5 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -138,11 +138,18 @@ struct virtio_gpu_fence_driver {
spinlock_t   lock;
 };
 
+#define VIRTGPU_EVENT_FENCE_SIGNALED_INTERNAL 0x1000
+struct virtio_gpu_fence_event {
+   struct drm_pending_event base;
+   struct drm_event event;
+};
+
 struct virtio_gpu_fence {
struct dma_fence f;
uint32_t ring_idx;
uint64_t fence_id;
bool emit_fence_info;
+   struct virtio_gpu_fence_event *e;
struct virtio_gpu_fence_driver *drv;
struct list_head node;
 };
diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c 
b/drivers/gpu/drm/virtio/virtgpu_fence.c
index 98a00c1e654d..f28357dbde35 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fence.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fence.c
@@ -152,11 +152,21 @@ void virtio_gpu_fence_event_process(struct 
virtio_gpu_device *vgdev,
continue;
 
dma_fence_signal_locked(&curr->f);
+   if (curr->e) {
+   drm_send_event(vgdev->ddev, &curr->e->base);
+   curr->e = NULL;
+   }
+
list_del(&curr->node);
dma_fence_put(&curr->f);
}
 
dma_fence_signal_locked(&signaled->f);
+   if (signaled->e) {
+   drm_send_event(vgdev->ddev, &signaled->e->base);
+   signaled->e = NULL;
+   }
+
list_del(&signaled->node);
dma_fence_put(&signaled->f);
break;
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index be7b22a03884..fdaa

[PATCH v1 10/12] drm/virtio: implement context init: handle VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK

2021-09-08 Thread Gurchetan Singh

For the Sommelier guest Wayland proxy, it's desirable for the
DRM fd to be pollable in response to an host compositor event.
This can also be used by the 3D driver to poll events on a CPU
timeline.

This enables the DRM fd associated with a particular 3D context
to be polled independent of KMS events.  The parameter
VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK specifies the pollable
rings.

Signed-off-by: Gurchetan Singh 
Acked-by: Nicholas Verne 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  1 +
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 22 +-
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index cca9ab505deb..cb60d52c2bd1 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -266,6 +266,7 @@ struct virtio_gpu_fpriv {
bool context_created;
uint32_t num_rings;
uint64_t base_fence_ctx;
+   uint64_t ring_idx_mask;
struct mutex context_lock;
 };
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 262f79210283..be7b22a03884 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -694,6 +694,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device 
*dev,
 {
int ret = 0;
uint32_t num_params, i, param, value;
+   uint64_t valid_ring_mask;
size_t len;
struct drm_virtgpu_context_set_param *ctx_set_params = NULL;
struct virtio_gpu_device *vgdev = dev->dev_private;
@@ -707,7 +708,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device 
*dev,
return -EINVAL;
 
/* Number of unique parameters supported at this time. */
-   if (num_params > 2)
+   if (num_params > 3)
return -EINVAL;
 
ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -761,12 +762,31 @@ static int virtio_gpu_context_init_ioctl(struct 
drm_device *dev,
vfpriv->base_fence_ctx = dma_fence_context_alloc(value);
vfpriv->num_rings = value;
break;
+   case VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK:
+   if (vfpriv->ring_idx_mask) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   vfpriv->ring_idx_mask = value;
+   break;
default:
ret = -EINVAL;
goto out_unlock;
}
}
 
+   if (vfpriv->ring_idx_mask) {
+   valid_ring_mask = 0;
+   for (i = 0; i < vfpriv->num_rings; i++)
+   valid_ring_mask |= 1 << i;
+
+   if (~valid_ring_mask & vfpriv->ring_idx_mask) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+   }
+
virtio_gpu_create_context_locked(vgdev, vfpriv);
virtio_gpu_notify(vgdev);
 
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 09/12] drm/virtio: implement context init: allocate an array of fence contexts

2021-09-08 Thread Gurchetan Singh

We don't want fences from different 3D contexts (virgl, gfxstream,
venus) to be on the same timeline.  With explicit context creation,
we can specify the number of ring each context wants.

Execbuffer can specify which ring to use.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  3 +++
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 34 --
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index a5142d60c2fa..cca9ab505deb 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -56,6 +56,7 @@
 #define STATE_ERR 2
 
 #define MAX_CAPSET_ID 63
+#define MAX_RINGS 64
 
 struct virtio_gpu_object_params {
unsigned long size;
@@ -263,6 +264,8 @@ struct virtio_gpu_fpriv {
uint32_t ctx_id;
uint32_t context_init;
bool context_created;
+   uint32_t num_rings;
+   uint64_t base_fence_ctx;
struct mutex context_lock;
 };
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index f51f3393a194..262f79210283 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -99,6 +99,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
*dev, void *data,
int in_fence_fd = exbuf->fence_fd;
int out_fence_fd = -1;
void *buf;
+   uint64_t fence_ctx;
+   uint32_t ring_idx;
+
+   fence_ctx = vgdev->fence_drv.context;
+   ring_idx = 0;
 
if (vgdev->has_virgl_3d == false)
return -ENOSYS;
@@ -106,6 +111,17 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
*dev, void *data,
if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS))
return -EINVAL;
 
+   if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) {
+   if (exbuf->ring_idx >= vfpriv->num_rings)
+   return -EINVAL;
+
+   if (!vfpriv->base_fence_ctx)
+   return -EINVAL;
+
+   fence_ctx = vfpriv->base_fence_ctx;
+   ring_idx = exbuf->ring_idx;
+   }
+
exbuf->fence_fd = -1;
 
virtio_gpu_create_context(dev, file);
@@ -173,7 +189,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
*dev, void *data,
goto out_memdup;
}
 
-   out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
+   out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx);
if(!out_fence) {
ret = -ENOMEM;
goto out_unresv;
@@ -691,7 +707,7 @@ static int virtio_gpu_context_init_ioctl(struct drm_device 
*dev,
return -EINVAL;
 
/* Number of unique parameters supported at this time. */
-   if (num_params > 1)
+   if (num_params > 2)
return -EINVAL;
 
ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
@@ -731,6 +747,20 @@ static int virtio_gpu_context_init_ioctl(struct drm_device 
*dev,
 
vfpriv->context_init |= value;
break;
+   case VIRTGPU_CONTEXT_PARAM_NUM_RINGS:
+   if (vfpriv->base_fence_ctx) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   if (value > MAX_RINGS) {
+   ret = -EINVAL;
+   goto out_unlock;
+   }
+
+   vfpriv->base_fence_ctx = dma_fence_context_alloc(value);
+   vfpriv->num_rings = value;
+   break;
default:
ret = -EINVAL;
goto out_unlock;
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 05/12] drm/virtio: implement context init: support init ioctl

2021-09-08 Thread Gurchetan Singh

From: Anthoine Bourgeois 

This implements the context initialization ioctl.  A list of params
is passed in by userspace, and kernel driver validates them.  The
only currently supported param is VIRTGPU_CONTEXT_PARAM_CAPSET_ID.

If the context has already been initialized, -EEXIST is returned.
This happens after Linux userspace does dumb_create + followed by
opening the Mesa virgl driver with the same virtgpu instance.

However, for most applications, 3D contexts will be explicitly
initialized when the feature is available.

Signed-off-by: Anthoine Bourgeois 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h   |  6 +-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 96 --
 drivers/gpu/drm/virtio/virtgpu_vq.c|  4 +-
 3 files changed, 98 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 5e1958a522ff..9996abf60e3a 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -259,12 +259,13 @@ struct virtio_gpu_device {
 
 struct virtio_gpu_fpriv {
uint32_t ctx_id;
+   uint32_t context_init;
bool context_created;
struct mutex context_lock;
 };
 
 /* virtgpu_ioctl.c */
-#define DRM_VIRTIO_NUM_IOCTLS 11
+#define DRM_VIRTIO_NUM_IOCTLS 12
 extern struct drm_ioctl_desc virtio_gpu_ioctls[DRM_VIRTIO_NUM_IOCTLS];
 void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file);
 
@@ -342,7 +343,8 @@ int virtio_gpu_cmd_get_capset(struct virtio_gpu_device 
*vgdev,
  struct virtio_gpu_drv_cap_cache **cache_p);
 int virtio_gpu_cmd_get_edids(struct virtio_gpu_device *vgdev);
 void virtio_gpu_cmd_context_create(struct virtio_gpu_device *vgdev, uint32_t 
id,
-  uint32_t nlen, const char *name);
+  uint32_t context_init, uint32_t nlen,
+  const char *name);
 void virtio_gpu_cmd_context_destroy(struct virtio_gpu_device *vgdev,
uint32_t id);
 void virtio_gpu_cmd_context_attach_resource(struct virtio_gpu_device *vgdev,
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 5c1ad1596889..f5281d1e30e1 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -38,20 +38,30 @@
VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \
VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE)
 
+/* Must be called with &virtio_gpu_fpriv.struct_mutex held. */
+static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev,
+struct virtio_gpu_fpriv *vfpriv)
+{
+   char dbgname[TASK_COMM_LEN];
+
+   get_task_comm(dbgname, current);
+   virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id,
+ vfpriv->context_init, strlen(dbgname),
+ dbgname);
+
+   vfpriv->context_created = true;
+}
+
 void virtio_gpu_create_context(struct drm_device *dev, struct drm_file *file)
 {
struct virtio_gpu_device *vgdev = dev->dev_private;
struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
-   char dbgname[TASK_COMM_LEN];
 
mutex_lock(&vfpriv->context_lock);
if (vfpriv->context_created)
goto out_unlock;
 
-   get_task_comm(dbgname, current);
-   virtio_gpu_cmd_context_create(vgdev, vfpriv->ctx_id,
- strlen(dbgname), dbgname);
-   vfpriv->context_created = true;
+   virtio_gpu_create_context_locked(vgdev, vfpriv);
 
 out_unlock:
mutex_unlock(&vfpriv->context_lock);
@@ -662,6 +672,79 @@ static int virtio_gpu_resource_create_blob_ioctl(struct 
drm_device *dev,
return 0;
 }
 
+static int virtio_gpu_context_init_ioctl(struct drm_device *dev,
+void *data, struct drm_file *file)
+{
+   int ret = 0;
+   uint32_t num_params, i, param, value;
+   size_t len;
+   struct drm_virtgpu_context_set_param *ctx_set_params = NULL;
+   struct virtio_gpu_device *vgdev = dev->dev_private;
+   struct virtio_gpu_fpriv *vfpriv = file->driver_priv;
+   struct drm_virtgpu_context_init *args = data;
+
+   num_params = args->num_params;
+   len = num_params * sizeof(struct drm_virtgpu_context_set_param);
+
+   if (!vgdev->has_context_init || !vgdev->has_virgl_3d)
+   return -EINVAL;
+
+   /* Number of unique parameters supported at this time. */
+   if (num_params > 1)
+   return -EINVAL;
+
+   ctx_set_params = memdup_user(u64_to_user_ptr(args->ctx_set_params),
+len);
+
+   if (IS_ERR(ctx_set_params))
+   return PTR_ERR(ctx_set_params);
+
+   mutex_lock(&vfpriv->context_lock);
+   if (vfpriv->context

[PATCH v1 08/12] drm/virtio: implement context init: stop using drv->context when creating fence

2021-09-08 Thread Gurchetan Singh

The plumbing is all here to do this.  Since we always use the
default fence context when allocating a fence, this makes no
functional difference.

We can't process just the largest fence id anymore, since it's
it's associated with different timelines.  It's fine for fence_id
260 to signal before 259.  As such, process each fence_id
individually.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_fence.c | 16 ++--
 drivers/gpu/drm/virtio/virtgpu_vq.c| 15 +++
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c 
b/drivers/gpu/drm/virtio/virtgpu_fence.c
index 24c728b65d21..98a00c1e654d 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fence.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fence.c
@@ -75,20 +75,25 @@ struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct 
virtio_gpu_device *vgdev,
uint64_t base_fence_ctx,
uint32_t ring_idx)
 {
+   uint64_t fence_context = base_fence_ctx + ring_idx;
struct virtio_gpu_fence_driver *drv = &vgdev->fence_drv;
struct virtio_gpu_fence *fence = kzalloc(sizeof(struct 
virtio_gpu_fence),
GFP_KERNEL);
+
if (!fence)
return fence;
 
fence->drv = drv;
+   fence->ring_idx = ring_idx;
+   fence->emit_fence_info = !(base_fence_ctx == drv->context);
 
/* This only partially initializes the fence because the seqno is
 * unknown yet.  The fence must not be used outside of the driver
 * until virtio_gpu_fence_emit is called.
 */
-   dma_fence_init(&fence->f, &virtio_gpu_fence_ops, &drv->lock, 
drv->context,
-  0);
+
+   dma_fence_init(&fence->f, &virtio_gpu_fence_ops, &drv->lock,
+  fence_context, 0);
 
return fence;
 }
@@ -110,6 +115,13 @@ void virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev,
 
cmd_hdr->flags |= cpu_to_le32(VIRTIO_GPU_FLAG_FENCE);
cmd_hdr->fence_id = cpu_to_le64(fence->fence_id);
+
+   /* Only currently defined fence param. */
+   if (fence->emit_fence_info) {
+   cmd_hdr->flags |=
+   cpu_to_le32(VIRTIO_GPU_FLAG_INFO_RING_IDX);
+   cmd_hdr->ring_idx = (u8)fence->ring_idx;
+   }
 }
 
 void virtio_gpu_fence_event_process(struct virtio_gpu_device *vgdev,
diff --git a/drivers/gpu/drm/virtio/virtgpu_vq.c 
b/drivers/gpu/drm/virtio/virtgpu_vq.c
index 496f8ce4cd41..938331554632 100644
--- a/drivers/gpu/drm/virtio/virtgpu_vq.c
+++ b/drivers/gpu/drm/virtio/virtgpu_vq.c
@@ -205,7 +205,7 @@ void virtio_gpu_dequeue_ctrl_func(struct work_struct *work)
struct list_head reclaim_list;
struct virtio_gpu_vbuffer *entry, *tmp;
struct virtio_gpu_ctrl_hdr *resp;
-   u64 fence_id = 0;
+   u64 fence_id;
 
INIT_LIST_HEAD(&reclaim_list);
spin_lock(&vgdev->ctrlq.qlock);
@@ -232,23 +232,14 @@ void virtio_gpu_dequeue_ctrl_func(struct work_struct 
*work)
DRM_DEBUG("response 0x%x\n", 
le32_to_cpu(resp->type));
}
if (resp->flags & cpu_to_le32(VIRTIO_GPU_FLAG_FENCE)) {
-   u64 f = le64_to_cpu(resp->fence_id);
-
-   if (fence_id > f) {
-   DRM_ERROR("%s: Oops: fence %llx -> %llx\n",
- __func__, fence_id, f);
-   } else {
-   fence_id = f;
-   }
+   fence_id = le64_to_cpu(resp->fence_id);
+   virtio_gpu_fence_event_process(vgdev, fence_id);
}
if (entry->resp_cb)
entry->resp_cb(vgdev, entry);
}
wake_up(&vgdev->ctrlq.ack_queue);
 
-   if (fence_id)
-   virtio_gpu_fence_event_process(vgdev, fence_id);
-
list_for_each_entry_safe(entry, tmp, &reclaim_list, list) {
if (entry->objs)
virtio_gpu_array_put_free_delayed(vgdev, entry->objs);
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 07/12] drm/virtio: implement context init: plumb {base_fence_ctx, ring_idx} to virtio_gpu_fence_alloc

2021-09-08 Thread Gurchetan Singh

These were defined in the previous commit. We'll need these
parameters when allocating a dma_fence.  The use case for this
is multiple synchronizations timelines.

The maximum number of timelines per 3D instance will be 32. Usually,
only 2 are needed -- one for CPU commands, and another for GPU
commands.

As such, we'll need to specify these parameters when allocating a
dma_fence.

vgdev->fence_drv.context is the "default" fence context for 2D mode
and old userspace.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h   | 5 +++--
 drivers/gpu/drm/virtio/virtgpu_fence.c | 4 +++-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 +
 drivers/gpu/drm/virtio/virtgpu_plane.c | 3 ++-
 4 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 401aec1a5efb..a5142d60c2fa 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -426,8 +426,9 @@ struct drm_plane *virtio_gpu_plane_init(struct 
virtio_gpu_device *vgdev,
int index);
 
 /* virtgpu_fence.c */
-struct virtio_gpu_fence *virtio_gpu_fence_alloc(
-   struct virtio_gpu_device *vgdev);
+struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device 
*vgdev,
+   uint64_t base_fence_ctx,
+   uint32_t ring_idx);
 void virtio_gpu_fence_emit(struct virtio_gpu_device *vgdev,
  struct virtio_gpu_ctrl_hdr *cmd_hdr,
  struct virtio_gpu_fence *fence);
diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c 
b/drivers/gpu/drm/virtio/virtgpu_fence.c
index d28e25e8409b..24c728b65d21 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fence.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fence.c
@@ -71,7 +71,9 @@ static const struct dma_fence_ops virtio_gpu_fence_ops = {
.timeline_value_str  = virtio_gpu_timeline_value_str,
 };
 
-struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device 
*vgdev)
+struct virtio_gpu_fence *virtio_gpu_fence_alloc(struct virtio_gpu_device 
*vgdev,
+   uint64_t base_fence_ctx,
+   uint32_t ring_idx)
 {
struct virtio_gpu_fence_driver *drv = &vgdev->fence_drv;
struct virtio_gpu_fence *fence = kzalloc(sizeof(struct 
virtio_gpu_fence),
diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index f5281d1e30e1..f51f3393a194 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -173,7 +173,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
*dev, void *data,
goto out_memdup;
}
 
-   out_fence = virtio_gpu_fence_alloc(vgdev);
+   out_fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
if(!out_fence) {
ret = -ENOMEM;
goto out_unresv;
@@ -288,7 +288,7 @@ static int virtio_gpu_resource_create_ioctl(struct 
drm_device *dev, void *data,
if (params.size == 0)
params.size = PAGE_SIZE;
 
-   fence = virtio_gpu_fence_alloc(vgdev);
+   fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
if (!fence)
return -ENOMEM;
ret = virtio_gpu_object_create(vgdev, ¶ms, &qobj, fence);
@@ -367,7 +367,7 @@ static int virtio_gpu_transfer_from_host_ioctl(struct 
drm_device *dev,
if (ret != 0)
goto err_put_free;
 
-   fence = virtio_gpu_fence_alloc(vgdev);
+   fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context, 0);
if (!fence) {
ret = -ENOMEM;
goto err_unlock;
@@ -427,7 +427,8 @@ static int virtio_gpu_transfer_to_host_ioctl(struct 
drm_device *dev, void *data,
goto err_put_free;
 
ret = -ENOMEM;
-   fence = virtio_gpu_fence_alloc(vgdev);
+   fence = virtio_gpu_fence_alloc(vgdev, vgdev->fence_drv.context,
+  0);
if (!fence)
goto err_unlock;
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c 
b/drivers/gpu/drm/virtio/virtgpu_plane.c
index a49fd9480381..6d3cc9e238a4 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -256,7 +256,8 @@ static int virtio_gpu_plane_prepare_fb(struct drm_plane 
*plane,
return 0;
 
if (bo->dumb && (plane->state->fb != new_state->fb)) {
-   vgfb->fence = virtio_gpu_fence_alloc(vgdev);
+   vgfb->fence = virtio_gpu_fence_alloc(vgdev, 
vgdev->fence_drv.context,
+0);
if (!vgfb->fence)
return -ENOMEM;
}
-- 
2.33

[PATCH v1 06/12] drm/virtio: implement context init: track {ring_idx, emit_fence_info} in virtio_gpu_fence

2021-09-08 Thread Gurchetan Singh

Each fence should be associated with a [fence ID, fence_context,
seqno].  The seqno number is just the fence id.

To get the fence context, we add the ring_idx to the 3D context's
base_fence_ctx.  The ring_idx is between 0 and 31, inclusive.

Each 3D context will have it's own base_fence_ctx. The ring_idx will
be emitted to host userspace, when emit_fence_info is true.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 9996abf60e3a..401aec1a5efb 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -139,7 +139,9 @@ struct virtio_gpu_fence_driver {
 
 struct virtio_gpu_fence {
struct dma_fence f;
+   uint32_t ring_idx;
uint64_t fence_id;
+   bool emit_fence_info;
struct virtio_gpu_fence_driver *drv;
struct list_head node;
 };
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 04/12] drm/virtio: implement context init: probe for feature

2021-09-08 Thread Gurchetan Singh

From: Anthoine Bourgeois 

Let's probe for VIRTIO_GPU_F_CONTEXT_INIT.

Create a new DRM_INFO(..) line since the current one is getting
too long.

Signed-off-by: Anthoine Bourgeois 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_debugfs.c | 1 +
 drivers/gpu/drm/virtio/virtgpu_drv.c | 1 +
 drivers/gpu/drm/virtio/virtgpu_drv.h | 1 +
 drivers/gpu/drm/virtio/virtgpu_kms.c | 8 +++-
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c 
b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
index c2b20e0ee030..b6954e2f75e6 100644
--- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c
+++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
@@ -52,6 +52,7 @@ static int virtio_gpu_features(struct seq_file *m, void *data)
vgdev->has_resource_assign_uuid);
 
virtio_gpu_add_bool(m, "blob resources", vgdev->has_resource_blob);
+   virtio_gpu_add_bool(m, "context init", vgdev->has_context_init);
virtio_gpu_add_int(m, "cap sets", vgdev->num_capsets);
virtio_gpu_add_int(m, "scanouts", vgdev->num_scanouts);
if (vgdev->host_visible_region.len) {
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c 
b/drivers/gpu/drm/virtio/virtgpu_drv.c
index ed85a7863256..9d963f1fda8f 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -172,6 +172,7 @@ static unsigned int features[] = {
VIRTIO_GPU_F_EDID,
VIRTIO_GPU_F_RESOURCE_UUID,
VIRTIO_GPU_F_RESOURCE_BLOB,
+   VIRTIO_GPU_F_CONTEXT_INIT,
 };
 static struct virtio_driver virtio_gpu_driver = {
.feature_table = features,
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 3023e16be0d6..5e1958a522ff 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -236,6 +236,7 @@ struct virtio_gpu_device {
bool has_resource_assign_uuid;
bool has_resource_blob;
bool has_host_visible;
+   bool has_context_init;
struct virtio_shm_region host_visible_region;
struct drm_mm host_visible_mm;
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 58a65121c200..21f410901694 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -191,13 +191,19 @@ int virtio_gpu_init(struct drm_device *dev)
(unsigned long)vgdev->host_visible_region.addr,
(unsigned long)vgdev->host_visible_region.len);
}
+   if (virtio_has_feature(vgdev->vdev, VIRTIO_GPU_F_CONTEXT_INIT)) {
+   vgdev->has_context_init = true;
+   }
 
-   DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible\n",
+   DRM_INFO("features: %cvirgl %cedid %cresource_blob %chost_visible",
 vgdev->has_virgl_3d? '+' : '-',
 vgdev->has_edid? '+' : '-',
 vgdev->has_resource_blob ? '+' : '-',
 vgdev->has_host_visible ? '+' : '-');
 
+   DRM_INFO("features: %ccontext_init\n",
+vgdev->has_context_init ? '+' : '-');
+
ret = virtio_find_vqs(vgdev->vdev, 2, vqs, callbacks, names, NULL);
if (ret) {
DRM_ERROR("failed to find virt queues\n");
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 03/12] drm/virtio: implement context init: track valid capabilities in a mask

2021-09-08 Thread Gurchetan Singh

The valid capability IDs are between 1 to 63, and defined in the
virtio gpu spec.  This is used for error checking the subsequent
patches.  We're currently only using 2 capability IDs, so this
should be plenty for the immediate future.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
---
 drivers/gpu/drm/virtio/virtgpu_drv.h |  3 +++
 drivers/gpu/drm/virtio/virtgpu_kms.c | 18 +-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h 
b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 0c4810982530..3023e16be0d6 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -55,6 +55,8 @@
 #define STATE_OK 1
 #define STATE_ERR 2
 
+#define MAX_CAPSET_ID 63
+
 struct virtio_gpu_object_params {
unsigned long size;
bool dumb;
@@ -245,6 +247,7 @@ struct virtio_gpu_device {
 
struct virtio_gpu_drv_capset *capsets;
uint32_t num_capsets;
+   uint64_t capset_id_mask;
struct list_head cap_cache;
 
/* protects uuid state when exporting */
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c 
b/drivers/gpu/drm/virtio/virtgpu_kms.c
index f3379059f324..58a65121c200 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -65,6 +65,7 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device 
*vgdev,
   int num_capsets)
 {
int i, ret;
+   bool invalid_capset_id = false;
 
vgdev->capsets = kcalloc(num_capsets,
 sizeof(struct virtio_gpu_drv_capset),
@@ -78,19 +79,34 @@ static void virtio_gpu_get_capsets(struct virtio_gpu_device 
*vgdev,
virtio_gpu_notify(vgdev);
ret = wait_event_timeout(vgdev->resp_wq,
 vgdev->capsets[i].id > 0, 5 * HZ);
-   if (ret == 0) {
+   /*
+* Capability ids are defined in the virtio-gpu spec and are
+* between 1 to 63, inclusive.
+*/
+   if (!vgdev->capsets[i].id ||
+   vgdev->capsets[i].id > MAX_CAPSET_ID)
+   invalid_capset_id = true;
+
+   if (ret == 0)
DRM_ERROR("timed out waiting for cap set %d\n", i);
+   else if (invalid_capset_id)
+   DRM_ERROR("invalid capset id %u", vgdev->capsets[i].id);
+
+   if (ret == 0 || invalid_capset_id) {
spin_lock(&vgdev->display_info_lock);
kfree(vgdev->capsets);
vgdev->capsets = NULL;
spin_unlock(&vgdev->display_info_lock);
return;
}
+
+   vgdev->capset_id_mask |= 1 << vgdev->capsets[i].id;
DRM_INFO("cap set %d: id %d, max-version %d, max-size %d\n",
 i, vgdev->capsets[i].id,
 vgdev->capsets[i].max_version,
 vgdev->capsets[i].max_size);
}
+
vgdev->num_capsets = num_capsets;
 }
 
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 02/12] drm/virtgpu api: create context init feature

2021-09-08 Thread Gurchetan Singh

This change allows creating contexts of depending on set of
context parameters.  The meaning of each of the parameters
is listed below:

1) VIRTGPU_CONTEXT_PARAM_CAPSET_ID

This determines the type of a context based on the capability set
ID.  For example, the current capsets:

VIRTIO_GPU_CAPSET_VIRGL
VIRTIO_GPU_CAPSET_VIRGL2

define a Gallium, TGSI based "virgl" context.  We only need 1 capset
ID per context type, though virgl has two due a bug that has since
been fixed.

The use case is the "gfxstream" rendering library and "venus"
renderer.

gfxstream doesn't do Gallium/TGSI translation and mostly relies on
auto-generated API streaming.  Certain users prefer gfxstream over
virgl for GLES on GLES emulation.  {gfxstream vk}/{venus} are also
required for Vulkan emulation.  The maximum capset ID is 63.

The goal is for guest userspace to choose the optimal context type
depending on the situation/hardware.

2) VIRTGPU_CONTEXT_PARAM_NUM_RINGS

This tells the number of independent command rings that the context
will use.  This value may be zero and is inferred to be zero if
VIRTGPU_CONTEXT_PARAM_NUM_RINGS is not passed in.  This is for backwards
compatibility for virgl, which has one big giant command ring for all
commands.

The maxiumum number of rings is 64.  In practice, multi-queue or
multi-ring submission is used for powerful dGPUs and virtio-gpu
may not be the best option in that case (see PCI passthrough or
rendernode forwarding).

3) VIRTGPU_CONTEXT_PARAM_POLL_RING_IDX_MASK

This is a mask of ring indices for which the DRM fd is pollable.
For example, if VIRTGPU_CONTEXT_PARAM_NUM_RINGS is 2, then the mask
may be:

[ring idx]  |  [1 << ring_idx] | final mask
---
0  11
1  23

The "Sommelier" guest Wayland proxy uses this to poll for events
from the host compositor.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
Acked-by: Nicholas Verne 
---
 include/uapi/drm/virtgpu_drm.h | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
index b9ec26e9c646..a13e20cc66b4 100644
--- a/include/uapi/drm/virtgpu_drm.h
+++ b/include/uapi/drm/virtgpu_drm.h
@@ -47,12 +47,15 @@ extern "C" {
 #define DRM_VIRTGPU_WAIT 0x08
 #define DRM_VIRTGPU_GET_CAPS  0x09
 #define DRM_VIRTGPU_RESOURCE_CREATE_BLOB 0x0a
+#define DRM_VIRTGPU_CONTEXT_INIT 0x0b
 
 #define VIRTGPU_EXECBUF_FENCE_FD_IN0x01
 #define VIRTGPU_EXECBUF_FENCE_FD_OUT   0x02
+#define VIRTGPU_EXECBUF_RING_IDX   0x04
 #define VIRTGPU_EXECBUF_FLAGS  (\
VIRTGPU_EXECBUF_FENCE_FD_IN |\
VIRTGPU_EXECBUF_FENCE_FD_OUT |\
+   VIRTGPU_EXECBUF_RING_IDX |\
0)
 
 struct drm_virtgpu_map {
@@ -68,6 +71,8 @@ struct drm_virtgpu_execbuffer {
__u64 bo_handles;
__u32 num_bo_handles;
__s32 fence_fd; /* in/out fence fd (see 
VIRTGPU_EXECBUF_FENCE_FD_IN/OUT) */
+   __u32 ring_idx; /* command ring index (see VIRTGPU_EXECBUF_RING_IDX) */
+   __u32 pad;
 };
 
 #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */
@@ -75,6 +80,8 @@ struct drm_virtgpu_execbuffer {
 #define VIRTGPU_PARAM_RESOURCE_BLOB 3 /* DRM_VIRTGPU_RESOURCE_CREATE_BLOB */
 #define VIRTGPU_PARAM_HOST_VISIBLE 4 /* Host blob resources are mappable */
 #define VIRTGPU_PARAM_CROSS_DEVICE 5 /* Cross virtio-device resource sharing  
*/
+#define VIRTGPU_PARAM_CONTEXT_INIT 6 /* DRM_VIRTGPU_CONTEXT_INIT */
+#define VIRTGPU_PARAM_SUPPORTED_CAPSET_IDs 7 /* Bitmask of supported 
capability set ids */
 
 struct drm_virtgpu_getparam {
__u64 param;
@@ -173,6 +180,22 @@ struct drm_virtgpu_resource_create_blob {
__u64 blob_id;
 };
 
+#define VIRTGPU_CONTEXT_PARAM_CAPSET_ID   0x0001
+#define VIRTGPU_CONTEXT_PARAM_NUM_RINGS   0x0002
+#define VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK 0x0003
+struct drm_virtgpu_context_set_param {
+   __u64 param;
+   __u64 value;
+};
+
+struct drm_virtgpu_context_init {
+   __u32 num_params;
+   __u32 pad;
+
+   /* pointer to drm_virtgpu_context_set_param array */
+   __u64 ctx_set_params;
+};
+
 #define DRM_IOCTL_VIRTGPU_MAP \
DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_MAP, struct drm_virtgpu_map)
 
@@ -212,6 +235,10 @@ struct drm_virtgpu_resource_create_blob {
DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_RESOURCE_CREATE_BLOB,   \
struct drm_virtgpu_resource_create_blob)
 
+#define DRM_IOCTL_VIRTGPU_CONTEXT_INIT \
+   DRM_IOWR(DRM_COMMAND_BASE + DRM_VIRTGPU_CONTEXT_INIT,   \
+   struct drm_virtgpu_context_init)
+
 #if defined(__cplusplus)
 }
 #endif
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 00/12] Context types

2021-09-08 Thread Gurchetan Singh

Version 1 of context types:

https://lists.oasis-open.org/archives/virtio-dev/202108/msg00141.html

Changes since RFC:
   * le32 info --> {u8 ring_idx + u8 padding[3]).
   * Max rings is now 64.

Anthoine Bourgeois (2):
  drm/virtio: implement context init: probe for feature
  drm/virtio: implement context init: support init ioctl

Gurchetan Singh (10):
  virtio-gpu api: multiple context types with explicit initialization
  drm/virtgpu api: create context init feature
  drm/virtio: implement context init: track valid capabilities in a mask
  drm/virtio: implement context init: track {ring_idx, emit_fence_info}
in virtio_gpu_fence
  drm/virtio: implement context init: plumb {base_fence_ctx, ring_idx}
to virtio_gpu_fence_alloc
  drm/virtio: implement context init: stop using drv->context when
creating fence
  drm/virtio: implement context init: allocate an array of fence
contexts
  drm/virtio: implement context init: handle
VIRTGPU_CONTEXT_PARAM_POLL_RINGS_MASK
  drm/virtio: implement context init: add virtio_gpu_fence_event
  drm/virtio: implement context init: advertise feature to userspace

 drivers/gpu/drm/virtio/virtgpu_debugfs.c |   1 +
 drivers/gpu/drm/virtio/virtgpu_drv.c |  44 -
 drivers/gpu/drm/virtio/virtgpu_drv.h |  28 +++-
 drivers/gpu/drm/virtio/virtgpu_fence.c   |  30 +++-
 drivers/gpu/drm/virtio/virtgpu_ioctl.c   | 195 +--
 drivers/gpu/drm/virtio/virtgpu_kms.c |  26 ++-
 drivers/gpu/drm/virtio/virtgpu_plane.c   |   3 +-
 drivers/gpu/drm/virtio/virtgpu_vq.c  |  19 +--
 include/uapi/drm/virtgpu_drm.h   |  27 
 include/uapi/linux/virtio_gpu.h  |  18 ++-
 10 files changed, 355 insertions(+), 36 deletions(-)

-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v1 01/12] virtio-gpu api: multiple context types with explicit initialization

2021-09-08 Thread Gurchetan Singh

This feature allows for each virtio-gpu 3D context to be created
with a "context_init" variable.  This variable can specify:

 - the type of protocol used by the context via the capset id.
   This is useful for differentiating virgl, gfxstream, and venus
   protocols by host userspace.

 - other things in the future, such as the version of the context.

In addition, each different context needs one or more timelines, so
for example a virgl context's waiting can be independent on a
gfxstream context's waiting.

VIRTIO_GPU_FLAG_INFO_RING_IDX is introduced to specific to tell the
host which per-context command ring (or "hardware queue", distinct
from the virtio-queue) the fence should be associated with.

The new capability sets (gfxstream, venus etc.) are only defined in
the virtio-gpu spec and not defined in the header.

Signed-off-by: Gurchetan Singh 
Acked-by: Lingfeng Yang 
---
 include/uapi/linux/virtio_gpu.h | 18 +++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/uapi/linux/virtio_gpu.h b/include/uapi/linux/virtio_gpu.h
index 97523a95781d..b0e3d91dfab7 100644
--- a/include/uapi/linux/virtio_gpu.h
+++ b/include/uapi/linux/virtio_gpu.h
@@ -59,6 +59,11 @@
  * VIRTIO_GPU_CMD_RESOURCE_CREATE_BLOB
  */
 #define VIRTIO_GPU_F_RESOURCE_BLOB   3
+/*
+ * VIRTIO_GPU_CMD_CREATE_CONTEXT with
+ * context_init and multiple timelines
+ */
+#define VIRTIO_GPU_F_CONTEXT_INIT4
 
 enum virtio_gpu_ctrl_type {
VIRTIO_GPU_UNDEFINED = 0,
@@ -122,14 +127,20 @@ enum virtio_gpu_shm_id {
VIRTIO_GPU_SHM_ID_HOST_VISIBLE = 1
 };
 
-#define VIRTIO_GPU_FLAG_FENCE (1 << 0)
+#define VIRTIO_GPU_FLAG_FENCE (1 << 0)
+/*
+ * If the following flag is set, then ring_idx contains the index
+ * of the command ring that needs to used when creating the fence
+ */
+#define VIRTIO_GPU_FLAG_INFO_RING_IDX (1 << 1)
 
 struct virtio_gpu_ctrl_hdr {
__le32 type;
__le32 flags;
__le64 fence_id;
__le32 ctx_id;
-   __le32 padding;
+   u8 ring_idx;
+   u8 padding[3];
 };
 
 /* data passed in the cursor vq */
@@ -269,10 +280,11 @@ struct virtio_gpu_resource_create_3d {
 };
 
 /* VIRTIO_GPU_CMD_CTX_CREATE */
+#define VIRTIO_GPU_CONTEXT_INIT_CAPSET_ID_MASK 0x00ff
 struct virtio_gpu_ctx_create {
struct virtio_gpu_ctrl_hdr hdr;
__le32 nlen;
-   __le32 padding;
+   __le32 context_init;
char debug_name[64];
 };
 
-- 
2.33.0.153.gba50c8fa24-goog

[PATCH v2] drm/plane-helper: fix uninitialized variable reference

2021-09-08 Thread Alex Xu (Hello71)

drivers/gpu/drm/drm_plane_helper.c: In function 'drm_primary_helper_update':
drivers/gpu/drm/drm_plane_helper.c:113:32: error: 'visible' is used 
uninitialized [-Werror=uninitialized]
  113 | struct drm_plane_state plane_state = {
  |^~~
drivers/gpu/drm/drm_plane_helper.c:178:14: note: 'visible' was declared here
  178 | bool visible;
  |  ^~~
cc1: all warnings being treated as errors

visible is an output, not an input. in practice this use might turn out
OK but it's still UB.

Fixes: df86af9133 ("drm/plane-helper: Add drm_plane_helper_check_state()")
Signed-off-by: Alex Xu (Hello71) 
---
 drivers/gpu/drm/drm_plane_helper.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_plane_helper.c 
b/drivers/gpu/drm/drm_plane_helper.c
index 5b2d0ca03705..838b32b70bce 100644
--- a/drivers/gpu/drm/drm_plane_helper.c
+++ b/drivers/gpu/drm/drm_plane_helper.c
@@ -123,7 +123,6 @@ static int drm_plane_helper_check_update(struct drm_plane 
*plane,
.crtc_w = drm_rect_width(dst),
.crtc_h = drm_rect_height(dst),
.rotation = rotation,
-   .visible = *visible,
};
struct drm_crtc_state crtc_state = {
.crtc = crtc,
-- 
2.33.0

[PATCH 1/4] drm/i915: rename debugfs_gt files

2021-09-08 Thread Lucas De Marchi

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt.[ch] to intel_gt_debugfs.[ch] and then make functions,
defines and structs follow suit.

While at it and since we are renaming the header, sort the includes
alphabetically.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/Makefile  |  2 +-
 drivers/gpu/drm/i915/gt/debugfs_engines.c  |  6 +++---
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.c| 14 +++---
 drivers/gpu/drm/i915/gt/intel_gt.c |  6 +++---
 .../gt/{debugfs_gt.c => intel_gt_debugfs.c}|  8 
 .../gt/{debugfs_gt.h => intel_gt_debugfs.h}| 14 +++---
 drivers/gpu/drm/i915/gt/intel_sseu_debugfs.c   | 10 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_debugfs.c | 18 +-
 .../gpu/drm/i915/gt/uc/intel_guc_log_debugfs.c |  8 
 drivers/gpu/drm/i915/gt/uc/intel_huc_debugfs.c |  6 +++---
 drivers/gpu/drm/i915/gt/uc/intel_uc_debugfs.c  |  6 +++---
 11 files changed, 49 insertions(+), 49 deletions(-)
 rename drivers/gpu/drm/i915/gt/{debugfs_gt.c => intel_gt_debugfs.c} (87%)
 rename drivers/gpu/drm/i915/gt/{debugfs_gt.h => intel_gt_debugfs.h} (71%)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index c36c8a4f0716..3e171f0b5f6a 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -80,7 +80,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 # "Graphics Technology" (aka we talk to the gpu)
 gt-y += \
gt/debugfs_engines.o \
-   gt/debugfs_gt.o \
gt/debugfs_gt_pm.o \
gt/gen2_engine_cs.o \
gt/gen6_engine_cs.o \
@@ -101,6 +100,7 @@ gt-y += \
gt/intel_gt.o \
gt/intel_gt_buffer_pool.o \
gt/intel_gt_clock_utils.o \
+   gt/intel_gt_debugfs.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
gt/intel_gt_pm_irq.o \
diff --git a/drivers/gpu/drm/i915/gt/debugfs_engines.c 
b/drivers/gpu/drm/i915/gt/debugfs_engines.c
index 5e3725e62241..2980dac5b171 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_engines.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_engines.c
@@ -7,9 +7,9 @@
 #include 
 
 #include "debugfs_engines.h"
-#include "debugfs_gt.h"
 #include "i915_drv.h" /* for_each_engine! */
 #include "intel_engine.h"
+#include "intel_gt_debugfs.h"
 
 static int engines_show(struct seq_file *m, void *data)
 {
@@ -24,11 +24,11 @@ static int engines_show(struct seq_file *m, void *data)
 
return 0;
 }
-DEFINE_GT_DEBUGFS_ATTRIBUTE(engines);
+DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(engines);
 
 void debugfs_engines_register(struct intel_gt *gt, struct dentry *root)
 {
-   static const struct debugfs_gt_file files[] = {
+   static const struct intel_gt_debugfs_file files[] = {
{ "engines", &engines_fops },
};
 
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c 
b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
index f6733f279890..9222cf68c56c 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
@@ -6,11 +6,11 @@
 
 #include 
 
-#include "debugfs_gt.h"
 #include "debugfs_gt_pm.h"
 #include "i915_drv.h"
 #include "intel_gt.h"
 #include "intel_gt_clock_utils.h"
+#include "intel_gt_debugfs.h"
 #include "intel_gt_pm.h"
 #include "intel_llc.h"
 #include "intel_rc6.h"
@@ -36,7 +36,7 @@ static int fw_domains_show(struct seq_file *m, void *data)
 
return 0;
 }
-DEFINE_GT_DEBUGFS_ATTRIBUTE(fw_domains);
+DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(fw_domains);
 
 static void print_rc6_res(struct seq_file *m,
  const char *title,
@@ -238,7 +238,7 @@ static int drpc_show(struct seq_file *m, void *unused)
 
return err;
 }
-DEFINE_GT_DEBUGFS_ATTRIBUTE(drpc);
+DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(drpc);
 
 static int frequency_show(struct seq_file *m, void *unused)
 {
@@ -480,7 +480,7 @@ static int frequency_show(struct seq_file *m, void *unused)
 
return 0;
 }
-DEFINE_GT_DEBUGFS_ATTRIBUTE(frequency);
+DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(frequency);
 
 static int llc_show(struct seq_file *m, void *data)
 {
@@ -533,7 +533,7 @@ static bool llc_eval(void *data)
return HAS_LLC(gt->i915);
 }
 
-DEFINE_GT_DEBUGFS_ATTRIBUTE(llc);
+DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(llc);
 
 static const char *rps_power_to_str(unsigned int power)
 {
@@ -612,11 +612,11 @@ static bool rps_eval(void *data)
return HAS_RPS(gt->i915);
 }
 
-DEFINE_GT_DEBUGFS_ATTRIBUTE(rps_boost);
+DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rps_boost);
 
 void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root)
 {
-   static const struct debugfs_gt_file files[] = {
+   static const struct intel_gt_debugfs_file files[] = {
{ "drpc", &drpc_fops, NULL },
{ "frequency", &frequency_fops, NULL },
{ "forcewake", &fw_domains_fops, NULL },
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c 
b/drivers/gpu/drm/i915/gt/intel_gt.c
index 2aeaae036a6f..9dda17553e12 100644
--- a/drivers/gpu/drm

[PATCH 3/4] drm/i915: rename debugfs_gt_pm files

2021-09-08 Thread Lucas De Marchi

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_gt_pm.[ch] to intel_gt_pm_debugfs.[ch] and then make
functions, defines and structs follow suit.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/Makefile  |  2 +-
 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h| 14 --
 drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
 .../gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c}  |  4 ++--
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h  | 14 ++
 5 files changed, 19 insertions(+), 19 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
 rename drivers/gpu/drm/i915/gt/{debugfs_gt_pm.c => intel_gt_pm_debugfs.c} (99%)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 232c9673a2e5..dd656f2d7721 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
 gt-y += \
-   gt/debugfs_gt_pm.o \
gt/gen2_engine_cs.o \
gt/gen6_engine_cs.o \
gt/gen6_ppgtt.o \
@@ -103,6 +102,7 @@ gt-y += \
gt/intel_gt_engines_debugfs.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
+   gt/intel_gt_pm_debugfs.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
gt/intel_gtt.o \
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h 
b/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
deleted file mode 100644
index 4cf5f5c9da7d..
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.h
+++ /dev/null
@@ -1,14 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#ifndef DEBUGFS_GT_PM_H
-#define DEBUGFS_GT_PM_H
-
-struct intel_gt;
-struct dentry;
-
-void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root);
-
-#endif /* DEBUGFS_GT_PM_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
index e5d173c235a3..4096ee893b69 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
@@ -5,10 +5,10 @@
 
 #include 
 
-#include "debugfs_gt_pm.h"
 #include "i915_drv.h"
 #include "intel_gt_debugfs.h"
 #include "intel_gt_engines_debugfs.h"
+#include "intel_gt_pm_debugfs.h"
 #include "intel_sseu_debugfs.h"
 #include "uc/intel_uc_debugfs.h"
 
@@ -24,7 +24,7 @@ void intel_gt_register_debugfs(struct intel_gt *gt)
return;
 
intel_gt_engines_register_debugfs(gt, root);
-   debugfs_gt_pm_register(gt, root);
+   intel_gt_pm_register_debugfs(gt, root);
intel_sseu_debugfs_register(gt, root);
 
intel_uc_debugfs_register(>->uc, root);
diff --git a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
similarity index 99%
rename from drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
rename to drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
index 9222cf68c56c..baca153c05dd 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_gt_pm.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
@@ -6,12 +6,12 @@
 
 #include 
 
-#include "debugfs_gt_pm.h"
 #include "i915_drv.h"
 #include "intel_gt.h"
 #include "intel_gt_clock_utils.h"
 #include "intel_gt_debugfs.h"
 #include "intel_gt_pm.h"
+#include "intel_gt_pm_debugfs.h"
 #include "intel_llc.h"
 #include "intel_rc6.h"
 #include "intel_rps.h"
@@ -614,7 +614,7 @@ static bool rps_eval(void *data)
 
 DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rps_boost);
 
-void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root)
+void intel_gt_pm_register_debugfs(struct intel_gt *gt, struct dentry *root)
 {
static const struct intel_gt_debugfs_file files[] = {
{ "drpc", &drpc_fops, NULL },
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
new file mode 100644
index ..f44894579604
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef INTEL_GT_PM_DEBUGFS_H
+#define INTEL_GT_PM_DEBUGFS_H
+
+struct intel_gt;
+struct dentry;
+
+void intel_gt_pm_register_debugfs(struct intel_gt *gt, struct dentry *root);
+
+#endif /* INTEL_GT_PM_DEBUGFS_H */
-- 
2.32.0

[PATCH 2/4] drm/i915: rename debugfs_engines files

2021-09-08 Thread Lucas De Marchi

We shouldn't be using debugfs_ namespace for this functionality. Rename
debugfs_engines.[ch] to intel_gt_engines_debugfs.[ch] and then make
functions, defines and structs follow suit.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/Makefile  |  2 +-
 drivers/gpu/drm/i915/gt/debugfs_engines.h  | 14 --
 drivers/gpu/drm/i915/gt/intel_gt_debugfs.c |  4 ++--
 ...ebugfs_engines.c => intel_gt_engines_debugfs.c} |  4 ++--
 drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h | 14 ++
 5 files changed, 19 insertions(+), 19 deletions(-)
 delete mode 100644 drivers/gpu/drm/i915/gt/debugfs_engines.h
 rename drivers/gpu/drm/i915/gt/{debugfs_engines.c => 
intel_gt_engines_debugfs.c} (85%)
 create mode 100644 drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 3e171f0b5f6a..232c9673a2e5 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -79,7 +79,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
 
 # "Graphics Technology" (aka we talk to the gpu)
 gt-y += \
-   gt/debugfs_engines.o \
gt/debugfs_gt_pm.o \
gt/gen2_engine_cs.o \
gt/gen6_engine_cs.o \
@@ -101,6 +100,7 @@ gt-y += \
gt/intel_gt_buffer_pool.o \
gt/intel_gt_clock_utils.o \
gt/intel_gt_debugfs.o \
+   gt/intel_gt_engines_debugfs.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
gt/intel_gt_pm_irq.o \
diff --git a/drivers/gpu/drm/i915/gt/debugfs_engines.h 
b/drivers/gpu/drm/i915/gt/debugfs_engines.h
deleted file mode 100644
index f69257eaa1cc..
--- a/drivers/gpu/drm/i915/gt/debugfs_engines.h
+++ /dev/null
@@ -1,14 +0,0 @@
-/* SPDX-License-Identifier: MIT */
-/*
- * Copyright © 2019 Intel Corporation
- */
-
-#ifndef DEBUGFS_ENGINES_H
-#define DEBUGFS_ENGINES_H
-
-struct intel_gt;
-struct dentry;
-
-void debugfs_engines_register(struct intel_gt *gt, struct dentry *root);
-
-#endif /* DEBUGFS_ENGINES_H */
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
index a27ba11605d8..e5d173c235a3 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_debugfs.c
@@ -5,10 +5,10 @@
 
 #include 
 
-#include "debugfs_engines.h"
 #include "debugfs_gt_pm.h"
 #include "i915_drv.h"
 #include "intel_gt_debugfs.h"
+#include "intel_gt_engines_debugfs.h"
 #include "intel_sseu_debugfs.h"
 #include "uc/intel_uc_debugfs.h"
 
@@ -23,7 +23,7 @@ void intel_gt_register_debugfs(struct intel_gt *gt)
if (IS_ERR(root))
return;
 
-   debugfs_engines_register(gt, root);
+   intel_gt_engines_register_debugfs(gt, root);
debugfs_gt_pm_register(gt, root);
intel_sseu_debugfs_register(gt, root);
 
diff --git a/drivers/gpu/drm/i915/gt/debugfs_engines.c 
b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.c
similarity index 85%
rename from drivers/gpu/drm/i915/gt/debugfs_engines.c
rename to drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.c
index 2980dac5b171..44b22384fcb2 100644
--- a/drivers/gpu/drm/i915/gt/debugfs_engines.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.c
@@ -6,10 +6,10 @@
 
 #include 
 
-#include "debugfs_engines.h"
 #include "i915_drv.h" /* for_each_engine! */
 #include "intel_engine.h"
 #include "intel_gt_debugfs.h"
+#include "intel_gt_engines_debugfs.h"
 
 static int engines_show(struct seq_file *m, void *data)
 {
@@ -26,7 +26,7 @@ static int engines_show(struct seq_file *m, void *data)
 }
 DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(engines);
 
-void debugfs_engines_register(struct intel_gt *gt, struct dentry *root)
+void intel_gt_engines_register_debugfs(struct intel_gt *gt, struct dentry 
*root)
 {
static const struct intel_gt_debugfs_file files[] = {
{ "engines", &engines_fops },
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h
new file mode 100644
index ..4163b496937b
--- /dev/null
+++ b/drivers/gpu/drm/i915/gt/intel_gt_engines_debugfs.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: MIT */
+/*
+ * Copyright © 2019 Intel Corporation
+ */
+
+#ifndef INTEL_GT_ENGINES_DEBUGFS_H
+#define INTEL_GT_ENGINES_DEBUGFS_H
+
+struct intel_gt;
+struct dentry;
+
+void intel_gt_engines_register_debugfs(struct intel_gt *gt, struct dentry 
*root);
+
+#endif /* INTEL_GT_ENGINES_DEBUGFS_H */
-- 
2.32.0

[PATCH 4/4] drm/i915: deduplicate frequency dump on debugfs

2021-09-08 Thread Lucas De Marchi

Although commit 9dd4b065446a ("drm/i915/gt: Move pm debug files into a
gt aware debugfs") says it was moving debug files to gt/, the
i915_frequency_info file was left behind and its implementation copied
into drivers/gpu/drm/i915/gt/debugfs_gt_pm.c. Over time we had several
patches having to change both places to keep them in sync (and some
patches failing to do so). The initial idea was to remove
i915_frequency_info, but there are user space tools using it. From a
quick code search there are other scripts and test tools besides igt, so
it's not simply updating igt to get rid of the older file.

Here we export a function using drm_printer as parameter and make
both show() implementations to call this same function. Aside from a few
variable name differences, for i915_frequency_info this brings a few
lines that were not previously printed: RP UP EI, RP UP THRESHOLD, RP
DOWN THRESHOLD and RP DOWN EI.  These came in as part of
commit 9c878557b1eb ("drm/i915/gt: Use the RPM config register to
determine clk frequencies"), which didn't change both places.

Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c | 127 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.h |   2 +
 drivers/gpu/drm/i915/i915_debugfs.c   | 231 +-
 3 files changed, 76 insertions(+), 284 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c 
b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
index baca153c05dd..31d334d3b3b5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
+++ b/drivers/gpu/drm/i915/gt/intel_gt_pm_debugfs.c
@@ -240,9 +240,8 @@ static int drpc_show(struct seq_file *m, void *unused)
 }
 DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(drpc);
 
-static int frequency_show(struct seq_file *m, void *unused)
+void intel_gt_pm_frequency_dump(struct intel_gt *gt, struct drm_printer *p)
 {
-   struct intel_gt *gt = m->private;
struct drm_i915_private *i915 = gt->i915;
struct intel_uncore *uncore = gt->uncore;
struct intel_rps *rps = >->rps;
@@ -254,21 +253,21 @@ static int frequency_show(struct seq_file *m, void 
*unused)
u16 rgvswctl = intel_uncore_read16(uncore, MEMSWCTL);
u16 rgvstat = intel_uncore_read16(uncore, MEMSTAT_ILK);
 
-   seq_printf(m, "Requested P-state: %d\n", (rgvswctl >> 8) & 0xf);
-   seq_printf(m, "Requested VID: %d\n", rgvswctl & 0x3f);
-   seq_printf(m, "Current VID: %d\n", (rgvstat & MEMSTAT_VID_MASK) 
>>
+   drm_printf(p, "Requested P-state: %d\n", (rgvswctl >> 8) & 0xf);
+   drm_printf(p, "Requested VID: %d\n", rgvswctl & 0x3f);
+   drm_printf(p, "Current VID: %d\n", (rgvstat & MEMSTAT_VID_MASK) 
>>
   MEMSTAT_VID_SHIFT);
-   seq_printf(m, "Current P-state: %d\n",
+   drm_printf(p, "Current P-state: %d\n",
   (rgvstat & MEMSTAT_PSTATE_MASK) >> 
MEMSTAT_PSTATE_SHIFT);
} else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
u32 rpmodectl, freq_sts;
 
rpmodectl = intel_uncore_read(uncore, GEN6_RP_CONTROL);
-   seq_printf(m, "Video Turbo Mode: %s\n",
+   drm_printf(p, "Video Turbo Mode: %s\n",
   yesno(rpmodectl & GEN6_RP_MEDIA_TURBO));
-   seq_printf(m, "HW control enabled: %s\n",
+   drm_printf(p, "HW control enabled: %s\n",
   yesno(rpmodectl & GEN6_RP_ENABLE));
-   seq_printf(m, "SW control enabled: %s\n",
+   drm_printf(p, "SW control enabled: %s\n",
   yesno((rpmodectl & GEN6_RP_MEDIA_MODE_MASK) ==
 GEN6_RP_MEDIA_SW_MODE));
 
@@ -276,25 +275,25 @@ static int frequency_show(struct seq_file *m, void 
*unused)
freq_sts = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS);
vlv_punit_put(i915);
 
-   seq_printf(m, "PUNIT_REG_GPU_FREQ_STS: 0x%08x\n", freq_sts);
-   seq_printf(m, "DDR freq: %d MHz\n", i915->mem_freq);
+   drm_printf(p, "PUNIT_REG_GPU_FREQ_STS: 0x%08x\n", freq_sts);
+   drm_printf(p, "DDR freq: %d MHz\n", i915->mem_freq);
 
-   seq_printf(m, "actual GPU freq: %d MHz\n",
+   drm_printf(p, "actual GPU freq: %d MHz\n",
   intel_gpu_freq(rps, (freq_sts >> 8) & 0xff));
 
-   seq_printf(m, "current GPU freq: %d MHz\n",
+   drm_printf(p, "current GPU freq: %d MHz\n",
   intel_gpu_freq(rps, rps->cur_freq));
 
-   seq_printf(m, "max GPU freq: %d MHz\n",
+   drm_printf(p, "max GPU freq: %d MHz\n",
   intel_gpu_freq(rps, rps->max_freq));
 
-   seq_printf(m, "min GPU freq: %d MHz\n",
+   drm_printf(p, "min GPU freq: %d MHz\n",
   intel_gpu_freq(rps, rps->min_freq));

Re: [PATCH v3 00/16] eDP: Support probing eDP panels dynamically instead of hardcoding

2021-09-08 Thread Doug Anderson

Hi,

On Sun, Sep 5, 2021 at 11:55 AM Sam Ravnborg  wrote:
>
> Hi Douglas,
>
> On Wed, Sep 01, 2021 at 01:19:18PM -0700, Douglas Anderson wrote:
> > The goal of this patch series is to move away from hardcoding exact
> > eDP panels in device tree files. As discussed in the various patches
> > in this series (I'm not repeating everything here), most eDP panels
> > are 99% probable and we can get that last 1% by allowing two "power
> > up" delays to be specified in the device tree file and then using the
> > panel ID (found in the EDID) to look up additional power sequencing
> > delays for the panel.
> >
> > This patch series is the logical contiunation of a previous patch
> > series where I proposed solving this problem by adding a
> > board-specific compatible string [1]. In the discussion that followed
> > it sounded like people were open to something like the solution
> > proposed in this new series.
> >
> > In version 2 I got rid of the idea that we could have a "fallback"
> > compatible string that we'd use if we didn't recognize the ID in the
> > EDID. This simplifies the bindings a lot and the implementation
> > somewhat. As a result of not having a "fallback", though, I'm not
> > confident in transitioning any existing boards over to this since
> > we'll have to fallback to very conservative timings if we don't
> > recognize the ID from the EDID and I can't guarantee that I've seen
> > every panel that might have shipped on an existing product. The plan
> > is to use "edp-panel" only on new boards or new revisions of old
> > boards where we can guarantee that every EDID that ships out of the
> > factory has an ID in the table.
> >
> > Version 3 of this series now splits out all eDP panels to their own
> > driver and adds the generic eDP panel support to this new driver. I
> > believe this is what Sam was looking for [2].
> >
> > [1] https://lore.kernel.org/r/yfkqaxomowyye...@google.com/
> > [2] https://lore.kernel.org/r/yrtsfntn%2ft8fl...@ravnborg.org/
> >
> > Changes in v3:
> > - Decode hex product ID w/ same endianness as everyone else.
> > - ("Reorder logicpd_type_28...") patch new for v3.
> > - Split eDP panels patch new for v3.
> > - Move wayward panels patch new for v3.
> > - ("Non-eDP panels don't need "HPD" handling") new for v3.
> > - Split the delay structure out patch just on eDP now.
> > - ("Better describe eDP panel delays") new for v3.
> > - Fix "prepare_to_enable" patch new for v3.
> > - ("Don't re-read the EDID every time") moved to eDP only patch.
> > - Generic "edp-panel" handled by the eDP panel driver now.
> > - Change init order to we power at the end.
> > - Adjust endianness of product ID.
> > - Fallback to conservative delays if panel not recognized.
> > - Add Sharp LQ116M1JW10 to table.
> > - Add AUO B116XAN06.1 to table.
> > - Rename delays more generically so they can be reused.
> >
> > Changes in v2:
> > - No longer allow fallback to panel-simple.
> > - Add "-ms" suffix to delays.
> > - Don't support a "fallback" panel. Probed panels must be probed.
> > - Not based on patch to copy "desc"--just allocate for probed panels.
> > - Add "-ms" suffix to delays.
> >
> > Douglas Anderson (16):
> >   dt-bindings: drm/panel-simple-edp: Introduce generic eDP panels
> >   drm/edid: Break out reading block 0 of the EDID
> >   drm/edid: Allow the querying/working with the panel ID from the EDID
> >   drm/panel-simple: Reorder logicpd_type_28 / mitsubishi_aa070mc01
> >   drm/panel-simple-edp: Split eDP panels out of panel-simple
> >   ARM: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP
> >   arm64: defconfig: Everyone who had PANEL_SIMPLE now gets
> > PANEL_SIMPLE_EDP
> >   MIPS: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP
> >   drm/panel-simple-edp: Move some wayward panels to the eDP driver
> >   drm/panel-simple: Non-eDP panels don't need "HPD" handling
> >   drm/panel-simple-edp: Split the delay structure out
> >   drm/panel-simple-edp: Better describe eDP panel delays
> >   drm/panel-simple-edp: hpd_reliable shouldn't be subtraced from
> > hpd_absent
> >   drm/panel-simple-edp: Fix "prepare_to_enable" if panel doesn't handle
> > HPD
> >   drm/panel-simple-edp: Don't re-read the EDID every time we power off
> > the panel
> >   drm/panel-simple-edp: Implement generic "edp-panel"s probed by EDID
>
> Thanks for looking into this. I really like the outcome.
> We have panel-simple that now (mostly) handle simple panels,
> and thus all the eDP functionality is in a separate driver.
>
> I have provided a few nits.
> My only take on this is the naming - as we do not want to confuse
> panel-simple and panel-edp I strongly suggest renaming the driver to
> panel-edp.

Sure, I'll do that. I was trying to express the fact that the new
"panel-edp" driver won't actually handle _all_ eDP panels, only the
eDP panels that are (comparatively) simpler. For instance, I'm not
planning to handle panel-samsung-atna33xc20.c in "panel-edp". I guess
people

Re: [PATCH v3 03/16] drm/edid: Allow the querying/working with the panel ID from the EDID

2021-09-08 Thread Doug Anderson

Hi,

On Mon, Sep 6, 2021 at 3:05 AM Jani Nikula  wrote:
>
> > +{
> > + struct edid *edid;
> > + u32 val;
> > +
> > + edid = drm_do_get_edid_blk0(drm_do_probe_ddc_edid, adapter, NULL, 
> > NULL);
> > +
> > + /*
> > +  * There are no manufacturer IDs of 0, so if there is a problem 
> > reading
> > +  * the EDID then we'll just return 0.
> > +  */
> > + if (IS_ERR_OR_NULL(edid))
> > + return 0;
> > +
> > + /*
> > +  * In theory we could try to de-obfuscate this like edid_get_quirks()
> > +  * does, but it's easier to just deal with a 32-bit number.
>
> Hmm, but is it, really? AFAICT this is just an internal representation
> for a table, where it could just as well be stored in a struct that
> could be just as compact now, but extensible later. You populate the
> table via an encoding macro, then decode the id using a function - while
> it could be in a format that's directly usable without the decode. If
> suitably chosen, the struct could perhaps be reused between the quirks
> code and your code.

I'm not 100% sure, but I think you're suggesting having this function
return a `struct edid_panel_id` or something like that. Is that right?
Maybe that would look something like this?

struct edid_panel_id {
  char vendor[4];
  u16 product_id;
}

...or perhaps this (untested, but I think it works):

struct edid_panel_id {
  u16 vend_c1:5;
  u16 vend_c2:5;
  u16 vend_c3:5;
  u16 product_id;
}

...and then change `struct edid_quirk` to something like this:

static const struct edid_quirk {
  struct edid_panel_id panel_id;
  u32 quirks;
} ...

Is that correct? There are a few downsides that I can see:

a) I think the biggest downside is the inability compare with "==". I
don't believe it's legal to compare structs with "==" in C. Yeah, we
can use memcmp() but that feels more awkward to me.

b) Unless you use the bitfield approach, it takes up more space. I
know it's not a huge deal, but the format in the EDID is pretty much
_forced_ to fit in 32-bits. The bitfield approach seems like it'd be
more awkward than my encoding macros.

-Doug

Re: [PATCH v5 11/16] drm/mediatek: add display MDP RDMA support for MT8195

2021-09-08 Thread Chun-Kuang Hu

Hi, Nancy:

Nancy.Lin  於 2021年9月6日 週一 下午3:15寫道：
>
> Add MDP_RDMA driver for MT8195. MDP_RDMA is the DMA engine of
> the ovl_adaptor component.
>
> Signed-off-by: Nancy.Lin 
> ---
>  drivers/gpu/drm/mediatek/Makefile   |   3 +-
>  drivers/gpu/drm/mediatek/mtk_disp_drv.h |   7 +
>  drivers/gpu/drm/mediatek/mtk_mdp_rdma.c | 301 
>  drivers/gpu/drm/mediatek/mtk_mdp_rdma.h |  37 +++
>  4 files changed, 347 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/gpu/drm/mediatek/mtk_mdp_rdma.c
>  create mode 100644 drivers/gpu/drm/mediatek/mtk_mdp_rdma.h
>
> diff --git a/drivers/gpu/drm/mediatek/Makefile 
> b/drivers/gpu/drm/mediatek/Makefile
> index a38e88e82d12..6e604a933ed0 100644
> --- a/drivers/gpu/drm/mediatek/Makefile
> +++ b/drivers/gpu/drm/mediatek/Makefile
> @@ -13,7 +13,8 @@ mediatek-drm-y := mtk_disp_aal.o \
>   mtk_drm_gem.o \
>   mtk_drm_plane.o \
>   mtk_dsi.o \
> - mtk_dpi.o
> + mtk_dpi.o \
> + mtk_mdp_rdma.o
>
>  obj-$(CONFIG_DRM_MEDIATEK) += mediatek-drm.o
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h 
> b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> index a33b13fe2b6e..b3a372cab0bd 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> @@ -8,6 +8,7 @@
>
>  #include 
>  #include "mtk_drm_plane.h"
> +#include "mtk_mdp_rdma.h"
>
>  int mtk_aal_clk_enable(struct device *dev);
>  void mtk_aal_clk_disable(struct device *dev);
> @@ -106,4 +107,10 @@ void mtk_rdma_enable_vblank(struct device *dev,
> void *vblank_cb_data);
>  void mtk_rdma_disable_vblank(struct device *dev);
>
> +int mtk_mdp_rdma_clk_enable(struct device *dev);
> +void mtk_mdp_rdma_clk_disable(struct device *dev);
> +void mtk_mdp_rdma_start(struct device *dev, struct cmdq_pkt *cmdq_pkt);
> +void mtk_mdp_rdma_stop(struct device *dev, struct cmdq_pkt *cmdq_pkt);
> +void mtk_mdp_rdma_config(struct device *dev, struct mtk_mdp_rdma_cfg *cfg,
> +struct cmdq_pkt *cmdq_pkt);
>  #endif
> diff --git a/drivers/gpu/drm/mediatek/mtk_mdp_rdma.c 
> b/drivers/gpu/drm/mediatek/mtk_mdp_rdma.c
> new file mode 100644
> index ..052434d960b9
> --- /dev/null
> +++ b/drivers/gpu/drm/mediatek/mtk_mdp_rdma.c
> @@ -0,0 +1,301 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (c) 2021 MediaTek Inc.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "mtk_drm_drv.h"
> +#include "mtk_disp_drv.h"
> +#include "mtk_mdp_rdma.h"
> +
> +#define MDP_RDMA_EN0x000
> +   #define FLD_ROT_ENABLEBIT(0)

Maybe my description is not good, I like the style of rdma driver [1].

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/mediatek/mtk_disp_rdma.c?h=v5.14

> +
> +#define MDP_RDMA_RESET 0x008
> +
> +#define MDP_RDMA_CON   0x020
> +   #define FLD_OUTPUT_10BBIT(5)
> +   #define FLD_SIMPLE_MODE   BIT(4)
> +
> +#define MDP_RDMA_GMCIF_CON 0x028
> +   #define FLD_COMMAND_DIV   BIT(0)
> +   #define FLD_EXT_PREULTRA_EN   BIT(3)
> +   #define FLD_RD_REQ_TYPE   GENMASK(7, 4)
> +   #define VAL_RD_REQ_TYPE_BURST_8_ACCESS7
> +   #define FLD_ULTRA_EN  GENMASK(13, 12)
> +   #define VAL_ULTRA_EN_ENABLE   1
> +   #define FLD_PRE_ULTRA_EN  GENMASK(17, 16)
> +   #define VAL_PRE_ULTRA_EN_ENABLE   1
> +   #define FLD_EXT_ULTRA_EN  BIT(18)
> +
> +#define MDP_RDMA_SRC_CON   0x030
> +   #define FLD_OUTPUT_ARGB   BIT(25)
> +   #define FLD_BIT_NUMBERGENMASK(19, 18)
> +   #define FLD_UNIFORM_CONFIGBIT(17)
> +   #define FLD_SWAP  BIT(14)
> +   #define FLD_SRC_FORMATGENMASK(3, 0)
> +
> +#define MDP_RDMA_COMP_CON  0x038
> +   #define FLD_AFBC_EN   BIT(22)
> +   #define FLD_AFBC_YUV_TRANSFORMBIT(21)
> +   #define FLD_UFBDC_EN  BIT(12)
> +
> +#define MDP_RDMA_MF_BKGD_SIZE_IN_BYTE  0x060
> +   #define FLD_MF_BKGD_WBGENMASK(22, 0)
> +
> +#define MDP_RDMA_MF_SRC_SIZE   0x070
> +   #define FLD_MF_SRC_H  GENMASK(30, 16)
> +   #define FLD_MF_SR

Re: [PATCH v5 25/25] drm/i915/guc: Add GuC kernel doc

2021-09-08 Thread John Harrison


On 9/3/2021 12:59, Daniele Ceraolo Spurio wrote:

From: Matthew Brost 

Add GuC kernel doc for all structures added thus far for GuC submission
and update the main GuC submission section with the new interface
details.

v2:
  - Drop guc_active.lock DOC
v3:
  - Fixup a few kernel doc comments (Daniele)
v4 (Daniele):
  - Implement doc suggestions from John
  - Add kerneldoc for all members of the GuC structure and pull the file
in i915.rst
v5 (Daniele):
  - Implement new doc suggestions from John

Signed-off-by: Matthew Brost 
Signed-off-by: Daniele Ceraolo Spurio 
Cc: John Harrison 

Reviewed-by: John Harrison 


---
  Documentation/gpu/i915.rst|   2 +
  drivers/gpu/drm/i915/gt/intel_context_types.h |  43 +---
  drivers/gpu/drm/i915/gt/uc/intel_guc.h|  75 ++---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 100 ++
  drivers/gpu/drm/i915/i915_request.h   |  21 ++--
  5 files changed, 181 insertions(+), 60 deletions(-)

diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 101dde3eb1ea..311e10400708 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -495,6 +495,8 @@ GuC
  .. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc.c
 :doc: GuC
  
+.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc.h

+
  GuC Firmware Layout
  ~~~
  
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h

index 5285d660eacf..930569a1a01f 100644
--- a/drivers/gpu/drm/i915/gt/intel_context_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_context_types.h
@@ -156,40 +156,51 @@ struct intel_context {
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
  
  	struct {

-   /** lock: protects everything in guc_state */
+   /** @lock: protects everything in guc_state */
spinlock_t lock;
/**
-* sched_state: scheduling state of this context using GuC
+* @sched_state: scheduling state of this context using GuC
 * submission
 */
u32 sched_state;
/*
-* fences: maintains of list of requests that have a submit
-* fence related to GuC submission
+* @fences: maintains a list of requests that are currently
+* being fenced until a GuC operation completes
 */
struct list_head fences;
-   /* GuC context blocked fence */
+   /**
+* @blocked: fence used to signal when the blocking of a
+* context's submissions is complete.
+*/
struct i915_sw_fence blocked;
-   /* GuC committed requests */
+   /** @number_committed_requests: number of committed requests */
int number_committed_requests;
-   /** requests: active requests on this context */
+   /** @requests: list of active requests on this context */
struct list_head requests;
-   /*
-* GuC priority management
-*/
+   /** @prio: the context's current guc priority */
u8 prio;
+   /**
+* @prio_count: a counter of the number requests in flight in
+* each priority bucket
+*/
u32 prio_count[GUC_CLIENT_PRIORITY_NUM];
} guc_state;
  
  	struct {

-   /* GuC LRC descriptor ID */
+   /**
+* @id: handle which is used to uniquely identify this context
+* with the GuC, protected by guc->contexts_lock
+*/
u16 id;
-
-   /* GuC LRC descriptor reference count */
+   /**
+* @ref: the number of references to the guc_id, when
+* transitioning in and out of zero protected by
+* guc->contexts_lock
+*/
atomic_t ref;
-
-   /*
-* GuC ID link - in list when unpinned but guc_id still valid 
in GuC
+   /**
+* @link: in guc->guc_id_list when the guc_id has no refs but is
+* still valid, protected by guc->contexts_lock
 */
struct list_head link;
} guc_id;
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc.h 
b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
index 2e27fe59786b..5dd174babf7a 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc.h
@@ -22,74 +22,121 @@
  
  struct __guc_ads_blob;
  
-/*

- * Top level structure of GuC. It handles firmware loading and manages client
- * pool. intel_guc owns a intel_guc_client to replace the legacy ExecList
- * submission.
+/**
+ * struct intel_guc - Top level structure of GuC.
+ *
+ * It handl

Re: [PATCH v3 06/16] ARM: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP

2021-09-08 Thread Olof Johansson

On Wed, Sep 8, 2021 at 3:36 PM Doug Anderson  wrote:
>
> Hi,
>
> On Fri, Sep 3, 2021 at 1:38 PM Stephen Boyd  wrote:
> >
> > Quoting Doug Anderson (2021-09-01 16:10:15)
> > > Hi,
> > >
> > > On Wed, Sep 1, 2021 at 2:12 PM Olof Johansson  wrote:
> > > >
> > > > On Wed, Sep 1, 2021 at 1:20 PM Douglas Anderson  
> > > > wrote:
> > > > >
> > > > > In the patch ("drm/panel-simple-edp: Split eDP panels out of
> > > > > panel-simple") we split the PANEL_SIMPLE driver in 2. By default let's
> > > > > give everyone who had the old driver enabled the new driver too. If
> > > > > folks want to opt-out of one or the other they always can later.
> > > > >
> > > > > Signed-off-by: Douglas Anderson 
> > > >
> > > > Isn't this a case where the new option should just have had the old
> > > > option as the default value to avoid this kind of churn and possibly
> > > > broken platforms?
> > >
> > > I'm happy to go either way. I guess I didn't do that originally
> > > because logically there's not any reason to link the two drivers going
> > > forward. Said another way, someone enabling the "simple panel" driver
> > > for non-eDP panels wouldn't expect that the "simple panel" driver for
> > > DP panels would also get enabled by default. They really have nothing
> > > to do with one another. Enabling by default for something like this
> > > also seems like it would lead to bloat. I could have sworn that
> > > periodically people get yelled at for marking drivers on by default
> > > when it doesn't make sense.
> > >
> > > ...that being said, I'm happy to change the default as you suggest.
> > > Just let me know.
> >
> > Having the default will help olddefconfig users seamlessly migrate to
> > the new Kconfig. Sadly they don't notice that they should probably
> > disable the previous Kconfig symbol, but oh well. At least with the
> > default they don't go on a hunt/bisect to figure out that some Kconfig
> > needed to be enabled now that they're using a new kernel version.
> >
> > Maybe the default should have a TODO comment next to it indicating we
> > should remove the default in a year or two.
>
> OK, so I'm trying to figure out how to do this without just "kicking
> the can" down the road. I guess your idea is that for the next year
> this will be the default and that anyone who really wants
> "CONFIG_DRM_PANEL_EDP" will "opt-in" to keep it by adding
> "CONFIG_DRM_PANEL_EDP=y" to their config? ...and then after a year
> passes we remove the default? ...but that won't work, will it? Since
> "CONFIG_DRM_PANEL_EDP" will be the default for the next year then you
> really can't add it to the "defconfig", at least if you ever
> "normalize" it. The "defconfig" by definition has everything stripped
> from it that's already the "default", so for the next year anyone who
> tries to opt-in will get their preference stripped.
>
> Hrm, so let me explain options as I see them. Maybe someone can point
> out something that I missed. I'll assume that we'll change the config
> option from CONFIG_DRM_PANEL_SIMPLE_EDP to CONFIG_DRM_PANEL_EDP
> (remove the "SIMPLE" part).
>
> ==
>
> Where we were before my series:
>
> * One config "CONFIG_DRM_PANEL_SIMPLE" and it enables simple non-eDP
> and eDP drivers.
>
> ==
>
> Option 1: update everyone's configs (this patch)
>
> * Keep old config "CONFIG_DRM_PANEL_SIMPLE" but it now only means
> enable the panel-simple (non-eDP) driver.
> * Anyone who wants eDP panels must opt-in to "CONFIG_DRM_PANEL_EDP"
> * Update all configs in mainline; any out-of mainline configs must
> figure this out themselves.
>
> Pros:
> * no long term baggage
>
> Cons:
> * patch upstream is a bit of "churn"
> * anyone with downstream config will have to figure out what happened.
>
> ==
>
> Option 2: kick the can down the road + accept cruft
>
> * Keep old config "CONFIG_DRM_PANEL_SIMPLE" and it means enable the
> panel-simple (non-eDP) driver.
> * Anyone with "CONFIG_DRM_PANEL_SIMPLE" is opted in by default to
> "CONFIG_DRM_PANEL_EDP"
>
> AKA:
> config DRM_PANEL_EDP
>   default DRM_PANEL_SIMPLE
>
> Pros:
> * no config patches needed upstream at all and everything just works!
>
> Cons:
> * people are opted in to extra cruft by default and need to know to turn it 
> off.
> * unclear if we can change the default without the same problems.
>
> ==
>
> Option 3: try to be clever
>
> * Add _two_ new configs. CONFIG_DRM_PANEL_SIMPLE_V2 and CONFIG_DRM_PANEL_EDP.
> * Old config "CONFIG_DRM_PANEL_SIMPLE" gets marked as "deprecated".
> * Both new configs have "default CONFIG_DRM_PANEL_SIMPLE"
>
> Now anyone old will magically get both the new config options by
> default. Anyone looking at this in the future _won't_ set the
> deprecated CONFIG_DRM_PANEL_SIMPLE but will instead choose if they
> want either the eDP or "simple" driver.
>
> Pros:
> * No long term baggage.
> * Everyone is transitioned automatically by default with no cruft patches.
>
> Cons:
> * I can't think of a better name than "CONFIG_DRM_PANEL_SIMPLE_V2" and
> that name is ugl

[PATCH v3 8/8] treewide: Replace the use of mem_encrypt_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky

Replace uses of mem_encrypt_active() with calls to cc_platform_has() with
the CC_ATTR_MEM_ENCRYPT attribute.

Remove the implementation of mem_encrypt_active() across all arches.

For s390, since the default implementation of the cc_platform_has()
matches the s390 implementation of mem_encrypt_active(), cc_platform_has()
does not need to be implemented in s390 (the config option
ARCH_HAS_CC_PLATFORM is not set).

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: VMware Graphics 
Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Dave Young 
Cc: Baoquan He 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Heiko Carstens 
Cc: Vasily Gorbik 
Cc: Christian Borntraeger 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/include/asm/mem_encrypt.h  | 5 -
 arch/powerpc/platforms/pseries/svm.c| 5 +++--
 arch/s390/include/asm/mem_encrypt.h | 2 --
 arch/x86/include/asm/mem_encrypt.h  | 5 -
 arch/x86/kernel/head64.c| 4 ++--
 arch/x86/mm/ioremap.c   | 4 ++--
 arch/x86/mm/mem_encrypt.c   | 2 +-
 arch/x86/mm/pat/set_memory.c| 3 ++-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 4 +++-
 drivers/gpu/drm/drm_cache.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 4 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c | 6 +++---
 drivers/iommu/amd/iommu.c   | 3 ++-
 drivers/iommu/amd/iommu_v2.c| 3 ++-
 drivers/iommu/iommu.c   | 3 ++-
 fs/proc/vmcore.c| 6 +++---
 include/linux/mem_encrypt.h | 4 
 kernel/dma/swiotlb.c| 4 ++--
 18 files changed, 31 insertions(+), 40 deletions(-)

diff --git a/arch/powerpc/include/asm/mem_encrypt.h 
b/arch/powerpc/include/asm/mem_encrypt.h
index ba9dab07c1be..2f26b8fc8d29 100644
--- a/arch/powerpc/include/asm/mem_encrypt.h
+++ b/arch/powerpc/include/asm/mem_encrypt.h
@@ -10,11 +10,6 @@
 
 #include 
 
-static inline bool mem_encrypt_active(void)
-{
-   return is_secure_guest();
-}
-
 static inline bool force_dma_unencrypted(struct device *dev)
 {
return is_secure_guest();
diff --git a/arch/powerpc/platforms/pseries/svm.c 
b/arch/powerpc/platforms/pseries/svm.c
index 87f001b4c4e4..c083ecbbae4d 100644
--- a/arch/powerpc/platforms/pseries/svm.c
+++ b/arch/powerpc/platforms/pseries/svm.c
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -63,7 +64,7 @@ void __init svm_swiotlb_init(void)
 
 int set_memory_encrypted(unsigned long addr, int numpages)
 {
-   if (!mem_encrypt_active())
+   if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
 
if (!PAGE_ALIGNED(addr))
@@ -76,7 +77,7 @@ int set_memory_encrypted(unsigned long addr, int numpages)
 
 int set_memory_decrypted(unsigned long addr, int numpages)
 {
-   if (!mem_encrypt_active())
+   if (!cc_platform_has(CC_ATTR_MEM_ENCRYPT))
return 0;
 
if (!PAGE_ALIGNED(addr))
diff --git a/arch/s390/include/asm/mem_encrypt.h 
b/arch/s390/include/asm/mem_encrypt.h
index 2542cbf7e2d1..08a8b96606d7 100644
--- a/arch/s390/include/asm/mem_encrypt.h
+++ b/arch/s390/include/asm/mem_encrypt.h
@@ -4,8 +4,6 @@
 
 #ifndef __ASSEMBLY__
 
-static inline bool mem_encrypt_active(void) { return false; }
-
 int set_memory_encrypted(unsigned long addr, int numpages);
 int set_memory_decrypted(unsigned long addr, int numpages);
 
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 499440781b39..ed954aa5c448 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -98,11 +98,6 @@ static inline void mem_encrypt_free_decrypted_mem(void) { }
 
 extern char __start_bss_decrypted[], __end_bss_decrypted[], 
__start_bss_decrypted_unused[];
 
-static inline bool mem_encrypt_active(void)
-{
-   return sme_me_mask;
-}
-
 static inline u64 sme_get_me_mask(void)
 {
return sme_me_mask;
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index de01903c3735..f98c76a1d16c 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -19,7 +19,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -285,7 +285,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 * there is no need to zero it after changing the memory encryption
 * attribute.
 */
-   if (mem_encrypt_active()) {
+   if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
vaddr = (unsigned long)__start_bss_decrypted;
vaddr_end = (unsigned long)__end_bss_decrypted;
for (; vaddr < vaddr_end; vaddr += PMD_SIZE) {
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index b59a5cbc6bc5..026031b3b782 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/

[PATCH v3 7/8] x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky

Replace uses of sev_es_active() with the more generic cc_platform_has()
using CC_ATTR_GUEST_STATE_ENCRYPT. If future support is added for other
memory encyrption techonologies, the use of CC_ATTR_GUEST_STATE_ENCRYPT
can be updated, as required.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |  2 --
 arch/x86/kernel/sev.c  |  6 +++---
 arch/x86/mm/mem_encrypt.c  | 14 --
 arch/x86/realmode/init.c   |  3 +--
 4 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index f440eebeeb2c..499440781b39 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_es_active(void);
 bool amd_cc_platform_has(enum cc_attr attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
@@ -75,7 +74,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_es_active(void) { return false; }
 static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
 
 static inline int __init
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a6895e440bc3..53a6837d354b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -11,7 +11,7 @@
 
 #include  /* For show_regs() */
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -615,7 +615,7 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd)
int cpu;
u64 pfn;
 
-   if (!sev_es_active())
+   if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return 0;
 
pflags = _PAGE_NX | _PAGE_RW;
@@ -774,7 +774,7 @@ void __init sev_es_init_vc_handling(void)
 
BUILD_BUG_ON(offsetof(struct sev_es_runtime_data, ghcb_page) % 
PAGE_SIZE);
 
-   if (!sev_es_active())
+   if (!cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
return;
 
if (!sev_es_check_cpu_features())
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 22d4e152a6de..47d571a2cd28 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -373,13 +373,6 @@ int __init early_set_memory_encrypted(unsigned long vaddr, 
unsigned long size)
  * up under SME the trampoline area cannot be encrypted, whereas under SEV
  * the trampoline area must be encrypted.
  */
-
-/* Needs to be called from non-instrumentable code */
-bool noinstr sev_es_active(void)
-{
-   return sev_status & MSR_AMD64_SEV_ES_ENABLED;
-}
-
 bool amd_cc_platform_has(enum cc_attr attr)
 {
switch (attr) {
@@ -393,7 +386,7 @@ bool amd_cc_platform_has(enum cc_attr attr)
return sev_status & MSR_AMD64_SEV_ENABLED;
 
case CC_ATTR_GUEST_STATE_ENCRYPT:
-   return sev_es_active();
+   return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 
default:
return false;
@@ -469,7 +462,7 @@ static void print_mem_encrypt_feature_info(void)
pr_cont(" SEV");
 
/* Encrypted Register State */
-   if (sev_es_active())
+   if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
pr_cont(" SEV-ES");
 
pr_cont("\n");
@@ -488,7 +481,8 @@ void __init mem_encrypt_init(void)
 * With SEV, we need to unroll the rep string I/O instructions,
 * but SEV-ES supports them through the #VC handler.
 */
-   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && !sev_es_active())
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) &&
+   !cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT))
static_branch_enable(&sev_enable_key);
 
print_mem_encrypt_feature_info();
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index c878c5ee5a4c..4a3da7592b99 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -2,7 +2,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 
@@ -48,7 +47,7 @@ static void sme_sev_setup_real_mode(struct trampoline_header 
*th)
if (cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
th->flags |= TH_FLAGS_SME_ACTIVE;
 
-   if (sev_es_active()) {
+   if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
/*
 * Skip the call to verify_cpu() in secondary_startup_64 as it
 * will cause #VC exceptions when the AP can't handle them yet.
-- 
2.33.0

[PATCH v3 6/8] x86/sev: Replace occurrences of sev_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky

Replace uses of sev_active() with the more generic cc_platform_has()
using CC_ATTR_GUEST_MEM_ENCRYPT. If future support is added for other
memory encryption technologies, the use of CC_ATTR_GUEST_MEM_ENCRYPT
can be updated, as required.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Ard Biesheuvel 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/mem_encrypt.h |  2 --
 arch/x86/kernel/crash_dump_64.c|  4 +++-
 arch/x86/kernel/kvm.c  |  3 ++-
 arch/x86/kernel/kvmclock.c |  4 ++--
 arch/x86/kernel/machine_kexec_64.c |  4 ++--
 arch/x86/kvm/svm/svm.c |  3 ++-
 arch/x86/mm/ioremap.c  |  6 +++---
 arch/x86/mm/mem_encrypt.c  | 25 ++---
 arch/x86/platform/efi/efi_64.c |  9 +
 9 files changed, 29 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 8c4f0dfe63f9..f440eebeeb2c 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sev_active(void);
 bool sev_es_active(void);
 bool amd_cc_platform_has(enum cc_attr attr);
 
@@ -76,7 +75,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
 
diff --git a/arch/x86/kernel/crash_dump_64.c b/arch/x86/kernel/crash_dump_64.c
index 045e82e8945b..a7f617a3981d 100644
--- a/arch/x86/kernel/crash_dump_64.c
+++ b/arch/x86/kernel/crash_dump_64.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static ssize_t __copy_oldmem_page(unsigned long pfn, char *buf, size_t csize,
  unsigned long offset, int userbuf,
@@ -73,5 +74,6 @@ ssize_t copy_oldmem_page_encrypted(unsigned long pfn, char 
*buf, size_t csize,
 
 ssize_t elfcorehdr_read(char *buf, size_t count, u64 *ppos)
 {
-   return read_from_oldmem(buf, count, ppos, 0, sev_active());
+   return read_from_oldmem(buf, count, ppos, 0,
+   cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT));
 }
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index a26643dc6bd6..509a578f56a0 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -418,7 +419,7 @@ static void __init sev_map_percpu_data(void)
 {
int cpu;
 
-   if (!sev_active())
+   if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return;
 
for_each_possible_cpu(cpu) {
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index ad273e5861c1..fc3930c5db1b 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -16,9 +16,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 #include 
 #include 
 
@@ -232,7 +232,7 @@ static void __init kvmclock_init_mem(void)
 * hvclock is shared between the guest and the hypervisor, must
 * be mapped decrypted.
 */
-   if (sev_active()) {
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
r = set_memory_decrypted((unsigned long) hvclock_mem,
 1UL << order);
if (r) {
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 7040c0fa921c..f5da4a18070a 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -167,7 +167,7 @@ static int init_transition_pgtable(struct kimage *image, 
pgd_t *pgd)
}
pte = pte_offset_kernel(pmd, vaddr);
 
-   if (sev_active())
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
prot = PAGE_KERNEL_EXEC;
 
set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
@@ -207,7 +207,7 @@ static int init_pgtable(struct kimage *image, unsigned long 
start_pgtable)
level4p = (pgd_t *)__va(start_pgtable);
clear_page(level4p);
 
-   if (sev_active()) {
+   if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
info.page_flag   |= _PAGE_ENC;
info.kernpg_flag |= _PAGE_ENC;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 69639f9624f5..eb3669154b48 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -457,7 +458,7 @@ static int has_svm(void)
return 0;
}
 
-   if (sev_active()) {
+   if (cc_platf

[PATCH v3 5/8] x86/sme: Replace occurrences of sme_active() with cc_platform_has()

2021-09-08 Thread Tom Lendacky

Replace uses of sme_active() with the more generic cc_platform_has()
using CC_ATTR_HOST_MEM_ENCRYPT. If future support is added for other
memory encryption technologies, the use of CC_ATTR_HOST_MEM_ENCRYPT
can be updated, as required.

This also replaces two usages of sev_active() that are really geared
towards detecting if SME is active.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Joerg Roedel 
Cc: Will Deacon 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/include/asm/mem_encrypt.h   |  2 --
 arch/x86/kernel/machine_kexec_64.c   | 15 ---
 arch/x86/kernel/pci-swiotlb.c|  9 -
 arch/x86/kernel/relocate_kernel_64.S |  2 +-
 arch/x86/mm/ioremap.c|  6 +++---
 arch/x86/mm/mem_encrypt.c| 15 +--
 arch/x86/mm/mem_encrypt_identity.c   |  3 ++-
 arch/x86/realmode/init.c |  5 +++--
 drivers/iommu/amd/init.c |  7 ---
 10 files changed, 31 insertions(+), 35 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 0a6e34b07017..11b7c06e2828 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -129,7 +129,7 @@ relocate_kernel(unsigned long indirection_page,
unsigned long page_list,
unsigned long start_address,
unsigned int preserve_context,
-   unsigned int sme_active);
+   unsigned int host_mem_enc_active);
 #endif
 
 #define ARCH_HAS_KIMAGE_ARCH
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 3d8a5e8b2e3f..8c4f0dfe63f9 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -51,7 +51,6 @@ void __init mem_encrypt_free_decrypted_mem(void);
 void __init mem_encrypt_init(void);
 
 void __init sev_es_init_vc_handling(void);
-bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
 bool amd_cc_platform_has(enum cc_attr attr);
@@ -77,7 +76,6 @@ static inline void __init sme_encrypt_kernel(struct 
boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
-static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
 static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index 131f30fdcfbd..7040c0fa921c 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -358,7 +359,7 @@ void machine_kexec(struct kimage *image)
   (unsigned long)page_list,
   image->start,
   image->preserve_context,
-  sme_active());
+  
cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT));
 
 #ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
@@ -569,12 +570,12 @@ void arch_kexec_unprotect_crashkres(void)
  */
 int arch_kexec_post_alloc_pages(void *vaddr, unsigned int pages, gfp_t gfp)
 {
-   if (sev_active())
+   if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return 0;
 
/*
-* If SME is active we need to be sure that kexec pages are
-* not encrypted because when we boot to the new kernel the
+* If host memory encryption is active we need to be sure that kexec
+* pages are not encrypted because when we boot to the new kernel the
 * pages won't be accessed encrypted (initially).
 */
return set_memory_decrypted((unsigned long)vaddr, pages);
@@ -582,12 +583,12 @@ int arch_kexec_post_alloc_pages(void *vaddr, unsigned int 
pages, gfp_t gfp)
 
 void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages)
 {
-   if (sev_active())
+   if (!cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return;
 
/*
-* If SME is active we need to reset the pages back to being
-* an encrypted mapping before freeing them.
+* If host memory encryption is active we need to reset the pages back
+* to being an encrypted mapping before freeing them.
 */
set_memory_encrypted((unsigned long)vaddr, pages);
 }
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index c2cfa5e7c152..814ab46a0dad 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -6,7 +6,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include 
 #include 
@@ -45,11 +45,10 @@ int __init pci_swiotlb_detect_4gb(void)
swiotlb = 1;
 
/*
-*

[PATCH v3 4/8] powerpc/pseries/svm: Add a powerpc version of cc_platform_has()

2021-09-08 Thread Tom Lendacky

Introduce a powerpc version of the cc_platform_has() function. This will
be used to replace the powerpc mem_encrypt_active() implementation, so
the implementation will initially only support the CC_ATTR_MEM_ENCRYPT
attribute.

Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Tom Lendacky 
---
 arch/powerpc/platforms/pseries/Kconfig   |  1 +
 arch/powerpc/platforms/pseries/Makefile  |  2 ++
 arch/powerpc/platforms/pseries/cc_platform.c | 26 
 3 files changed, 29 insertions(+)
 create mode 100644 arch/powerpc/platforms/pseries/cc_platform.c

diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index 5e037df2a3a1..2e57391e0778 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -159,6 +159,7 @@ config PPC_SVM
select SWIOTLB
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+   select ARCH_HAS_CC_PLATFORM
help
 There are certain POWER platforms which support secure guests using
 the Protected Execution Facility, with the help of an Ultravisor
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 4cda0ef87be0..41d8aee98da4 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -31,3 +31,5 @@ obj-$(CONFIG_FA_DUMP) += rtas-fadump.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
 obj-$(CONFIG_PPC_VAS)  += vas.o
+
+obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
diff --git a/arch/powerpc/platforms/pseries/cc_platform.c 
b/arch/powerpc/platforms/pseries/cc_platform.c
new file mode 100644
index ..e8021af83a19
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/cc_platform.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+
+bool cc_platform_has(enum cc_attr attr)
+{
+   switch (attr) {
+   case CC_ATTR_MEM_ENCRYPT:
+   return is_secure_guest();
+
+   default:
+   return false;
+   }
+}
+EXPORT_SYMBOL_GPL(cc_platform_has);
-- 
2.33.0

[PATCH v3 3/8] x86/sev: Add an x86 version of cc_platform_has()

2021-09-08 Thread Tom Lendacky

Introduce an x86 version of the cc_platform_has() function. This will be
used to replace vendor specific calls like sme_active(), sev_active(),
etc.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/x86/Kconfig   |  1 +
 arch/x86/include/asm/mem_encrypt.h |  3 +++
 arch/x86/kernel/Makefile   |  3 +++
 arch/x86/kernel/cc_platform.c  | 21 +
 arch/x86/mm/mem_encrypt.c  | 21 +
 5 files changed, 49 insertions(+)
 create mode 100644 arch/x86/kernel/cc_platform.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 4e001425..2b2a9639d8ae 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1513,6 +1513,7 @@ config AMD_MEM_ENCRYPT
select ARCH_HAS_FORCE_DMA_UNENCRYPTED
select INSTRUCTION_DECODER
select ARCH_HAS_RESTRICTED_VIRTIO_MEMORY_ACCESS
+   select ARCH_HAS_CC_PLATFORM
help
  Say yes to enable support for the encryption of system memory.
  This requires an AMD processor that supports Secure Memory
diff --git a/arch/x86/include/asm/mem_encrypt.h 
b/arch/x86/include/asm/mem_encrypt.h
index 9c80c68d75b5..3d8a5e8b2e3f 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -13,6 +13,7 @@
 #ifndef __ASSEMBLY__
 
 #include 
+#include 
 
 #include 
 
@@ -53,6 +54,7 @@ void __init sev_es_init_vc_handling(void);
 bool sme_active(void);
 bool sev_active(void);
 bool sev_es_active(void);
+bool amd_cc_platform_has(enum cc_attr attr);
 
 #define __bss_decrypted __section(".bss..decrypted")
 
@@ -78,6 +80,7 @@ static inline void sev_es_init_vc_handling(void) { }
 static inline bool sme_active(void) { return false; }
 static inline bool sev_active(void) { return false; }
 static inline bool sev_es_active(void) { return false; }
+static inline bool amd_cc_platform_has(enum cc_attr attr) { return false; }
 
 static inline int __init
 early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 
0; }
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 8f4e8fa6ed75..f91403a78594 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -147,6 +147,9 @@ obj-$(CONFIG_UNWINDER_FRAME_POINTER)+= 
unwind_frame.o
 obj-$(CONFIG_UNWINDER_GUESS)   += unwind_guess.o
 
 obj-$(CONFIG_AMD_MEM_ENCRYPT)  += sev.o
+
+obj-$(CONFIG_ARCH_HAS_CC_PLATFORM) += cc_platform.o
+
 ###
 # 64 bit specific files
 ifeq ($(CONFIG_X86_64),y)
diff --git a/arch/x86/kernel/cc_platform.c b/arch/x86/kernel/cc_platform.c
new file mode 100644
index ..3c9bacd3c3f3
--- /dev/null
+++ b/arch/x86/kernel/cc_platform.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#include 
+#include 
+#include 
+
+bool cc_platform_has(enum cc_attr attr)
+{
+   if (sme_me_mask)
+   return amd_cc_platform_has(attr);
+
+   return false;
+}
+EXPORT_SYMBOL_GPL(cc_platform_has);
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index ff08dc463634..18fe19916bc3 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -389,6 +390,26 @@ bool noinstr sev_es_active(void)
return sev_status & MSR_AMD64_SEV_ES_ENABLED;
 }
 
+bool amd_cc_platform_has(enum cc_attr attr)
+{
+   switch (attr) {
+   case CC_ATTR_MEM_ENCRYPT:
+   return sme_me_mask != 0;
+
+   case CC_ATTR_HOST_MEM_ENCRYPT:
+   return sme_active();
+
+   case CC_ATTR_GUEST_MEM_ENCRYPT:
+   return sev_active();
+
+   case CC_ATTR_GUEST_STATE_ENCRYPT:
+   return sev_es_active();
+
+   default:
+   return false;
+   }
+}
+
 /* Override for DMA direct allocation check - ARCH_HAS_FORCE_DMA_UNENCRYPTED */
 bool force_dma_unencrypted(struct device *dev)
 {
-- 
2.33.0

[PATCH v3 2/8] mm: Introduce a function to check for confidential computing features

2021-09-08 Thread Tom Lendacky

In prep for other confidential computing technologies, introduce a generic
helper function, cc_platform_has(), that can be used to check for specific
active confidential computing attributes, like memory encryption. This is
intended to eliminate having to add multiple technology-specific checks to
the code (e.g. if (sev_active() || tdx_active())).

Co-developed-by: Andi Kleen 
Signed-off-by: Andi Kleen 
Co-developed-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Kuppuswamy Sathyanarayanan 

Signed-off-by: Tom Lendacky 
---
 arch/Kconfig|  3 ++
 include/linux/cc_platform.h | 88 +
 2 files changed, 91 insertions(+)
 create mode 100644 include/linux/cc_platform.h

diff --git a/arch/Kconfig b/arch/Kconfig
index 3743174da870..ca7c359e5da8 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1234,6 +1234,9 @@ config RELR
 config ARCH_HAS_MEM_ENCRYPT
bool
 
+config ARCH_HAS_CC_PLATFORM
+   bool
+
 config HAVE_SPARSE_SYSCALL_NR
bool
help
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
new file mode 100644
index ..253f3ea66cd8
--- /dev/null
+++ b/include/linux/cc_platform.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Confidential Computing Platform Capability checks
+ *
+ * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ *
+ * Author: Tom Lendacky 
+ */
+
+#ifndef _CC_PLATFORM_H
+#define _CC_PLATFORM_H
+
+#include 
+#include 
+
+/**
+ * enum cc_attr - Confidential computing attributes
+ *
+ * These attributes represent confidential computing features that are
+ * currently active.
+ */
+enum cc_attr {
+   /**
+* @CC_ATTR_MEM_ENCRYPT: Memory encryption is active
+*
+* The platform/OS is running with active memory encryption. This
+* includes running either as a bare-metal system or a hypervisor
+* and actively using memory encryption or as a guest/virtual machine
+* and actively using memory encryption.
+*
+* Examples include SME, SEV and SEV-ES.
+*/
+   CC_ATTR_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_HOST_MEM_ENCRYPT: Host memory encryption is active
+*
+* The platform/OS is running as a bare-metal system or a hypervisor
+* and actively using memory encryption.
+*
+* Examples include SME.
+*/
+   CC_ATTR_HOST_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_GUEST_MEM_ENCRYPT: Guest memory encryption is active
+*
+* The platform/OS is running as a guest/virtual machine and actively
+* using memory encryption.
+*
+* Examples include SEV and SEV-ES.
+*/
+   CC_ATTR_GUEST_MEM_ENCRYPT,
+
+   /**
+* @CC_ATTR_GUEST_STATE_ENCRYPT: Guest state encryption is active
+*
+* The platform/OS is running as a guest/virtual machine and actively
+* using memory encryption and register state encryption.
+*
+* Examples include SEV-ES.
+*/
+   CC_ATTR_GUEST_STATE_ENCRYPT,
+};
+
+#ifdef CONFIG_ARCH_HAS_CC_PLATFORM
+
+/**
+ * cc_platform_has() - Checks if the specified cc_attr attribute is active
+ * @attr: Confidential computing attribute to check
+ *
+ * The cc_platform_has() function will return an indicator as to whether the
+ * specified Confidential Computing attribute is currently active.
+ *
+ * Context: Any context
+ * Return:
+ * * TRUE  - Specified Confidential Computing attribute is active
+ * * FALSE - Specified Confidential Computing attribute is not active
+ */
+bool cc_platform_has(enum cc_attr attr);
+
+#else  /* !CONFIG_ARCH_HAS_CC_PLATFORM */
+
+static inline bool cc_platform_has(enum cc_attr attr) { return false; }
+
+#endif /* CONFIG_ARCH_HAS_CC_PLATFORM */
+
+#endif /* _CC_PLATFORM_H */
-- 
2.33.0

[PATCH v3 1/8] x86/ioremap: Selectively build arch override encryption functions

2021-09-08 Thread Tom Lendacky

In prep for other uses of the cc_platform_has() function besides AMD's
memory encryption support, selectively build the AMD memory encryption
architecture override functions only when CONFIG_AMD_MEM_ENCRYPT=y. These
functions are:
- early_memremap_pgprot_adjust()
- arch_memremap_can_ram_remap()

Additionally, routines that are only invoked by these architecture
override functions can also be conditionally built. These functions are:
- memremap_should_map_decrypted()
- memremap_is_efi_data()
- memremap_is_setup_data()
- early_memremap_is_setup_data()

And finally, phys_mem_access_encrypted() is conditionally built as well,
but requires a static inline version of it when CONFIG_AMD_MEM_ENCRYPT is
not set.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Signed-off-by: Tom Lendacky 
---
 arch/x86/include/asm/io.h | 8 
 arch/x86/mm/ioremap.c | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 841a5d104afa..5c6a4af0b911 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -391,6 +391,7 @@ extern void arch_io_free_memtype_wc(resource_size_t start, 
resource_size_t size)
 #define arch_io_reserve_memtype_wc arch_io_reserve_memtype_wc
 #endif
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
 extern bool arch_memremap_can_ram_remap(resource_size_t offset,
unsigned long size,
unsigned long flags);
@@ -398,6 +399,13 @@ extern bool arch_memremap_can_ram_remap(resource_size_t 
offset,
 
 extern bool phys_mem_access_encrypted(unsigned long phys_addr,
  unsigned long size);
+#else
+static inline bool phys_mem_access_encrypted(unsigned long phys_addr,
+unsigned long size)
+{
+   return true;
+}
+#endif
 
 /**
  * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 60ade7dd71bd..ccff76cedd8f 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -508,6 +508,7 @@ void unxlate_dev_mem_ptr(phys_addr_t phys, void *addr)
memunmap((void *)((unsigned long)addr & PAGE_MASK));
 }
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
 /*
  * Examine the physical address to determine if it is an area of memory
  * that should be mapped decrypted.  If the memory is not part of the
@@ -746,7 +747,6 @@ bool phys_mem_access_encrypted(unsigned long phys_addr, 
unsigned long size)
return arch_memremap_can_ram_remap(phys_addr, size, 0);
 }
 
-#ifdef CONFIG_AMD_MEM_ENCRYPT
 /* Remap memory with encryption */
 void __init *early_memremap_encrypted(resource_size_t phys_addr,
  unsigned long size)
-- 
2.33.0

[PATCH v3 0/8] Implement generic cc_platform_has() helper function

2021-09-08 Thread Tom Lendacky

This patch series provides a generic helper function, cc_platform_has(),
to replace the sme_active(), sev_active(), sev_es_active() and
mem_encrypt_active() functions.

It is expected that as new confidential computing technologies are
added to the kernel, they can all be covered by a single function call
instead of a collection of specific function calls all called from the
same locations.

The powerpc and s390 patches have been compile tested only. Can the
folks copied on this series verify that nothing breaks for them. Also,
a new file, arch/powerpc/platforms/pseries/cc_platform.c, has been
created for powerpc to hold the out of line function.

Cc: Andi Kleen 
Cc: Andy Lutomirski 
Cc: Ard Biesheuvel 
Cc: Baoquan He 
Cc: Benjamin Herrenschmidt 
Cc: Borislav Petkov 
Cc: Christian Borntraeger 
Cc: Daniel Vetter 
Cc: Dave Hansen 
Cc: Dave Young 
Cc: David Airlie 
Cc: Heiko Carstens 
Cc: Ingo Molnar 
Cc: Joerg Roedel 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Thomas Zimmermann 
Cc: Vasily Gorbik 
Cc: VMware Graphics 
Cc: Will Deacon 
Cc: Christoph Hellwig 

---

Patches based on:
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
  4b93c544e90e ("thunderbolt: test: split up test cases in 
tb_test_credit_alloc_all")

Changes since v2:
- Changed the name from prot_guest_has() to cc_platform_has()
- Took the cc_platform_has() function out of line. Created two new files,
  cc_platform.c, in both x86 and ppc to implment the function. As a
  result, also changed the attribute defines into enums.
- Removed any received Reviewed-by's and Acked-by's given changes in this
  version.
- Added removal of new instances of mem_encrypt_active() usage in powerpc
  arch.
- Based on latest Linux tree to pick up powerpc changes related to the
  mem_encrypt_active() function.

Changes since v1:
- Moved some arch ioremap functions within #ifdef CONFIG_AMD_MEM_ENCRYPT
  in prep for use of prot_guest_has() by TDX.
- Added type includes to the the protected_guest.h header file to prevent
  build errors outside of x86.
- Made amd_prot_guest_has() EXPORT_SYMBOL_GPL
- Used amd_prot_guest_has() in place of checking sme_me_mask in the
  arch/x86/mm/mem_encrypt.c file.

Tom Lendacky (8):
  x86/ioremap: Selectively build arch override encryption functions
  mm: Introduce a function to check for confidential computing features
  x86/sev: Add an x86 version of cc_platform_has()
  powerpc/pseries/svm: Add a powerpc version of cc_platform_has()
  x86/sme: Replace occurrences of sme_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_active() with cc_platform_has()
  x86/sev: Replace occurrences of sev_es_active() with cc_platform_has()
  treewide: Replace the use of mem_encrypt_active() with
cc_platform_has()

 arch/Kconfig |  3 +
 arch/powerpc/include/asm/mem_encrypt.h   |  5 --
 arch/powerpc/platforms/pseries/Kconfig   |  1 +
 arch/powerpc/platforms/pseries/Makefile  |  2 +
 arch/powerpc/platforms/pseries/cc_platform.c | 26 ++
 arch/powerpc/platforms/pseries/svm.c |  5 +-
 arch/s390/include/asm/mem_encrypt.h  |  2 -
 arch/x86/Kconfig |  1 +
 arch/x86/include/asm/io.h|  8 ++
 arch/x86/include/asm/kexec.h |  2 +-
 arch/x86/include/asm/mem_encrypt.h   | 14 +---
 arch/x86/kernel/Makefile |  3 +
 arch/x86/kernel/cc_platform.c| 21 +
 arch/x86/kernel/crash_dump_64.c  |  4 +-
 arch/x86/kernel/head64.c |  4 +-
 arch/x86/kernel/kvm.c|  3 +-
 arch/x86/kernel/kvmclock.c   |  4 +-
 arch/x86/kernel/machine_kexec_64.c   | 19 +++--
 arch/x86/kernel/pci-swiotlb.c|  9 +-
 arch/x86/kernel/relocate_kernel_64.S |  2 +-
 arch/x86/kernel/sev.c|  6 +-
 arch/x86/kvm/svm/svm.c   |  3 +-
 arch/x86/mm/ioremap.c| 18 ++--
 arch/x86/mm/mem_encrypt.c| 57 +++--
 arch/x86/mm/mem_encrypt_identity.c   |  3 +-
 arch/x86/mm/pat/set_memory.c |  3 +-
 arch/x86/platform/efi/efi_64.c   |  9 +-
 arch/x86/realmode/init.c |  8 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |  4 +-
 drivers/gpu/drm/drm_cache.c  |  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  |  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_msg.c  |  6 +-
 drivers/iommu/amd/init.c |  7 +-
 drivers/iommu/amd/iommu.c|  3 +-
 drivers/iommu/amd/iommu_v2.c |  3 +-
 drivers/iommu/iommu.c|  3 +-
 fs/proc/vmcore.c |  6 +-
 include/linux/cc_platform.h  | 88 
 include/linux/mem_encrypt.h  |  4 -

Re: [PATCH v3 06/16] ARM: configs: Everyone who had PANEL_SIMPLE now gets PANEL_SIMPLE_EDP

2021-09-08 Thread Doug Anderson

Hi,

On Fri, Sep 3, 2021 at 1:38 PM Stephen Boyd  wrote:
>
> Quoting Doug Anderson (2021-09-01 16:10:15)
> > Hi,
> >
> > On Wed, Sep 1, 2021 at 2:12 PM Olof Johansson  wrote:
> > >
> > > On Wed, Sep 1, 2021 at 1:20 PM Douglas Anderson  
> > > wrote:
> > > >
> > > > In the patch ("drm/panel-simple-edp: Split eDP panels out of
> > > > panel-simple") we split the PANEL_SIMPLE driver in 2. By default let's
> > > > give everyone who had the old driver enabled the new driver too. If
> > > > folks want to opt-out of one or the other they always can later.
> > > >
> > > > Signed-off-by: Douglas Anderson 
> > >
> > > Isn't this a case where the new option should just have had the old
> > > option as the default value to avoid this kind of churn and possibly
> > > broken platforms?
> >
> > I'm happy to go either way. I guess I didn't do that originally
> > because logically there's not any reason to link the two drivers going
> > forward. Said another way, someone enabling the "simple panel" driver
> > for non-eDP panels wouldn't expect that the "simple panel" driver for
> > DP panels would also get enabled by default. They really have nothing
> > to do with one another. Enabling by default for something like this
> > also seems like it would lead to bloat. I could have sworn that
> > periodically people get yelled at for marking drivers on by default
> > when it doesn't make sense.
> >
> > ...that being said, I'm happy to change the default as you suggest.
> > Just let me know.
>
> Having the default will help olddefconfig users seamlessly migrate to
> the new Kconfig. Sadly they don't notice that they should probably
> disable the previous Kconfig symbol, but oh well. At least with the
> default they don't go on a hunt/bisect to figure out that some Kconfig
> needed to be enabled now that they're using a new kernel version.
>
> Maybe the default should have a TODO comment next to it indicating we
> should remove the default in a year or two.

OK, so I'm trying to figure out how to do this without just "kicking
the can" down the road. I guess your idea is that for the next year
this will be the default and that anyone who really wants
"CONFIG_DRM_PANEL_EDP" will "opt-in" to keep it by adding
"CONFIG_DRM_PANEL_EDP=y" to their config? ...and then after a year
passes we remove the default? ...but that won't work, will it? Since
"CONFIG_DRM_PANEL_EDP" will be the default for the next year then you
really can't add it to the "defconfig", at least if you ever
"normalize" it. The "defconfig" by definition has everything stripped
from it that's already the "default", so for the next year anyone who
tries to opt-in will get their preference stripped.

Hrm, so let me explain options as I see them. Maybe someone can point
out something that I missed. I'll assume that we'll change the config
option from CONFIG_DRM_PANEL_SIMPLE_EDP to CONFIG_DRM_PANEL_EDP
(remove the "SIMPLE" part).

==

Where we were before my series:

* One config "CONFIG_DRM_PANEL_SIMPLE" and it enables simple non-eDP
and eDP drivers.

==

Option 1: update everyone's configs (this patch)

* Keep old config "CONFIG_DRM_PANEL_SIMPLE" but it now only means
enable the panel-simple (non-eDP) driver.
* Anyone who wants eDP panels must opt-in to "CONFIG_DRM_PANEL_EDP"
* Update all configs in mainline; any out-of mainline configs must
figure this out themselves.

Pros:
* no long term baggage

Cons:
* patch upstream is a bit of "churn"
* anyone with downstream config will have to figure out what happened.

==

Option 2: kick the can down the road + accept cruft

* Keep old config "CONFIG_DRM_PANEL_SIMPLE" and it means enable the
panel-simple (non-eDP) driver.
* Anyone with "CONFIG_DRM_PANEL_SIMPLE" is opted in by default to
"CONFIG_DRM_PANEL_EDP"

AKA:
config DRM_PANEL_EDP
  default DRM_PANEL_SIMPLE

Pros:
* no config patches needed upstream at all and everything just works!

Cons:
* people are opted in to extra cruft by default and need to know to turn it off.
* unclear if we can change the default without the same problems.

==

Option 3: try to be clever

* Add _two_ new configs. CONFIG_DRM_PANEL_SIMPLE_V2 and CONFIG_DRM_PANEL_EDP.
* Old config "CONFIG_DRM_PANEL_SIMPLE" gets marked as "deprecated".
* Both new configs have "default CONFIG_DRM_PANEL_SIMPLE"

Now anyone old will magically get both the new config options by
default. Anyone looking at this in the future _won't_ set the
deprecated CONFIG_DRM_PANEL_SIMPLE but will instead choose if they
want either the eDP or "simple" driver.

Pros:
* No long term baggage.
* Everyone is transitioned automatically by default with no cruft patches.

Cons:
* I can't think of a better name than "CONFIG_DRM_PANEL_SIMPLE_V2" and
that name is ugly.

==

Option 4: shave a yak

When thinking about this I came up with a clever idea of stashing the
kernel version in a defconfig when it's generated. Then you could do
something like:

config DRM_PANEL_EDP
  default DRM_PANEL_SIMPLE if DEFCONFIG_GENERATED_AT <= 0x00050f00

That feels

Re: [PATCH 1/2] drm/nouveau/ga102-: support ttm buffer moves via copy engine

2021-09-08 Thread Ben Skeggs

On Thu, 9 Sept 2021 at 04:19, Daniel Vetter  wrote:
>
> On Mon, Sep 06, 2021 at 10:56:27AM +1000, Ben Skeggs wrote:
> > From: Ben Skeggs 
> >
> > We don't currently have any kind of real acceleration on Ampere GPUs,
> > but the TTM memcpy() fallback paths aren't really designed to handle
> > copies between different devices, such as on Optimus systems, and
> > result in a kernel OOPS.
>
> Is this just for moving a buffer from vram to system memory when you pin
> it for dma-buf? I'm kinda lost what you even use ttm bo moves for if
> there's no one using the gpu.
It occurs when we attempt to move the buffer into vram for scanout,
through the modeset paths.

>
> Also I guess memcpy goes boom if you can't mmap it because it's outside
> the gart? Or just that it's very slow. We're trying to use ttm memcyp as
> fallback, so want to know how this can all go wrong :-)
Neither ttm_kmap_iter_linear_io_init() nor ttm_kmap_iter_tt_init() are
able to work with the imported dma-buf object, which can obviously be
fixed.

But.  I then attempted to hack that up with a custom memcpy() for that
situation to test it, using dma_buf_vmap(), and get stuck forever
inside i915 waiting for the gem object lock.

Ben.

> -Daniel
>
> >
> > A few options were investigated to try and fix this, but didn't work
> > out, and likely would have resulted in a very unpleasant experience
> > for users anyway.
> >
> > This commit adds just enough support for setting up a single channel
> > connected to a copy engine, which the kernel can use to accelerate
> > the buffer copies between devices.  Userspace has no access to this
> > incomplete channel support, but it's suitable for TTM's needs.
> >
> > A more complete implementation of host(fifo) for Ampere GPUs is in
> > the works, but the required changes are far too invasive that they
> > would be unsuitable to backport to fix this issue on current kernels.
> >
> > Signed-off-by: Ben Skeggs 
> > Cc: Lyude Paul 
> > Cc: Karol Herbst 
> > Cc:  # v5.12+
> > ---
> >  drivers/gpu/drm/nouveau/include/nvif/class.h  |   2 +
> >  .../drm/nouveau/include/nvkm/engine/fifo.h|   1 +
> >  drivers/gpu/drm/nouveau/nouveau_bo.c  |   1 +
> >  drivers/gpu/drm/nouveau/nouveau_chan.c|   6 +-
> >  drivers/gpu/drm/nouveau/nouveau_drm.c |   4 +
> >  drivers/gpu/drm/nouveau/nv84_fence.c  |   2 +-
> >  .../gpu/drm/nouveau/nvkm/engine/device/base.c |   3 +
> >  .../gpu/drm/nouveau/nvkm/engine/fifo/Kbuild   |   1 +
> >  .../gpu/drm/nouveau/nvkm/engine/fifo/ga102.c  | 308 ++
> >  .../gpu/drm/nouveau/nvkm/subdev/top/ga100.c   |   7 +-
> >  10 files changed, 329 insertions(+), 6 deletions(-)
> >  create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fifo/ga102.c
> >
> > diff --git a/drivers/gpu/drm/nouveau/include/nvif/class.h 
> > b/drivers/gpu/drm/nouveau/include/nvif/class.h
> > index c68cc957248e..a582c0cb0cb0 100644
> > --- a/drivers/gpu/drm/nouveau/include/nvif/class.h
> > +++ b/drivers/gpu/drm/nouveau/include/nvif/class.h
> > @@ -71,6 +71,7 @@
> >  #define PASCAL_CHANNEL_GPFIFO_A   /* cla06f.h */ 
> > 0xc06f
> >  #define VOLTA_CHANNEL_GPFIFO_A/* clc36f.h */ 
> > 0xc36f
> >  #define TURING_CHANNEL_GPFIFO_A   /* clc36f.h */ 
> > 0xc46f
> > +#define AMPERE_CHANNEL_GPFIFO_B   /* clc36f.h */ 
> > 0xc76f
> >
> >  #define NV50_DISP /* cl5070.h */ 
> > 0x5070
> >  #define G82_DISP  /* cl5070.h */ 
> > 0x8270
> > @@ -200,6 +201,7 @@
> >  #define PASCAL_DMA_COPY_B
> > 0xc1b5
> >  #define VOLTA_DMA_COPY_A 
> > 0xc3b5
> >  #define TURING_DMA_COPY_A
> > 0xc5b5
> > +#define AMPERE_DMA_COPY_B
> > 0xc7b5
> >
> >  #define FERMI_DECOMPRESS 
> > 0x90b8
> >
> > diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h 
> > b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
> > index 54fab7cc36c1..64ee82c7c1be 100644
> > --- a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
> > +++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
> > @@ -77,4 +77,5 @@ int gp100_fifo_new(struct nvkm_device *, enum 
> > nvkm_subdev_type, int inst, struct
> >  int gp10b_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> > struct nvkm_fifo **);
> >  int gv100_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> > struct nvkm_fifo **);
> >  int tu102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> > struct nvkm_fifo **);
> > +int ga102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> > struct nvkm_fifo **);
> >  #endif
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> > b/drivers/gpu/drm/nouveau/nouveau_bo.c
>

Re: [PATCH 2/2] drm/bridge: parade-ps8640: Add support for AUX channel

2021-09-08 Thread Stephen Boyd

Quoting Philip Chen (2021-09-08 11:18:06)
> diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c 
> b/drivers/gpu/drm/bridge/parade-ps8640.c
> index a16725dbf912..3f0241a60357 100644
> --- a/drivers/gpu/drm/bridge/parade-ps8640.c
> +++ b/drivers/gpu/drm/bridge/parade-ps8640.c
> @@ -93,6 +115,102 @@ static inline struct ps8640 *bridge_to_ps8640(struct 
> drm_bridge *e)
> return container_of(e, struct ps8640, bridge);
>  }
>
> +static inline struct ps8640 *aux_to_ps8640(struct drm_dp_aux *aux)
> +{
> +   return container_of(aux, struct ps8640, aux);
> +}
> +
> +static ssize_t ps8640_aux_transfer(struct drm_dp_aux *aux,
> +  struct drm_dp_aux_msg *msg)
> +{
> +   struct ps8640 *ps_bridge = aux_to_ps8640(aux);
> +   struct i2c_client *client = ps_bridge->page[PAGE0_DP_CNTL];
> +   struct regmap *map = ps_bridge->regmap[PAGE0_DP_CNTL];
> +   unsigned int len = msg->size;
> +   unsigned int data;
> +   int ret;
> +   u8 request = msg->request &
> +~(DP_AUX_I2C_MOT | DP_AUX_I2C_WRITE_STATUS_UPDATE);
> +   u8 *buf = msg->buffer;
> +   bool is_native_aux = false;
> +
> +   if (len > DP_AUX_MAX_PAYLOAD_BYTES)
> +   return -EINVAL;
> +
> +   pm_runtime_get_sync(&client->dev);

Is this driver using runtime PM? Probably can't add this until it is
actually runtime PM enabled.

> +
> +   switch (request) {
> +   case DP_AUX_NATIVE_WRITE:
> +   case DP_AUX_NATIVE_READ:
> +   is_native_aux = true;
> +   case DP_AUX_I2C_WRITE:
> +   case DP_AUX_I2C_READ:
> +   regmap_write(map, PAGE0_AUXCH_CFG3, AUXCH_CFG3_RESET);
> +   break;
> +   default:
> +   ret = -EINVAL;
> +   goto exit;
> +   }
> +
> +   /* Assume it's good */
> +   msg->reply = 0;
> +
> +   data = ((request << 4) & AUX_CMD_MASK) |
> +  ((msg->address >> 16) & AUX_ADDR_19_16_MASK);
> +   regmap_write(map, PAGE0_AUX_ADDR_23_16, data);
> +   data = (msg->address >> 8) & 0xff;
> +   regmap_write(map, PAGE0_AUX_ADDR_15_8, data);
> +   data = msg->address & 0xff;
> +   regmap_write(map, PAGE0_AUX_ADDR_7_0, msg->address & 0xff);

Can we pack this into a three byte buffer and write it in one
regmap_bulk_write()? That would be nice because it looks like the
addresses are all next to each other in the i2c address space.

> +
> +   data = (len - 1) & AUX_LENGTH_MASK;
> +   regmap_write(map, PAGE0_AUX_LENGTH, data);
> +
> +   if (request == DP_AUX_NATIVE_WRITE || request == DP_AUX_I2C_WRITE) {
> +   ret = regmap_noinc_write(map, PAGE0_AUX_WDATA, buf, len);
> +   if (ret < 0) {
> +   DRM_ERROR("failed to write PAGE0_AUX_WDATA");

Needs a newline.

> +   goto exit;
> +   }
> +   }
> +
> +   regmap_write(map, PAGE0_AUX_CTRL, AUX_START);
> +
> +   regmap_read(map, PAGE0_AUX_STATUS, &data);
> +   switch (data & AUX_STATUS_MASK) {
> +   case AUX_STATUS_DEFER:
> +   if (is_native_aux)
> +   msg->reply |= DP_AUX_NATIVE_REPLY_DEFER;
> +   else
> +   msg->reply |= DP_AUX_I2C_REPLY_DEFER;
> +   goto exit;
> +   case AUX_STATUS_NACK:
> +   if (is_native_aux)
> +   msg->reply |= DP_AUX_NATIVE_REPLY_NACK;
> +   else
> +   msg->reply |= DP_AUX_I2C_REPLY_NACK;
> +   goto exit;
> +   case AUX_STATUS_TIMEOUT:
> +   ret = -ETIMEDOUT;
> +   goto exit;
> +   }
> +
> +   if (request == DP_AUX_NATIVE_READ || request == DP_AUX_I2C_READ) {
> +   ret = regmap_noinc_read(map, PAGE0_AUX_RDATA, buf, len);
> +   if (ret < 0)
> +   DRM_ERROR("failed to read PAGE0_AUX_RDATA");

Needs a newline.

> +   }
> +
> +exit:
> +   pm_runtime_mark_last_busy(&client->dev);
> +   pm_runtime_put_autosuspend(&client->dev);
> +
> +   if (ret)
> +   return ret;
> +
> +   return len;
> +}
> +
>  static int ps8640_bridge_vdo_control(struct ps8640 *ps_bridge,
>  const enum ps8640_vdo_control ctrl)
>  {

Re: [PATCH 1/2] drm/bridge: parade-ps8640: Use regmap APIs

2021-09-08 Thread Stephen Boyd

Quoting Philip Chen (2021-09-08 11:18:05)
> diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c 
> b/drivers/gpu/drm/bridge/parade-ps8640.c
> index 685e9c38b2db..a16725dbf912 100644
> --- a/drivers/gpu/drm/bridge/parade-ps8640.c
> +++ b/drivers/gpu/drm/bridge/parade-ps8640.c
> @@ -64,12 +65,29 @@ struct ps8640 {
> struct drm_bridge *panel_bridge;
> struct mipi_dsi_device *dsi;
> struct i2c_client *page[MAX_DEVS];
> +   struct regmap   *regmap[MAX_DEVS];
> struct regulator_bulk_data supplies[2];
> struct gpio_desc *gpio_reset;
> struct gpio_desc *gpio_powerdown;
> bool powered;
>  };
>
> +static const struct regmap_range ps8640_volatile_ranges[] = {
> +   { .range_min = 0, .range_max = 0xff },

Is the plan to fill this out later or is 0xff the max register? If it's
the latter then I think adding the max register to regmap_config is
simpler.

> +};
> +
> +static const struct regmap_access_table ps8640_volatile_table = {
> +   .yes_ranges = ps8640_volatile_ranges,
> +   .n_yes_ranges = ARRAY_SIZE(ps8640_volatile_ranges),
> +};
> +
> +static const struct regmap_config ps8640_regmap_config = {
> +   .reg_bits = 8,
> +   .val_bits = 8,
> +   .volatile_table = &ps8640_volatile_table,
> +   .cache_type = REGCACHE_NONE,
> +};
> +
>  static inline struct ps8640 *bridge_to_ps8640(struct drm_bridge *e)
>  {
> return container_of(e, struct ps8640, bridge);
> @@ -78,13 +96,13 @@ static inline struct ps8640 *bridge_to_ps8640(struct 
> drm_bridge *e)
>  static int ps8640_bridge_vdo_control(struct ps8640 *ps_bridge,
>  const enum ps8640_vdo_control ctrl)
>  {
> -   struct i2c_client *client = ps_bridge->page[PAGE3_DSI_CNTL1];
> -   u8 vdo_ctrl_buf[] = { VDO_CTL_ADD, ctrl };
> +   struct regmap *map = ps_bridge->regmap[PAGE3_DSI_CNTL1];
> +   u8 vdo_ctrl_buf[] = {VDO_CTL_ADD, ctrl};

Nitpick: Add a space after { and before }.

> int ret;
>
> -   ret = i2c_smbus_write_i2c_block_data(client, PAGE3_SET_ADD,
> -sizeof(vdo_ctrl_buf),
> -vdo_ctrl_buf);
> +   ret = regmap_bulk_write(map, PAGE3_SET_ADD,
> +   vdo_ctrl_buf, sizeof(vdo_ctrl_buf));
> +
> if (ret < 0) {
> DRM_ERROR("failed to %sable VDO: %d\n",
>   ctrl == ENABLE ? "en" : "dis", ret);

Re: [PATCH] drm: mxsfb: Fix NULL pointer dereference crash on unload

2021-09-08 Thread Marek Vasut


On 9/8/21 8:24 PM, Daniel Vetter wrote:

On Tue, Sep 07, 2021 at 04:49:00AM +0200, Marek Vasut wrote:

The mxsfb->crtc.funcs may already be NULL when unloading the driver,
in which case calling mxsfb_irq_disable() via drm_irq_uninstall() from
mxsfb_unload() leads to NULL pointer dereference.

Since all we care about is masking the IRQ and mxsfb->base is still
valid, just use that to clear and mask the IRQ.

Fixes: ae1ed00932819 ("drm: mxsfb: Stop using DRM simple display pipeline 
helper")
Signed-off-by: Marek Vasut 
Cc: Daniel Abrecht 
Cc: Emil Velikov 
Cc: Laurent Pinchart 
Cc: Sam Ravnborg 
Cc: Stefan Agner 


You probably want a drm_atomic_helper_shutdown instead of trying to do all
that manually. We've also added a bunch more devm and drmm_ functions to
automate the cleanup a lot more here, e.g. your drm_mode_config_cleanup is
in the wrong place.

Also I'm confused because I'm not even seeing this function anywhere in
upstream.


It is still here:
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/mxsfb/mxsfb_drv.c#n171
as of:
999569d59a0aa ("Add linux-next specific files for 20210908")

Is there some other tree I should be looking at ?

Re: [PATCH] drm/bridge: ti-sn65dsi83: Check link status register after enabling the bridge

2021-09-08 Thread Andrzej Hajda



W dniu 08.09.2021 o 13:11, Dave Stevenson pisze:
> Hi Marek and Andrzej
>
> On Tue, 7 Sept 2021 at 22:24, Marek Vasut  wrote:
>> On 9/7/21 7:29 PM, Andrzej Hajda wrote:
>>> W dniu 07.09.2021 o 16:25, Marek Vasut pisze:
 On 9/7/21 9:31 AM, Andrzej Hajda wrote:
> On 07.09.2021 04:39, Marek Vasut wrote:
>> In rare cases, the bridge may not start up correctly, which usually
>> leads to no display output. In case this happens, warn about it in
>> the kernel log.
>>
>> Signed-off-by: Marek Vasut 
>> Cc: Jagan Teki 
>> Cc: Laurent Pinchart 
>> Cc: Linus Walleij 
>> Cc: Robert Foss 
>> Cc: Sam Ravnborg 
>> Cc: dri-devel@lists.freedesktop.org
>> ---
>> NOTE: See the following:
>> https://e2e.ti.com/support/interface-group/interface/f/interface-forum/942005/sn65dsi83-dsi83-lvds-bridge---sporadic-behavior---no-video
>>
>> https://community.nxp.com/t5/i-MX-Processors/i-MX8M-MIPI-DSI-Interface-LVDS-Bridge-Initialization/td-p/1156533
>>
>> ---
>>  drivers/gpu/drm/bridge/ti-sn65dsi83.c | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>> b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>> index a32f70bc68ea4..4ea71d7f0bfbc 100644
>> --- a/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>> +++ b/drivers/gpu/drm/bridge/ti-sn65dsi83.c
>> @@ -520,6 +520,11 @@ static void sn65dsi83_atomic_enable(struct
>> drm_bridge *bridge,
>>  /* Clear all errors that got asserted during initialization. */
>>  regmap_read(ctx->regmap, REG_IRQ_STAT, &pval);
>>  regmap_write(ctx->regmap, REG_IRQ_STAT, pval);
>
> It does not look as correct error handling, maybe it would be good to
> analyze and optionally report 'unexpected' errors here as well.
 The above is correct -- it clears the status register because the
 setup might've set random bits in that register. Then we wait a bit,
 let the link run, and read them again to get the real link status in
 this new piece of code below, hence the usleep_range there. And then
 if the link indicates a problem, we know it is a problem.
>>>
>>> Usually such registers are cleared on very beginning of the
>>> initialization, and tested (via irq handler, or via reading), during
>>> initalization, if initialization phase goes well. If it is not the case
>>> forgive me.
>> The init just flips the bit at random in the IRQ_STAT register, so no,
>> that's not really viable here. That's why we clear them at the end, and
>> then wait a bit, and then check whether something new appeared in them.
>>
>> If not, all is great.
>>
>> Sure, we could generate an IRQ, but then IRQ line is not always
>> connected to this chip on all hardware I have available. So this gives
>> the user at least some indication that something is wrong with their HW.
>>
>> +
>> +usleep_range(1, 12000);
>> +regmap_read(ctx->regmap, REG_IRQ_STAT, &pval);
>> +if (pval)
>> +dev_err(ctx->dev, "Unexpected link status 0x%02x\n", pval);
>
> I am not sure what is the case here but it looks like 'we do not know
> what is going on, so let's add some diagnostic messages to gather info
> and figure it out later'.
 That's pretty much the case, see the two links above in the NOTE
 section. If something goes wrong, we print the value for the user
 (usually developer) so they can fix their problems. We cannot do much
 better in the attach callback.

 The issue I ran into (and where this would be helpful information to
 me during debugging, since the issue happened real seldom, see also
 the NOTE links above) is that the DSI controller driver started
 streaming video on the data lanes before the DSI83 had a chance to
 initialize. This worked most of the time, except for a few exceptions
 here and there, where the video didn't start. This does set link
 status bits consistently. In the meantime, I fixed the controller
 driver (so far downstream, due to ongoing discussion).
>>>
>>> Maybe drm_connector_set_link_status_property(conn,
>>> DRM_MODE_LINK_STATUS_BAD) would be usefule here.
>> Hmm, this works on connector, the dsi83 is a bridge and it can be stuck
>> between two other bridges. That doesn't seem like the right tool, no ?
>>
> Whole driver lacks IRQ handler which IMO could perform better diagnosis,
> and I guess it could also help in recovery, but this is just my guess.
> So if this patch is enough for now you can add:
 No, IRQ won't help you here, because by the time you get the IRQ, the
 DSI host already started streaming video on data lanes and you won't
 be able to correctly reinit the DSI83 unless you communicate to the
 DSI host that it should switch the data lanes back to LP11.

 And for that, there is a bigger chunk missing really. What needs to be
 added is a way for th

Re: [PATCH v3 10/16] drm/panel-simple: Non-eDP panels don't need "HPD" handling

2021-09-08 Thread Doug Anderson

Hi,

On Sun, Sep 5, 2021 at 11:46 AM Sam Ravnborg  wrote:
>
> On Wed, Sep 01, 2021 at 01:19:28PM -0700, Douglas Anderson wrote:
> > All of the "HPD" handling added to panel-simple recently was for eDP
> > panels. Remove it from panel-simple now that panel-simple-edp handles
> > eDP panels. The "prepare_to_enable" delay only makes sense in the
> > context of HPD, so remove it too. No non-eDP panels used it anyway.
> >
> > Signed-off-by: Douglas Anderson 
>
> Maybe merge this with the patch that moved all the functionality
> from panel-simple to panel-edp?

Unless you feel strongly about it, I'm going to keep it separate still
in the next version. To try to make diffing easier, I tried hard to
make the minimal changes in the "split the driver in two" patch.

-Doug

Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

2021-09-08 Thread Daniel Vetter

On Wed, Sep 8, 2021 at 9:36 PM Rob Clark  wrote:
> On Wed, Sep 8, 2021 at 11:49 AM Daniel Vetter  wrote:
> > On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote:
> > > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter  wrote:
> > > >
> > > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > > > > From: Rob Clark 
> > > > >
> > > > > The initial purpose is for igt tests, but this would also be useful 
> > > > > for
> > > > > compositors that wait until close to vblank deadline to make decisions
> > > > > about which frame to show.
> > > > >
> > > > > Signed-off-by: Rob Clark 
> > > >
> > > > Needs userspace and I think ideally also some igts to make sure it works
> > > > and doesn't go boom.
> > >
> > > See cover-letter.. there are igt tests, although currently that is the
> > > only user.
> >
> > Ah sorry missed that. It would be good to record that in the commit too
> > that adds the uapi. git blame doesn't find cover letters at all, unlike on
> > gitlab where you get the MR request with everything.
> >
> > Ok there is the Link: thing, but since that only points at the last
> > version all the interesting discussion is still usually lost, so I tend to
> > not bother looking there.
> >
> > > I'd be ok to otherwise initially restrict this and the sw_sync UABI
> > > (CAP_SYS_ADMIN?  Or??) until there is a non-igt user, but they are
> > > both needed by the igt tests
> >
> > Hm really awkward, uapi for igts in cross vendor stuff like this isn't
> > great. I think hiding it in vgem is semi-ok (we have fences there
> > already). But it's all a bit silly ...
> >
> > For the tests, should we instead have a selftest/Kunit thing to exercise
> > this stuff? igt probably not quite the right thing. Or combine with a page
> > flip if you want to test msm.
>
> Hmm, IIRC we have used CONFIG_BROKEN or something along those lines
> for UABI in other places where we weren't willing to commit to yet?
>
> I suppose if we had to I could make this a sw_sync ioctl instead.  But
> OTOH there are kind of a limited # of ways this ioctl could look.  And
> we already know that at least some wayland compositors are going to
> want this.

Hm I was trying to think up a few ways this could work, but didn't
come up with anything reasonable. Forcing the compositor to boost the
entire chain (for gl composited primary plane fallback) is something
the kernel can easily do too. Also only makes sense for priority
boost, not so much for clock boosting, since clock boosting only
really needs the final element to be boosted.

> I guess I can look at non-igt options.  But the igt test is already a
> pretty convenient way to contrive situations (like loops, which is a
> thing I need to add)

Yeah it's definitely very useful for testing ... One option could be a
hacky debugfs interface, where you write a fd number and deadline and
the debugfs read function does the deadline setting. Horribly, but
since it's debugfs no one ever cares. That's at least where we're
hiding all the i915 hacks that igts need.
-Daniel

> BR,
> -R
>
>
> > -Daniel
> >
> > >
> > > BR,
> > > -R
> > >
> > > > -Daniel
> > > >
> > > > > ---
> > > > >  drivers/dma-buf/sync_file.c| 19 +++
> > > > >  include/uapi/linux/sync_file.h | 20 
> > > > >  2 files changed, 39 insertions(+)
> > > > >
> > > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > > > index 394e6e1e9686..f295772d5169 100644
> > > > > --- a/drivers/dma-buf/sync_file.c
> > > > > +++ b/drivers/dma-buf/sync_file.c
> > > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct 
> > > > > sync_file *sync_file,
> > > > >   return ret;
> > > > >  }
> > > > >
> > > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > > > > + unsigned long arg)
> > > > > +{
> > > > > + struct sync_set_deadline ts;
> > > > > +
> > > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > > > > + return -EFAULT;
> > > > > +
> > > > > + if (ts.pad)
> > > > > + return -EINVAL;
> > > > > +
> > > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, 
> > > > > ts.tv_nsec));
> > > > > +
> > > > > + return 0;
> > > > > +}
> > > > > +
> > > > >  static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > > >   unsigned long arg)
> > > > >  {
> > > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, 
> > > > > unsigned int cmd,
> > > > >   case SYNC_IOC_FILE_INFO:
> > > > >   return sync_file_ioctl_fence_info(sync_file, arg);
> > > > >
> > > > > + case SYNC_IOC_SET_DEADLINE:
> > > > > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > > > > +
> > > > >   default:
> > > > >   return -ENOTTY;
> > > > >   }
> > > > > diff --git a/include/uapi/linux/sync_file.h 
> > > > > b/include/uapi/linux/sync_file.h
> > > > > inde

Re: [PATCH 2/8] drm/i915/xehp: CCS shares the render reset domain

2021-09-08 Thread Matt Roper

On Wed, Sep 08, 2021 at 11:07:07AM +0100, Tvrtko Ursulin wrote:
> 
> On 07/09/2021 18:19, Matt Roper wrote:
> > The reset domain is shared between render and all compute engines,
> > so resetting one will affect the others.
> > 
> > Note:  Before performing a reset on an RCS or CCS engine, the GuC will
> > attempt to preempt-to-idle the other non-hung RCS/CCS engines to avoid
> > impacting other clients (since some shared modules will be reset).  If
> > other engines are executing non-preemptable workloads, the impact is
> > unavoidable and some work may be lost.
> 
> Since here it talks about engine reset, should this patch add warning if
> same is attempted by i915 on a GuC platform - to document it is not

Did you mean "on a *non* GuC platform" here?  We aren't going to have
compute engine support on any platforms where GuC submission isn't the
default operating model, so the only way to get compute engines +
execlist submission is to force an override via module parameters (e.g.,
enable_guc=0).  Doing so will taint the kernel, so I think the current
consensus from offline discussion is that the user has already put
themselves into a configuration where it's easier than usual to shoot
themselves in the foot; it's not too much different than the kind of
trouble a user could get themselves into if they loaded the driver with
hangcheck disabled or something.


Matt

> implemented/supported? Or perhaps later in the series, or future series
> works better.
> 
> Reviewed-by: Tvrtko Ursulin 
> 
> Regards,
> 
> Tvrtko
> 
> > Bspec: 52549
> > Original-patch-by: Michel Thierry
> > Cc: Tvrtko Ursulin 
> > Cc: Vinay Belgaumkar 
> > Signed-off-by: Daniele Ceraolo Spurio 
> > Signed-off-by: Aravind Iddamsetty 
> > Signed-off-by: Matt Roper 
> > ---
> >   drivers/gpu/drm/i915/gt/intel_reset.c | 4 
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
> > b/drivers/gpu/drm/i915/gt/intel_reset.c
> > index 91200c43951f..30598c1d070c 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_reset.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
> > @@ -507,6 +507,10 @@ static int gen11_reset_engines(struct intel_gt *gt,
> > [VECS1] = GEN11_GRDOM_VECS2,
> > [VECS2] = GEN11_GRDOM_VECS3,
> > [VECS3] = GEN11_GRDOM_VECS4,
> > +   [CCS0] = GEN11_GRDOM_RENDER,
> > +   [CCS1] = GEN11_GRDOM_RENDER,
> > +   [CCS2] = GEN11_GRDOM_RENDER,
> > +   [CCS3] = GEN11_GRDOM_RENDER,
> > };
> > struct intel_engine_cs *engine;
> > intel_engine_mask_t tmp;
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

2021-09-08 Thread Rob Clark

On Wed, Sep 8, 2021 at 11:49 AM Daniel Vetter  wrote:
>
> On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote:
> > On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter  wrote:
> > >
> > > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > > > From: Rob Clark 
> > > >
> > > > The initial purpose is for igt tests, but this would also be useful for
> > > > compositors that wait until close to vblank deadline to make decisions
> > > > about which frame to show.
> > > >
> > > > Signed-off-by: Rob Clark 
> > >
> > > Needs userspace and I think ideally also some igts to make sure it works
> > > and doesn't go boom.
> >
> > See cover-letter.. there are igt tests, although currently that is the
> > only user.
>
> Ah sorry missed that. It would be good to record that in the commit too
> that adds the uapi. git blame doesn't find cover letters at all, unlike on
> gitlab where you get the MR request with everything.
>
> Ok there is the Link: thing, but since that only points at the last
> version all the interesting discussion is still usually lost, so I tend to
> not bother looking there.
>
> > I'd be ok to otherwise initially restrict this and the sw_sync UABI
> > (CAP_SYS_ADMIN?  Or??) until there is a non-igt user, but they are
> > both needed by the igt tests
>
> Hm really awkward, uapi for igts in cross vendor stuff like this isn't
> great. I think hiding it in vgem is semi-ok (we have fences there
> already). But it's all a bit silly ...
>
> For the tests, should we instead have a selftest/Kunit thing to exercise
> this stuff? igt probably not quite the right thing. Or combine with a page
> flip if you want to test msm.

Hmm, IIRC we have used CONFIG_BROKEN or something along those lines
for UABI in other places where we weren't willing to commit to yet?

I suppose if we had to I could make this a sw_sync ioctl instead.  But
OTOH there are kind of a limited # of ways this ioctl could look.  And
we already know that at least some wayland compositors are going to
want this.

I guess I can look at non-igt options.  But the igt test is already a
pretty convenient way to contrive situations (like loops, which is a
thing I need to add)

BR,
-R


> -Daniel
>
> >
> > BR,
> > -R
> >
> > > -Daniel
> > >
> > > > ---
> > > >  drivers/dma-buf/sync_file.c| 19 +++
> > > >  include/uapi/linux/sync_file.h | 20 
> > > >  2 files changed, 39 insertions(+)
> > > >
> > > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > > index 394e6e1e9686..f295772d5169 100644
> > > > --- a/drivers/dma-buf/sync_file.c
> > > > +++ b/drivers/dma-buf/sync_file.c
> > > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct 
> > > > sync_file *sync_file,
> > > >   return ret;
> > > >  }
> > > >
> > > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > > > + unsigned long arg)
> > > > +{
> > > > + struct sync_set_deadline ts;
> > > > +
> > > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > > > + return -EFAULT;
> > > > +
> > > > + if (ts.pad)
> > > > + return -EINVAL;
> > > > +
> > > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, 
> > > > ts.tv_nsec));
> > > > +
> > > > + return 0;
> > > > +}
> > > > +
> > > >  static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > > >   unsigned long arg)
> > > >  {
> > > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, 
> > > > unsigned int cmd,
> > > >   case SYNC_IOC_FILE_INFO:
> > > >   return sync_file_ioctl_fence_info(sync_file, arg);
> > > >
> > > > + case SYNC_IOC_SET_DEADLINE:
> > > > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > > > +
> > > >   default:
> > > >   return -ENOTTY;
> > > >   }
> > > > diff --git a/include/uapi/linux/sync_file.h 
> > > > b/include/uapi/linux/sync_file.h
> > > > index ee2dcfb3d660..f67d4ffe7566 100644
> > > > --- a/include/uapi/linux/sync_file.h
> > > > +++ b/include/uapi/linux/sync_file.h
> > > > @@ -67,6 +67,18 @@ struct sync_file_info {
> > > >   __u64   sync_fence_info;
> > > >  };
> > > >
> > > > +/**
> > > > + * struct sync_set_deadline - set a deadline on a fence
> > > > + * @tv_sec:  seconds elapsed since epoch
> > > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > > > + * @pad: must be zero
> > > > + */
> > > > +struct sync_set_deadline {
> > > > + __s64   tv_sec;
> > > > + __s32   tv_nsec;
> > > > + __u32   pad;
> > > > +};
> > > > +
> > > >  #define SYNC_IOC_MAGIC   '>'
> > > >
> > > >  /**
> > > > @@ -95,4 +107,12 @@ struct sync_file_info {
> > > >   */
> > > >  #define SYNC_IOC_FILE_INFO   _IOWR(SYNC_IOC_MAGIC, 4, struct 
> > > > sync_file_info)
> > > >
> > > > +
> > > > +/**
> > > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > > > + *
> >

[drm:i915-uncore-vfunc 30/31] drivers/gpu/drm/i915/selftests/mock_uncore.c:47:2: error: implicit declaration of function 'ASSIGN_RAW_WRITE_MMIO_VFUNCS'; did you mean 'MMIO_RAW_WRITE_VFUNCS'?

2021-09-08 Thread kernel test robot

tree:   git://people.freedesktop.org/~airlied/linux.git i915-uncore-vfunc
head:   b42168f90718a90b11f2d52306d9aeaa9468
commit: 99aebd17891290abfca80c48eca01f4e02413fb3 [30/31] drm/i915/uncore: 
constify the register vtables.
config: i386-allyesconfig (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
git remote add drm git://people.freedesktop.org/~airlied/linux.git
git fetch --no-tags drm i915-uncore-vfunc
git checkout 99aebd17891290abfca80c48eca01f4e02413fb3
# save the attached .config to linux build tree
make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

   In file included from drivers/gpu/drm/i915/intel_uncore.c:2630:
   drivers/gpu/drm/i915/selftests/mock_uncore.c: In function 'mock_uncore_init':
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:47:2: error: implicit 
>> declaration of function 'ASSIGN_RAW_WRITE_MMIO_VFUNCS'; did you mean 
>> 'MMIO_RAW_WRITE_VFUNCS'? [-Werror=implicit-function-declaration]
  47 |  ASSIGN_RAW_WRITE_MMIO_VFUNCS(uncore, nop);
 |  ^~~~
 |  MMIO_RAW_WRITE_VFUNCS
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:47:39: error: 'nop' undeclared 
>> (first use in this function); did you mean 'nopv'?
  47 |  ASSIGN_RAW_WRITE_MMIO_VFUNCS(uncore, nop);
 |   ^~~
 |   nopv
   drivers/gpu/drm/i915/selftests/mock_uncore.c:47:39: note: each undeclared 
identifier is reported only once for each function it appears in
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:48:2: error: implicit 
>> declaration of function 'ASSIGN_RAW_READ_MMIO_VFUNCS' 
>> [-Werror=implicit-function-declaration]
  48 |  ASSIGN_RAW_READ_MMIO_VFUNCS(uncore, nop);
 |  ^~~
   At top level:
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read64' 
>> defined but not used [-Werror=unused-function]
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
   drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of 
macro '__nop_read'
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read32' 
>> defined but not used [-Werror=unused-function]
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
   drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of 
macro '__nop_read'
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read16' 
>> defined but not used [-Werror=unused-function]
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
   drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of 
macro '__nop_read'
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: error: 'nop_read8' 
>> defined but not used [-Werror=unused-function]
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
   drivers/gpu/drm/i915/selftests/mock_uncore.c:36:1: note: in definition of 
macro '__nop_read'
  36 | nop_read##x(struct intel_uncore *uncore, i915_reg_t reg, bool trace) 
{ return 0; }
 | ^~~~
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: error: 'nop_write32' 
>> defined but not used [-Werror=unused-function]
  29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, 
bool trace) { }
 | ^
   drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: note: in definition of 
macro '__nop_write'
  29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, 
bool trace) { }
 | ^
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: error: 'nop_write16' 
>> defined but not used [-Werror=unused-function]
  29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, 
bool trace) { }
 | ^
   drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: note: in definition of 
macro '__nop_write'
  29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, 
bool trace) { }
 | ^
>> drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: error: 'nop_write8' 
>> defined but not used [-Werror=unused-function]
  29 | nop_write##x(struct intel_uncore *uncore, i915_reg_t reg, u##x val, 
bool trace) { }
 | ^
   drivers/gpu/drm/i915/selftests/mock_uncore.c:29:1: note: in def

Re: [PATCH 13/14] drm/kmb: Enable alpha blended second plane

2021-09-08 Thread Sam Ravnborg

Hi Thomas,

On Wed, Sep 08, 2021 at 07:50:42PM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 03.08.21 um 07:10 schrieb Sam Ravnborg:
> > Hi Anitha,
> > 
> > On Mon, Aug 02, 2021 at 08:44:26PM +, Chrisanthus, Anitha wrote:
> > > Hi Sam,
> > > Thanks. Where should this go, drm-misc-fixes or drm-misc-next?
> > 
> > Looks like a drm-misc-next candidate to me.
> > I may improve something for existing users, but it does not look like it
> > fixes an existing bug.
> 
> I found this patch in drm-misc-fixes, although it doesn't look like a
> bugfix. It should have gone into drm-misc-next. See [1]. If it indeed
> belongs into drm-misc-fixes, it certainly should have contained a Fixes tag.

The patch fixes some warnings that has become errors the last week.
Anitha pinged me about it, but I failed to followup. So in the end it
was applied to shut up the warning => errors.

Sam

[PATCH] drm/nouveau/nvkm: Replace -ENOSYS with -ENODEV

2021-09-08 Thread Guenter Roeck

nvkm test builds fail with the following error.

drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c:
In function 'nvkm_control_mthd_pstate_info':
drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c:60:35: error:
overflow in conversion from 'int' to '__s8' {aka 'signed char'}
changes value from '-251' to '5'

The code builds on most architectures, but fails on parisc where ENOSYS
is defined as 251. Replace the error code with -ENODEV (-19). The actual
error code does not really matter and is not passed to userspace - it
just has to be negative.

Fixes: 7238eca4cf18 ("drm/nouveau: expose pstate selection per-power source in 
sysfs")
Signed-off-by: Guenter Roeck 
---
 drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c 
b/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c
index b0ece71aefde..ce774579c89d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/ctrl.c
@@ -57,7 +57,7 @@ nvkm_control_mthd_pstate_info(struct nvkm_control *ctrl, void 
*data, u32 size)
args->v0.count = 0;
args->v0.ustate_ac = NVIF_CONTROL_PSTATE_INFO_V0_USTATE_DISABLE;
args->v0.ustate_dc = NVIF_CONTROL_PSTATE_INFO_V0_USTATE_DISABLE;
-   args->v0.pwrsrc = -ENOSYS;
+   args->v0.pwrsrc = -ENODEV;
args->v0.pstate = NVIF_CONTROL_PSTATE_INFO_V0_PSTATE_UNKNOWN;
}
 
-- 
2.33.0

[PATCH 1/2] drm/i915/gt: Queue and wait for the irq_work item.

2021-09-08 Thread Sebastian Andrzej Siewior

Disabling interrupts and invoking the irq_work function directly breaks
on PREEMPT_RT.
PREEMPT_RT does not invoke all irq_work from hardirq context because
some of the user have spinlock_t locking in the callback function.
These locks are then turned into a sleeping locks which can not be
acquired with disabled interrupts.

Using irq_work_queue() has the benefit that the irqwork will be invoked
in the regular context. In general there is "no" delay between enqueuing
the callback and its invocation because the interrupt is raised right
away on architectures which support it (which includes x86).

Use irq_work_queue() + irq_work_sync() instead invoking the callback
directly.

Reported-by: Clark Williams 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/gpu/drm/i915/gt/intel_breadcrumbs.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c 
b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
index 38cc42783dfb2..594dec2f76954 100644
--- a/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
@@ -318,10 +318,9 @@ void __intel_breadcrumbs_park(struct intel_breadcrumbs *b)
/* Kick the work once more to drain the signalers, and disarm the irq */
irq_work_sync(&b->irq_work);
while (READ_ONCE(b->irq_armed) && !atomic_read(&b->active)) {
-   local_irq_disable();
-   signal_irq_work(&b->irq_work);
-   local_irq_enable();
+   irq_work_queue(&b->irq_work);
cond_resched();
+   irq_work_sync(&b->irq_work);
}
 }
 
-- 
2.33.0

[RFC PATCH 2/2] drm/i915/gt: Use spin_lock_irq() instead of local_irq_disable() + spin_lock()

2021-09-08 Thread Sebastian Andrzej Siewior

execlists_dequeue() is invoked from a function which uses
local_irq_disable() to disable interrupts so the spin_lock() behaves
like spin_lock_irq().
This breaks PREEMPT_RT because local_irq_disable() + spin_lock() is not
the same as spin_lock_irq().

execlists_dequeue_irq() and execlists_dequeue() has each one caller
only. If intel_engine_cs::active::lock is acquired and released with the
_irq suffix then it behaves almost as if execlists_dequeue() would be
invoked with disabled interrupts. The difference is the last part of the
function which is then invoked with enabled interrupts.
I can't tell if this makes a difference. From looking at it, it might
work to move the last unlock at the end of the function as I didn't find
anything that would acquire the lock again.

Reported-by: Clark Williams 
Signed-off-by: Sebastian Andrzej Siewior 
---
 .../drm/i915/gt/intel_execlists_submission.c| 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index fc77592d88a96..2ec1dd352960b 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -1265,7 +1265,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
 * and context switches) submission.
 */
 
-   spin_lock(&engine->active.lock);
+   spin_lock_irq(&engine->active.lock);
 
/*
 * If the queue is higher priority than the last
@@ -1365,7 +1365,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
 * Even if ELSP[1] is occupied and not worthy
 * of timeslices, our queue might be.
 */
-   spin_unlock(&engine->active.lock);
+   spin_unlock_irq(&engine->active.lock);
return;
}
}
@@ -1391,7 +1391,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
 
if (last && !can_merge_rq(last, rq)) {
spin_unlock(&ve->base.active.lock);
-   spin_unlock(&engine->active.lock);
+   spin_unlock_irq(&engine->active.lock);
return; /* leave this for another sibling */
}
 
@@ -1552,7 +1552,7 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
 * interrupt for secondary ports).
 */
execlists->queue_priority_hint = queue_prio(execlists);
-   spin_unlock(&engine->active.lock);
+   spin_unlock_irq(&engine->active.lock);
 
/*
 * We can skip poking the HW if we ended up with exactly the same set
@@ -1578,13 +1578,6 @@ static void execlists_dequeue(struct intel_engine_cs 
*engine)
}
 }
 
-static void execlists_dequeue_irq(struct intel_engine_cs *engine)
-{
-   local_irq_disable(); /* Suspend interrupts across request submission */
-   execlists_dequeue(engine);
-   local_irq_enable(); /* flush irq_work (e.g. breadcrumb enabling) */
-}
-
 static void clear_ports(struct i915_request **ports, int count)
 {
memset_p((void **)ports, NULL, count);
@@ -2377,7 +2370,7 @@ static void execlists_submission_tasklet(struct 
tasklet_struct *t)
}
 
if (!engine->execlists.pending[0]) {
-   execlists_dequeue_irq(engine);
+   execlists_dequeue(engine);
start_timeslice(engine);
}
 
-- 
2.33.0

[PATCH 0/2] drm/i915/gt: Locking splats PREEMPT_RT

2021-09-08 Thread Sebastian Andrzej Siewior

Clark Williams reported two issues with the i915 driver running on
PREEMPT_RT. While #1 looks simple I have no idea about #2 thus the RFC.

Sebastian

Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

2021-09-08 Thread Daniel Vetter

On Wed, Sep 08, 2021 at 11:23:42AM -0700, Rob Clark wrote:
> On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter  wrote:
> >
> > On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > > From: Rob Clark 
> > >
> > > The initial purpose is for igt tests, but this would also be useful for
> > > compositors that wait until close to vblank deadline to make decisions
> > > about which frame to show.
> > >
> > > Signed-off-by: Rob Clark 
> >
> > Needs userspace and I think ideally also some igts to make sure it works
> > and doesn't go boom.
> 
> See cover-letter.. there are igt tests, although currently that is the
> only user.

Ah sorry missed that. It would be good to record that in the commit too
that adds the uapi. git blame doesn't find cover letters at all, unlike on
gitlab where you get the MR request with everything.

Ok there is the Link: thing, but since that only points at the last
version all the interesting discussion is still usually lost, so I tend to
not bother looking there.

> I'd be ok to otherwise initially restrict this and the sw_sync UABI
> (CAP_SYS_ADMIN?  Or??) until there is a non-igt user, but they are
> both needed by the igt tests

Hm really awkward, uapi for igts in cross vendor stuff like this isn't
great. I think hiding it in vgem is semi-ok (we have fences there
already). But it's all a bit silly ...

For the tests, should we instead have a selftest/Kunit thing to exercise
this stuff? igt probably not quite the right thing. Or combine with a page
flip if you want to test msm.
-Daniel

> 
> BR,
> -R
> 
> > -Daniel
> >
> > > ---
> > >  drivers/dma-buf/sync_file.c| 19 +++
> > >  include/uapi/linux/sync_file.h | 20 
> > >  2 files changed, 39 insertions(+)
> > >
> > > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > > index 394e6e1e9686..f295772d5169 100644
> > > --- a/drivers/dma-buf/sync_file.c
> > > +++ b/drivers/dma-buf/sync_file.c
> > > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct 
> > > sync_file *sync_file,
> > >   return ret;
> > >  }
> > >
> > > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > > + unsigned long arg)
> > > +{
> > > + struct sync_set_deadline ts;
> > > +
> > > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > > + return -EFAULT;
> > > +
> > > + if (ts.pad)
> > > + return -EINVAL;
> > > +
> > > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, 
> > > ts.tv_nsec));
> > > +
> > > + return 0;
> > > +}
> > > +
> > >  static long sync_file_ioctl(struct file *file, unsigned int cmd,
> > >   unsigned long arg)
> > >  {
> > > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, 
> > > unsigned int cmd,
> > >   case SYNC_IOC_FILE_INFO:
> > >   return sync_file_ioctl_fence_info(sync_file, arg);
> > >
> > > + case SYNC_IOC_SET_DEADLINE:
> > > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > > +
> > >   default:
> > >   return -ENOTTY;
> > >   }
> > > diff --git a/include/uapi/linux/sync_file.h 
> > > b/include/uapi/linux/sync_file.h
> > > index ee2dcfb3d660..f67d4ffe7566 100644
> > > --- a/include/uapi/linux/sync_file.h
> > > +++ b/include/uapi/linux/sync_file.h
> > > @@ -67,6 +67,18 @@ struct sync_file_info {
> > >   __u64   sync_fence_info;
> > >  };
> > >
> > > +/**
> > > + * struct sync_set_deadline - set a deadline on a fence
> > > + * @tv_sec:  seconds elapsed since epoch
> > > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > > + * @pad: must be zero
> > > + */
> > > +struct sync_set_deadline {
> > > + __s64   tv_sec;
> > > + __s32   tv_nsec;
> > > + __u32   pad;
> > > +};
> > > +
> > >  #define SYNC_IOC_MAGIC   '>'
> > >
> > >  /**
> > > @@ -95,4 +107,12 @@ struct sync_file_info {
> > >   */
> > >  #define SYNC_IOC_FILE_INFO   _IOWR(SYNC_IOC_MAGIC, 4, struct 
> > > sync_file_info)
> > >
> > > +
> > > +/**
> > > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > > + *
> > > + * Allows userspace to set a deadline on a fence, see 
> > > dma_fence_set_deadline()
> > > + */
> > > +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct 
> > > sync_set_deadline)
> > > +
> > >  #endif /* _UAPI_LINUX_SYNC_H */
> > > --
> > > 2.31.1
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

2021-09-08 Thread Daniel Vetter

On Wed, Sep 08, 2021 at 11:19:15AM -0700, Rob Clark wrote:
> On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter  wrote:
> >
> > On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:
> > > From: Rob Clark 
> > >
> > > Signed-off-by: Rob Clark 
> > > ---
> > >  drivers/dma-buf/dma-fence-chain.c | 13 +
> > >  1 file changed, 13 insertions(+)
> > >
> > > diff --git a/drivers/dma-buf/dma-fence-chain.c 
> > > b/drivers/dma-buf/dma-fence-chain.c
> > > index 1b4cb3e5cec9..736a9ad3ea6d 100644
> > > --- a/drivers/dma-buf/dma-fence-chain.c
> > > +++ b/drivers/dma-buf/dma-fence-chain.c
> > > @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence 
> > > *fence)
> > >   dma_fence_free(fence);
> > >  }
> > >
> > > +
> > > +static void dma_fence_chain_set_deadline(struct dma_fence *fence,
> > > +  ktime_t deadline)
> > > +{
> > > + dma_fence_chain_for_each(fence, fence) {
> > > + struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> > > + struct dma_fence *f = chain ? chain->fence : fence;
> >
> > Doesn't this just end up calling set_deadline on a chain, potenetially
> > resulting in recursion? Also I don't think this should ever happen, why
> > did you add that?
> 
> Tbh the fence-chain was the part I was a bit fuzzy about, and the main
> reason I added igt tests.  The iteration is similar to how, for ex,
> dma_fence_chain_signaled() work, and according to the igt test it does
> what was intended

Huh indeed. Maybe something we should fix, like why does the
dma_fence_chain_for_each not give you the upcast chain pointer ... I guess
this also needs more Christian and less me.
-Daniel

> 
> BR,
> -R
> 
> > -Daniel
> >
> > > +
> > > + dma_fence_set_deadline(f, deadline);
> > > + }
> > > +}
> > > +
> > >  const struct dma_fence_ops dma_fence_chain_ops = {
> > >   .use_64bit_seqno = true,
> > >   .get_driver_name = dma_fence_chain_get_driver_name,
> > > @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
> > >   .enable_signaling = dma_fence_chain_enable_signaling,
> > >   .signaled = dma_fence_chain_signaled,
> > >   .release = dma_fence_chain_release,
> > > + .set_deadline = dma_fence_chain_set_deadline,
> > >  };
> > >  EXPORT_SYMBOL(dma_fence_chain_ops);
> > >
> > > --
> > > 2.31.1
> > >
> >
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm/plane-helper: fix uninitialized variable reference

2021-09-08 Thread Daniel Vetter

On Tue, Sep 07, 2021 at 10:08:36AM -0400, Alex Xu (Hello71) wrote:
> drivers/gpu/drm/drm_plane_helper.c: In function 'drm_primary_helper_update':
> drivers/gpu/drm/drm_plane_helper.c:113:32: error: 'visible' is used 
> uninitialized [-Werror=uninitialized]
>   113 | struct drm_plane_state plane_state = {
>   |^~~
> drivers/gpu/drm/drm_plane_helper.c:178:14: note: 'visible' was declared here
>   178 | bool visible;
>   |  ^~~
> cc1: all warnings being treated as errors
> 
> visible is an output, not an input. in practice this use might turn out
> OK but it's still UB.
> 
> Fixes: df86af9133 ("drm/plane-helper: Add drm_plane_helper_check_state()")

I need a signed-off-by from you before I can merge this. See

https://dri.freedesktop.org/docs/drm/process/submitting-patches.html#sign-your-work-the-developer-s-certificate-of-origin

Patch lgtm otherwise.
-Daniel

> ---
>  drivers/gpu/drm/drm_plane_helper.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/drm_plane_helper.c 
> b/drivers/gpu/drm/drm_plane_helper.c
> index 5b2d0ca03705..838b32b70bce 100644
> --- a/drivers/gpu/drm/drm_plane_helper.c
> +++ b/drivers/gpu/drm/drm_plane_helper.c
> @@ -123,7 +123,6 @@ static int drm_plane_helper_check_update(struct drm_plane 
> *plane,
>   .crtc_w = drm_rect_width(dst),
>   .crtc_h = drm_rect_height(dst),
>   .rotation = rotation,
> - .visible = *visible,
>   };
>   struct drm_crtc_state crtc_state = {
>   .crtc = crtc,
> -- 
> 2.33.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] kernel/locking: Add context to ww_mutex_trylock.

2021-09-08 Thread Daniel Vetter

On Wed, Sep 08, 2021 at 12:14:23PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 07, 2021 at 03:20:44PM +0200, Maarten Lankhorst wrote:
> > i915 will soon gain an eviction path that trylock a whole lot of locks
> > for eviction, getting dmesg failures like below:
> > 
> > BUG: MAX_LOCK_DEPTH too low!
> > turning off the locking correctness validator.
> > depth: 48  max: 48!
> > 48 locks held by i915_selftest/5776:
> >  #0: 888101a79240 (&dev->mutex){}-{3:3}, at: 
> > __driver_attach+0x88/0x160
> >  #1: c99778c0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: 
> > i915_vma_pin.constprop.63+0x39/0x1b0 [i915]
> >  #2: 88800cf74de8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
> > i915_vma_pin.constprop.63+0x5f/0x1b0 [i915]
> >  #3: 88810c7f9e38 (&vm->mutex/1){+.+.}-{3:3}, at: 
> > i915_vma_pin_ww+0x1c4/0x9d0 [i915]
> >  #4: 88810bad5768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
> > i915_gem_evict_something+0x110/0x860 [i915]
> >  #5: 88810bad60e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
> > i915_gem_evict_something+0x110/0x860 [i915]
> > ...
> >  #46: 88811964d768 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
> > i915_gem_evict_something+0x110/0x860 [i915]
> >  #47: 88811964e0e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: 
> > i915_gem_evict_something+0x110/0x860 [i915]
> > INFO: lockdep is turned off.
> 
> > As an intermediate solution, add an acquire context to ww_mutex_trylock,
> > which allows us to do proper nesting annotations on the trylocks, making
> > the above lockdep splat disappear.
> 
> Fair enough I suppose.

What's maybe missing from the commit message
- we'll probably use this for ttm too eventually
- even when we add full ww_mutex locking we'll still have the trylock
  fastpath. This is because we have a lock inversion against list locks in
  these eviction paths, and the slow path unroll to drop that list lock is
  a bit nasty (and defintely expensive).

iow even long term this here is needed in some form I think.
-Daniel

> 
> > +/**
> > + * ww_mutex_trylock - tries to acquire the w/w mutex with optional acquire 
> > context
> > + * @lock: mutex to lock
> > + * @ctx: optional w/w acquire context
> > + *
> > + * Trylocks a mutex with the optional acquire context; no deadlock 
> > detection is
> > + * possible. Returns 1 if the mutex has been acquired successfully, 0 
> > otherwise.
> > + *
> > + * Unlike ww_mutex_lock, no deadlock handling is performed. However, if a 
> > @ctx is
> > + * specified, -EALREADY and -EDEADLK handling may happen in calls to 
> > ww_mutex_lock.
> > + *
> > + * A mutex acquired with this function must be released with 
> > ww_mutex_unlock.
> > + */
> > +int __sched
> > +ww_mutex_trylock(struct ww_mutex *ww, struct ww_acquire_ctx *ctx)
> > +{
> > +   bool locked;
> > +
> > +   if (!ctx)
> > +   return mutex_trylock(&ww->base);
> > +
> > +#ifdef CONFIG_DEBUG_MUTEXES
> > +   DEBUG_LOCKS_WARN_ON(ww->base.magic != &ww->base);
> > +#endif
> > +
> > +   preempt_disable();
> > +   locked = __mutex_trylock(&ww->base);
> > +
> > +   if (locked) {
> > +   ww_mutex_set_context_fastpath(ww, ctx);
> > +   mutex_acquire_nest(&ww->base.dep_map, 0, 1, &ctx->dep_map, 
> > _RET_IP_);
> > +   }
> > +   preempt_enable();
> > +
> > +   return locked;
> > +}
> > +EXPORT_SYMBOL(ww_mutex_trylock);
> 
> You'll need a similar hunk in ww_rt_mutex.c

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [resend PATCH] drm/ttm: Fix a deadlock if the target BO is not idle during swap

2021-09-08 Thread Daniel Vetter

On Tue, Sep 07, 2021 at 11:28:23AM +0200, Christian König wrote:
> Am 07.09.21 um 11:05 schrieb Daniel Vetter:
> > On Tue, Sep 07, 2021 at 08:22:20AM +0200, Christian König wrote:
> > > Added a Fixes tag and pushed this to drm-misc-fixes.
> > We're in the merge window, this should have been drm-misc-next-fixes. I'll
> > poke misc maintainers so it's not lost.
> 
> Hui? It's a fix for a problem in stable and not in drm-misc-next.

Ah the flow chart is confusing. There is no current -rc, so it's always
-next-fixes. Or you're running the risk that it's lost until after -rc1.
Maybe we should clarify that "is the bug in current -rc?" only applies if
there is a current -rc.

Anyway Thomas sent out a pr, so it's all good.
-Daniel

> 
> Christian.
> 
> > -Daniel
> > 
> > > It will take a while until it cycles back into the development branches, 
> > > so
> > > feel free to push some version to amd-staging-drm-next as well. Just ping
> > > Alex when you do this.
> > > 
> > > Thanks,
> > > Christian.
> > > 
> > > Am 07.09.21 um 06:08 schrieb xinhui pan:
> > > > The ret value might be -EBUSY, caller will think lru lock is still
> > > > locked but actually NOT. So return -ENOSPC instead. Otherwise we hit
> > > > list corruption.
> > > > 
> > > > ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0,
> > > > caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will
> > > > be stuck as we actually did not free any BO memory. This usually happens
> > > > when the fence is not signaled for a long time.
> > > > 
> > > > Signed-off-by: xinhui pan 
> > > > Reviewed-by: Christian König 
> > > > ---
> > > >drivers/gpu/drm/ttm/ttm_bo.c | 6 +++---
> > > >1 file changed, 3 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > index 8d7fd65ccced..23f906941ac9 100644
> > > > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > > @@ -1152,9 +1152,9 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, 
> > > > struct ttm_operation_ctx *ctx,
> > > > }
> > > > if (bo->deleted) {
> > > > -   ttm_bo_cleanup_refs(bo, false, false, locked);
> > > > +   ret = ttm_bo_cleanup_refs(bo, false, false, locked);
> > > > ttm_bo_put(bo);
> > > > -   return 0;
> > > > +   return ret == -EBUSY ? -ENOSPC : ret;
> > > > }
> > > > ttm_bo_del_from_lru(bo);
> > > > @@ -1208,7 +1208,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, 
> > > > struct ttm_operation_ctx *ctx,
> > > > if (locked)
> > > > dma_resv_unlock(bo->base.resv);
> > > > ttm_bo_put(bo);
> > > > -   return ret;
> > > > +   return ret == -EBUSY ? -ENOSPC : ret;
> > > >}
> > > >void ttm_bo_tt_destroy(struct ttm_buffer_object *bo)
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm: mxsfb: Fix NULL pointer dereference crash on unload

2021-09-08 Thread Daniel Vetter

On Tue, Sep 07, 2021 at 04:49:00AM +0200, Marek Vasut wrote:
> The mxsfb->crtc.funcs may already be NULL when unloading the driver,
> in which case calling mxsfb_irq_disable() via drm_irq_uninstall() from
> mxsfb_unload() leads to NULL pointer dereference.
> 
> Since all we care about is masking the IRQ and mxsfb->base is still
> valid, just use that to clear and mask the IRQ.
> 
> Fixes: ae1ed00932819 ("drm: mxsfb: Stop using DRM simple display pipeline 
> helper")
> Signed-off-by: Marek Vasut 
> Cc: Daniel Abrecht 
> Cc: Emil Velikov 
> Cc: Laurent Pinchart 
> Cc: Sam Ravnborg 
> Cc: Stefan Agner 

You probably want a drm_atomic_helper_shutdown instead of trying to do all
that manually. We've also added a bunch more devm and drmm_ functions to
automate the cleanup a lot more here, e.g. your drm_mode_config_cleanup is
in the wrong place.

Also I'm confused because I'm not even seeing this function anywhere in
upstream.
-Daniel

> ---
>  drivers/gpu/drm/mxsfb/mxsfb_drv.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/mxsfb/mxsfb_drv.c 
> b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
> index ec0432fe1bdf8..86d78634a9799 100644
> --- a/drivers/gpu/drm/mxsfb/mxsfb_drv.c
> +++ b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
> @@ -173,7 +173,11 @@ static void mxsfb_irq_disable(struct drm_device *drm)
>   struct mxsfb_drm_private *mxsfb = drm->dev_private;
>  
>   mxsfb_enable_axi_clk(mxsfb);
> - mxsfb->crtc.funcs->disable_vblank(&mxsfb->crtc);
> +
> + /* Disable and clear VBLANK IRQ */
> + writel(CTRL1_CUR_FRAME_DONE_IRQ_EN, mxsfb->base + LCDC_CTRL1 + REG_CLR);
> + writel(CTRL1_CUR_FRAME_DONE_IRQ, mxsfb->base + LCDC_CTRL1 + REG_CLR);
> +
>   mxsfb_disable_axi_clk(mxsfb);
>  }
>  
> -- 
> 2.33.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH 1/2] drm/nouveau/ga102-: support ttm buffer moves via copy engine

2021-09-08 Thread Daniel Vetter

On Mon, Sep 06, 2021 at 10:56:27AM +1000, Ben Skeggs wrote:
> From: Ben Skeggs 
> 
> We don't currently have any kind of real acceleration on Ampere GPUs,
> but the TTM memcpy() fallback paths aren't really designed to handle
> copies between different devices, such as on Optimus systems, and
> result in a kernel OOPS.

Is this just for moving a buffer from vram to system memory when you pin
it for dma-buf? I'm kinda lost what you even use ttm bo moves for if
there's no one using the gpu.

Also I guess memcpy goes boom if you can't mmap it because it's outside
the gart? Or just that it's very slow. We're trying to use ttm memcyp as
fallback, so want to know how this can all go wrong :-)
-Daniel

> 
> A few options were investigated to try and fix this, but didn't work
> out, and likely would have resulted in a very unpleasant experience
> for users anyway.
> 
> This commit adds just enough support for setting up a single channel
> connected to a copy engine, which the kernel can use to accelerate
> the buffer copies between devices.  Userspace has no access to this
> incomplete channel support, but it's suitable for TTM's needs.
> 
> A more complete implementation of host(fifo) for Ampere GPUs is in
> the works, but the required changes are far too invasive that they
> would be unsuitable to backport to fix this issue on current kernels.
> 
> Signed-off-by: Ben Skeggs 
> Cc: Lyude Paul 
> Cc: Karol Herbst 
> Cc:  # v5.12+
> ---
>  drivers/gpu/drm/nouveau/include/nvif/class.h  |   2 +
>  .../drm/nouveau/include/nvkm/engine/fifo.h|   1 +
>  drivers/gpu/drm/nouveau/nouveau_bo.c  |   1 +
>  drivers/gpu/drm/nouveau/nouveau_chan.c|   6 +-
>  drivers/gpu/drm/nouveau/nouveau_drm.c |   4 +
>  drivers/gpu/drm/nouveau/nv84_fence.c  |   2 +-
>  .../gpu/drm/nouveau/nvkm/engine/device/base.c |   3 +
>  .../gpu/drm/nouveau/nvkm/engine/fifo/Kbuild   |   1 +
>  .../gpu/drm/nouveau/nvkm/engine/fifo/ga102.c  | 308 ++
>  .../gpu/drm/nouveau/nvkm/subdev/top/ga100.c   |   7 +-
>  10 files changed, 329 insertions(+), 6 deletions(-)
>  create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/fifo/ga102.c
> 
> diff --git a/drivers/gpu/drm/nouveau/include/nvif/class.h 
> b/drivers/gpu/drm/nouveau/include/nvif/class.h
> index c68cc957248e..a582c0cb0cb0 100644
> --- a/drivers/gpu/drm/nouveau/include/nvif/class.h
> +++ b/drivers/gpu/drm/nouveau/include/nvif/class.h
> @@ -71,6 +71,7 @@
>  #define PASCAL_CHANNEL_GPFIFO_A   /* cla06f.h */ 
> 0xc06f
>  #define VOLTA_CHANNEL_GPFIFO_A/* clc36f.h */ 
> 0xc36f
>  #define TURING_CHANNEL_GPFIFO_A   /* clc36f.h */ 
> 0xc46f
> +#define AMPERE_CHANNEL_GPFIFO_B   /* clc36f.h */ 
> 0xc76f
>  
>  #define NV50_DISP /* cl5070.h */ 
> 0x5070
>  #define G82_DISP  /* cl5070.h */ 
> 0x8270
> @@ -200,6 +201,7 @@
>  #define PASCAL_DMA_COPY_B
> 0xc1b5
>  #define VOLTA_DMA_COPY_A 
> 0xc3b5
>  #define TURING_DMA_COPY_A
> 0xc5b5
> +#define AMPERE_DMA_COPY_B
> 0xc7b5
>  
>  #define FERMI_DECOMPRESS 
> 0x90b8
>  
> diff --git a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h 
> b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
> index 54fab7cc36c1..64ee82c7c1be 100644
> --- a/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
> +++ b/drivers/gpu/drm/nouveau/include/nvkm/engine/fifo.h
> @@ -77,4 +77,5 @@ int gp100_fifo_new(struct nvkm_device *, enum 
> nvkm_subdev_type, int inst, struct
>  int gp10b_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> struct nvkm_fifo **);
>  int gv100_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> struct nvkm_fifo **);
>  int tu102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> struct nvkm_fifo **);
> +int ga102_fifo_new(struct nvkm_device *, enum nvkm_subdev_type, int inst, 
> struct nvkm_fifo **);
>  #endif
> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c 
> b/drivers/gpu/drm/nouveau/nouveau_bo.c
> index 4a7cebac8060..b3e4f555fa05 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
> @@ -844,6 +844,7 @@ nouveau_bo_move_init(struct nouveau_drm *drm)
>   struct ttm_resource *, struct ttm_resource *);
>   int (*init)(struct nouveau_channel *, u32 handle);
>   } _methods[] = {
> + {  "COPY", 4, 0xc7b5, nve0_bo_move_copy, nve0_bo_move_init },
>   {  "COPY", 4, 0xc5b5, nve0_bo_move_copy, nve0_bo_move_init },
>   {  "GRCE", 0, 0xc5b5, nve0_bo_move_copy, nvc0_bo_move_init },
>   {  "COPY", 4, 0xc3b5, nve0_bo_move_copy, nve0_bo_move_i

Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

2021-09-08 Thread Rob Clark

On Wed, Sep 8, 2021 at 10:50 AM Daniel Vetter  wrote:
>
> On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > The initial purpose is for igt tests, but this would also be useful for
> > compositors that wait until close to vblank deadline to make decisions
> > about which frame to show.
> >
> > Signed-off-by: Rob Clark 
>
> Needs userspace and I think ideally also some igts to make sure it works
> and doesn't go boom.

See cover-letter.. there are igt tests, although currently that is the
only user.

I'd be ok to otherwise initially restrict this and the sw_sync UABI
(CAP_SYS_ADMIN?  Or??) until there is a non-igt user, but they are
both needed by the igt tests

BR,
-R

> -Daniel
>
> > ---
> >  drivers/dma-buf/sync_file.c| 19 +++
> >  include/uapi/linux/sync_file.h | 20 
> >  2 files changed, 39 insertions(+)
> >
> > diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> > index 394e6e1e9686..f295772d5169 100644
> > --- a/drivers/dma-buf/sync_file.c
> > +++ b/drivers/dma-buf/sync_file.c
> > @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct 
> > sync_file *sync_file,
> >   return ret;
> >  }
> >
> > +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> > + unsigned long arg)
> > +{
> > + struct sync_set_deadline ts;
> > +
> > + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> > + return -EFAULT;
> > +
> > + if (ts.pad)
> > + return -EINVAL;
> > +
> > + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, 
> > ts.tv_nsec));
> > +
> > + return 0;
> > +}
> > +
> >  static long sync_file_ioctl(struct file *file, unsigned int cmd,
> >   unsigned long arg)
> >  {
> > @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned 
> > int cmd,
> >   case SYNC_IOC_FILE_INFO:
> >   return sync_file_ioctl_fence_info(sync_file, arg);
> >
> > + case SYNC_IOC_SET_DEADLINE:
> > + return sync_file_ioctl_set_deadline(sync_file, arg);
> > +
> >   default:
> >   return -ENOTTY;
> >   }
> > diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> > index ee2dcfb3d660..f67d4ffe7566 100644
> > --- a/include/uapi/linux/sync_file.h
> > +++ b/include/uapi/linux/sync_file.h
> > @@ -67,6 +67,18 @@ struct sync_file_info {
> >   __u64   sync_fence_info;
> >  };
> >
> > +/**
> > + * struct sync_set_deadline - set a deadline on a fence
> > + * @tv_sec:  seconds elapsed since epoch
> > + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> > + * @pad: must be zero
> > + */
> > +struct sync_set_deadline {
> > + __s64   tv_sec;
> > + __s32   tv_nsec;
> > + __u32   pad;
> > +};
> > +
> >  #define SYNC_IOC_MAGIC   '>'
> >
> >  /**
> > @@ -95,4 +107,12 @@ struct sync_file_info {
> >   */
> >  #define SYNC_IOC_FILE_INFO   _IOWR(SYNC_IOC_MAGIC, 4, struct 
> > sync_file_info)
> >
> > +
> > +/**
> > + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> > + *
> > + * Allows userspace to set a deadline on a fence, see 
> > dma_fence_set_deadline()
> > + */
> > +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct 
> > sync_set_deadline)
> > +
> >  #endif /* _UAPI_LINUX_SYNC_H */
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

[PATCH 2/2] drm/bridge: parade-ps8640: Add support for AUX channel

2021-09-08 Thread Philip Chen

Implement the first version of AUX support, which will be useful as
we expand the driver to support varied use cases.

Signed-off-by: Philip Chen 
---

 drivers/gpu/drm/bridge/parade-ps8640.c | 123 +
 1 file changed, 123 insertions(+)

diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c 
b/drivers/gpu/drm/bridge/parade-ps8640.c
index a16725dbf912..3f0241a60357 100644
--- a/drivers/gpu/drm/bridge/parade-ps8640.c
+++ b/drivers/gpu/drm/bridge/parade-ps8640.c
@@ -9,15 +9,36 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
+#define PAGE0_AUXCH_CFG3   0x76
+#define  AUXCH_CFG3_RESET  0xff
+#define PAGE0_AUX_ADDR_7_0 0x7d
+#define PAGE0_AUX_ADDR_15_80x7e
+#define PAGE0_AUX_ADDR_23_16   0x7f
+#define  AUX_ADDR_19_16_MASK   GENMASK(3, 0)
+#define  AUX_CMD_MASK  GENMASK(7, 4)
+#define PAGE0_AUX_LENGTH   0x80
+#define  AUX_LENGTH_MASK   GENMASK(3, 0)
+#define PAGE0_AUX_WDATA0x81
+#define PAGE0_AUX_RDATA0x82
+#define PAGE0_AUX_CTRL 0x83
+#define  AUX_START 0x01
+#define PAGE0_AUX_STATUS   0x84
+#define  AUX_STATUS_MASK   GENMASK(7, 5)
+#define  AUX_STATUS_TIMEOUT(0x7 << 5)
+#define  AUX_STATUS_DEFER  (0x2 << 5)
+#define  AUX_STATUS_NACK   (0x1 << 5)
+
 #define PAGE2_GPIO_H   0xa7
 #define  PS_GPIO9  BIT(1)
 #define PAGE2_I2C_BYPASS   0xea
@@ -63,6 +84,7 @@ enum ps8640_vdo_control {
 struct ps8640 {
struct drm_bridge bridge;
struct drm_bridge *panel_bridge;
+   struct drm_dp_aux aux;
struct mipi_dsi_device *dsi;
struct i2c_client *page[MAX_DEVS];
struct regmap   *regmap[MAX_DEVS];
@@ -93,6 +115,102 @@ static inline struct ps8640 *bridge_to_ps8640(struct 
drm_bridge *e)
return container_of(e, struct ps8640, bridge);
 }
 
+static inline struct ps8640 *aux_to_ps8640(struct drm_dp_aux *aux)
+{
+   return container_of(aux, struct ps8640, aux);
+}
+
+static ssize_t ps8640_aux_transfer(struct drm_dp_aux *aux,
+  struct drm_dp_aux_msg *msg)
+{
+   struct ps8640 *ps_bridge = aux_to_ps8640(aux);
+   struct i2c_client *client = ps_bridge->page[PAGE0_DP_CNTL];
+   struct regmap *map = ps_bridge->regmap[PAGE0_DP_CNTL];
+   unsigned int len = msg->size;
+   unsigned int data;
+   int ret;
+   u8 request = msg->request &
+~(DP_AUX_I2C_MOT | DP_AUX_I2C_WRITE_STATUS_UPDATE);
+   u8 *buf = msg->buffer;
+   bool is_native_aux = false;
+
+   if (len > DP_AUX_MAX_PAYLOAD_BYTES)
+   return -EINVAL;
+
+   pm_runtime_get_sync(&client->dev);
+
+   switch (request) {
+   case DP_AUX_NATIVE_WRITE:
+   case DP_AUX_NATIVE_READ:
+   is_native_aux = true;
+   case DP_AUX_I2C_WRITE:
+   case DP_AUX_I2C_READ:
+   regmap_write(map, PAGE0_AUXCH_CFG3, AUXCH_CFG3_RESET);
+   break;
+   default:
+   ret = -EINVAL;
+   goto exit;
+   }
+
+   /* Assume it's good */
+   msg->reply = 0;
+
+   data = ((request << 4) & AUX_CMD_MASK) |
+  ((msg->address >> 16) & AUX_ADDR_19_16_MASK);
+   regmap_write(map, PAGE0_AUX_ADDR_23_16, data);
+   data = (msg->address >> 8) & 0xff;
+   regmap_write(map, PAGE0_AUX_ADDR_15_8, data);
+   data = msg->address & 0xff;
+   regmap_write(map, PAGE0_AUX_ADDR_7_0, msg->address & 0xff);
+
+   data = (len - 1) & AUX_LENGTH_MASK;
+   regmap_write(map, PAGE0_AUX_LENGTH, data);
+
+   if (request == DP_AUX_NATIVE_WRITE || request == DP_AUX_I2C_WRITE) {
+   ret = regmap_noinc_write(map, PAGE0_AUX_WDATA, buf, len);
+   if (ret < 0) {
+   DRM_ERROR("failed to write PAGE0_AUX_WDATA");
+   goto exit;
+   }
+   }
+
+   regmap_write(map, PAGE0_AUX_CTRL, AUX_START);
+
+   regmap_read(map, PAGE0_AUX_STATUS, &data);
+   switch (data & AUX_STATUS_MASK) {
+   case AUX_STATUS_DEFER:
+   if (is_native_aux)
+   msg->reply |= DP_AUX_NATIVE_REPLY_DEFER;
+   else
+   msg->reply |= DP_AUX_I2C_REPLY_DEFER;
+   goto exit;
+   case AUX_STATUS_NACK:
+   if (is_native_aux)
+   msg->reply |= DP_AUX_NATIVE_REPLY_NACK;
+   else
+   msg->reply |= DP_AUX_I2C_REPLY_NACK;
+   goto exit;
+   case AUX_STATUS_TIMEOUT:
+   ret = -ETIMEDOUT;
+   goto exit;
+   }
+
+   if (request == DP_AUX_NATIVE_READ || request == DP_AUX_I2C_READ) {
+   ret = regmap_noinc_read(map, PAGE0_AUX_RDATA, buf, len);
+   if (ret < 0)
+   DRM_ERROR("failed to read PAGE0_AUX_RDATA");
+   }
+
+exit:
+   pm_runtime_mark_last_busy

[PATCH 1/2] drm/bridge: parade-ps8640: Use regmap APIs

2021-09-08 Thread Philip Chen

Replace the direct i2c access (i2c_smbus_* functions) with regmap APIs,
which will simplify the future update on ps8640 driver.

Signed-off-by: Philip Chen 
---

 drivers/gpu/drm/bridge/parade-ps8640.c | 66 +++---
 1 file changed, 39 insertions(+), 27 deletions(-)

diff --git a/drivers/gpu/drm/bridge/parade-ps8640.c 
b/drivers/gpu/drm/bridge/parade-ps8640.c
index 685e9c38b2db..a16725dbf912 100644
--- a/drivers/gpu/drm/bridge/parade-ps8640.c
+++ b/drivers/gpu/drm/bridge/parade-ps8640.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -64,12 +65,29 @@ struct ps8640 {
struct drm_bridge *panel_bridge;
struct mipi_dsi_device *dsi;
struct i2c_client *page[MAX_DEVS];
+   struct regmap   *regmap[MAX_DEVS];
struct regulator_bulk_data supplies[2];
struct gpio_desc *gpio_reset;
struct gpio_desc *gpio_powerdown;
bool powered;
 };
 
+static const struct regmap_range ps8640_volatile_ranges[] = {
+   { .range_min = 0, .range_max = 0xff },
+};
+
+static const struct regmap_access_table ps8640_volatile_table = {
+   .yes_ranges = ps8640_volatile_ranges,
+   .n_yes_ranges = ARRAY_SIZE(ps8640_volatile_ranges),
+};
+
+static const struct regmap_config ps8640_regmap_config = {
+   .reg_bits = 8,
+   .val_bits = 8,
+   .volatile_table = &ps8640_volatile_table,
+   .cache_type = REGCACHE_NONE,
+};
+
 static inline struct ps8640 *bridge_to_ps8640(struct drm_bridge *e)
 {
return container_of(e, struct ps8640, bridge);
@@ -78,13 +96,13 @@ static inline struct ps8640 *bridge_to_ps8640(struct 
drm_bridge *e)
 static int ps8640_bridge_vdo_control(struct ps8640 *ps_bridge,
 const enum ps8640_vdo_control ctrl)
 {
-   struct i2c_client *client = ps_bridge->page[PAGE3_DSI_CNTL1];
-   u8 vdo_ctrl_buf[] = { VDO_CTL_ADD, ctrl };
+   struct regmap *map = ps_bridge->regmap[PAGE3_DSI_CNTL1];
+   u8 vdo_ctrl_buf[] = {VDO_CTL_ADD, ctrl};
int ret;
 
-   ret = i2c_smbus_write_i2c_block_data(client, PAGE3_SET_ADD,
-sizeof(vdo_ctrl_buf),
-vdo_ctrl_buf);
+   ret = regmap_bulk_write(map, PAGE3_SET_ADD,
+   vdo_ctrl_buf, sizeof(vdo_ctrl_buf));
+
if (ret < 0) {
DRM_ERROR("failed to %sable VDO: %d\n",
  ctrl == ENABLE ? "en" : "dis", ret);
@@ -96,8 +114,7 @@ static int ps8640_bridge_vdo_control(struct ps8640 
*ps_bridge,
 
 static void ps8640_bridge_poweron(struct ps8640 *ps_bridge)
 {
-   struct i2c_client *client = ps_bridge->page[PAGE2_TOP_CNTL];
-   unsigned long timeout;
+   struct regmap *map = ps_bridge->regmap[PAGE2_TOP_CNTL];
int ret, status;
 
if (ps_bridge->powered)
@@ -121,18 +138,12 @@ static void ps8640_bridge_poweron(struct ps8640 
*ps_bridge)
 */
msleep(200);
 
-   timeout = jiffies + msecs_to_jiffies(200) + 1;
+   ret = regmap_read_poll_timeout(map, PAGE2_GPIO_H, status,
+   status & PS_GPIO9, 20 * 1000, 200 * 1000);
 
-   while (time_is_after_jiffies(timeout)) {
-   status = i2c_smbus_read_byte_data(client, PAGE2_GPIO_H);
-   if (status < 0) {
-   DRM_ERROR("failed read PAGE2_GPIO_H: %d\n", status);
-   goto err_regulators_disable;
-   }
-   if ((status & PS_GPIO9) == PS_GPIO9)
-   break;
-
-   msleep(20);
+   if (ret < 0) {
+   DRM_ERROR("failed read PAGE2_GPIO_H: %d\n", ret);
+   goto err_regulators_disable;
}
 
msleep(50);
@@ -144,22 +155,15 @@ static void ps8640_bridge_poweron(struct ps8640 
*ps_bridge)
 * disabled by the manufacturer. Once disabled, all MCS commands are
 * ignored by the display interface.
 */
-   status = i2c_smbus_read_byte_data(client, PAGE2_MCS_EN);
-   if (status < 0) {
-   DRM_ERROR("failed read PAGE2_MCS_EN: %d\n", status);
-   goto err_regulators_disable;
-   }
 
-   ret = i2c_smbus_write_byte_data(client, PAGE2_MCS_EN,
-   status & ~MCS_EN);
+   ret = regmap_update_bits(map, PAGE2_MCS_EN, MCS_EN, 0);
if (ret < 0) {
DRM_ERROR("failed write PAGE2_MCS_EN: %d\n", ret);
goto err_regulators_disable;
}
 
/* Switch access edp panel's edid through i2c */
-   ret = i2c_smbus_write_byte_data(client, PAGE2_I2C_BYPASS,
-   I2C_BYPASS_EN);
+   ret = regmap_write(map, PAGE2_I2C_BYPASS, I2C_BYPASS_EN);
if (ret < 0) {
DRM_ERROR("failed write PAGE2_I2C_BYPASS: %d\n", ret);
goto err_regulators_disable;
@@ -361,6 +365,10 @@ static int ps8640_probe(struct i2c_client *client)

Re: [PATCH] doc: gpu: Add document describing buffer exchange

2021-09-08 Thread Daniel Vetter

On Sun, Sep 05, 2021 at 01:27:42PM +0100, Daniel Stone wrote:
> Since there's a lot of confusion around this, document both the rules
> and the best practice around negotiating, allocating, importing, and
> using buffers when crossing context/process/device/subsystem boundaries.
> 
> This ties up all of dmabuf, formats and modifiers, and their usage.
> 
> Signed-off-by: Daniel Stone 
> ---
> 
> This is just a quick first draft, inspired by:
>   https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3197#note_1048637
> 
> It's not complete or perfect, but I'm off to eat a roast then have a
> nice walk in the sun, so figured it'd be better to dash it off rather
> than let it rot on my hard drive.
> 
> 
>  .../gpu/exchanging-pixel-buffers.rst  | 285 ++

I think we should stuff this into the dma-buf.rst page instead of hiding
it in gpu?

Maybe then link to it from everywhere, so from a the prime stuff in gpu,
and from whatever doc there is for the v4l import/export ioctls.

>  Documentation/gpu/index.rst   |   1 +
>  2 files changed, 286 insertions(+)
>  create mode 100644 Documentation/gpu/exchanging-pixel-buffers.rst
> 
> diff --git a/Documentation/gpu/exchanging-pixel-buffers.rst 
> b/Documentation/gpu/exchanging-pixel-buffers.rst
> new file mode 100644
> index ..75c4de13d5c8
> --- /dev/null
> +++ b/Documentation/gpu/exchanging-pixel-buffers.rst
> @@ -0,0 +1,285 @@
> +.. Copyright 2021 Collabora Ltd.
> +
> +
> +Exchanging pixel buffers
> +
> +
> +As originally designed, the Linux graphics subsystem had extremely limited
> +support for sharing pixel-buffer allocations between processes, devices, and
> +subsystems. Modern systems require extensive integration between all three
> +classes; this document details how applications and kernel subsystems should
> +approach this sharing for two-dimensional image data.
> +
> +It is written with reference to the DRM subsystem for GPU and display 
> devices,
> +V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspace
> +support, however any other subsystems should also follow this design and 
> advice.
> +
> +
> +Formats and modifiers
> +=
> +
> +Each buffer must have an underlying format. This format describes the data 
> which
> +can be stored and loaded for each pixel. Although each subsystem has its own
> +format descriptions (e.g. V4L2 and fbdev), the `DRM_FORMAT_*` tokens should 
> be
> +reused wherever possible, as they are the standard descriptions used for
> +interchange.
> +
> +Each `DRM_FORMAT_*` token describes the per-pixel data available, in terms of
> +the translation between one or more pixels in memory, and the color data
> +contained within that memory. The number and type of color channels are
> +described: whether they are RGB or YUV, integer or floating-point, the size
> +of each channel and their locations within the pixel memory, and the
> +relationship between color planes.
> +
> +For example, `DRM_FORMAT_ARGB` describes a format in which each pixel 
> has a
> +single 32-bit value in memory. Alpha, red, green, and blue, color channels 
> are
> +available at 8-byte precision per channel, ordered respectively from most to
> +least significant bits in little-endian storage. As a more complex example,
> +`DRM_FORMAT_NV12` describes a format in which luma and chroma YUV samples are
> +stored in separate memory planes, where the chroma plane is stored at half 
> the
> +resolution in both dimensions (i.e. one U/V chroma sample is stored for each 
> 2x2
> +pixel grouping).
> +
> +Format modifiers describe a translation mechanism between these per-pixel 
> memory
> +samples, and the actual memory storage for the buffer. The most 
> straightforward
> +modifier is `DRM_FORMAT_MOD_LINEAR`, describing a scheme in which each pixel 
> has
> +contiguous storage beginning at (0,0); each pixel's location in memory will 
> be
> +`base + (y * stride) + (x * bpp)`. This is considered the baseline 
> interchange
> +format, and most convenient for CPU access.
> +
> +Modern hardware employs much more sophisticated access mechanisms, typically
> +making use of tiled access and possibly also compression. For example, the
> +`DRM_FORMAT_MOD_VIVANTE_TILED` modifier describes memory storage where pixels
> +are stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile 
> in
> +memory stores pixels (0,0) to (3,3) inclusive, and the second tile in memory
> +stores pixels (4,0) to (7,3) inclusive.
> +
> +Some modifiers may modify the number of memory buffers required to store the
> +data; for example, the `I915_FORMAT_MOD_Y_TILED_CCS` modifier adds a second
> +memory buffer to RGB formats in which it stores data about the status of 
> every
> +tile, notably including whether the tile is fully populated with pixel data, 
> or
> +can be expanded from a single solid color.
> +
> +These extended layouts are highly vendor-spe

Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

2021-09-08 Thread Rob Clark

On Wed, Sep 8, 2021 at 10:54 AM Daniel Vetter  wrote:
>
> On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > Signed-off-by: Rob Clark 
> > ---
> >  drivers/dma-buf/dma-fence-chain.c | 13 +
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/drivers/dma-buf/dma-fence-chain.c 
> > b/drivers/dma-buf/dma-fence-chain.c
> > index 1b4cb3e5cec9..736a9ad3ea6d 100644
> > --- a/drivers/dma-buf/dma-fence-chain.c
> > +++ b/drivers/dma-buf/dma-fence-chain.c
> > @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence 
> > *fence)
> >   dma_fence_free(fence);
> >  }
> >
> > +
> > +static void dma_fence_chain_set_deadline(struct dma_fence *fence,
> > +  ktime_t deadline)
> > +{
> > + dma_fence_chain_for_each(fence, fence) {
> > + struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> > + struct dma_fence *f = chain ? chain->fence : fence;
>
> Doesn't this just end up calling set_deadline on a chain, potenetially
> resulting in recursion? Also I don't think this should ever happen, why
> did you add that?

Tbh the fence-chain was the part I was a bit fuzzy about, and the main
reason I added igt tests.  The iteration is similar to how, for ex,
dma_fence_chain_signaled() work, and according to the igt test it does
what was intended

BR,
-R

> -Daniel
>
> > +
> > + dma_fence_set_deadline(f, deadline);
> > + }
> > +}
> > +
> >  const struct dma_fence_ops dma_fence_chain_ops = {
> >   .use_64bit_seqno = true,
> >   .get_driver_name = dma_fence_chain_get_driver_name,
> > @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
> >   .enable_signaling = dma_fence_chain_enable_signaling,
> >   .signaled = dma_fence_chain_signaled,
> >   .release = dma_fence_chain_release,
> > + .set_deadline = dma_fence_chain_set_deadline,
> >  };
> >  EXPORT_SYMBOL(dma_fence_chain_ops);
> >
> > --
> > 2.31.1
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Re: [PATCH v2 7/7] drm/gud: Add module parameter to control emulation: xrgb8888

2021-09-08 Thread Thomas Zimmermann


Hi

Am 07.09.21 um 13:57 schrieb Noralf Trønnes:

For devices that don't support XRGB give the user the ability to
choose what's most important: Color depth or frames per second.

Add an 'xrgb' module parameter to override the emulation format.

Assume the user wants full control if xrgb is set and don't set
DRM_CAP_DUMB_PREFERRED_DEPTH if RGB565 is supported (AFAIK only X.org
supports this).


More of a general statement: wouldn't it make more sense to auto-detect 
this entirely? The GUD protocol could order the list of supported 
formats by preference (maybe it does already). Or you could take the 
type of USB connection into account.


Additionally, xrgb is really a fall-back for lazy userspace 
programs, but userspace should do better IMHO.


Best regards
Thomas



Signed-off-by: Noralf Trønnes 
---
  drivers/gpu/drm/gud/gud_drv.c | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/gud/gud_drv.c b/drivers/gpu/drm/gud/gud_drv.c
index 3f9d4b9a1e3d..60d27ee5ddbd 100644
--- a/drivers/gpu/drm/gud/gud_drv.c
+++ b/drivers/gpu/drm/gud/gud_drv.c
@@ -30,6 +30,10 @@
  
  #include "gud_internal.h"
  
+static int gud_xrgb;

+module_param_named(xrgb, gud_xrgb, int, 0644);
+MODULE_PARM_DESC(xrgb, "XRGB emulation format: GUD_PIXEL_FORMAT_* value, 
0=auto, -1=disable [default=auto]");
+
  /* Only used internally */
  static const struct drm_format_info gud_drm_format_r1 = {
.format = GUD_DRM_FORMAT_R1,
@@ -530,12 +534,12 @@ static int gud_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
case DRM_FORMAT_RGB332:
fallthrough;
case DRM_FORMAT_RGB888:
-   if (!xrgb_emulation_format)
+   if (!gud_xrgb && !xrgb_emulation_format)
xrgb_emulation_format = info;
break;
case DRM_FORMAT_RGB565:
rgb565_supported = true;
-   if (!xrgb_emulation_format)
+   if (!gud_xrgb && !xrgb_emulation_format)
xrgb_emulation_format = info;
break;
case DRM_FORMAT_XRGB:
@@ -543,6 +547,9 @@ static int gud_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
break;
}
  
+		if (gud_xrgb == formats_dev[i])

+   xrgb_emulation_format = info;
+
fmt_buf_size = drm_format_info_min_pitch(info, 0, 
drm->mode_config.max_width) *
   drm->mode_config.max_height;
max_buffer_size = max(max_buffer_size, fmt_buf_size);
@@ -559,7 +566,7 @@ static int gud_probe(struct usb_interface *intf, const 
struct usb_device_id *id)
}
  
  	/* Prefer speed over color depth */

-   if (rgb565_supported)
+   if (!gud_xrgb && rgb565_supported)
drm->mode_config.preferred_depth = 16;
  
  	if (!xrgb_supported && xrgb_emulation_format) {




--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH] drm/rockchip: Update crtc fixup to account for fractional clk change

2021-09-08 Thread Andy Shevchenko

On Wed, Sep 08, 2021 at 08:53:56AM -0500, Chris Morgan wrote:
> From: Chris Morgan 
> 
> After commit 928f9e268611 ("clk: fractional-divider: Hide
> clk_fractional_divider_ops from wide audience") was merged it appears
> that the DSI panel on my Odroid Go Advance stopped working. Upon closer
> examination of the problem, it looks like it was the fixup in the
> rockchip_drm_vop.c file was causing the issue. The changes made to the
> clk driver appear to change some assumptions made in the fixup.
> 
> After debugging the working 5.14 kernel and the no-longer working
> 5.15 kernel, it looks like this was broken all along but still
> worked, whereas after the fractional clock change it stopped
> working despite the issue (it went from sort-of broken to very broken).
> 
> In the 5.14 kernel the dclk_vopb_frac was being requested to be set to
> 17000999 on my board. The clock driver was taking the value of the
> parent clock and attempting to divide the requested value from it
> (1700/17000999 = 0), then subtracting 1 from it (making it -1),
> and running it through fls_long to get 64. It would then subtract
> the value of fd->mwidth from it to get 48, and then bit shift
> 17000999 to the left by 48, coming up with a very large number of
> 7649082492112076800. This resulted in a numerator of 65535 and a
> denominator of 1 from the clk driver. The driver seemingly would
> try again and get a correct 1:1 value later, and then move on.
> 
> Output from my 5.14 kernel (with some printfs for good measure):
> [2.830066] rockchip-drm display-subsystem: bound ff46.vop (ops 
> vop_component_ops)
> [2.839431] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> dw_mipi_dsi_rockchip_ops)
> [2.855980] Clock is dclk_vopb_frac
> [2.856004] Scale 64, Rate 7649082492112076800, Oldrate 17000999, Parent 
> Rate 1700, Best Numerator 65535, Best Denominator 1, fd->mwidth 16
> [2.903529] Clock is dclk_vopb_frac
> [2.903556] Scale 0, Rate 1700, Oldrate 1700, Parent Rate 
> 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16
> [2.903579] Clock is dclk_vopb_frac
> [2.903583] Scale 0, Rate 1700, Oldrate 1700, Parent Rate 
> 1700, Best Numerator 1, Best Denominator 1, fd->mwidth 16
> 
> Contrast this with 5.15 after the clk change where the rate of 17000999
> was getting passed and resulted in numerators/denomiators of 17001/
> 17000.
> 
> Output from my 5.15 kernel (with some printfs added for good measure):
> [2.817571] rockchip-drm display-subsystem: bound ff46.vop (ops 
> vop_component_ops)
> [2.826975] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> dw_mipi_dsi_rockchip_ops)
> [2.843430] Rate 17000999, Parent Rate 1700, Best Numerator 17018, 
> Best Denominator 17017
> [2.891073] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> Best Denominator 17000
> [2.891269] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> Best Denominator 17000
> [2.891281] Rate 17001000, Parent Rate 1700, Best Numerator 17001, 
> Best Denominator 17000
> 
> After tracing through the code it appeared that this function here was
> adding a 999 to the requested frequency because of how the clk driver
> was rounding/accepting those frequencies. I believe after the changes
> made in the commit listed above the assumptions listed in this driver
> are no longer true. When I remove the + 999 from the driver the DSI
> panel begins to work again.
> 
> Output from my 5.15 kernel with 999 removed (printfs added):
> [2.852054] rockchip-drm display-subsystem: bound ff46.vop (ops 
> vop_component_ops)
> [2.864483] rockchip-drm display-subsystem: bound ff45.dsi (ops 
> dw_mipi_dsi_rockchip_ops)
> [2.880869] Clock is dclk_vopb_frac
> [2.880892] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> Denominator 1
> [2.928521] Clock is dclk_vopb_frac
> [2.928551] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> Denominator 1
> [2.928570] Clock is dclk_vopb_frac
> [2.928574] Rate 1700, Parent Rate 1700, Best Numerator 1, Best 
> Denominator 1
> 
> I have tested the change extensively on my Odroid Go Advance (Rockchip
> RK3326) and it appears to work well. However, this change will affect
> all Rockchip SoCs that use this driver so I believe further testing
> is warranted. Please note that without this change I can confirm
> at least all PX30s with DSI panels will stop working with the 5.15
> kernel.

To me it all makes a lot of sense, thank you for deep analysis of the issue!
In any case I think we will need a Fixes tag to something (either one of
clk-fractional-divider.c series or preexisted).

Anyway, FWIW,
Reviewed-by: Andy Shevchenko 

> Signed-off-by: Chris Morgan 
> ---
>  drivers/gpu/drm/rockchip/rockchip_drm_vop.c | 21 +++--
>  1 file changed, 3 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c

Re: [PATCH v3 6/9] dma-buf/fence-array: Add fence deadline support

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 11:47:57AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/dma-buf/dma-fence-array.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-fence-array.c 
> b/drivers/dma-buf/dma-fence-array.c
> index d3fbd950be94..8d194b09ee3d 100644
> --- a/drivers/dma-buf/dma-fence-array.c
> +++ b/drivers/dma-buf/dma-fence-array.c
> @@ -119,12 +119,23 @@ static void dma_fence_array_release(struct dma_fence 
> *fence)
>   dma_fence_free(fence);
>  }
>  
> +static void dma_fence_array_set_deadline(struct dma_fence *fence,
> +  ktime_t deadline)
> +{
> + struct dma_fence_array *array = to_dma_fence_array(fence);
> + unsigned i;
> +
> + for (i = 0; i < array->num_fences; ++i)
> + dma_fence_set_deadline(array->fences[i], deadline);

Hm I wonder whether this can go wrong, and whether we need Christian's
massive fence iterator that I've seen flying around. If you nest these
things too much it could all go wrong I think. I looked at other users
which inspect dma_fence_array and none of them have a risk for unbounded
recursion.

Maybe check with Christian.
-Daniel


> +}
> +
>  const struct dma_fence_ops dma_fence_array_ops = {
>   .get_driver_name = dma_fence_array_get_driver_name,
>   .get_timeline_name = dma_fence_array_get_timeline_name,
>   .enable_signaling = dma_fence_array_enable_signaling,
>   .signaled = dma_fence_array_signaled,
>   .release = dma_fence_array_release,
> + .set_deadline = dma_fence_array_set_deadline,
>  };
>  EXPORT_SYMBOL(dma_fence_array_ops);
>  
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v3 1/9] dma-fence: Add deadline awareness

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 11:47:52AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Add a way to hint to the fence signaler of an upcoming deadline, such as
> vblank, which the fence waiter would prefer not to miss.  This is to aid
> the fence signaler in making power management decisions, like boosting
> frequency as the deadline approaches and awareness of missing deadlines
> so that can be factored in to the frequency scaling.
> 
> v2: Drop dma_fence::deadline and related logic to filter duplicate
> deadlines, to avoid increasing dma_fence size.  The fence-context
> implementation will need similar logic to track deadlines of all
> the fences on the same timeline.  [ckoenig]
> 
> Signed-off-by: Rob Clark 
> Reviewed-by: Christian König 
> Signed-off-by: Rob Clark 
> ---
>  drivers/dma-buf/dma-fence.c | 20 
>  include/linux/dma-fence.h   | 16 
>  2 files changed, 36 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index ce0f5eff575d..1f444863b94d 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -910,6 +910,26 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
> uint32_t count,
>  }
>  EXPORT_SYMBOL(dma_fence_wait_any_timeout);
>  
> +
> +/**
> + * dma_fence_set_deadline - set desired fence-wait deadline
> + * @fence:the fence that is to be waited on
> + * @deadline: the time by which the waiter hopes for the fence to be
> + *signaled
> + *
> + * Inform the fence signaler of an upcoming deadline, such as vblank, by
> + * which point the waiter would prefer the fence to be signaled by.  This
> + * is intended to give feedback to the fence signaler to aid in power
> + * management decisions, such as boosting GPU frequency if a periodic
> + * vblank deadline is approaching.
> + */
> +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> +{
> + if (fence->ops->set_deadline && !dma_fence_is_signaled(fence))
> + fence->ops->set_deadline(fence, deadline);
> +}
> +EXPORT_SYMBOL(dma_fence_set_deadline);
> +
>  /**
>   * dma_fence_init - Initialize a custom fence.
>   * @fence: the fence to initialize
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 6ffb4b2c6371..9c809f0d5d0a 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -99,6 +99,7 @@ enum dma_fence_flag_bits {
>   DMA_FENCE_FLAG_SIGNALED_BIT,
>   DMA_FENCE_FLAG_TIMESTAMP_BIT,
>   DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
> + DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
>   DMA_FENCE_FLAG_USER_BITS, /* must always be last member */
>  };
>  
> @@ -261,6 +262,19 @@ struct dma_fence_ops {
>*/
>   void (*timeline_value_str)(struct dma_fence *fence,
>  char *str, int size);
> +
> + /**
> +  * @set_deadline:
> +  *
> +  * Callback to allow a fence waiter to inform the fence signaler of an
> +  * upcoming deadline, such as vblank, by which point the waiter would
> +  * prefer the fence to be signaled by.  This is intended to give 
> feedback
> +  * to the fence signaler to aid in power management decisions, such as
> +  * boosting GPU frequency.

Please add here that this callback is called without &dma_fence.lock held,
and that locking is up to callers if they have some state to manage.

I realized that while scratching some heads over your later patches.
-Daniel

> +  *
> +  * This callback is optional.
> +  */
> + void (*set_deadline)(struct dma_fence *fence, ktime_t deadline);
>  };
>  
>  void dma_fence_init(struct dma_fence *fence, const struct dma_fence_ops *ops,
> @@ -586,6 +600,8 @@ static inline signed long dma_fence_wait(struct dma_fence 
> *fence, bool intr)
>   return ret < 0 ? ret : 0;
>  }
>  
> +void dma_fence_set_deadline(struct dma_fence *fence, ktime_t deadline);
> +
>  struct dma_fence *dma_fence_get_stub(void);
>  struct dma_fence *dma_fence_allocate_private_stub(void);
>  u64 dma_fence_context_alloc(unsigned num);
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v3 7/9] dma-buf/fence-chain: Add fence deadline support

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 11:47:58AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/dma-buf/dma-fence-chain.c | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-fence-chain.c 
> b/drivers/dma-buf/dma-fence-chain.c
> index 1b4cb3e5cec9..736a9ad3ea6d 100644
> --- a/drivers/dma-buf/dma-fence-chain.c
> +++ b/drivers/dma-buf/dma-fence-chain.c
> @@ -208,6 +208,18 @@ static void dma_fence_chain_release(struct dma_fence 
> *fence)
>   dma_fence_free(fence);
>  }
>  
> +
> +static void dma_fence_chain_set_deadline(struct dma_fence *fence,
> +  ktime_t deadline)
> +{
> + dma_fence_chain_for_each(fence, fence) {
> + struct dma_fence_chain *chain = to_dma_fence_chain(fence);
> + struct dma_fence *f = chain ? chain->fence : fence;

Doesn't this just end up calling set_deadline on a chain, potenetially
resulting in recursion? Also I don't think this should ever happen, why
did you add that?
-Daniel

> +
> + dma_fence_set_deadline(f, deadline);
> + }
> +}
> +
>  const struct dma_fence_ops dma_fence_chain_ops = {
>   .use_64bit_seqno = true,
>   .get_driver_name = dma_fence_chain_get_driver_name,
> @@ -215,6 +227,7 @@ const struct dma_fence_ops dma_fence_chain_ops = {
>   .enable_signaling = dma_fence_chain_enable_signaling,
>   .signaled = dma_fence_chain_signaled,
>   .release = dma_fence_chain_release,
> + .set_deadline = dma_fence_chain_set_deadline,
>  };
>  EXPORT_SYMBOL(dma_fence_chain_ops);
>  
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v3 5/9] drm/msm: Add deadline based boost support

2021-09-08 Thread Rob Clark

On Wed, Sep 8, 2021 at 10:48 AM Daniel Vetter  wrote:
>
> On Fri, Sep 03, 2021 at 11:47:56AM -0700, Rob Clark wrote:
> > From: Rob Clark 
> >
> > Signed-off-by: Rob Clark 
>
> Why do you need a kthread_work here? Is this just to make sure you're
> running at realtime prio? Maybe a comment to that effect would be good.

Mostly because we are already using a kthread_worker for things the
GPU needs to kick off to a different context.. but I think this is
something we'd want at a realtime prio

BR,
-R

> -Daniel
>
> > ---
> >  drivers/gpu/drm/msm/msm_fence.c   | 76 +++
> >  drivers/gpu/drm/msm/msm_fence.h   | 20 +++
> >  drivers/gpu/drm/msm/msm_gpu.h |  1 +
> >  drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++
> >  4 files changed, 117 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/msm/msm_fence.c 
> > b/drivers/gpu/drm/msm/msm_fence.c
> > index f2cece542c3f..67c2a96e1c85 100644
> > --- a/drivers/gpu/drm/msm/msm_fence.c
> > +++ b/drivers/gpu/drm/msm/msm_fence.c
> > @@ -8,6 +8,37 @@
> >
> >  #include "msm_drv.h"
> >  #include "msm_fence.h"
> > +#include "msm_gpu.h"
> > +
> > +static inline bool fence_completed(struct msm_fence_context *fctx, 
> > uint32_t fence);
> > +
> > +static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx)
> > +{
> > + struct msm_drm_private *priv = fctx->dev->dev_private;
> > + return priv->gpu;
> > +}
> > +
> > +static enum hrtimer_restart deadline_timer(struct hrtimer *t)
> > +{
> > + struct msm_fence_context *fctx = container_of(t,
> > + struct msm_fence_context, deadline_timer);
> > +
> > + kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work);
> > +
> > + return HRTIMER_NORESTART;
> > +}
> > +
> > +static void deadline_work(struct kthread_work *work)
> > +{
> > + struct msm_fence_context *fctx = container_of(work,
> > + struct msm_fence_context, deadline_work);
> > +
> > + /* If deadline fence has already passed, nothing to do: */
> > + if (fence_completed(fctx, fctx->next_deadline_fence))
> > + return;
> > +
> > + msm_devfreq_boost(fctx2gpu(fctx), 2);
> > +}
> >
> >
> >  struct msm_fence_context *
> > @@ -26,6 +57,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile 
> > uint32_t *fenceptr,
> >   fctx->fenceptr = fenceptr;
> >   spin_lock_init(&fctx->spinlock);
> >
> > + hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, 
> > HRTIMER_MODE_ABS);
> > + fctx->deadline_timer.function = deadline_timer;
> > +
> > + kthread_init_work(&fctx->deadline_work, deadline_work);
> > +
> > + fctx->next_deadline = ktime_get();
> > +
> >   return fctx;
> >  }
> >
> > @@ -49,6 +87,8 @@ void msm_update_fence(struct msm_fence_context *fctx, 
> > uint32_t fence)
> >  {
> >   spin_lock(&fctx->spinlock);
> >   fctx->completed_fence = max(fence, fctx->completed_fence);
> > + if (fence_completed(fctx, fctx->next_deadline_fence))
> > + hrtimer_cancel(&fctx->deadline_timer);
> >   spin_unlock(&fctx->spinlock);
> >  }
> >
> > @@ -79,10 +119,46 @@ static bool msm_fence_signaled(struct dma_fence *fence)
> >   return fence_completed(f->fctx, f->base.seqno);
> >  }
> >
> > +static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t 
> > deadline)
> > +{
> > + struct msm_fence *f = to_msm_fence(fence);
> > + struct msm_fence_context *fctx = f->fctx;
> > + unsigned long flags;
> > + ktime_t now;
> > +
> > + spin_lock_irqsave(&fctx->spinlock, flags);
> > + now = ktime_get();
> > +
> > + if (ktime_after(now, fctx->next_deadline) ||
> > + ktime_before(deadline, fctx->next_deadline)) {
> > + fctx->next_deadline = deadline;
> > + fctx->next_deadline_fence =
> > + max(fctx->next_deadline_fence, 
> > (uint32_t)fence->seqno);
> > +
> > + /*
> > +  * Set timer to trigger boost 3ms before deadline, or
> > +  * if we are already less than 3ms before the deadline
> > +  * schedule boost work immediately.
> > +  */
> > + deadline = ktime_sub(deadline, ms_to_ktime(3));
> > +
> > + if (ktime_after(now, deadline)) {
> > + kthread_queue_work(fctx2gpu(fctx)->worker,
> > + &fctx->deadline_work);
> > + } else {
> > + hrtimer_start(&fctx->deadline_timer, deadline,
> > + HRTIMER_MODE_ABS);
> > + }
> > + }
> > +
> > + spin_unlock_irqrestore(&fctx->spinlock, flags);
> > +}
> > +
> >  static const struct dma_fence_ops msm_fence_ops = {
> >   .get_driver_name = msm_fence_get_driver_name,
> >   .get_timeline_name = msm_fence_get_timeline_name,
> >   .signaled = msm_fence_signaled,
> > + .set_deadline = msm_fence_set_deadline,
> >  };
> >
> >  struct dma_fence *
> > diff --git a/d

Re: [PATCH 13/14] drm/kmb: Enable alpha blended second plane

2021-09-08 Thread Thomas Zimmermann


Hi

Am 03.08.21 um 07:10 schrieb Sam Ravnborg:

Hi Anitha,

On Mon, Aug 02, 2021 at 08:44:26PM +, Chrisanthus, Anitha wrote:

Hi Sam,
Thanks. Where should this go, drm-misc-fixes or drm-misc-next?


Looks like a drm-misc-next candidate to me.
I may improve something for existing users, but it does not look like it
fixes an existing bug.


I found this patch in drm-misc-fixes, although it doesn't look like a 
bugfix. It should have gone into drm-misc-next. See [1]. If it indeed 
belongs into drm-misc-fixes, it certainly should have contained a Fixes tag.


Best regards
Thomas

[1] 
https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-misc.html#where-do-i-apply-my-patch




Sam



--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH v3 8/9] dma-buf/sync_file: Add SET_DEADLINE ioctl

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 11:47:59AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> The initial purpose is for igt tests, but this would also be useful for
> compositors that wait until close to vblank deadline to make decisions
> about which frame to show.
> 
> Signed-off-by: Rob Clark 

Needs userspace and I think ideally also some igts to make sure it works
and doesn't go boom.
-Daniel

> ---
>  drivers/dma-buf/sync_file.c| 19 +++
>  include/uapi/linux/sync_file.h | 20 
>  2 files changed, 39 insertions(+)
> 
> diff --git a/drivers/dma-buf/sync_file.c b/drivers/dma-buf/sync_file.c
> index 394e6e1e9686..f295772d5169 100644
> --- a/drivers/dma-buf/sync_file.c
> +++ b/drivers/dma-buf/sync_file.c
> @@ -459,6 +459,22 @@ static long sync_file_ioctl_fence_info(struct sync_file 
> *sync_file,
>   return ret;
>  }
>  
> +static int sync_file_ioctl_set_deadline(struct sync_file *sync_file,
> + unsigned long arg)
> +{
> + struct sync_set_deadline ts;
> +
> + if (copy_from_user(&ts, (void __user *)arg, sizeof(ts)))
> + return -EFAULT;
> +
> + if (ts.pad)
> + return -EINVAL;
> +
> + dma_fence_set_deadline(sync_file->fence, ktime_set(ts.tv_sec, 
> ts.tv_nsec));
> +
> + return 0;
> +}
> +
>  static long sync_file_ioctl(struct file *file, unsigned int cmd,
>   unsigned long arg)
>  {
> @@ -471,6 +487,9 @@ static long sync_file_ioctl(struct file *file, unsigned 
> int cmd,
>   case SYNC_IOC_FILE_INFO:
>   return sync_file_ioctl_fence_info(sync_file, arg);
>  
> + case SYNC_IOC_SET_DEADLINE:
> + return sync_file_ioctl_set_deadline(sync_file, arg);
> +
>   default:
>   return -ENOTTY;
>   }
> diff --git a/include/uapi/linux/sync_file.h b/include/uapi/linux/sync_file.h
> index ee2dcfb3d660..f67d4ffe7566 100644
> --- a/include/uapi/linux/sync_file.h
> +++ b/include/uapi/linux/sync_file.h
> @@ -67,6 +67,18 @@ struct sync_file_info {
>   __u64   sync_fence_info;
>  };
>  
> +/**
> + * struct sync_set_deadline - set a deadline on a fence
> + * @tv_sec:  seconds elapsed since epoch
> + * @tv_nsec: nanoseconds elapsed since the time given by the tv_sec
> + * @pad: must be zero
> + */
> +struct sync_set_deadline {
> + __s64   tv_sec;
> + __s32   tv_nsec;
> + __u32   pad;
> +};
> +
>  #define SYNC_IOC_MAGIC   '>'
>  
>  /**
> @@ -95,4 +107,12 @@ struct sync_file_info {
>   */
>  #define SYNC_IOC_FILE_INFO   _IOWR(SYNC_IOC_MAGIC, 4, struct sync_file_info)
>  
> +
> +/**
> + * DOC: SYNC_IOC_SET_DEADLINE - set a deadline on a fence
> + *
> + * Allows userspace to set a deadline on a fence, see 
> dma_fence_set_deadline()
> + */
> +#define SYNC_IOC_SET_DEADLINE_IOW(SYNC_IOC_MAGIC, 5, struct 
> sync_set_deadline)
> +
>  #endif /* _UAPI_LINUX_SYNC_H */
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH v3 5/9] drm/msm: Add deadline based boost support

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 11:47:56AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> Signed-off-by: Rob Clark 

Why do you need a kthread_work here? Is this just to make sure you're
running at realtime prio? Maybe a comment to that effect would be good.
-Daniel

> ---
>  drivers/gpu/drm/msm/msm_fence.c   | 76 +++
>  drivers/gpu/drm/msm/msm_fence.h   | 20 +++
>  drivers/gpu/drm/msm/msm_gpu.h |  1 +
>  drivers/gpu/drm/msm/msm_gpu_devfreq.c | 20 +++
>  4 files changed, 117 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c
> index f2cece542c3f..67c2a96e1c85 100644
> --- a/drivers/gpu/drm/msm/msm_fence.c
> +++ b/drivers/gpu/drm/msm/msm_fence.c
> @@ -8,6 +8,37 @@
>  
>  #include "msm_drv.h"
>  #include "msm_fence.h"
> +#include "msm_gpu.h"
> +
> +static inline bool fence_completed(struct msm_fence_context *fctx, uint32_t 
> fence);
> +
> +static struct msm_gpu *fctx2gpu(struct msm_fence_context *fctx)
> +{
> + struct msm_drm_private *priv = fctx->dev->dev_private;
> + return priv->gpu;
> +}
> +
> +static enum hrtimer_restart deadline_timer(struct hrtimer *t)
> +{
> + struct msm_fence_context *fctx = container_of(t,
> + struct msm_fence_context, deadline_timer);
> +
> + kthread_queue_work(fctx2gpu(fctx)->worker, &fctx->deadline_work);
> +
> + return HRTIMER_NORESTART;
> +}
> +
> +static void deadline_work(struct kthread_work *work)
> +{
> + struct msm_fence_context *fctx = container_of(work,
> + struct msm_fence_context, deadline_work);
> +
> + /* If deadline fence has already passed, nothing to do: */
> + if (fence_completed(fctx, fctx->next_deadline_fence))
> + return;
> +
> + msm_devfreq_boost(fctx2gpu(fctx), 2);
> +}
>  
>  
>  struct msm_fence_context *
> @@ -26,6 +57,13 @@ msm_fence_context_alloc(struct drm_device *dev, volatile 
> uint32_t *fenceptr,
>   fctx->fenceptr = fenceptr;
>   spin_lock_init(&fctx->spinlock);
>  
> + hrtimer_init(&fctx->deadline_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
> + fctx->deadline_timer.function = deadline_timer;
> +
> + kthread_init_work(&fctx->deadline_work, deadline_work);
> +
> + fctx->next_deadline = ktime_get();
> +
>   return fctx;
>  }
>  
> @@ -49,6 +87,8 @@ void msm_update_fence(struct msm_fence_context *fctx, 
> uint32_t fence)
>  {
>   spin_lock(&fctx->spinlock);
>   fctx->completed_fence = max(fence, fctx->completed_fence);
> + if (fence_completed(fctx, fctx->next_deadline_fence))
> + hrtimer_cancel(&fctx->deadline_timer);
>   spin_unlock(&fctx->spinlock);
>  }
>  
> @@ -79,10 +119,46 @@ static bool msm_fence_signaled(struct dma_fence *fence)
>   return fence_completed(f->fctx, f->base.seqno);
>  }
>  
> +static void msm_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> +{
> + struct msm_fence *f = to_msm_fence(fence);
> + struct msm_fence_context *fctx = f->fctx;
> + unsigned long flags;
> + ktime_t now;
> +
> + spin_lock_irqsave(&fctx->spinlock, flags);
> + now = ktime_get();
> +
> + if (ktime_after(now, fctx->next_deadline) ||
> + ktime_before(deadline, fctx->next_deadline)) {
> + fctx->next_deadline = deadline;
> + fctx->next_deadline_fence =
> + max(fctx->next_deadline_fence, (uint32_t)fence->seqno);
> +
> + /*
> +  * Set timer to trigger boost 3ms before deadline, or
> +  * if we are already less than 3ms before the deadline
> +  * schedule boost work immediately.
> +  */
> + deadline = ktime_sub(deadline, ms_to_ktime(3));
> +
> + if (ktime_after(now, deadline)) {
> + kthread_queue_work(fctx2gpu(fctx)->worker,
> + &fctx->deadline_work);
> + } else {
> + hrtimer_start(&fctx->deadline_timer, deadline,
> + HRTIMER_MODE_ABS);
> + }
> + }
> +
> + spin_unlock_irqrestore(&fctx->spinlock, flags);
> +}
> +
>  static const struct dma_fence_ops msm_fence_ops = {
>   .get_driver_name = msm_fence_get_driver_name,
>   .get_timeline_name = msm_fence_get_timeline_name,
>   .signaled = msm_fence_signaled,
> + .set_deadline = msm_fence_set_deadline,
>  };
>  
>  struct dma_fence *
> diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h
> index 4783db528bcc..d34e853c555a 100644
> --- a/drivers/gpu/drm/msm/msm_fence.h
> +++ b/drivers/gpu/drm/msm/msm_fence.h
> @@ -50,6 +50,26 @@ struct msm_fence_context {
>   volatile uint32_t *fenceptr;
>  
>   spinlock_t spinlock;
> +
> + /*
> +  * TODO this doesn't really deal with multiple deadlines, like
> +  * if userspace got multiple frames ahead.. OTOH atomic updates
> +  * don't queue, so maybe that is

Re: [PATCH v3 4/9] drm/scheduler: Add fence deadline support

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 11:47:55AM -0700, Rob Clark wrote:
> From: Rob Clark 
> 
> As the finished fence is the one that is exposed to userspace, and
> therefore the one that other operations, like atomic update, would
> block on, we need to propagate the deadline from from the finished
> fence to the actual hw fence.
> 
> v2: Split into drm_sched_fence_set_parent() (ckoenig)
> 
> Signed-off-by: Rob Clark 
> ---
>  drivers/gpu/drm/scheduler/sched_fence.c | 34 +
>  drivers/gpu/drm/scheduler/sched_main.c  |  2 +-
>  include/drm/gpu_scheduler.h |  8 ++
>  3 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
> b/drivers/gpu/drm/scheduler/sched_fence.c
> index bcea035cf4c6..4fc41a71d1c7 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -128,6 +128,30 @@ static void drm_sched_fence_release_finished(struct 
> dma_fence *f)
>   dma_fence_put(&fence->scheduled);
>  }
>  
> +static void drm_sched_fence_set_deadline_finished(struct dma_fence *f,
> +   ktime_t deadline)
> +{
> + struct drm_sched_fence *fence = to_drm_sched_fence(f);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&fence->lock, flags);
> +
> + /* If we already have an earlier deadline, keep it: */
> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags) &&
> + ktime_before(fence->deadline, deadline)) {
> + spin_unlock_irqrestore(&fence->lock, flags);
> + return;
> + }
> +
> + fence->deadline = deadline;
> + set_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT, &f->flags);
> +
> + spin_unlock_irqrestore(&fence->lock, flags);
> +
> + if (fence->parent)
> + dma_fence_set_deadline(fence->parent, deadline);
> +}
> +
>  static const struct dma_fence_ops drm_sched_fence_ops_scheduled = {
>   .get_driver_name = drm_sched_fence_get_driver_name,
>   .get_timeline_name = drm_sched_fence_get_timeline_name,
> @@ -138,6 +162,7 @@ static const struct dma_fence_ops 
> drm_sched_fence_ops_finished = {
>   .get_driver_name = drm_sched_fence_get_driver_name,
>   .get_timeline_name = drm_sched_fence_get_timeline_name,
>   .release = drm_sched_fence_release_finished,
> + .set_deadline = drm_sched_fence_set_deadline_finished,
>  };
>  
>  struct drm_sched_fence *to_drm_sched_fence(struct dma_fence *f)
> @@ -152,6 +177,15 @@ struct drm_sched_fence *to_drm_sched_fence(struct 
> dma_fence *f)
>  }
>  EXPORT_SYMBOL(to_drm_sched_fence);
>  
> +void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
> + struct dma_fence *fence)
> +{
> + s_fence->parent = dma_fence_get(fence);
> + if (test_bit(DMA_FENCE_FLAG_HAS_DEADLINE_BIT,
> +  &s_fence->finished.flags))

Don't you need the spinlock here too to avoid races? test_bit is very
unordered, so guarantees nothing. Spinlock would need to be both around
->parent = and the test_bit.

Entirely aside, but there's discussions going on to preallocate the hw
fence somehow. If we do that we could make the deadline forwarding
lockless here. Having a spinlock just to set the parent is a bit annoying
...

Alternative is that you do this locklessly with barriers and a _lot_ of
comments. Would be good to benchmark whether the overhead matters though
first.
-Daniel

> + dma_fence_set_deadline(fence, s_fence->deadline);
> +}
> +
>  struct drm_sched_fence *drm_sched_fence_alloc(struct drm_sched_entity 
> *entity,
> void *owner)
>  {
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index 595e47ff7d06..27bf0ac0625f 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -978,7 +978,7 @@ static int drm_sched_main(void *param)
>   drm_sched_fence_scheduled(s_fence);
>  
>   if (!IS_ERR_OR_NULL(fence)) {
> - s_fence->parent = dma_fence_get(fence);
> + drm_sched_fence_set_parent(s_fence, fence);
>   r = dma_fence_add_callback(fence, &sched_job->cb,
>  drm_sched_job_done_cb);
>   if (r == -ENOENT)
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 7f77a455722c..158ddd662469 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -238,6 +238,12 @@ struct drm_sched_fence {
>   */
>   struct dma_fencefinished;
>  
> + /**
> +  * @deadline: deadline set on &drm_sched_fence.finished which
> +  * potentially needs to be propagated to &drm_sched_fence.parent
> +  */
> + ktime_t deadline;
> +
>  /**
>   * @parent: the fence returned by &drm_sched_backend_ops.run_

[PULL] drm-misc-fixes

2021-09-08 Thread Thomas Zimmermann

Hi Dave and Daniel,

here's this week's PR for drm-misc-fixes. One patch is a potential deadlock
in TTM, the other enables an additional plane in kmb. I'm slightly unhappy
that the latter one ended up in -fixes as it's not a bugfix AFAICT.

Best regards
Thomas

drm-misc-fixes-2021-09-08:
Short summary of fixes pull:

 * kmb: Emable second plane
 * ttm: Fix potential deadlock during swap

The following changes since commit fa0b1ef5f7a694f48e00804a391245f3471aa155:

  drm: Copy drm_wait_vblank to user before returning (2021-08-17 13:56:03 -0400)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-misc tags/drm-misc-fixes-2021-09-08

for you to fetch changes up to c8704b7ec182f9293e6a994310c7d4203428cdfb:

  drm/kmb: Enable alpha blended second plane (2021-09-07 10:10:30 -0700)


Short summary of fixes pull:

 * kmb: Emable second plane
 * ttm: Fix potential deadlock during swap


Edmund Dea (1):
  drm/kmb: Enable alpha blended second plane

xinhui pan (1):
  drm/ttm: Fix a deadlock if the target BO is not idle during swap

 drivers/gpu/drm/kmb/kmb_drv.c   |  8 ++--
 drivers/gpu/drm/kmb/kmb_drv.h   |  5 +++
 drivers/gpu/drm/kmb/kmb_plane.c | 81 -
 drivers/gpu/drm/kmb/kmb_plane.h |  5 ++-
 drivers/gpu/drm/kmb/kmb_regs.h  |  3 ++
 drivers/gpu/drm/ttm/ttm_bo.c|  6 +--
 6 files changed, 90 insertions(+), 18 deletions(-)

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer

Re: [PATCH v2 1/2] drm: document drm_mode_create_lease object requirements

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 01:00:16PM +, Simon Ser wrote:
> validate_lease expects one CRTC, one connector and one plane.
> 
> Signed-off-by: Simon Ser 
> Cc: Daniel Vetter 
> Cc: Pekka Paalanen 
> Cc: Keith Packard 

Reviewed-by: Daniel Vetter 

> ---
>  include/uapi/drm/drm_mode.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
> index 90c55383f1ee..e4a2570a6058 100644
> --- a/include/uapi/drm/drm_mode.h
> +++ b/include/uapi/drm/drm_mode.h
> @@ -1110,6 +1110,9 @@ struct drm_mode_destroy_blob {
>   * struct drm_mode_create_lease - Create lease
>   *
>   * Lease mode resources, creating another drm_master.
> + *
> + * The @object_ids array must reference at least one CRTC, one connector and
> + * one plane if &DRM_CLIENT_CAP_UNIVERSAL_PLANES is enabled.
>   */
>  struct drm_mode_create_lease {
>   /** @object_ids: Pointer to array of object ids (__u32) */
> -- 
> 2.33.0
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm/i915/request: fix early tracepoints

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 12:24:05PM +0100, Matthew Auld wrote:
> Currently we blow up in trace_dma_fence_init, when calling into
> get_driver_name or get_timeline_name, since both the engine and context
> might be NULL(or contain some garbage address) in the case of newly
> allocated slab objects via the request ctor. Note that we also use
> SLAB_TYPESAFE_BY_RCU here, which allows requests to be immediately
> freed, but delay freeing the underlying page by an RCU grace period.
> With this scheme requests can be re-allocated, at the same time as they
> are also being read by some lockless RCU lookup mechanism.
> 
> One possible fix, since we don't yet have a fully initialised request
> when in the ctor, is just setting the context/engine as NULL and adding
> some extra handling in get_driver_name etc. And since the ctor is only
> called for new slab objects(i.e allocate new page and call the ctor for
> each object) it's safe to reset the context/engine prior to calling into
> dma_fence_init, since we can be certain that no one is doing an RCU
> lookup which might depend on peeking at the engine/context, like in
> active_engine(), since the object can't yet be externally visible.
> 
> In the recycled case(which might also be externally visible) the request
> refcount always transitions from 0->1 after we set the context/engine
> etc, which should ensure it's valid to dereference the engine for
> example, when doing an RCU list-walk, so long as we can also increment
> the refcount first. If the refcount is already zero, then the request is
> considered complete/released.  If it's non-zero, then the request might
> be in the process of being re-allocated, or potentially still in flight,
> however after successfully incrementing the refcount, it's possible to
> carefully inspect the request state, to determine if the request is
> still what we were looking for. Note that all externally visible
> requests returned to the cache must have zero refcount.

The commit message here is a bit confusing, since you start out with
describing a solution that you're not actually implementing it. I usually
do this by putting alternate solutions at the bottom, starting with "An
alternate solution would be ..." or so.

And then closing with why we don't do that, here it would be that we do
no longer have a need for these partially set up i915_requests, and
therefore just reverting that complication is the simplest solution.

> An alternative fix then is to instead move the dma_fence_init out from
> the request ctor. Originally this was how it was done, but it was moved
> in:
> 
> commit 855e39e65cfc33a73724f1cc644ffc5754864a20
> Author: Chris Wilson 
> Date:   Mon Feb 3 09:41:48 2020 +
> 
> drm/i915: Initialise basic fence before acquiring seqno
> 
> where it looks like intel_timeline_get_seqno() relied on some of the
> rq->fence state, but that is no longer the case since:
> 
> commit 12ca695d2c1ed26b2dcbb528b42813bd0f216cfc
> Author: Maarten Lankhorst 
> Date:   Tue Mar 23 16:49:50 2021 +0100
> 
> drm/i915: Do not share hwsp across contexts any more, v8.
> 
> intel_timeline_get_seqno() could also be cleaned up slightly by dropping
> the request argument.
> 
> Moving dma_fence_init back out of the ctor, should ensure we have enough
> of the request initialised in case of trace_dma_fence_init.
> Functionally this should be the same, and is effectively what we were
> already open coding before, except now we also assign the fence->lock
> and fence->ops, but since these are invariant for recycled
> requests(which might be externally visible), and will therefore already
> hold the same value, it shouldn't matter. We still leave the
> spin_lock_init() in the ctor, since we can't re-init the rq->lock in
> case it is already held.

Holding rq->lock without having a full reference to it sounds like really
bad taste. I think it would be good to have a (kerneldoc) comment next to
i915_request.lock about this, with a FIXME. But separate patch.

> Fixes: 855e39e65cfc ("drm/i915: Initialise basic fence before acquiring 
> seqno")
> Signed-off-by: Matthew Auld 
> Cc: Michael Mason 
> Cc: Daniel Vetter 

With the commit message restructured a bit, and assuming this one actually
works:

Reviewed-by: Daniel Vetter 

But I'm really not confident :-(
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_request.c | 11 ++-
>  1 file changed, 2 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c 
> b/drivers/gpu/drm/i915/i915_request.c
> index ce446716d092..79da5eca60af 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -829,8 +829,6 @@ static void __i915_request_ctor(void *arg)
>   i915_sw_fence_init(&rq->submit, submit_notify);
>   i915_sw_fence_init(&rq->semaphore, semaphore_notify);
>  
> - dma_fence_init(&rq->fence, &i915_fence_ops, &rq->lock, 0, 0);
> -
>   rq->capture_list = NULL;
>  
>   init_llist_head(&rq->execute_cb);
> @@ -905,

Re: [PATCH] drm/sched: Fix drm_sched_fence_free() so it can be passed an uninitialized fence

2021-09-08 Thread Daniel Vetter

On Fri, Sep 03, 2021 at 02:05:54PM +0200, Boris Brezillon wrote:
> drm_sched_job_cleanup() will pass an uninitialized fence to
> drm_sched_fence_free(), which will cause to_drm_sched_fence() to return
> a NULL fence object, causing a NULL pointer deref when this NULL object
> is passed to kmem_cache_free().
> 
> Let's create a new drm_sched_fence_free() function that takes a
> drm_sched_fence pointer and suffix the old function with _rcu. While at
> it, complain if drm_sched_fence_free() is passed an initialized fence
> or if drm_sched_fence_free_rcu() is passed an uninitialized fence.
> 
> Fixes: dbe48d030b28 ("drm/sched: Split drm_sched_job_init")
> Signed-off-by: Boris Brezillon 
> ---
> Found while debugging another issue in panfrost causing a failure in
> the submit ioctl and exercising the error path (path that calls
> drm_sched_job_cleanup() on an unarmed job).

Reviewed-by: Daniel Vetter 

I already provided an irc r-b, just here for the record too.
-Daniel

> ---
>  drivers/gpu/drm/scheduler/sched_fence.c | 29 -
>  drivers/gpu/drm/scheduler/sched_main.c  |  2 +-
>  include/drm/gpu_scheduler.h |  2 +-
>  3 files changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_fence.c 
> b/drivers/gpu/drm/scheduler/sched_fence.c
> index db3fd1303fc4..7fd869520ef2 100644
> --- a/drivers/gpu/drm/scheduler/sched_fence.c
> +++ b/drivers/gpu/drm/scheduler/sched_fence.c
> @@ -69,19 +69,28 @@ static const char 
> *drm_sched_fence_get_timeline_name(struct dma_fence *f)
>   return (const char *)fence->sched->name;
>  }
>  
> -/**
> - * drm_sched_fence_free - free up the fence memory
> - *
> - * @rcu: RCU callback head
> - *
> - * Free up the fence memory after the RCU grace period.
> - */
> -void drm_sched_fence_free(struct rcu_head *rcu)
> +static void drm_sched_fence_free_rcu(struct rcu_head *rcu)
>  {
>   struct dma_fence *f = container_of(rcu, struct dma_fence, rcu);
>   struct drm_sched_fence *fence = to_drm_sched_fence(f);
>  
> - kmem_cache_free(sched_fence_slab, fence);
> + if (!WARN_ON_ONCE(!fence))
> + kmem_cache_free(sched_fence_slab, fence);
> +}
> +
> +/**
> + * drm_sched_fence_free - free up an uninitialized fence
> + *
> + * @fence: fence to free
> + *
> + * Free up the fence memory. Should only be used if drm_sched_fence_init()
> + * has not been called yet.
> + */
> +void drm_sched_fence_free(struct drm_sched_fence *fence)
> +{
> + /* This function should not be called if the fence has been 
> initialized. */
> + if (!WARN_ON_ONCE(fence->sched))
> + kmem_cache_free(sched_fence_slab, fence);
>  }
>  
>  /**
> @@ -97,7 +106,7 @@ static void drm_sched_fence_release_scheduled(struct 
> dma_fence *f)
>   struct drm_sched_fence *fence = to_drm_sched_fence(f);
>  
>   dma_fence_put(fence->parent);
> - call_rcu(&fence->finished.rcu, drm_sched_fence_free);
> + call_rcu(&fence->finished.rcu, drm_sched_fence_free_rcu);
>  }
>  
>  /**
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index fbbd3b03902f..6987d412a946 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -750,7 +750,7 @@ void drm_sched_job_cleanup(struct drm_sched_job *job)
>   dma_fence_put(&job->s_fence->finished);
>   } else {
>   /* aborted job before committing to run it */
> - drm_sched_fence_free(&job->s_fence->finished.rcu);
> + drm_sched_fence_free(job->s_fence);
>   }
>  
>   job->s_fence = NULL;
> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
> index 7f77a455722c..f011e4c407f2 100644
> --- a/include/drm/gpu_scheduler.h
> +++ b/include/drm/gpu_scheduler.h
> @@ -509,7 +509,7 @@ struct drm_sched_fence *drm_sched_fence_alloc(
>   struct drm_sched_entity *s_entity, void *owner);
>  void drm_sched_fence_init(struct drm_sched_fence *fence,
> struct drm_sched_entity *entity);
> -void drm_sched_fence_free(struct rcu_head *rcu);
> +void drm_sched_fence_free(struct drm_sched_fence *fence);
>  
>  void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
>  void drm_sched_fence_finished(struct drm_sched_fence *fence);
> -- 
> 2.31.1
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH][next] drm/i915: clean up inconsistent indenting

2021-09-08 Thread Daniel Vetter

On Thu, Sep 02, 2021 at 10:57:37PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> There is a statement that is indented one character too deeply,
> clean this up.
> 
> Signed-off-by: Colin Ian King 

Queued to drm-intel-gt-next, thanks for patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/gt/intel_execlists_submission.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index de5f9c86b9a4..aeb324b701ec 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -2565,7 +2565,7 @@ __execlists_context_pre_pin(struct intel_context *ce,
>   if (!__test_and_set_bit(CONTEXT_INIT_BIT, &ce->flags)) {
>   lrc_init_state(ce, engine, *vaddr);
>  
> -  __i915_gem_object_flush_map(ce->state->obj, 0, 
> engine->context_size);
> + __i915_gem_object_flush_map(ce->state->obj, 0, 
> engine->context_size);
>   }
>  
>   return 0;
> -- 
> 2.32.0
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [PATCH] drm/ttm: provide default page protection for UML

2021-09-08 Thread Daniel Vetter

On Sat, Sep 04, 2021 at 11:50:37AM +0800, David Gow wrote:
> On Thu, Sep 2, 2021 at 10:46 PM Daniel Vetter  wrote:
> >
> > On Thu, Sep 02, 2021 at 07:19:01AM +0100, Anton Ivanov wrote:
> > > On 02/09/2021 06:52, Randy Dunlap wrote:
> > > > On 9/1/21 10:48 PM, Anton Ivanov wrote:
> > > > > On 02/09/2021 03:01, Randy Dunlap wrote:
> > > > > > boot_cpu_data [struct cpuinfo_um (on UML)] does not have a struct
> > > > > > member named 'x86', so provide a default page protection mode
> > > > > > for CONFIG_UML.
> > > > > >
> > > > > > Mends this build error:
> > > > > > ../drivers/gpu/drm/ttm/ttm_module.c: In function
> > > > > > ‘ttm_prot_from_caching’:
> > > > > > ../drivers/gpu/drm/ttm/ttm_module.c:59:24: error: ‘struct
> > > > > > cpuinfo_um’ has no member named ‘x86’
> > > > > >else if (boot_cpu_data.x86 > 3)
> > > > > >  ^
> > > > > >
> > > > > > Fixes: 3bf3710e3718 ("drm/ttm: Add a generic TTM memcpy move for
> > > > > > page-based iomem")
> > > > > > Signed-off-by: Randy Dunlap 
> > > > > > Cc: Thomas Hellström 
> > > > > > Cc: Christian König 
> > > > > > Cc: Huang Rui 
> > > > > > Cc: dri-devel@lists.freedesktop.org
> > > > > > Cc: Jeff Dike 
> > > > > > Cc: Richard Weinberger 
> > > > > > Cc: Anton Ivanov 
> > > > > > Cc: linux...@lists.infradead.org
> > > > > > Cc: David Airlie 
> > > > > > Cc: Daniel Vetter 
> > > > > > ---
> > > > > >   drivers/gpu/drm/ttm/ttm_module.c |4 
> > > > > >   1 file changed, 4 insertions(+)
> > > > > >
> > > > > > --- linux-next-20210901.orig/drivers/gpu/drm/ttm/ttm_module.c
> > > > > > +++ linux-next-20210901/drivers/gpu/drm/ttm/ttm_module.c
> > > > > > @@ -53,6 +53,9 @@ pgprot_t ttm_prot_from_caching(enum ttm_
> > > > > >   if (caching == ttm_cached)
> > > > > >   return tmp;
> > > > > > +#ifdef CONFIG_UML
> > > > > > +tmp = pgprot_noncached(tmp);
> > > > > > +#else
> > > > > >   #if defined(__i386__) || defined(__x86_64__)
> > > > > >   if (caching == ttm_write_combined)
> > > > > >   tmp = pgprot_writecombine(tmp);
> > > > > > @@ -69,6 +72,7 @@ pgprot_t ttm_prot_from_caching(enum ttm_
> > > > > >   #if defined(__sparc__)
> > > > > >   tmp = pgprot_noncached(tmp);
> > > > > >   #endif
> > > > > > +#endif
> > > > > >   return tmp;
> > > > > >   }
> > > > >
> > > > > Patch looks OK.
> > > > >
> > > > > I have a question though - why all of DRM is not !UML in config. Not
> > > > > like we can use them.
> > > >
> > > > I have no idea about that.
> > > > Hopefully one of the (other) UML maintainers can answer you.
> > >
> > > Touche.
> > >
> > > We will discuss that and possibly push a patch to !UML that part of the
> > > tree. IMHO it is not applicable.
> >
> > I thought kunit is based on top of uml, and we do want to eventually adopt
> > that. Especially for helper libraries like ttm.
> 
> UML is not actually a dependency for KUnit, so it's definitely
> possible to test things which aren't compatible with UML. (In fact,
> there's even now some tooling support to use qemu instead on a number
> of architectures.)
> 
> That being said, the KUnit tooling does use UML by default, so if it's
> not too difficult to keep some level of UML support, it'll make it a
> little easier (and faster) for people to run any KUnit tests.

Yeah my understanding is that uml is the quickest way to spawn a new
kernel, which kunit needs to run. And I really do like that idea, because
having virtualization support in cloud CI systems (which use containers
themselves) is a bit a fun exercise. The less we rely on virtual machines
in containers for that, the better.

Hence why I really like the uml approach for kunit.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Re: [Intel-gfx] [PATCH v2] drm/i915: Handle Intel igfx + Intel dgfx hybrid graphics setup

2021-09-08 Thread Daniel Vetter

On Thu, Sep 02, 2021 at 04:01:40PM +0100, Tvrtko Ursulin wrote:
> 
> On 02/09/2021 15:33, Daniel Vetter wrote:
> > On Tue, Aug 31, 2021 at 02:18:15PM +0100, Tvrtko Ursulin wrote:
> > > 
> > > On 31/08/2021 13:43, Daniel Vetter wrote:
> > > > On Tue, Aug 31, 2021 at 10:15:03AM +0100, Tvrtko Ursulin wrote:
> > > > > 
> > > > > On 30/08/2021 09:26, Daniel Vetter wrote:
> > > > > > On Fri, Aug 27, 2021 at 03:44:42PM +0100, Tvrtko Ursulin wrote:
> > > > > > > 
> > > > > > > On 27/08/2021 15:39, Tvrtko Ursulin wrote:
> > > > > > > > From: Tvrtko Ursulin 
> > > > > > > > 
> > > > > > > > In short this makes i915 work for hybrid setups (DRI_PRIME=1 
> > > > > > > > with Mesa)
> > > > > > > > when rendering is done on Intel dgfx and scanout/composition on 
> > > > > > > > Intel
> > > > > > > > igfx.
> > > > > > > > 
> > > > > > > > Before this patch the driver was not quite ready for that 
> > > > > > > > setup, mainly
> > > > > > > > because it was able to emit a semaphore wait between the two 
> > > > > > > > GPUs, which
> > > > > > > > results in deadlocks because semaphore target location in HWSP 
> > > > > > > > is neither
> > > > > > > > shared between the two, nor mapped in both GGTT spaces.
> > > > > > > > 
> > > > > > > > To fix it the patch adds an additional check to a couple of 
> > > > > > > > relevant code
> > > > > > > > paths in order to prevent using semaphores for inter-engine
> > > > > > > > synchronisation between different driver instances.
> > > > > > > > 
> > > > > > > > Patch also moves singly used i915_gem_object_last_write_engine 
> > > > > > > > to be
> > > > > > > > private in its only calling unit (debugfs), while modifying it 
> > > > > > > > to only
> > > > > > > > show activity belonging to the respective driver instance.
> > > > > > > > 
> > > > > > > > What remains in this problem space is the question of the GEM 
> > > > > > > > busy ioctl.
> > > > > > > > We have a somewhat ambigous comment there saying only status of 
> > > > > > > > native
> > > > > > > > fences will be reported, which could be interpreted as either 
> > > > > > > > i915, or
> > > > > > > > native to the drm fd. For now I have decided to leave that as 
> > > > > > > > is, meaning
> > > > > > > > any i915 instance activity continues to be reported.
> > > > > > > > 
> > > > > > > > v2:
> > > > > > > >  * Avoid adding rq->i915. (Chris)
> > > > > > > > 
> > > > > > > > Signed-off-by: Tvrtko Ursulin 
> > > > > > 
> > > > > > Can't we just delete semaphore code and done?
> > > > > > - GuC won't have it
> > > > > > - media team benchmarked on top of softpin media driver, found no
> > > > > >  difference
> > > > > 
> > > > > You have S-curve for saturated workloads or something else? How 
> > > > > thorough and
> > > > > which media team I guess.
> > > > > 
> > > > >   From memory it was a nice win for some benchmarks (non-saturated 
> > > > > ones), but
> > > > > as I have told you previously, we haven't been putting numbers in 
> > > > > commit
> > > > > messages since it wasn't allowed. I may be able to dig out some more 
> > > > > details
> > > > > if I went trawling through GEM channel IRC logs, although probably 
> > > > > not the
> > > > > actual numbers since those were usually on pastebin. Or you go an 
> > > > > talk with
> > > > > Chris since he probably remembers more details. Or you just decide 
> > > > > you don't
> > > > > care and remove it. I wouldn't do that without putting the complete 
> > > > > story in
> > > > > writing, but it's your call after all.
> > > > 
> > > > Media has also changed, they're not using relocations anymore.
> > > 
> > > Meaning you think it changes the benchmarking story? When coupled with
> > > removal of GPU relocations then possibly yes.
> > > 
> > > > Unless there's solid data performance tuning of any kind that gets in 
> > > > the
> > > > way simply needs to be removed. Yes this is radical, but the codebase is
> > > > in a state to require this.
> > > > 
> > > > So either way we'd need to rebenchmark this if it's really required. 
> > > > Also
> > > 
> > > Therefore can you share what benchmarks have been done or is it secret?  
> > > As
> > > said, I think the non-saturated case was the more interesting one here.
> > > 
> > > > if we really need this code still someone needs to fix the design, the
> > > > current code is making layering violations an art form.
> > > > 
> > > > > Anyway, without the debugfs churn it is more or less two line patch 
> > > > > to fix
> > > > > igfx + dgfx hybrid setup. So while mulling it over this could go in. 
> > > > > I'd
> > > > > just refine it to use a GGTT check instead of GT. And unless DG1 ends 
> > > > > up
> > > > > being GuC only.
> > > > 
> > > > The minimal robust fix here is imo to stop us from upcasting dma_fence 
> > > > to
> > > > i915_request if it's not for our device. Not sprinkle code here into the
> > > > semaphore code. We shouldn't even get this far with foreign fences.
> > > 
> > > Device check does not w

1 2 >

1 - 100 of 159 matches

Mail list logo