Re: [PATCH 0/8] drm/panel: Some very minor err handling fixes + more _multi

2024-05-19 Thread Linus Walleij
On Fri, May 17, 2024 at 11:37 PM Douglas Anderson  wrote:

> This series is pretty much just addressing a few minor error handling
> bugs that I noticed recently while reviewing some panel patches. For
> the most part these are all old issues.

All patches look good to me:
Reviewed-by: Linus Walleij 

Yours,
Linus Walleij


Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Joe Perches
On Sat, 2024-05-18 at 11:23 -0700, Guenter Roeck wrote:
> On 5/18/24 10:32, Kees Cook wrote:
> 
[]
> > I think the INT_MAX test is actually better in this case because
> > nvif_object_ioctl()'s size argument is u32:
> > 
> > ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL);
> >
> > 
> > So that could wrap around, even though the allocation may not.
> > 
> > Better yet, since "sizeof(*args) + size" is repeated 3 times in the
> > function, I'd recommend:
> > 
> > ...
> > u32 args_size;
> > 
> > if (check_add_overflow(sizeof(*args), size, _size))
> > return -ENOMEM;
> > if (args_size > sizeof(stack)) {
> > if (!(args = kmalloc(args_size, GFP_KERNEL)))

trivia:

More typical kernel style would use separate alloc and test

args = kmalloc(args_size, GFP_KERNEL);
if (!args)

> > return -ENOMEM;
> >  } else {
> >  args = (void *)stack;
> >  }
> > ...
> >  ret = nvif_object_ioctl(object, args, args_size, NULL);
> > 
> > This will catch the u32 overflow to nvif_object_ioctl(), catch an
> > allocation underflow on 32-bits systems, and make the code more
> > readable. :)
> > 
> 
> Makes sense. I'll change that and send v2.
> 
> Thanks,
> Guenter
> 
> 



Re: [PATCH v2] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 11:29:23AM -0700, Guenter Roeck wrote:
> Trying to build parisc:allmodconfig with gcc 12.x or later results
> in the following build error.
> 
> drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
> drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
>   'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
> overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
>   161 | memcpy(data, args->mthd.data, size);
>   | ^~~
> drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
> drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
>   'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
> overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
>   298 | memcpy(data, args->new.data, size);
> 
> gcc assumes that 'sizeof(*args) + size' can overflow, which would result
> in the problem.
> 
> The problem is not new, only it is now no longer a warning but an error
> since W=1 has been enabled for the drm subsystem and since Werror is
> enabled for test builds.
> 
> Rearrange arithmetic and use check_add_overflow() for validating the
> allocation size to avoid the overflow.
> 
> Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
> subsystem")
> Cc: Javier Martinez Canillas 
> Cc: Jani Nikula 
> Cc: Thomas Zimmermann 
> Cc: Danilo Krummrich 
> Cc: Maxime Ripard 
> Cc: Kees Cook 
> Cc: Christophe JAILLET 
> Signed-off-by: Guenter Roeck 

Yeah, looks good to me. Thanks!

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH net-next v9 04/14] netdev: support binding dma-buf to netdevice

2024-05-18 Thread David Wei
On 2024-05-10 16:21, Mina Almasry wrote:
> -/* Stub */
>  int netdev_nl_bind_rx_doit(struct sk_buff *skb, struct genl_info *info)
>  {
> - return 0;
> + struct nlattr *tb[ARRAY_SIZE(netdev_queue_dmabuf_nl_policy)];
> + struct net_devmem_dmabuf_binding *out_binding;
> + struct list_head *sock_binding_list;
> + u32 ifindex, dmabuf_fd, rxq_idx;
> + struct net_device *netdev;
> + struct sk_buff *rsp;
> + struct nlattr *attr;
> + int rem, err = 0;
> + void *hdr;
> +
> + if (GENL_REQ_ATTR_CHECK(info, NETDEV_A_DEV_IFINDEX) ||
> + GENL_REQ_ATTR_CHECK(info, NETDEV_A_BIND_DMABUF_DMABUF_FD) ||
> + GENL_REQ_ATTR_CHECK(info, NETDEV_A_BIND_DMABUF_QUEUES))
> + return -EINVAL;
> +
> + ifindex = nla_get_u32(info->attrs[NETDEV_A_DEV_IFINDEX]);
> + dmabuf_fd = nla_get_u32(info->attrs[NETDEV_A_BIND_DMABUF_DMABUF_FD]);
> +
> + rtnl_lock();
> +
> + netdev = __dev_get_by_index(genl_info_net(info), ifindex);
> + if (!netdev) {
> + err = -ENODEV;
> + goto err_unlock;
> + }
> +
> + err = net_devmem_bind_dmabuf(netdev, dmabuf_fd, _binding);
> + if (err)
> + goto err_unlock;
> +
> + nla_for_each_attr(attr, genlmsg_data(info->genlhdr),
> +   genlmsg_len(info->genlhdr), rem) {
> + if (nla_type(attr) != NETDEV_A_BIND_DMABUF_QUEUES)
> + continue;
> +
> + err = nla_parse_nested(
> + tb, ARRAY_SIZE(netdev_queue_dmabuf_nl_policy) - 1, attr,
> + netdev_queue_dmabuf_nl_policy, info->extack);
> + if (err < 0)
> + goto err_unbind;
> +
> + rxq_idx = nla_get_u32(tb[NETDEV_A_QUEUE_DMABUF_IDX]);
> + if (rxq_idx >= netdev->num_rx_queues) {
> + err = -ERANGE;
> + goto err_unbind;
> + }

net_devmem_bind_dmabuf_to_queue() checks for rxq_idx >=
netdev->num_rx_queues as well. I'd say remove the one in
netdev_nl_bind_rx_doit().

Also we may want a generic netdev function e.g. netdev_rx_queue_set_mp()
since I need the same functionality.

> +
> + err = net_devmem_bind_dmabuf_to_queue(netdev, rxq_idx,
> +   out_binding);
> + if (err)
> + goto err_unbind;
> + }
> +
> + sock_binding_list = genl_sk_priv_get(_nl_family,
> +  NETLINK_CB(skb).sk);
> + if (IS_ERR(sock_binding_list)) {
> + err = PTR_ERR(sock_binding_list);
> + goto err_unbind;
> + }
> +
> + list_add(_binding->list, sock_binding_list);
> +
> + rsp = genlmsg_new(GENLMSG_DEFAULT_SIZE, GFP_KERNEL);
> + if (!rsp) {
> + err = -ENOMEM;
> + goto err_unbind;
> + }
> +
> + hdr = genlmsg_iput(rsp, info);
> + if (!hdr) {
> + err = -EMSGSIZE;
> + goto err_genlmsg_free;
> + }
> +
> + nla_put_u32(rsp, NETDEV_A_BIND_DMABUF_DMABUF_ID, out_binding->id);
> + genlmsg_end(rsp, hdr);
> +
> + rtnl_unlock();
> +
> + return genlmsg_reply(rsp, info);
> +
> +err_genlmsg_free:
> + nlmsg_free(rsp);
> +err_unbind:
> + net_devmem_unbind_dmabuf(out_binding);
> +err_unlock:
> + rtnl_unlock();
> + return err;
>  }


Re: [PATCH net-next v9 04/14] netdev: support binding dma-buf to netdevice

2024-05-18 Thread David Wei
On 2024-05-10 16:21, Mina Almasry wrote:
> +void net_devmem_unbind_dmabuf(struct net_devmem_dmabuf_binding *binding)
> +{
> + struct netdev_rx_queue *rxq;
> + unsigned long xa_idx;
> + unsigned int rxq_idx;
> +
> + if (!binding)
> + return;
> +
> + if (binding->list.next)
> + list_del(>list);
> +
> + xa_for_each(>bound_rxq_list, xa_idx, rxq) {
> + if (rxq->mp_params.mp_priv == binding) {
> + /* We hold the rtnl_lock while binding/unbinding
> +  * dma-buf, so we can't race with another thread that
> +  * is also modifying this value. However, the page_pool
> +  * may read this config while it's creating its
> +  * rx-queues. WRITE_ONCE() here to match the
> +  * READ_ONCE() in the page_pool.
> +  */
> + WRITE_ONCE(rxq->mp_params.mp_ops, NULL);
> + WRITE_ONCE(rxq->mp_params.mp_priv, NULL);
> +
> + rxq_idx = get_netdev_rx_queue_index(rxq);
> +
> + netdev_rx_queue_restart(binding->dev, rxq_idx);

What if netdev_rx_queue_restart() fails? Depending on where it failed, a
queue might still be filled from struct net_devmem_dmabuf_binding. This
is one downside of the current situation with netdev_rx_queue_restart()
needing to do allocations each time.

Perhaps a full reset if individual queue restart fails?



[PATCH v2] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Guenter Roeck
Trying to build parisc:allmodconfig with gcc 12.x or later results
in the following build error.

drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
  161 | memcpy(data, args->mthd.data, size);
  | ^~~
drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
  298 | memcpy(data, args->new.data, size);

gcc assumes that 'sizeof(*args) + size' can overflow, which would result
in the problem.

The problem is not new, only it is now no longer a warning but an error
since W=1 has been enabled for the drm subsystem and since Werror is
enabled for test builds.

Rearrange arithmetic and use check_add_overflow() for validating the
allocation size to avoid the overflow.

Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
subsystem")
Cc: Javier Martinez Canillas 
Cc: Jani Nikula 
Cc: Thomas Zimmermann 
Cc: Danilo Krummrich 
Cc: Maxime Ripard 
Cc: Kees Cook 
Cc: Christophe JAILLET 
Signed-off-by: Guenter Roeck 
---
v2: Use check_add_overflow() to calculate the allocation size and to check
for overflows.

 drivers/gpu/drm/nouveau/nvif/object.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/object.c 
b/drivers/gpu/drm/nouveau/nvif/object.c
index 4d1aaee8fe15..89a812f812af 100644
--- a/drivers/gpu/drm/nouveau/nvif/object.c
+++ b/drivers/gpu/drm/nouveau/nvif/object.c
@@ -142,11 +142,15 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, 
void *data, u32 size)
struct nvif_ioctl_v0 ioctl;
struct nvif_ioctl_mthd_v0 mthd;
} *args;
+   u32 args_size;
u8 stack[128];
int ret;
 
-   if (sizeof(*args) + size > sizeof(stack)) {
-   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
+   if (check_add_overflow(sizeof(*args), size, _size))
+   return -ENOMEM;
+
+   if (args_size > sizeof(stack)) {
+   if (!(args = kmalloc(args_size, GFP_KERNEL)))
return -ENOMEM;
} else {
args = (void *)stack;
@@ -157,7 +161,7 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, void 
*data, u32 size)
args->mthd.method = mthd;
 
memcpy(args->mthd.data, data, size);
-   ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL);
+   ret = nvif_object_ioctl(object, args, args_size, NULL);
memcpy(data, args->mthd.data, size);
if (args != (void *)stack)
kfree(args);
@@ -276,7 +280,14 @@ nvif_object_ctor(struct nvif_object *parent, const char 
*name, u32 handle,
object->map.size = 0;
 
if (parent) {
-   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) {
+   u32 args_size;
+
+   if (check_add_overflow(sizeof(*args), size, _size)) {
+   nvif_object_dtor(object);
+   return -ENOMEM;
+   }
+
+   if (!(args = kmalloc(args_size, GFP_KERNEL))) {
nvif_object_dtor(object);
return -ENOMEM;
}
@@ -293,8 +304,7 @@ nvif_object_ctor(struct nvif_object *parent, const char 
*name, u32 handle,
args->new.oclass = oclass;
 
memcpy(args->new.data, data, size);
-   ret = nvif_object_ioctl(parent, args, sizeof(*args) + size,
-   >priv);
+   ret = nvif_object_ioctl(parent, args, args_size, >priv);
memcpy(data, args->new.data, size);
kfree(args);
if (ret == 0)
-- 
2.39.2



Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Guenter Roeck

On 5/18/24 10:32, Kees Cook wrote:

On Sat, May 18, 2024 at 06:54:36PM +0200, Christophe JAILLET wrote:

(adding linux-harden...@vger.kernel.org)


Le 18/05/2024 à 16:37, Guenter Roeck a écrit :

Trying to build parisc:allmodconfig with gcc 12.x or later results
in the following build error.

drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
161 | memcpy(data, args->mthd.data, size);
| ^~~
drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
298 | memcpy(data, args->new.data, size);

gcc assumes that 'sizeof(*args) + size' can overflow, which would result
in the problem.

The problem is not new, only it is now no longer a warning but an error since 
W=1
has been enabled for the drm subsystem and since Werror is enabled for test 
builds.

Rearrange arithmetic and add extra size checks to avoid the overflow.

Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
subsystem")
Cc: Javier Martinez Canillas 
Cc: Jani Nikula 
Cc: Thomas Zimmermann 
Cc: Danilo Krummrich 
Cc: Maxime Ripard 
Signed-off-by: Guenter Roeck 
---
checkpatch complains about the line length in the description and the 
(pre-existing)
assignlemts in if conditions, but I did not want to split lines in the 
description
or rearrange the code further.

I don't know why I only see the problem with parisc builds (at least so far).

   drivers/gpu/drm/nouveau/nvif/object.c | 8 +---
   1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/object.c 
b/drivers/gpu/drm/nouveau/nvif/object.c
index 4d1aaee8fe15..baf623a48874 100644
--- a/drivers/gpu/drm/nouveau/nvif/object.c
+++ b/drivers/gpu/drm/nouveau/nvif/object.c
@@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, void 
*data, u32 size)
u8 stack[128];
int ret;
-   if (sizeof(*args) + size > sizeof(stack)) {
-   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
+   if (size > sizeof(stack) - sizeof(*args)) {
+   if (size > INT_MAX ||
+   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))


Hi,

Would it be cleaner or better to use size_add(sizeof(*args), size)?


I think the INT_MAX test is actually better in this case because
nvif_object_ioctl()'s size argument is u32:

ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL);
   

So that could wrap around, even though the allocation may not.

Better yet, since "sizeof(*args) + size" is repeated 3 times in the
function, I'd recommend:

...
u32 args_size;

if (check_add_overflow(sizeof(*args), size, _size))
return -ENOMEM;
if (args_size > sizeof(stack)) {
if (!(args = kmalloc(args_size, GFP_KERNEL)))
return -ENOMEM;
 } else {
 args = (void *)stack;
 }
...
 ret = nvif_object_ioctl(object, args, args_size, NULL);

This will catch the u32 overflow to nvif_object_ioctl(), catch an
allocation underflow on 32-bits systems, and make the code more
readable. :)



Makes sense. I'll change that and send v2.

Thanks,
Guenter




Re: [PATCH] dma-buf/fence-array: Add flex array to struct dma_fence_array

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 07:47:02PM +0200, Christophe JAILLET wrote:
> This is an effort to get rid of all multiplications from allocation
> functions in order to prevent integer overflows [1][2].
> 
> The "struct dma_fence_array" can be refactored to add a flex array in order
> to have the "callback structures allocated behind the array" be more
> explicit.
> 
> Do so:
>- makes the code more readable and safer.
>- allows using __counted_by() for additional checks
>- avoids some pointer arithmetic in dma_fence_array_enable_signaling()
> 
> Link: 
> https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
>  [1]
> Link: https://github.com/KSPP/linux/issues/160 [2]
> Signed-off-by: Christophe JAILLET 

Yes please! :)

Reviewed-by: Kees Cook 

-- 
Kees Cook


[PATCH] dma-buf/fence-array: Add flex array to struct dma_fence_array

2024-05-18 Thread Christophe JAILLET
This is an effort to get rid of all multiplications from allocation
functions in order to prevent integer overflows [1][2].

The "struct dma_fence_array" can be refactored to add a flex array in order
to have the "callback structures allocated behind the array" be more
explicit.

Do so:
   - makes the code more readable and safer.
   - allows using __counted_by() for additional checks
   - avoids some pointer arithmetic in dma_fence_array_enable_signaling()

Link: 
https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments
 [1]
Link: https://github.com/KSPP/linux/issues/160 [2]
Signed-off-by: Christophe JAILLET 
---
Compile tested only.

Also, I don't think that 'cb' is a great name and the associated kernel-doc
description could certainly be improved.
Any proposal welcomed :)
---
 drivers/dma-buf/dma-fence-array.c | 10 --
 include/linux/dma-fence-array.h   |  3 +++
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/dma-buf/dma-fence-array.c 
b/drivers/dma-buf/dma-fence-array.c
index 9b3ce8948351..9c55afaca607 100644
--- a/drivers/dma-buf/dma-fence-array.c
+++ b/drivers/dma-buf/dma-fence-array.c
@@ -70,7 +70,7 @@ static void dma_fence_array_cb_func(struct dma_fence *f,
 static bool dma_fence_array_enable_signaling(struct dma_fence *fence)
 {
struct dma_fence_array *array = to_dma_fence_array(fence);
-   struct dma_fence_array_cb *cb = (void *)([1]);
+   struct dma_fence_array_cb *cb = array->cb;
unsigned i;
 
for (i = 0; i < array->num_fences; ++i) {
@@ -168,22 +168,20 @@ struct dma_fence_array *dma_fence_array_create(int 
num_fences,
   bool signal_on_any)
 {
struct dma_fence_array *array;
-   size_t size = sizeof(*array);
 
WARN_ON(!num_fences || !fences);
 
-   /* Allocate the callback structures behind the array. */
-   size += num_fences * sizeof(struct dma_fence_array_cb);
-   array = kzalloc(size, GFP_KERNEL);
+   array = kzalloc(struct_size(array, cb, num_fences), GFP_KERNEL);
if (!array)
return NULL;
 
+   array->num_fences = num_fences;
+
spin_lock_init(>lock);
dma_fence_init(>base, _fence_array_ops, >lock,
   context, seqno);
init_irq_work(>work, irq_dma_fence_array_work);
 
-   array->num_fences = num_fences;
atomic_set(>num_pending, signal_on_any ? 1 : num_fences);
array->fences = fences;
 
diff --git a/include/linux/dma-fence-array.h b/include/linux/dma-fence-array.h
index ec7f25def392..a793f9d5c73b 100644
--- a/include/linux/dma-fence-array.h
+++ b/include/linux/dma-fence-array.h
@@ -33,6 +33,7 @@ struct dma_fence_array_cb {
  * @num_pending: fences in the array still pending
  * @fences: array of the fences
  * @work: internal irq_work function
+ * @cb: array of callback helpers
  */
 struct dma_fence_array {
struct dma_fence base;
@@ -43,6 +44,8 @@ struct dma_fence_array {
struct dma_fence **fences;
 
struct irq_work work;
+
+   struct dma_fence_array_cb cb[] __counted_by(num_fences);
 };
 
 /**
-- 
2.45.1



Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Kees Cook
On Sat, May 18, 2024 at 06:54:36PM +0200, Christophe JAILLET wrote:
> (adding linux-harden...@vger.kernel.org)
> 
> 
> Le 18/05/2024 à 16:37, Guenter Roeck a écrit :
> > Trying to build parisc:allmodconfig with gcc 12.x or later results
> > in the following build error.
> > 
> > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
> > drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
> > 'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
> > overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
> >161 | memcpy(data, args->mthd.data, size);
> >| ^~~
> > drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
> > drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
> > 'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
> > overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
> >298 | memcpy(data, args->new.data, size);
> > 
> > gcc assumes that 'sizeof(*args) + size' can overflow, which would result
> > in the problem.
> > 
> > The problem is not new, only it is now no longer a warning but an error 
> > since W=1
> > has been enabled for the drm subsystem and since Werror is enabled for test 
> > builds.
> > 
> > Rearrange arithmetic and add extra size checks to avoid the overflow.
> > 
> > Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
> > subsystem")
> > Cc: Javier Martinez Canillas 
> > 
> > Cc: Jani Nikula 
> > Cc: Thomas Zimmermann 
> > Cc: Danilo Krummrich 
> > Cc: Maxime Ripard 
> > Signed-off-by: Guenter Roeck 
> > ---
> > checkpatch complains about the line length in the description and the 
> > (pre-existing)
> > assignlemts in if conditions, but I did not want to split lines in the 
> > description
> > or rearrange the code further.
> > 
> > I don't know why I only see the problem with parisc builds (at least so 
> > far).
> > 
> >   drivers/gpu/drm/nouveau/nvif/object.c | 8 +---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nvif/object.c 
> > b/drivers/gpu/drm/nouveau/nvif/object.c
> > index 4d1aaee8fe15..baf623a48874 100644
> > --- a/drivers/gpu/drm/nouveau/nvif/object.c
> > +++ b/drivers/gpu/drm/nouveau/nvif/object.c
> > @@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, 
> > void *data, u32 size)
> > u8 stack[128];
> > int ret;
> > -   if (sizeof(*args) + size > sizeof(stack)) {
> > -   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
> > +   if (size > sizeof(stack) - sizeof(*args)) {
> > +   if (size > INT_MAX ||
> > +   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
> 
> Hi,
> 
> Would it be cleaner or better to use size_add(sizeof(*args), size)?

I think the INT_MAX test is actually better in this case because
nvif_object_ioctl()'s size argument is u32:

ret = nvif_object_ioctl(object, args, sizeof(*args) + size, NULL);
  

So that could wrap around, even though the allocation may not.

Better yet, since "sizeof(*args) + size" is repeated 3 times in the
function, I'd recommend:

...
u32 args_size;

if (check_add_overflow(sizeof(*args), size, _size))
return -ENOMEM;
if (args_size > sizeof(stack)) {
if (!(args = kmalloc(args_size, GFP_KERNEL)))
return -ENOMEM;
} else {
args = (void *)stack;
}
...
ret = nvif_object_ioctl(object, args, args_size, NULL);

This will catch the u32 overflow to nvif_object_ioctl(), catch an
allocation underflow on 32-bits systems, and make the code more
readable. :)

-Kees

-- 
Kees Cook


Re: [PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Christophe JAILLET

(adding linux-harden...@vger.kernel.org)


Le 18/05/2024 à 16:37, Guenter Roeck a écrit :

Trying to build parisc:allmodconfig with gcc 12.x or later results
in the following build error.

drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
   161 | memcpy(data, args->mthd.data, size);
   | ^~~
drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
   298 | memcpy(data, args->new.data, size);

gcc assumes that 'sizeof(*args) + size' can overflow, which would result
in the problem.

The problem is not new, only it is now no longer a warning but an error since 
W=1
has been enabled for the drm subsystem and since Werror is enabled for test 
builds.

Rearrange arithmetic and add extra size checks to avoid the overflow.

Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
subsystem")
Cc: Javier Martinez Canillas 
Cc: Jani Nikula 
Cc: Thomas Zimmermann 
Cc: Danilo Krummrich 
Cc: Maxime Ripard 
Signed-off-by: Guenter Roeck 
---
checkpatch complains about the line length in the description and the 
(pre-existing)
assignlemts in if conditions, but I did not want to split lines in the 
description
or rearrange the code further.

I don't know why I only see the problem with parisc builds (at least so far).

  drivers/gpu/drm/nouveau/nvif/object.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/object.c 
b/drivers/gpu/drm/nouveau/nvif/object.c
index 4d1aaee8fe15..baf623a48874 100644
--- a/drivers/gpu/drm/nouveau/nvif/object.c
+++ b/drivers/gpu/drm/nouveau/nvif/object.c
@@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, void 
*data, u32 size)
u8 stack[128];
int ret;
  
-	if (sizeof(*args) + size > sizeof(stack)) {

-   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
+   if (size > sizeof(stack) - sizeof(*args)) {
+   if (size > INT_MAX ||
+   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))


Hi,

Would it be cleaner or better to use size_add(sizeof(*args), size)?


return -ENOMEM;
} else {
args = (void *)stack;
@@ -276,7 +277,8 @@ nvif_object_ctor(struct nvif_object *parent, const char 
*name, u32 handle,
object->map.size = 0;
  
  	if (parent) {

-   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) {
+   if (size > INT_MAX ||
+   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) {


Same.

CJ


nvif_object_dtor(object);
return -ENOMEM;
}




[PATCH] drm/nouveau/nvif: Avoid build error due to potential integer overflows

2024-05-18 Thread Guenter Roeck
Trying to build parisc:allmodconfig with gcc 12.x or later results
in the following build error.

drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_mthd':
drivers/gpu/drm/nouveau/nvif/object.c:161:9: error:
'memcpy' accessing 4294967264 or more bytes at offsets 0 and 32 
overlaps 6442450881 bytes at offset -2147483617 [-Werror=restrict]
  161 | memcpy(data, args->mthd.data, size);
  | ^~~
drivers/gpu/drm/nouveau/nvif/object.c: In function 'nvif_object_ctor':
drivers/gpu/drm/nouveau/nvif/object.c:298:17: error:
'memcpy' accessing 4294967240 or more bytes at offsets 0 and 56 
overlaps 6442450833 bytes at offset -2147483593 [-Werror=restrict]
  298 | memcpy(data, args->new.data, size);

gcc assumes that 'sizeof(*args) + size' can overflow, which would result
in the problem.

The problem is not new, only it is now no longer a warning but an error since 
W=1
has been enabled for the drm subsystem and since Werror is enabled for test 
builds.

Rearrange arithmetic and add extra size checks to avoid the overflow.

Fixes: a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the 
subsystem")
Cc: Javier Martinez Canillas 
Cc: Jani Nikula 
Cc: Thomas Zimmermann 
Cc: Danilo Krummrich 
Cc: Maxime Ripard 
Signed-off-by: Guenter Roeck 
---
checkpatch complains about the line length in the description and the 
(pre-existing)
assignlemts in if conditions, but I did not want to split lines in the 
description
or rearrange the code further.

I don't know why I only see the problem with parisc builds (at least so far).

 drivers/gpu/drm/nouveau/nvif/object.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nvif/object.c 
b/drivers/gpu/drm/nouveau/nvif/object.c
index 4d1aaee8fe15..baf623a48874 100644
--- a/drivers/gpu/drm/nouveau/nvif/object.c
+++ b/drivers/gpu/drm/nouveau/nvif/object.c
@@ -145,8 +145,9 @@ nvif_object_mthd(struct nvif_object *object, u32 mthd, void 
*data, u32 size)
u8 stack[128];
int ret;
 
-   if (sizeof(*args) + size > sizeof(stack)) {
-   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
+   if (size > sizeof(stack) - sizeof(*args)) {
+   if (size > INT_MAX ||
+   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL)))
return -ENOMEM;
} else {
args = (void *)stack;
@@ -276,7 +277,8 @@ nvif_object_ctor(struct nvif_object *parent, const char 
*name, u32 handle,
object->map.size = 0;
 
if (parent) {
-   if (!(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) {
+   if (size > INT_MAX ||
+   !(args = kmalloc(sizeof(*args) + size, GFP_KERNEL))) {
nvif_object_dtor(object);
return -ENOMEM;
}
-- 
2.39.2



Re: [PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime

2024-05-18 Thread Lucas De Marchi

On Fri, May 17, 2024 at 03:40:22PM GMT, Matt Roper wrote:

On Fri, May 17, 2024 at 01:43:07PM -0700, Lucas De Marchi wrote:

From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
  - Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
  - Handle the parallel submission case by multiplying the runtime by
width.
v3: Update comments

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_device_types.h |  3 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 37 
 drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
 drivers/gpu/drm/xe/xe_execlist.c |  1 +
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
 5 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 5c5e36de452a..bc97990fd032 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -555,6 +555,9 @@ struct xe_file {
struct mutex lock;
} exec_queue;

+   /** @runtime: hw engine class runtime in ticks for this drm client */
+   u64 runtime[XE_ENGINE_CLASS_MAX];
+
/** @client: drm client */
struct xe_drm_client *client;
 };
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
index 395de93579fa..fa6dc996eca8 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
q->lrc[0].fence_ctx.next_seqno - 1;
 }

+/**
+ * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
+ * @q: The exec queue
+ *
+ * Update the timestamp saved by HW for this exec queue and save runtime
+ * calculated by using the delta from last update. On multi-lrc case, only the
+ * first is considered.
+ */
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
+{
+   struct xe_file *xef;
+   struct xe_lrc *lrc;
+   u32 old_ts, new_ts;
+
+   /*
+* Jobs that are run during driver load may use an exec_queue, but are
+* not associated with a user xe file, so avoid accumulating busyness
+* for kernel specific work.
+*/
+   if (!q->vm || !q->vm->xef)
+   return;
+
+   xef = q->vm->xef;
+
+   /*
+* Only sample the first LRC. For parallel submission, all of them are
+* scheduled together and we compensate that below by multiplying by
+* width - this may introduce errors if that premise is not true and
+* they don't exit 100% aligned. On the other hand, looping through
+* the LRCs and reading them in different time could also introduce
+* errors.


At the time we're executing this function, those LRCs aren't executing
on the hardware anymore and their timestamps aren't continuing to move,


not necessarily. Besides calling this function when execution ends, see
last patch. This is called when userspace reads the fdinfo file, which
portentially reads this for running LRCs.


right?  I don't see where error could creep in from just looping over
each of them?

I guess parallel submission is mostly just used by media these days,
where the batches submitted in parallel are nearly identical and
expected to run the same amount of time, right?  Do we have any


what I got from Matt Brost is that they are always scheduled together
and will run together on the different instances of that engine class,
regardless if they are different. As you said, I'm not sure there's
even a use case for running different batches.  +Matt Brost to confirm
my reasoning below.

Looking at our uapi and some tests in igt. This is controlled by the width
arg when creating the exec queue. Later when executing, we can only pass 1
sync obj to wait for completion. So even if userspace passes different
batch addresses during exec (i.e. different batches), the whole lot will
not complete until everything finishes. I think it's reasonable to
consider everything is busy while it doesn't complete.


userspace (or potential future userspace) that might submit
heterogeneous batches in parallel, which would make this inaccurate?

I'm not very familiar with the use cases of parallel submission, so I'll
trust that you've got a better understanding of the userspace usage than
I do; everything else here looks fine to me.


I kind of learned this part when doing this code :). It'd be good 

Re: [PATCH v2 1/4] drm/vmwgfx: Filter modes which exceed graphics memory

2024-05-18 Thread kernel test robot
Hi Ian,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm/drm-next]
[also build test WARNING on drm-exynos/exynos-drm-next drm-intel/for-linux-next 
drm-intel/for-linux-next-fixes drm-misc/drm-misc-next drm-tip/drm-tip 
linus/master v6.9 next-20240517]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Ian-Forbes/drm-vmwgfx-Filter-modes-which-exceed-graphics-memory/20240518-024955
base:   git://anongit.freedesktop.org/drm/drm drm-next
patch link:
https://lore.kernel.org/r/20240517184811.25807-2-ian.forbes%40broadcom.com
patch subject: [PATCH v2 1/4] drm/vmwgfx: Filter modes which exceed graphics 
memory
config: i386-randconfig-141-20240518 
(https://download.01.org/0day-ci/archive/20240518/202405181844.pa39dlhd-...@intel.com/config)
compiler: clang version 18.1.5 (https://github.com/llvm/llvm-project 
617a15a9eac96088ae5e9134248d8236e34b91b1)
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20240518/202405181844.pa39dlhd-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202405181844.pa39dlhd-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c:47: warning: This comment starts with 
>> '/**', but isn't a kernel-doc comment. Refer 
>> Documentation/doc-guide/kernel-doc.rst
* llvmpipe will align the width and height of its buffers to match its


vim +47 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c

38  
39  #define vmw_crtc_to_stdu(x) \
40  container_of(x, struct vmw_screen_target_display_unit, 
base.crtc)
41  #define vmw_encoder_to_stdu(x) \
42  container_of(x, struct vmw_screen_target_display_unit, 
base.encoder)
43  #define vmw_connector_to_stdu(x) \
44  container_of(x, struct vmw_screen_target_display_unit, 
base.connector)
45  
46  /**
  > 47   * llvmpipe will align the width and height of its buffers to match its
48   * tile size. We need to keep this in mind when exposing modes to 
userspace
49   * so that this possible over-allocation will not exceed graphics 
memory.
50   */
51  #define LLVM_PIPE_TILE_SIZE 64
52  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-05-17 Thread Guenter Roeck

On 5/17/24 11:00, Guenter Roeck wrote:

On 5/17/24 10:48, Steven Rostedt wrote:

On Fri, 17 May 2024 10:36:37 -0700
Guenter Roeck  wrote:


Building csky:allmodconfig (and others) ... failed
--
Error log:
In file included from include/trace/trace_events.h:419,
  from include/trace/define_trace.h:102,
  from drivers/cxl/core/trace.h:737,
  from drivers/cxl/core/trace.c:8:
drivers/cxl/core/./trace.h:383:1: error: macro "__assign_str" passed 2 
arguments, but takes just 1

This is with the patch applied on top of v6.9-8410-gff2632d7d08e.
So far that seems to be the only build failure.
Introduced with commit 6aec00139d3a8 ("cxl/core: Add region info to
cxl_general_media and cxl_dram events"). Guess we'll see more of those
towards the end of the commit window.


Looks like I made this patch just before this commit was pulled into
Linus's tree.

Which is why I'll apply and rerun the above again probably on Tuesday of
next week against Linus's latest.

This patch made it through both an allyesconfig and an allmodconfig, but on
the commit I had applied it to, which was:

   1b294a1f3561 ("Merge tag 'net-next-6.10' of 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")

I'll be compiling those two builds after I update it then.



I am currently repeating my test builds with the above errors fixed.
That should take a couple of hours. I'll let you know how it goes.



There are no more build failures caused by this patch after fixing the above
errors.

Tested-by: Guenter Roeck 

Guenter



Re: [PATCH net-next v9 05/14] netdev: netdevice devmem allocator

2024-05-17 Thread David Wei
On 2024-05-10 16:21, Mina Almasry wrote:
> +/* This returns the absolute dma_addr_t calculated from
> + * net_iov_owner(niov)->owner->base_dma_addr, not the page_pool-owned
> + * niov->dma_addr.
> + *
> + * The absolute dma_addr_t is a dma_addr_t that is always uncompressed.
> + *
> + * The page_pool-owner niov->dma_addr is the absolute dma_addr compressed 
> into
> + * an unsigned long. Special handling is done when the unsigned long is 
> 32-bit
> + * but the dma_addr_t is 64-bit.
> + *
> + * In general code looking for the dma_addr_t should use net_iov_dma_addr(),
> + * while page_pool code looking for the unsigned long dma_addr which mirrors
> + * the field in struct page should use niov->dma_addr.
> + */
> +static inline dma_addr_t net_iov_dma_addr(const struct net_iov *niov)
> +{
> + struct dmabuf_genpool_chunk_owner *owner = net_iov_owner(niov);
> +
> + return owner->base_dma_addr +
> +((dma_addr_t)net_iov_idx(niov) << PAGE_SHIFT);
> +}

This part feels like devmem TCP specific, yet the function is in
netmem.h. Please consider moving it into devmem.{h,c} which makes it
less likely that people not reading your comment will try using it.

> +
> +static inline struct net_devmem_dmabuf_binding *
> +net_iov_binding(const struct net_iov *niov)
> +{
> + return net_iov_owner(niov)->binding;
> +}
> +
>  /* netmem */
>  
>  /**
> diff --git a/net/core/devmem.c b/net/core/devmem.c
> index d82f92d7cf9ce..1f90e23a81441 100644
> --- a/net/core/devmem.c
> +++ b/net/core/devmem.c
> @@ -54,6 +54,42 @@ void __net_devmem_dmabuf_binding_free(struct 
> net_devmem_dmabuf_binding *binding)
>   kfree(binding);
>  }
>  
> +struct net_iov *
> +net_devmem_alloc_dmabuf(struct net_devmem_dmabuf_binding *binding)
> +{
> + struct dmabuf_genpool_chunk_owner *owner;
> + unsigned long dma_addr;
> + struct net_iov *niov;
> + ssize_t offset;
> + ssize_t index;
> +
> + dma_addr = gen_pool_alloc_owner(binding->chunk_pool, PAGE_SIZE,
> + (void **));
> + if (!dma_addr)
> + return NULL;
> +
> + offset = dma_addr - owner->base_dma_addr;
> + index = offset / PAGE_SIZE;
> + niov = >niovs[index];
> +
> + niov->dma_addr = 0;
> +
> + net_devmem_dmabuf_binding_get(binding);
> +
> + return niov;
> +}
> +
> +void net_devmem_free_dmabuf(struct net_iov *niov)
> +{
> + struct net_devmem_dmabuf_binding *binding = net_iov_binding(niov);
> + unsigned long dma_addr = net_iov_dma_addr(niov);
> +
> + if (gen_pool_has_addr(binding->chunk_pool, dma_addr, PAGE_SIZE))
> + gen_pool_free(binding->chunk_pool, dma_addr, PAGE_SIZE);
> +
> + net_devmem_dmabuf_binding_put(binding);
> +}
> +
>  /* Protected by rtnl_lock() */
>  static DEFINE_XARRAY_FLAGS(net_devmem_dmabuf_bindings, XA_FLAGS_ALLOC1);
>  


Re: [PATCH 1/6] drm/bridge: analogix: remove unused struct 'bridge_init'

2024-05-17 Thread Dr. David Alan Gilbert
(oops the patch numbering in these 3 are wrong, they're all independent
patches)

Dave

* li...@treblig.org (li...@treblig.org) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> 'bridge_init' is unused, I think following:
> commit 6a1688ae8794 ("drm/bridge: ptn3460: Convert to I2C driver model")
> (which is where a git --follow finds it)
> Remove it.
> 
> Build tested.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 5 -
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
> b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> index df9370e0ff23..1e03f3525a92 100644
> --- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> +++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
> @@ -36,11 +36,6 @@
>  
>  static const bool verify_fast_training;
>  
> -struct bridge_init {
> - struct i2c_client *client;
> - struct device_node *node;
> -};
> -
>  static int analogix_dp_init_dp(struct analogix_dp_device *dp)
>  {
>   int ret;
> -- 
> 2.45.1
> 
-- 
 -Open up your eyes, open up your mind, open up your code ---   
/ Dr. David Alan Gilbert|   Running GNU/Linux   | Happy  \ 
\dave @ treblig.org |   | In Hex /
 \ _|_ http://www.treblig.org   |___/


[RFC PATCH 1/4] drm/msm: register a fault handler for display mmu faults

2024-05-17 Thread Abhinav Kumar
In preparation to register a iommu fault handler for display
related modules, register a fault handler for the backing
mmu object of msm_kms.

Currently, the fault handler only captures the display snapshot
but we can expand this later if more information needs to be
added to debug display mmu faults.

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/msm_kms.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
index af6a6fcb1173..62c8e6163e81 100644
--- a/drivers/gpu/drm/msm/msm_kms.c
+++ b/drivers/gpu/drm/msm/msm_kms.c
@@ -200,6 +200,28 @@ struct msm_gem_address_space *msm_kms_init_aspace(struct 
drm_device *dev)
return aspace;
 }
 
+static int msm_kms_fault_handler(void *arg, unsigned long iova, int flags, 
void *data)
+{
+   struct msm_kms *kms = arg;
+   struct msm_disp_state *state;
+   int ret;
+
+   ret = mutex_lock_interruptible(>dump_mutex);
+   if (ret)
+   return ret;
+
+   state = msm_disp_snapshot_state_sync(kms);
+
+   mutex_unlock(>dump_mutex);
+
+   if (IS_ERR(state)) {
+   DRM_DEV_ERROR(kms->dev->dev, "failed to capture snapshot\n");
+   return PTR_ERR(state);
+   }
+
+   return 0;
+}
+
 void msm_drm_kms_uninit(struct device *dev)
 {
struct platform_device *pdev = to_platform_device(dev);
@@ -261,6 +283,9 @@ int msm_drm_kms_init(struct device *dev, const struct 
drm_driver *drv)
goto err_msm_uninit;
}
 
+   if (kms->aspace)
+   msm_mmu_set_fault_handler(kms->aspace->mmu, kms, 
msm_kms_fault_handler);
+
drm_helper_move_panel_connectors_to_head(ddev);
 
drm_for_each_crtc(crtc, ddev) {
-- 
2.44.0



[RFC PATCH 2/4] drm/msm/iommu: rename msm_fault_handler to msm_gpu_fault_handler

2024-05-17 Thread Abhinav Kumar
In preparation of registering a separate fault handler for
display, lets rename the existing msm_fault_handler to
msm_gpu_fault_handler.

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/msm_iommu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index d5512037c38b..a79cd18bc4c9 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -243,7 +243,7 @@ static const struct iommu_flush_ops tlb_ops = {
.tlb_add_page = msm_iommu_tlb_add_page,
 };
 
-static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
+static int msm_gpu_fault_handler(struct iommu_domain *domain, struct device 
*dev,
unsigned long iova, int flags, void *arg);
 
 struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu *parent)
@@ -319,7 +319,7 @@ struct msm_mmu *msm_iommu_pagetable_create(struct msm_mmu 
*parent)
return >base;
 }
 
-static int msm_fault_handler(struct iommu_domain *domain, struct device *dev,
+static int msm_gpu_fault_handler(struct iommu_domain *domain, struct device 
*dev,
unsigned long iova, int flags, void *arg)
 {
struct msm_iommu *iommu = arg;
@@ -445,7 +445,7 @@ struct msm_mmu *msm_iommu_gpu_new(struct device *dev, 
struct msm_gpu *gpu, unsig
return mmu;
 
iommu = to_msm_iommu(mmu);
-   iommu_set_fault_handler(iommu->domain, msm_fault_handler, iommu);
+   iommu_set_fault_handler(iommu->domain, msm_gpu_fault_handler, iommu);
 
/* Enable stall on iommu fault: */
if (adreno_smmu->set_stall)
-- 
2.44.0



[RFC PATCH 3/4] drm/msm/iommu: introduce msm_iommu_disp_new() for msm_kms

2024-05-17 Thread Abhinav Kumar
Introduce a new API msm_iommu_disp_new() for display use-cases.

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/msm_iommu.c | 28 
 drivers/gpu/drm/msm/msm_mmu.h   |  1 +
 2 files changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index a79cd18bc4c9..3d5c1bb4c013 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -343,6 +343,19 @@ static int msm_gpu_fault_handler(struct iommu_domain 
*domain, struct device *dev
return 0;
 }
 
+static int msm_disp_fault_handler(struct iommu_domain *domain, struct device 
*dev,
+ unsigned long iova, int flags, void *arg)
+{
+   struct msm_iommu *iommu = arg;
+
+   if (iommu->base.handler)
+   return iommu->base.handler(iommu->base.arg, iova, flags, NULL);
+
+   pr_warn_ratelimited("*** fault: iova=%16lx, flags=%d\n", iova, flags);
+
+   return 0;
+}
+
 static void msm_iommu_resume_translation(struct msm_mmu *mmu)
 {
struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(mmu->dev);
@@ -434,6 +447,21 @@ struct msm_mmu *msm_iommu_new(struct device *dev, unsigned 
long quirks)
return >base;
 }
 
+struct msm_mmu *msm_iommu_disp_new(struct device *dev, unsigned long quirks)
+{
+   struct msm_iommu *iommu;
+   struct msm_mmu *mmu;
+
+   mmu = msm_iommu_new(dev, quirks);
+   if (IS_ERR_OR_NULL(mmu))
+   return mmu;
+
+   iommu = to_msm_iommu(mmu);
+   iommu_set_fault_handler(iommu->domain, msm_disp_fault_handler, iommu);
+
+   return mmu;
+}
+
 struct msm_mmu *msm_iommu_gpu_new(struct device *dev, struct msm_gpu *gpu, 
unsigned long quirks)
 {
struct adreno_smmu_priv *adreno_smmu = dev_get_drvdata(dev);
diff --git a/drivers/gpu/drm/msm/msm_mmu.h b/drivers/gpu/drm/msm/msm_mmu.h
index 88af4f490881..730458d08d6b 100644
--- a/drivers/gpu/drm/msm/msm_mmu.h
+++ b/drivers/gpu/drm/msm/msm_mmu.h
@@ -42,6 +42,7 @@ static inline void msm_mmu_init(struct msm_mmu *mmu, struct 
device *dev,
 
 struct msm_mmu *msm_iommu_new(struct device *dev, unsigned long quirks);
 struct msm_mmu *msm_iommu_gpu_new(struct device *dev, struct msm_gpu *gpu, 
unsigned long quirks);
+struct msm_mmu *msm_iommu_disp_new(struct device *dev, unsigned long quirks);
 
 static inline void msm_mmu_set_fault_handler(struct msm_mmu *mmu, void *arg,
int (*handler)(void *arg, unsigned long iova, int flags, void 
*data))
-- 
2.44.0



[RFC PATCH 4/4] drm/msm: switch msm_kms to use msm_iommu_disp_new()

2024-05-17 Thread Abhinav Kumar
Switch msm_kms to use msm_iommu_disp_new() so that the newly
registered fault handler will kick-in during any mmu faults.

Signed-off-by: Abhinav Kumar 
---
 drivers/gpu/drm/msm/msm_kms.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
index 62c8e6163e81..1859efbbff1d 100644
--- a/drivers/gpu/drm/msm/msm_kms.c
+++ b/drivers/gpu/drm/msm/msm_kms.c
@@ -181,7 +181,7 @@ struct msm_gem_address_space *msm_kms_init_aspace(struct 
drm_device *dev)
else
iommu_dev = mdss_dev;
 
-   mmu = msm_iommu_new(iommu_dev, 0);
+   mmu = msm_iommu_disp_new(iommu_dev, 0);
if (IS_ERR(mmu))
return ERR_CAST(mmu);
 
-- 
2.44.0



[RFC PATCH 0/4] drm/msm: add a display mmu fault handler

2024-05-17 Thread Abhinav Kumar
To debug display mmu faults, this series introduces a display fault
handler similar to the gpu one.

This is only compile tested at the moment, till a suitable method
to trigger the fault is found and see if this handler does the needful
on the device.

Abhinav Kumar (4):
  drm/msm: register a fault handler for display mmu faults
  drm/msm/iommu: rename msm_fault_handler to msm_gpu_fault_handler
  drm/msm/iommu: introduce msm_iommu_disp_new() for msm_kms
  drm/msm: switch msm_kms to use msm_iommu_disp_new()

 drivers/gpu/drm/msm/msm_iommu.c | 34 ++---
 drivers/gpu/drm/msm/msm_kms.c   | 27 +-
 drivers/gpu/drm/msm/msm_mmu.h   |  1 +
 3 files changed, 58 insertions(+), 4 deletions(-)

-- 
2.44.0



[PATCH 2/3] drm/amd/display: remove unused struct 'aux_payloads'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'aux_payloads' is unused since
commit eae5ffa9bd7b ("drm/amd/display: Switch ddc to new aux interface")
Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c 
b/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c
index c2d40979203e..d6d5bbf2108c 100644
--- a/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c
+++ b/drivers/gpu/drm/amd/display/dc/link/protocols/link_ddc.c
@@ -51,10 +51,6 @@ struct i2c_payloads {
struct vector payloads;
 };
 
-struct aux_payloads {
-   struct vector payloads;
-};
-
 static bool i2c_payloads_create(
struct dc_context *ctx,
struct i2c_payloads *payloads,
-- 
2.45.1



[PATCH 3/3] drm/amd/display: remove unused struct 'dc_reg_sequence'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'dc_reg_sequence' was added in
commit 44788bbc309b ("drm/amd/display: refactor reg_update")

but isn't actually used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/amd/display/dc/dc_helper.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dc_helper.c 
b/drivers/gpu/drm/amd/display/dc/dc_helper.c
index 8f9a67825615..b81419c95222 100644
--- a/drivers/gpu/drm/amd/display/dc/dc_helper.c
+++ b/drivers/gpu/drm/amd/display/dc/dc_helper.c
@@ -91,11 +91,6 @@ struct dc_reg_value_masks {
uint32_t mask;
 };
 
-struct dc_reg_sequence {
-   uint32_t addr;
-   struct dc_reg_value_masks value_masks;
-};
-
 static inline void set_reg_field_value_masks(
struct dc_reg_value_masks *field_value_mask,
uint32_t value,
-- 
2.45.1



[PATCH 1/3] drm/amdgpu: remove unused struct 'hqd_registers'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'hqd_registers' used to be used in a member of the 'bonaire_mqd'
struct. 'bonaire_mqd' was removed by
commit 486d807cd9a9 ("drm/amdgpu: remove duplicate definition of
cik_mqd")
It's now unused.

Remove 'hqd_registers' as well.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 38 ---
 1 file changed, 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c 
b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 541dbd70d8c7..f3544f02ffb9 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -2757,44 +2757,6 @@ static int gfx_v7_0_mec_init(struct amdgpu_device *adev)
return 0;
 }
 
-struct hqd_registers {
-   u32 cp_mqd_base_addr;
-   u32 cp_mqd_base_addr_hi;
-   u32 cp_hqd_active;
-   u32 cp_hqd_vmid;
-   u32 cp_hqd_persistent_state;
-   u32 cp_hqd_pipe_priority;
-   u32 cp_hqd_queue_priority;
-   u32 cp_hqd_quantum;
-   u32 cp_hqd_pq_base;
-   u32 cp_hqd_pq_base_hi;
-   u32 cp_hqd_pq_rptr;
-   u32 cp_hqd_pq_rptr_report_addr;
-   u32 cp_hqd_pq_rptr_report_addr_hi;
-   u32 cp_hqd_pq_wptr_poll_addr;
-   u32 cp_hqd_pq_wptr_poll_addr_hi;
-   u32 cp_hqd_pq_doorbell_control;
-   u32 cp_hqd_pq_wptr;
-   u32 cp_hqd_pq_control;
-   u32 cp_hqd_ib_base_addr;
-   u32 cp_hqd_ib_base_addr_hi;
-   u32 cp_hqd_ib_rptr;
-   u32 cp_hqd_ib_control;
-   u32 cp_hqd_iq_timer;
-   u32 cp_hqd_iq_rptr;
-   u32 cp_hqd_dequeue_request;
-   u32 cp_hqd_dma_offload;
-   u32 cp_hqd_sema_cmd;
-   u32 cp_hqd_msg_type;
-   u32 cp_hqd_atomic0_preop_lo;
-   u32 cp_hqd_atomic0_preop_hi;
-   u32 cp_hqd_atomic1_preop_lo;
-   u32 cp_hqd_atomic1_preop_hi;
-   u32 cp_hqd_hq_scheduler0;
-   u32 cp_hqd_hq_scheduler1;
-   u32 cp_mqd_control;
-};
-
 static void gfx_v7_0_compute_pipe_init(struct amdgpu_device *adev,
   int mec, int pipe)
 {
-- 
2.45.1



[PATCH 0/3] A bunch of struct removals

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

A bunch of deadcode/struct removals in drm/amd

Signed-off-by: Dr. David Alan Gilbert 


Dr. David Alan Gilbert (3):
  drm/amdgpu: remove unused struct 'hqd_registers'
  drm/amd/display: remove unused struct 'aux_payloads'
  drm/amd/display: remove unused struct 'dc_reg_sequence'

 drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c | 38 ---
 drivers/gpu/drm/amd/display/dc/dc_helper.c|  5 ---
 .../amd/display/dc/link/protocols/link_ddc.c  |  4 --
 3 files changed, 47 deletions(-)

-- 
2.45.1



[PATCH 3/6] drm/vmwgfx: remove unused struct 'vmw_stdu_dma'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'vmw_stdu_dma' is unused since
commit 39985eea5a6d ("drm/vmwgfx: Abstract placement selection")
Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 2041c4d48daa..50022e9e3519 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -85,11 +85,6 @@ struct vmw_stdu_update {
SVGA3dCmdUpdateGBScreenTarget body;
 };
 
-struct vmw_stdu_dma {
-   SVGA3dCmdHeader header;
-   SVGA3dCmdSurfaceDMA body;
-};
-
 struct vmw_stdu_surface_copy {
SVGA3dCmdHeader  header;
SVGA3dCmdSurfaceCopy body;
-- 
2.45.1



[PATCH 2/6] drm/nouveau: remove unused struct 'init_exec'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'init_exec' is unused since
commit cb75d97e9c77 ("drm/nouveau: implement devinit subdev, and new
init table parser")
Remove it.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/nouveau/nouveau_bios.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c 
b/drivers/gpu/drm/nouveau/nouveau_bios.c
index 79cfab53f80e..8c3c1f1e01c5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bios.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bios.c
@@ -43,11 +43,6 @@
 #define BIOSLOG(sip, fmt, arg...) NV_DEBUG(sip->dev, fmt, ##arg)
 #define LOG_OLD_VALUE(x)
 
-struct init_exec {
-   bool execute;
-   bool repeat;
-};
-
 static bool nv_cksum(const uint8_t *data, unsigned int length)
 {
/*
-- 
2.45.1



[PATCH 1/6] drm/bridge: analogix: remove unused struct 'bridge_init'

2024-05-17 Thread linux
From: "Dr. David Alan Gilbert" 

'bridge_init' is unused, I think following:
commit 6a1688ae8794 ("drm/bridge: ptn3460: Convert to I2C driver model")
(which is where a git --follow finds it)
Remove it.

Build tested.

Signed-off-by: Dr. David Alan Gilbert 
---
 drivers/gpu/drm/bridge/analogix/analogix_dp_core.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c 
b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index df9370e0ff23..1e03f3525a92 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -36,11 +36,6 @@
 
 static const bool verify_fast_training;
 
-struct bridge_init {
-   struct i2c_client *client;
-   struct device_node *node;
-};
-
 static int analogix_dp_init_dp(struct analogix_dp_device *dp)
 {
int ret;
-- 
2.45.1



Re: [PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime

2024-05-17 Thread Matt Roper
On Fri, May 17, 2024 at 01:43:07PM -0700, Lucas De Marchi wrote:
> From: Umesh Nerlige Ramappa 
> 
> Add a helper to accumulate per-client runtime of all its
> exec queues. This is called every time a sched job is finished.
> 
> v2:
>   - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
> runtime when job is finished since xe_sched_job_completed() is not a
> notification that job finished.
>   - Stop trying to update runtime from xe_exec_queue_fini() - that is
> redundant and may happen after xef is closed, leading to a
> use-after-free
>   - Do not special case the first timestamp read: the default LRC sets
> CTX_TIMESTAMP to zero, so even the first sample should be a valid
> one.
>   - Handle the parallel submission case by multiplying the runtime by
> width.
> v3: Update comments
> 
> Signed-off-by: Umesh Nerlige Ramappa 
> Signed-off-by: Lucas De Marchi 
> ---
>  drivers/gpu/drm/xe/xe_device_types.h |  3 +++
>  drivers/gpu/drm/xe/xe_exec_queue.c   | 37 
>  drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
>  drivers/gpu/drm/xe/xe_execlist.c |  1 +
>  drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
>  5 files changed, 44 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
> b/drivers/gpu/drm/xe/xe_device_types.h
> index 5c5e36de452a..bc97990fd032 100644
> --- a/drivers/gpu/drm/xe/xe_device_types.h
> +++ b/drivers/gpu/drm/xe/xe_device_types.h
> @@ -555,6 +555,9 @@ struct xe_file {
>   struct mutex lock;
>   } exec_queue;
>  
> + /** @runtime: hw engine class runtime in ticks for this drm client */
> + u64 runtime[XE_ENGINE_CLASS_MAX];
> +
>   /** @client: drm client */
>   struct xe_drm_client *client;
>  };
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
> b/drivers/gpu/drm/xe/xe_exec_queue.c
> index 395de93579fa..fa6dc996eca8 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.c
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.c
> @@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
>   q->lrc[0].fence_ctx.next_seqno - 1;
>  }
>  
> +/**
> + * xe_exec_queue_update_runtime() - Update runtime for this exec queue from 
> hw
> + * @q: The exec queue
> + *
> + * Update the timestamp saved by HW for this exec queue and save runtime
> + * calculated by using the delta from last update. On multi-lrc case, only 
> the
> + * first is considered.
> + */
> +void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
> +{
> + struct xe_file *xef;
> + struct xe_lrc *lrc;
> + u32 old_ts, new_ts;
> +
> + /*
> +  * Jobs that are run during driver load may use an exec_queue, but are
> +  * not associated with a user xe file, so avoid accumulating busyness
> +  * for kernel specific work.
> +  */
> + if (!q->vm || !q->vm->xef)
> + return;
> +
> + xef = q->vm->xef;
> +
> + /*
> +  * Only sample the first LRC. For parallel submission, all of them are
> +  * scheduled together and we compensate that below by multiplying by
> +  * width - this may introduce errors if that premise is not true and
> +  * they don't exit 100% aligned. On the other hand, looping through
> +  * the LRCs and reading them in different time could also introduce
> +  * errors.

At the time we're executing this function, those LRCs aren't executing
on the hardware anymore and their timestamps aren't continuing to move,
right?  I don't see where error could creep in from just looping over
each of them?

I guess parallel submission is mostly just used by media these days,
where the batches submitted in parallel are nearly identical and
expected to run the same amount of time, right?  Do we have any
userspace (or potential future userspace) that might submit
heterogeneous batches in parallel, which would make this inaccurate?

I'm not very familiar with the use cases of parallel submission, so I'll
trust that you've got a better understanding of the userspace usage than
I do; everything else here looks fine to me.

Reviewed-by: Matt Roper 


Matt

> +  */
> + lrc = >lrc[0];
> + new_ts = xe_lrc_update_timestamp(lrc, _ts);
> + xef->runtime[q->class] += (new_ts - old_ts) * q->width;
> +}
> +
>  void xe_exec_queue_kill(struct xe_exec_queue *q)
>  {
>   struct xe_exec_queue *eq = q, *next;
> diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h 
> b/drivers/gpu/drm/xe/xe_exec_queue.h
> index 48f6da53a292..e0f07d28ee1a 100644
> --- a/drivers/gpu/drm/xe/xe_exec_queue.h
> +++ b/drivers/gpu/drm/xe/xe_exec_queue.h
> @@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct 
> xe_exec_queue *e,
>  struct xe_vm *vm);
>  void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm,
> struct dma_fence *fence);
> +void xe_exec_queue_update_runtime(struct xe_exec_queue *q);
>  
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_execlist.c 

[PATCH 6/8] drm/panel: himax-hx83102: If prepare fails, disable GPIO before regulators

2024-05-17 Thread Douglas Anderson
The enable GPIO should clearly be set low before turning off
regulators. That matches both the inverse order that things were
enabled and also the order in unprepare().

Fixes: 0ef94554dc40 ("drm/panel: himax-hx83102: Break out as separate driver")
Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-himax-hx83102.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-himax-hx83102.c 
b/drivers/gpu/drm/panel/panel-himax-hx83102.c
index 1a6975937f30..4ac7f9d8b232 100644
--- a/drivers/gpu/drm/panel/panel-himax-hx83102.c
+++ b/drivers/gpu/drm/panel/panel-himax-hx83102.c
@@ -578,13 +578,13 @@ static int hx83102_prepare(struct drm_panel *panel)
return 0;
 
 poweroff:
+   gpiod_set_value(ctx->enable_gpio, 0);
regulator_disable(ctx->avee);
 poweroffavdd:
regulator_disable(ctx->avdd);
 poweroff1v8:
usleep_range(5000, 7000);
regulator_disable(ctx->pp1800);
-   gpiod_set_value(ctx->enable_gpio, 0);
 
return ret;
 }
-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



[PATCH 7/8] drm/panel: himax-hx83102: Check for errors on the NOP in prepare()

2024-05-17 Thread Douglas Anderson
The mipi_dsi_dcs_nop() function returns an error but we weren't
checking it in hx83102_prepare(). Add a check. This is highly unlikely
to matter in practice. If the NOP failed then likely later MIPI
commands would fail too.

Found by code inspection.

Fixes: 0ef94554dc40 ("drm/panel: himax-hx83102: Break out as separate driver")
Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-himax-hx83102.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-himax-hx83102.c 
b/drivers/gpu/drm/panel/panel-himax-hx83102.c
index 4ac7f9d8b232..1ba623e41924 100644
--- a/drivers/gpu/drm/panel/panel-himax-hx83102.c
+++ b/drivers/gpu/drm/panel/panel-himax-hx83102.c
@@ -547,7 +547,11 @@ static int hx83102_prepare(struct drm_panel *panel)
 
usleep_range(1, 11000);
 
-   mipi_dsi_dcs_nop(ctx->dsi);
+   ret = mipi_dsi_dcs_nop(ctx->dsi);
+   if (ret < 0) {
+   dev_err(dev, "Failed to send NOP: %d\n", ret);
+   goto poweroff;
+   }
usleep_range(1000, 2000);
 
gpiod_set_value(ctx->enable_gpio, 1);
-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



[PATCH 8/8] drm/panel: himax-hx83102: use wrapped MIPI DCS functions

2024-05-17 Thread Douglas Anderson
Take advantage of some of the new wrapped routines introduced by
commit f79d6d28d8fe ("drm/mipi-dsi: wrap more functions for streamline
handling") to simplify the himax-hx83102 driver a bit more. This gets
rid of some extra error prints (since the _multi functions all print
errors for you) and simplifies the code a bit.

One thing here that isn't just refactoring is that in a few places we
now check with errors with "if (err)" instead of "if (err < 0)". All
errors are expected to be negative so this is not expected to have any
impact. The _multi code internally considers anything non-zero to be
an error so this just makes things consistent.

It can also be noted that hx83102_prepare() has a mix of things that
can take advantage of _multi calls and things that can't. The cleanest
seemed to be to use the multi_ctx still but consistently use the
"accum_err" variable for error returns, though that's definitely a
style decision with pros and cons.

Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-himax-hx83102.c | 92 +++--
 1 file changed, 28 insertions(+), 64 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-himax-hx83102.c 
b/drivers/gpu/drm/panel/panel-himax-hx83102.c
index 1ba623e41924..6009a3fe1b8f 100644
--- a/drivers/gpu/drm/panel/panel-himax-hx83102.c
+++ b/drivers/gpu/drm/panel/panel-himax-hx83102.c
@@ -285,12 +285,10 @@ static int boe_nv110wum_init(struct hx83102 *ctx)
mipi_dsi_dcs_write_seq_multi(_ctx, HX83102_SETSPCCMD, 0x3f);
mipi_dsi_dcs_write_seq_multi(_ctx, HX83102_SETBANK, 0x00);
hx83102_enable_extended_cmds(_ctx, false);
-   if (dsi_ctx.accum_err)
-   return dsi_ctx.accum_err;
 
-   msleep(50);
+   mipi_dsi_msleep(dsi_ctx, 50);
 
-   return 0;
+   return dsi_ctx.accum_err;
 };
 
 static int ivo_t109nw41_init(struct hx83102 *ctx)
@@ -392,12 +390,10 @@ static int ivo_t109nw41_init(struct hx83102 *ctx)
mipi_dsi_dcs_write_seq_multi(_ctx, HX83102_SETSPCCMD, 0x3f);
mipi_dsi_dcs_write_seq_multi(_ctx, HX83102_SETBANK, 0x00);
hx83102_enable_extended_cmds(_ctx, false);
-   if (dsi_ctx.accum_err)
-   return dsi_ctx.accum_err;
 
-   msleep(60);
+   mipi_dsi_msleep(dsi_ctx, 60);
 
-   return 0;
+   return dsi_ctx.accum_err;
 };
 
 static const struct drm_display_mode starry_mode = {
@@ -472,40 +468,20 @@ static int hx83102_enable(struct drm_panel *panel)
return 0;
 }
 
-static int hx83102_panel_enter_sleep_mode(struct hx83102 *ctx)
-{
-   struct mipi_dsi_device *dsi = ctx->dsi;
-   int ret;
-
-   dsi->mode_flags &= ~MIPI_DSI_MODE_LPM;
-
-   ret = mipi_dsi_dcs_set_display_off(dsi);
-   if (ret < 0)
-   return ret;
-
-   ret = mipi_dsi_dcs_enter_sleep_mode(dsi);
-   if (ret < 0)
-   return ret;
-
-   return 0;
-}
-
 static int hx83102_disable(struct drm_panel *panel)
 {
struct hx83102 *ctx = panel_to_hx83102(panel);
struct mipi_dsi_device *dsi = ctx->dsi;
-   struct device *dev = >dev;
-   int ret;
+   struct mipi_dsi_multi_context dsi_ctx = { .dsi = dsi };
 
-   ret = hx83102_panel_enter_sleep_mode(ctx);
-   if (ret < 0) {
-   dev_err(dev, "failed to set panel off: %d\n", ret);
-   return ret;
-   }
+   dsi->mode_flags &= ~MIPI_DSI_MODE_LPM;
 
-   msleep(150);
+   mipi_dsi_dcs_set_display_off_multi(_ctx);
+   mipi_dsi_dcs_enter_sleep_mode_multi(_ctx);
 
-   return 0;
+   mipi_dsi_msleep(_ctx, 150);
+
+   return dsi_ctx.accum_err;
 }
 
 static int hx83102_unprepare(struct drm_panel *panel)
@@ -526,32 +502,30 @@ static int hx83102_prepare(struct drm_panel *panel)
 {
struct hx83102 *ctx = panel_to_hx83102(panel);
struct mipi_dsi_device *dsi = ctx->dsi;
-   struct device *dev = >dev;
-   int ret;
+   struct mipi_dsi_multi_context dsi_ctx = { .dsi = dsi };
 
gpiod_set_value(ctx->enable_gpio, 0);
usleep_range(1000, 1500);
 
-   ret = regulator_enable(ctx->pp1800);
-   if (ret < 0)
-   return ret;
+   dsi_ctx.accum_err = regulator_enable(ctx->pp1800);
+   if (dsi_ctx.accum_err)
+   return dsi_ctx.accum_err;
 
usleep_range(3000, 5000);
 
-   ret = regulator_enable(ctx->avdd);
-   if (ret < 0)
+   dsi_ctx.accum_err = regulator_enable(ctx->avdd);
+   if (dsi_ctx.accum_err)
goto poweroff1v8;
-   ret = regulator_enable(ctx->avee);
-   if (ret < 0)
+   dsi_ctx.accum_err = regulator_enable(ctx->avee);
+   if (dsi_ctx.accum_err)
goto poweroffavdd;
 
usleep_range(1, 11000);
 
-   ret = mipi_dsi_dcs_nop(ctx->dsi);
-   if (ret < 0) {
-   dev_err(dev, "Failed to send NOP: %d\n", ret);
+   mipi_dsi_dcs_nop_multi(_ctx);
+   if (dsi_ctx.accum_err)
goto poweroff;
-   }
+
usleep_range(1000, 2000);
 

[PATCH 3/8] drm/panel: boe-tv101wum-nl6: Check for errors on the NOP in prepare()

2024-05-17 Thread Douglas Anderson
The mipi_dsi_dcs_nop() function returns an error but we weren't
checking it in boe_panel_prepare(). Add a check. This is highly
unlikely to matter in practice. If the NOP failed then likely later
MIPI commands would fail too.

Found by code inspection.

Fixes: 812562b8d881 ("drm/panel: boe-tv101wum-nl6: Fine tune the panel power 
sequence")
Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c 
b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
index 028625d2d37d..f7beace455c3 100644
--- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
+++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
@@ -1456,7 +1456,11 @@ static int boe_panel_prepare(struct drm_panel *panel)
usleep_range(1, 11000);
 
if (boe->desc->lp11_before_reset) {
-   mipi_dsi_dcs_nop(boe->dsi);
+   ret = mipi_dsi_dcs_nop(boe->dsi);
+   if (ret < 0) {
+   dev_err(>dsi->dev, "Failed to send NOP: %d\n", 
ret);
+   goto poweroff;
+   }
usleep_range(1000, 2000);
}
gpiod_set_value(boe->enable_gpio, 1);
-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



[PATCH 5/8] drm/panel: ilitek-ili9882t: Check for errors on the NOP in prepare()

2024-05-17 Thread Douglas Anderson
The mipi_dsi_dcs_nop() function returns an error but we weren't
checking it in ili9882t_prepare(). Add a check. This is highly
unlikely to matter in practice. If the NOP failed then likely later
MIPI commands would fail too.

Found by code inspection.

Fixes: e2450d32e5fb ("drm/panel: ili9882t: Break out as separate driver")
Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-ilitek-ili9882t.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c 
b/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c
index a2ea25bb6624..266a087fe14c 100644
--- a/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c
+++ b/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c
@@ -478,7 +478,11 @@ static int ili9882t_prepare(struct drm_panel *panel)
usleep_range(1, 11000);
 
// MIPI needs to keep the LP11 state before the lcm_reset pin is pulled 
high
-   mipi_dsi_dcs_nop(ili->dsi);
+   ret = mipi_dsi_dcs_nop(ili->dsi);
+   if (ret < 0) {
+   dev_err(>dsi->dev, "Failed to send NOP: %d\n", ret);
+   goto poweroff;
+   }
usleep_range(1000, 2000);
 
gpiod_set_value(ili->enable_gpio, 1);
-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



[PATCH 4/8] drm/panel: ilitek-ili9882t: If prepare fails, disable GPIO before regulators

2024-05-17 Thread Douglas Anderson
The enable GPIO should clearly be set low before turning off
regulators. That matches both the inverse order that things were
enabled and also the order in unprepare().

Fixes: e2450d32e5fb ("drm/panel: ili9882t: Break out as separate driver")
Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-ilitek-ili9882t.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c 
b/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c
index 830d7cfbe857..a2ea25bb6624 100644
--- a/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c
+++ b/drivers/gpu/drm/panel/panel-ilitek-ili9882t.c
@@ -495,13 +495,13 @@ static int ili9882t_prepare(struct drm_panel *panel)
return 0;
 
 poweroff:
+   gpiod_set_value(ili->enable_gpio, 0);
regulator_disable(ili->avee);
 poweroffavdd:
regulator_disable(ili->avdd);
 poweroff1v8:
usleep_range(5000, 7000);
regulator_disable(ili->pp1800);
-   gpiod_set_value(ili->enable_gpio, 0);
 
return ret;
 }
-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



[PATCH 2/8] drm/panel: boe-tv101wum-nl6: If prepare fails, disable GPIO before regulators

2024-05-17 Thread Douglas Anderson
The enable GPIO should clearly be set low before turning off
regulators. That matches both the inverse order that things were
enabled and also the order in unprepare().

Fixes: a869b9db7adf ("drm/panel: support for boe tv101wum-nl6 wuxga dsi video 
mode panel")
Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c 
b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
index 8e839a1749e4..028625d2d37d 100644
--- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
+++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
@@ -1475,13 +1475,13 @@ static int boe_panel_prepare(struct drm_panel *panel)
return 0;
 
 poweroff:
+   gpiod_set_value(boe->enable_gpio, 0);
regulator_disable(boe->avee);
 poweroffavdd:
regulator_disable(boe->avdd);
 poweroff1v8:
usleep_range(5000, 7000);
regulator_disable(boe->pp1800);
-   gpiod_set_value(boe->enable_gpio, 0);
 
return ret;
 }
-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



[PATCH 1/8] drm/panel: himax-hx8394: Handle errors from mipi_dsi_dcs_set_display_on() better

2024-05-17 Thread Douglas Anderson
If mipi_dsi_dcs_set_display_on() returned an error then we'd store
that in the "ret" variable and jump to error handling. We'd then
attempt an orderly poweroff. Unfortunately we then blew away the value
stored in "ret". That means that if the orderly poweroff actually
worked then we're return 0 (no error) from hx8394_enable() even though
the panel wasn't enabled.

Fix this by not blowing away "ret".

Found by code inspection.

Fixes: 65dc9360f741 ("drm: panel: Add Himax HX8394 panel controller driver")
Signed-off-by: Douglas Anderson 
---

 drivers/gpu/drm/panel/panel-himax-hx8394.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-himax-hx8394.c 
b/drivers/gpu/drm/panel/panel-himax-hx8394.c
index ff0dc08b9829..cb9f46e853de 100644
--- a/drivers/gpu/drm/panel/panel-himax-hx8394.c
+++ b/drivers/gpu/drm/panel/panel-himax-hx8394.c
@@ -370,8 +370,7 @@ static int hx8394_enable(struct drm_panel *panel)
 
 sleep_in:
/* This will probably fail, but let's try orderly power off anyway. */
-   ret = mipi_dsi_dcs_enter_sleep_mode(dsi);
-   if (!ret)
+   if (!mipi_dsi_dcs_enter_sleep_mode(dsi))
msleep(50);
 
return ret;
-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



[PATCH 0/8] drm/panel: Some very minor err handling fixes + more _multi

2024-05-17 Thread Douglas Anderson


This series is pretty much just addressing a few minor error handling
bugs that I noticed recently while reviewing some panel patches. For
the most part these are all old issues.

This also adjusts the new himax-hx83102 in a similar way that Dmitry
did in his recent series that included commit f79d6d28d8fe
("drm/mipi-dsi: wrap more functions for streamline handling"). His
series handled the panel driver that himax-hx83102 forked from but not
himax-hx83102.


Douglas Anderson (8):
  drm/panel: himax-hx8394: Handle errors from
mipi_dsi_dcs_set_display_on() better
  drm/panel: boe-tv101wum-nl6: If prepare fails, disable GPIO before
regulators
  drm/panel: boe-tv101wum-nl6: Check for errors on the NOP in prepare()
  drm/panel: ilitek-ili9882t: If prepare fails, disable GPIO before
regulators
  drm/panel: ilitek-ili9882t: Check for errors on the NOP in prepare()
  drm/panel: himax-hx83102: If prepare fails, disable GPIO before
regulators
  drm/panel: himax-hx83102: Check for errors on the NOP in prepare()
  drm/panel: himax-hx83102: use wrapped MIPI DCS functions

 .../gpu/drm/panel/panel-boe-tv101wum-nl6.c|  8 +-
 drivers/gpu/drm/panel/panel-himax-hx83102.c   | 92 ++-
 drivers/gpu/drm/panel/panel-himax-hx8394.c|  3 +-
 drivers/gpu/drm/panel/panel-ilitek-ili9882t.c |  8 +-
 4 files changed, 43 insertions(+), 68 deletions(-)

-- 
2.45.0.rc1.225.g2a3ae87e7f-goog



Re: [PATCH v8 07/10] lib: add basic KUnit test for lib/math

2024-05-17 Thread Daniel Latypov
On Fri, May 17, 2024 at 1:14 PM Andy Shevchenko
 wrote:
> > [devarsht: Rebase to 6.9 and change license to GPL]
>
> I'm not sure that you may change license. It needs the author's confirmation.

Checking, this is referring to the MODULE_LICENSE, which used to be
MODULE_LICENSE("GPL v2");

and is now
MODULE_LICENSE("GPL");

If checkpatch is suggesting that now, then changing it sounds good to me.

>
> > ---
> > Changes since v6:
> > * Rebase to linux-next, change license to GPL as suggested by checkpatch.
>
> Note, checkpatch.pl is not false positives free. Be careful
> with what it suggests.
>
> > +#include 
> > +#include 
>
> > +#include 
>
> Do you know why this header is included?

I think I had put it in the original before a lot of the work you did
to split things out of kernel.h.
I haven't had time to look apply this patch series locally yet, but
I'd be pretty sure we can remove it without anything breaking.

Thanks,
Daniel


[PATCH v5 6/8] drm/xe: Cache data about user-visible engines

2024-05-17 Thread Lucas De Marchi
gt->info.engine_mask used to indicate the available engines, but that
is not always true anymore: some engines are reserved to kernel and some
may be exposed as a single engine (e.g. with ccs_mode).

Runtime changes only happen when no clients exist, so it's safe to cache
the list of engines in the gt and update that when it's needed. This
will help implementing per client engine utilization so this (mostly
constant) information doesn't need to be re-calculated on every query.

Reviewed-by: Jonathan Cavitt 
Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_gt.c  | 23 +++
 drivers/gpu/drm/xe/xe_gt.h  | 13 +
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c |  1 +
 drivers/gpu/drm/xe/xe_gt_types.h| 21 -
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index e69a03ddd255..5194a3d38e76 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -560,9 +560,32 @@ int xe_gt_init(struct xe_gt *gt)
if (err)
return err;
 
+   xe_gt_record_user_engines(gt);
+
return drmm_add_action_or_reset(_to_xe(gt)->drm, gt_fini, gt);
 }
 
+void xe_gt_record_user_engines(struct xe_gt *gt)
+{
+   struct xe_hw_engine *hwe;
+   enum xe_hw_engine_id id;
+
+   gt->user_engines.mask = 0;
+   memset(gt->user_engines.instances_per_class, 0,
+  sizeof(gt->user_engines.instances_per_class));
+
+   for_each_hw_engine(hwe, gt, id) {
+   if (xe_hw_engine_is_reserved(hwe))
+   continue;
+
+   gt->user_engines.mask |= BIT_ULL(id);
+   gt->user_engines.instances_per_class[hwe->class]++;
+   }
+
+   xe_gt_assert(gt, (gt->user_engines.mask | gt->info.engine_mask)
+== gt->info.engine_mask);
+}
+
 static int do_gt_reset(struct xe_gt *gt)
 {
int err;
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 8474c50b1b30..1d010bf4a756 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -38,6 +38,19 @@ int xe_gt_init_hwconfig(struct xe_gt *gt);
 int xe_gt_init_early(struct xe_gt *gt);
 int xe_gt_init(struct xe_gt *gt);
 int xe_gt_record_default_lrcs(struct xe_gt *gt);
+
+/**
+ * xe_gt_record_user_engines - save data related to engines available to
+ * usersapce
+ * @gt: GT structure
+ *
+ * Walk the available HW engines from gt->info.engine_mask and calculate data
+ * related to those engines that may be used by userspace. To be used whenever
+ * available engines change in runtime (e.g. with ccs_mode) or during
+ * initialization
+ */
+void xe_gt_record_user_engines(struct xe_gt *gt);
+
 void xe_gt_suspend_prepare(struct xe_gt *gt);
 int xe_gt_suspend(struct xe_gt *gt);
 int xe_gt_resume(struct xe_gt *gt);
diff --git a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c 
b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
index a34c9a24dafc..c36218f4f6c8 100644
--- a/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
+++ b/drivers/gpu/drm/xe/xe_gt_ccs_mode.c
@@ -134,6 +134,7 @@ ccs_mode_store(struct device *kdev, struct device_attribute 
*attr,
if (gt->ccs_mode != num_engines) {
xe_gt_info(gt, "Setting compute mode to %d\n", num_engines);
gt->ccs_mode = num_engines;
+   xe_gt_record_user_engines(gt);
xe_gt_reset_async(gt);
}
 
diff --git a/drivers/gpu/drm/xe/xe_gt_types.h b/drivers/gpu/drm/xe/xe_gt_types.h
index 475fb58882f1..10a9a9529377 100644
--- a/drivers/gpu/drm/xe/xe_gt_types.h
+++ b/drivers/gpu/drm/xe/xe_gt_types.h
@@ -113,7 +113,11 @@ struct xe_gt {
enum xe_gt_type type;
/** @info.reference_clock: clock frequency */
u32 reference_clock;
-   /** @info.engine_mask: mask of engines present on GT */
+   /**
+* @info.engine_mask: mask of engines present on GT. Some of
+* them may be reserved in runtime and not available for user.
+* See @user_engines.mask
+*/
u64 engine_mask;
/** @info.gmdid: raw GMD_ID value from hardware */
u32 gmdid;
@@ -368,6 +372,21 @@ struct xe_gt {
/** @wa_active.oob: bitmap with active OOB workaroudns */
unsigned long *oob;
} wa_active;
+
+   /** @user_engines: engines present in GT and available to userspace */
+   struct {
+   /**
+* @user_engines.mask: like @info->engine_mask, but take in
+* consideration only engines available to userspace
+*/
+   u64 mask;
+
+   /**
+* @user_engines.instances_per_class: aggregate per class the
+* number of engines available to userspace
+*/
+   u8 instances_per_class[XE_ENGINE_CLASS_MAX];
+   } 

[PATCH v5 4/8] drm/xe: Add helper to capture engine timestamp

2024-05-17 Thread Lucas De Marchi
Just like CTX_TIMESTAMP is used to calculate runtime, add a helper to
get the timestamp for the engine so it can be used to calculate the
"engine time" with the same unit as the runtime is recorded.

Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 5 +
 drivers/gpu/drm/xe/xe_hw_engine.h | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index 942fca8f1eb9..de1aefaa2335 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1121,3 +1121,8 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
 
return NULL;
 }
+
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe)
+{
+   return xe_mmio_read64_2x32(hwe->gt, RING_TIMESTAMP(hwe->mmio_base));
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 843de159e47c..7f2d27c0ba1a 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -68,5 +68,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
 }
 
 const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+u64 xe_hw_engine_read_timestamp(struct xe_hw_engine *hwe);
 
 #endif
-- 
2.43.0



[PATCH v5 7/8] drm/xe: Add helper to return any available hw engine

2024-05-17 Thread Lucas De Marchi
Get the first available engine from a gt, which helps in the case any
engine serves as a context, like when reading RING_TIMESTAMP.

Reviewed-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_gt.c | 11 +++
 drivers/gpu/drm/xe/xe_gt.h |  7 +++
 2 files changed, 18 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
index 5194a3d38e76..3432fef56486 100644
--- a/drivers/gpu/drm/xe/xe_gt.c
+++ b/drivers/gpu/drm/xe/xe_gt.c
@@ -833,3 +833,14 @@ struct xe_hw_engine 
*xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt,
 
return NULL;
 }
+
+struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt)
+{
+   struct xe_hw_engine *hwe;
+   enum xe_hw_engine_id id;
+
+   for_each_hw_engine(hwe, gt, id)
+   return hwe;
+
+   return NULL;
+}
diff --git a/drivers/gpu/drm/xe/xe_gt.h b/drivers/gpu/drm/xe/xe_gt.h
index 1d010bf4a756..9073ac68a777 100644
--- a/drivers/gpu/drm/xe/xe_gt.h
+++ b/drivers/gpu/drm/xe/xe_gt.h
@@ -67,6 +67,13 @@ void xe_gt_remove(struct xe_gt *gt);
 struct xe_hw_engine *
 xe_gt_any_hw_engine_by_reset_domain(struct xe_gt *gt, enum xe_engine_class 
class);
 
+/**
+ * xe_gt_any_hw_engine - scan the list of engines and return the
+ * first available
+ * @gt: GT structure
+ */
+struct xe_hw_engine *xe_gt_any_hw_engine(struct xe_gt *gt);
+
 struct xe_hw_engine *xe_gt_hw_engine(struct xe_gt *gt,
 enum xe_engine_class class,
 u16 instance,
-- 
2.43.0



[PATCH v5 5/8] drm/xe: Add helper to accumulate exec queue runtime

2024-05-17 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to accumulate per-client runtime of all its
exec queues. This is called every time a sched job is finished.

v2:
  - Use guc_exec_queue_free_job() and execlist_job_free() to accumulate
runtime when job is finished since xe_sched_job_completed() is not a
notification that job finished.
  - Stop trying to update runtime from xe_exec_queue_fini() - that is
redundant and may happen after xef is closed, leading to a
use-after-free
  - Do not special case the first timestamp read: the default LRC sets
CTX_TIMESTAMP to zero, so even the first sample should be a valid
one.
  - Handle the parallel submission case by multiplying the runtime by
width.
v3: Update comments

Signed-off-by: Umesh Nerlige Ramappa 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_device_types.h |  3 +++
 drivers/gpu/drm/xe/xe_exec_queue.c   | 37 
 drivers/gpu/drm/xe/xe_exec_queue.h   |  1 +
 drivers/gpu/drm/xe/xe_execlist.c |  1 +
 drivers/gpu/drm/xe/xe_guc_submit.c   |  2 ++
 5 files changed, 44 insertions(+)

diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
b/drivers/gpu/drm/xe/xe_device_types.h
index 5c5e36de452a..bc97990fd032 100644
--- a/drivers/gpu/drm/xe/xe_device_types.h
+++ b/drivers/gpu/drm/xe/xe_device_types.h
@@ -555,6 +555,9 @@ struct xe_file {
struct mutex lock;
} exec_queue;
 
+   /** @runtime: hw engine class runtime in ticks for this drm client */
+   u64 runtime[XE_ENGINE_CLASS_MAX];
+
/** @client: drm client */
struct xe_drm_client *client;
 };
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c 
b/drivers/gpu/drm/xe/xe_exec_queue.c
index 395de93579fa..fa6dc996eca8 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.c
+++ b/drivers/gpu/drm/xe/xe_exec_queue.c
@@ -769,6 +769,43 @@ bool xe_exec_queue_is_idle(struct xe_exec_queue *q)
q->lrc[0].fence_ctx.next_seqno - 1;
 }
 
+/**
+ * xe_exec_queue_update_runtime() - Update runtime for this exec queue from hw
+ * @q: The exec queue
+ *
+ * Update the timestamp saved by HW for this exec queue and save runtime
+ * calculated by using the delta from last update. On multi-lrc case, only the
+ * first is considered.
+ */
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q)
+{
+   struct xe_file *xef;
+   struct xe_lrc *lrc;
+   u32 old_ts, new_ts;
+
+   /*
+* Jobs that are run during driver load may use an exec_queue, but are
+* not associated with a user xe file, so avoid accumulating busyness
+* for kernel specific work.
+*/
+   if (!q->vm || !q->vm->xef)
+   return;
+
+   xef = q->vm->xef;
+
+   /*
+* Only sample the first LRC. For parallel submission, all of them are
+* scheduled together and we compensate that below by multiplying by
+* width - this may introduce errors if that premise is not true and
+* they don't exit 100% aligned. On the other hand, looping through
+* the LRCs and reading them in different time could also introduce
+* errors.
+*/
+   lrc = >lrc[0];
+   new_ts = xe_lrc_update_timestamp(lrc, _ts);
+   xef->runtime[q->class] += (new_ts - old_ts) * q->width;
+}
+
 void xe_exec_queue_kill(struct xe_exec_queue *q)
 {
struct xe_exec_queue *eq = q, *next;
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.h 
b/drivers/gpu/drm/xe/xe_exec_queue.h
index 48f6da53a292..e0f07d28ee1a 100644
--- a/drivers/gpu/drm/xe/xe_exec_queue.h
+++ b/drivers/gpu/drm/xe/xe_exec_queue.h
@@ -75,5 +75,6 @@ struct dma_fence *xe_exec_queue_last_fence_get(struct 
xe_exec_queue *e,
   struct xe_vm *vm);
 void xe_exec_queue_last_fence_set(struct xe_exec_queue *e, struct xe_vm *vm,
  struct dma_fence *fence);
+void xe_exec_queue_update_runtime(struct xe_exec_queue *q);
 
 #endif
diff --git a/drivers/gpu/drm/xe/xe_execlist.c b/drivers/gpu/drm/xe/xe_execlist.c
index e9dee1e14fef..bd7f27efe0e0 100644
--- a/drivers/gpu/drm/xe/xe_execlist.c
+++ b/drivers/gpu/drm/xe/xe_execlist.c
@@ -306,6 +306,7 @@ static void execlist_job_free(struct drm_sched_job *drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
xe_sched_job_put(job);
 }
 
diff --git a/drivers/gpu/drm/xe/xe_guc_submit.c 
b/drivers/gpu/drm/xe/xe_guc_submit.c
index 4efb88e3e056..ad2b8067d071 100644
--- a/drivers/gpu/drm/xe/xe_guc_submit.c
+++ b/drivers/gpu/drm/xe/xe_guc_submit.c
@@ -749,6 +749,8 @@ static void guc_exec_queue_free_job(struct drm_sched_job 
*drm_job)
 {
struct xe_sched_job *job = to_xe_sched_job(drm_job);
 
+   xe_exec_queue_update_runtime(job->q);
+
trace_xe_sched_job_free(job);
xe_sched_job_put(job);
 }
-- 
2.43.0



[PATCH v5 3/8] drm/xe/lrc: Add helper to capture context timestamp

2024-05-17 Thread Lucas De Marchi
From: Umesh Nerlige Ramappa 

Add a helper to capture CTX_TIMESTAMP from the context image so it can
be used to calculate the runtime.

v2: Add kernel-doc to clarify expectation from caller

Signed-off-by: Umesh Nerlige Ramappa 
Reviewed-by: Francois Dugast 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h |  1 +
 drivers/gpu/drm/xe/xe_lrc.c | 12 
 drivers/gpu/drm/xe/xe_lrc.h | 14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h   |  3 +++
 4 files changed, 30 insertions(+)

diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h 
b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
index e6ca8bbda8f4..045dfd09db99 100644
--- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
+++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
@@ -11,6 +11,7 @@
 #define CTX_RING_TAIL  (0x06 + 1)
 #define CTX_RING_START (0x08 + 1)
 #define CTX_RING_CTL   (0x0a + 1)
+#define CTX_TIMESTAMP  (0x22 + 1)
 #define CTX_INDIRECT_RING_STATE(0x26 + 1)
 #define CTX_PDP0_UDW   (0x30 + 1)
 #define CTX_PDP0_LDW   (0x32 + 1)
diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
index 9b0a4078add3..f679cb9aaea7 100644
--- a/drivers/gpu/drm/xe/xe_lrc.c
+++ b/drivers/gpu/drm/xe/xe_lrc.c
@@ -844,6 +844,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
lrc->tile = gt_to_tile(hwe->gt);
lrc->ring.size = ring_size;
lrc->ring.tail = 0;
+   lrc->ctx_timestamp = 0;
 
xe_hw_fence_ctx_init(>fence_ctx, hwe->gt,
 hwe->fence_irq, hwe->name);
@@ -898,6 +899,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
*hwe,
 RING_CTL_SIZE(lrc->ring.size) | 
RING_VALID);
}
 
+   xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0);
+
if (xe->info.has_asid && vm)
xe_lrc_write_ctx_reg(lrc, PVC_CTX_ASID, vm->usm.asid);
 
@@ -1576,3 +1579,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot 
*snapshot)
xe_bo_put(snapshot->lrc_bo);
kfree(snapshot);
 }
+
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts)
+{
+   *old_ts = lrc->ctx_timestamp;
+
+   lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP);
+
+   return lrc->ctx_timestamp;
+}
diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
index e0e841963c23..b9da1031083b 100644
--- a/drivers/gpu/drm/xe/xe_lrc.h
+++ b/drivers/gpu/drm/xe/xe_lrc.h
@@ -66,4 +66,18 @@ void xe_lrc_snapshot_capture_delayed(struct xe_lrc_snapshot 
*snapshot);
 void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct 
drm_printer *p);
 void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot);
 
+/**
+ * xe_lrc_update_timestamp - readout LRC timestamp and update cached value
+ * @lrc: logical ring context for this exec queue
+ * @old_ts: pointer where to save the previous timestamp
+ *
+ * Read the current timestamp for this LRC and update the cached value. The
+ * previous cached value is also returned in @old_ts so the caller can 
calculate
+ * the delta between 2 updates. Note that this is not intended to be called 
from
+ * any place, but just by the paths updating the drm client utilization.
+ *
+ * Returns the current LRC timestamp
+ */
+u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h 
b/drivers/gpu/drm/xe/xe_lrc_types.h
index cdbf03faef15..0fa055da6b27 100644
--- a/drivers/gpu/drm/xe/xe_lrc_types.h
+++ b/drivers/gpu/drm/xe/xe_lrc_types.h
@@ -45,6 +45,9 @@ struct xe_lrc {
 
/** @fence_ctx: context for hw fence */
struct xe_hw_fence_ctx fence_ctx;
+
+   /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
+   u32 ctx_timestamp;
 };
 
 struct xe_lrc_snapshot;
-- 
2.43.0



[PATCH v5 2/8] drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion

2024-05-17 Thread Lucas De Marchi
XE_ENGINE_CLASS_OTHER was missing from the str conversion. Add it and
remove the default handling so it's protected by -Wswitch.
Currently the only user is xe_hw_engine_class_sysfs_init(), which
already skips XE_ENGINE_CLASS_OTHER, so there's no change in behavior.

Reviewed-by: Nirmoy Das 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index b71e90c555fa..942fca8f1eb9 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -,9 +,13 @@ const char *xe_hw_engine_class_to_str(enum 
xe_engine_class class)
return "vecs";
case XE_ENGINE_CLASS_COPY:
return "bcs";
+   case XE_ENGINE_CLASS_OTHER:
+   return "other";
case XE_ENGINE_CLASS_COMPUTE:
return "ccs";
-   default:
-   return NULL;
+   case XE_ENGINE_CLASS_MAX:
+   break;
}
+
+   return NULL;
 }
-- 
2.43.0



[PATCH v5 1/8] drm/xe: Promote xe_hw_engine_class_to_str()

2024-05-17 Thread Lucas De Marchi
Move it out of the sysfs compilation unit so it can be re-used in other
places.

Reviewed-by: Nirmoy Das 
Reviewed-by: Oak Zeng 
Signed-off-by: Lucas De Marchi 
---
 drivers/gpu/drm/xe/xe_hw_engine.c | 18 ++
 drivers/gpu/drm/xe/xe_hw_engine.h |  2 ++
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c | 18 --
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/xe/xe_hw_engine.c 
b/drivers/gpu/drm/xe/xe_hw_engine.c
index e19af179af33..b71e90c555fa 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine.c
@@ -1099,3 +1099,21 @@ bool xe_hw_engine_is_reserved(struct xe_hw_engine *hwe)
return xe->info.has_usm && hwe->class == XE_ENGINE_CLASS_COPY &&
hwe->instance == gt->usm.reserved_bcs_instance;
 }
+
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
+{
+   switch (class) {
+   case XE_ENGINE_CLASS_RENDER:
+   return "rcs";
+   case XE_ENGINE_CLASS_VIDEO_DECODE:
+   return "vcs";
+   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
+   return "vecs";
+   case XE_ENGINE_CLASS_COPY:
+   return "bcs";
+   case XE_ENGINE_CLASS_COMPUTE:
+   return "ccs";
+   default:
+   return NULL;
+   }
+}
diff --git a/drivers/gpu/drm/xe/xe_hw_engine.h 
b/drivers/gpu/drm/xe/xe_hw_engine.h
index 71968ee2f600..843de159e47c 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine.h
+++ b/drivers/gpu/drm/xe/xe_hw_engine.h
@@ -67,4 +67,6 @@ static inline bool xe_hw_engine_is_valid(struct xe_hw_engine 
*hwe)
return hwe->name;
 }
 
+const char *xe_hw_engine_class_to_str(enum xe_engine_class class);
+
 #endif
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c 
b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
index 844ec68cbbb8..efce6c7dd2a2 100644
--- a/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
+++ b/drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c
@@ -618,24 +618,6 @@ static void hw_engine_class_sysfs_fini(struct drm_device 
*drm, void *arg)
kobject_put(kobj);
 }
 
-static const char *xe_hw_engine_class_to_str(enum xe_engine_class class)
-{
-   switch (class) {
-   case XE_ENGINE_CLASS_RENDER:
-   return "rcs";
-   case XE_ENGINE_CLASS_VIDEO_DECODE:
-   return "vcs";
-   case XE_ENGINE_CLASS_VIDEO_ENHANCE:
-   return "vecs";
-   case XE_ENGINE_CLASS_COPY:
-   return "bcs";
-   case XE_ENGINE_CLASS_COMPUTE:
-   return "ccs";
-   default:
-   return NULL;
-   }
-}
-
 /**
  * xe_hw_engine_class_sysfs_init - Init HW engine classes on GT.
  * @gt: Xe GT.
-- 
2.43.0



[PATCH v5 0/8] drm/xe: Per client usage

2024-05-17 Thread Lucas De Marchi
v5 of of 
https://lore.kernel.org/all/20240515214258.59209-1-lucas.demar...@intel.com

Add per-client usage statistics to xe. This ports xe to use the common
method in drm to export the usage to userspace per client (where 1
client == 1 drm fd open).

However instead of using the current format measured in nsec, this
creates a new one. The intention here is not to mix the GPU clock domain
with the CPU clock. It allows to cover a few more use cases without
extra complications.

This version is tested on an ADL-P and also checked gputop with i915 to
make sure not regressed. Last patch also contains the documentation for
the new key and sample output as requested in v1.

The pre-existent drm-cycles- is used as is, which allows gputop
to work with xe.

This last patch still has some open discussion from v2, so we may need
to hold it a little more.

v2:
  - Create a new drm-total-cycles instead of re-using drm-engine with a
different unit
  - Add documentation for the new interface and clarify usage of
xe_lrc_update_timestamp()

v3:
  - Fix bugs in "drm/xe: Add helper to accumulate exec queue runtime" -
see commit message
  - Reorder commits so the ones that are useful in other patch series
come first

v4:
  - Fix some comments and documentation
  - Add 2 patches so we cache on the gt the mask of engines visible to
userspace and the per-class capacity. Previously we were doing this
during the query, but besides not being very efficient as pointed
by Tvrtko, we were also not handling correclty reserved engines and
engines "hidden" by e.g. ccs_mode.  So move that part to be executed
on init and when changing the available engines.
  - Simplify the fdinfo output loop since now we have the information
cached on gt. This also moves the read of the gpu timestamp out
of the loop as suggested by Tvrtko and using the helpers implemented
in the new patches.

v5:
  - Fix kernel-doc
  - Move pm_runtime_put() earlier in the function as it's not needed
anymore after interacting with the HW.

Lucas De Marchi (6):
  drm/xe: Promote xe_hw_engine_class_to_str()
  drm/xe: Add XE_ENGINE_CLASS_OTHER to str conversion
  drm/xe: Add helper to capture engine timestamp
  drm/xe: Cache data about user-visible engines
  drm/xe: Add helper to return any available hw engine
  drm/xe/client: Print runtime to fdinfo

Umesh Nerlige Ramappa (2):
  drm/xe/lrc: Add helper to capture context timestamp
  drm/xe: Add helper to accumulate exec queue runtime

 Documentation/gpu/drm-usage-stats.rst |  21 ++-
 Documentation/gpu/xe/index.rst|   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst   |  10 ++
 drivers/gpu/drm/xe/regs/xe_lrc_layout.h   |   1 +
 drivers/gpu/drm/xe/xe_device_types.h  |   3 +
 drivers/gpu/drm/xe/xe_drm_client.c| 121 +-
 drivers/gpu/drm/xe/xe_exec_queue.c|  37 ++
 drivers/gpu/drm/xe/xe_exec_queue.h|   1 +
 drivers/gpu/drm/xe/xe_execlist.c  |   1 +
 drivers/gpu/drm/xe/xe_gt.c|  34 +
 drivers/gpu/drm/xe/xe_gt.h|  20 +++
 drivers/gpu/drm/xe/xe_gt_ccs_mode.c   |   1 +
 drivers/gpu/drm/xe/xe_gt_types.h  |  21 ++-
 drivers/gpu/drm/xe/xe_guc_submit.c|   2 +
 drivers/gpu/drm/xe/xe_hw_engine.c |  27 
 drivers/gpu/drm/xe/xe_hw_engine.h |   3 +
 drivers/gpu/drm/xe/xe_hw_engine_class_sysfs.c |  18 ---
 drivers/gpu/drm/xe/xe_lrc.c   |  12 ++
 drivers/gpu/drm/xe/xe_lrc.h   |  14 ++
 drivers/gpu/drm/xe/xe_lrc_types.h |   3 +
 20 files changed, 329 insertions(+), 22 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

-- 
2.43.0



[PATCH v5 8/8] drm/xe/client: Print runtime to fdinfo

2024-05-17 Thread Lucas De Marchi
Print the accumulated runtime for client when printing fdinfo.
Each time a query is done it first does 2 things:

1) loop through all the exec queues for the current client and
   accumulate the runtime, per engine class. CTX_TIMESTAMP is used for
   that, being read from the context image.

2) Read a "GPU timestamp" that can be used for considering "how much GPU
   time has passed" and that has the same unit/refclock as the one
   recording the runtime. RING_TIMESTAMP is used for that via MMIO.

Since for all current platforms RING_TIMESTAMP follows the same
refclock, just read it once, using any first engine available.

This is exported to userspace as 2 numbers in fdinfo:

drm-cycles-: 
drm-total-cycles-: 

Userspace is expected to collect at least 2 samples, which allows to
know the client engine busyness as per:

RUNTIME1 - RUNTIME0
busyness = -
  T1 - T0

Since drm-cycles- always starts at 0, it's also possible to know
if and engine was ever used by a client.

It's expected that userspace will read any 2 samples every few seconds.
Given the update frequency of the counters involved and that
CTX_TIMESTAMP is 32-bits, the counter for each exec_queue can wrap
around (assuming 100% utilization) after ~200s. The wraparound is not
perceived by userspace since it's just accumulated for all the
exec_queues in a 64-bit counter) but the measurement will not be
accurate if the samples are too far apart.

This could be mitigated by adding a workqueue to accumulate the counters
every so often, but it's additional complexity for something that is
done already by userspace every few seconds in tools like gputop (from
igt), htop, nvtop, etc, with none of them really defaulting to 1 sample
per minute or more.

Reviewed-by: Umesh Nerlige Ramappa 
Acked-by: Tvrtko Ursulin 
Signed-off-by: Lucas De Marchi 
---
 Documentation/gpu/drm-usage-stats.rst   |  21 +++-
 Documentation/gpu/xe/index.rst  |   1 +
 Documentation/gpu/xe/xe-drm-usage-stats.rst |  10 ++
 drivers/gpu/drm/xe/xe_drm_client.c  | 121 +++-
 4 files changed, 150 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/gpu/xe/xe-drm-usage-stats.rst

diff --git a/Documentation/gpu/drm-usage-stats.rst 
b/Documentation/gpu/drm-usage-stats.rst
index 6dc299343b48..a80f95ca1b2f 100644
--- a/Documentation/gpu/drm-usage-stats.rst
+++ b/Documentation/gpu/drm-usage-stats.rst
@@ -112,6 +112,19 @@ larger value within a reasonable period. Upon observing a 
value lower than what
 was previously read, userspace is expected to stay with that larger previous
 value until a monotonic update is seen.
 
+- drm-total-cycles-: 
+
+Engine identifier string must be the same as the one specified in the
+drm-cycles- tag and shall contain the total number cycles for the given
+engine.
+
+This is a timestamp in GPU unspecified unit that matches the update rate
+of drm-cycles-. For drivers that implement this interface, the engine
+utilization can be calculated entirely on the GPU clock domain, without
+considering the CPU sleep time between 2 samples.
+
+A driver may implement either this key or drm-maxfreq-, but not both.
+
 - drm-maxfreq-:  [Hz|MHz|KHz]
 
 Engine identifier string must be the same as the one specified in the
@@ -121,6 +134,9 @@ percentage utilization of the engine, whereas 
drm-engine- only reflects
 time active without considering what frequency the engine is operating as a
 percentage of its maximum frequency.
 
+A driver may implement either this key or drm-total-cycles-, but not
+both.
+
 Memory
 ^^
 
@@ -168,5 +184,6 @@ be documented above and where possible, aligned with other 
drivers.
 Driver specific implementations
 ---
 
-:ref:`i915-usage-stats`
-:ref:`panfrost-usage-stats`
+* :ref:`i915-usage-stats`
+* :ref:`panfrost-usage-stats`
+* :ref:`xe-usage-stats`
diff --git a/Documentation/gpu/xe/index.rst b/Documentation/gpu/xe/index.rst
index c224ecaee81e..3f07aa3b5432 100644
--- a/Documentation/gpu/xe/index.rst
+++ b/Documentation/gpu/xe/index.rst
@@ -23,3 +23,4 @@ DG2, etc is provided to prototype the driver.
xe_firmware
xe_tile
xe_debugging
+   xe-drm-usage-stats.rst
diff --git a/Documentation/gpu/xe/xe-drm-usage-stats.rst 
b/Documentation/gpu/xe/xe-drm-usage-stats.rst
new file mode 100644
index ..482d503ae68a
--- /dev/null
+++ b/Documentation/gpu/xe/xe-drm-usage-stats.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: GPL-2.0+
+
+.. _xe-usage-stats:
+
+
+Xe DRM client usage stats implementation
+
+
+.. kernel-doc:: drivers/gpu/drm/xe/xe_drm_client.c
+   :doc: DRM Client usage stats
diff --git a/drivers/gpu/drm/xe/xe_drm_client.c 
b/drivers/gpu/drm/xe/xe_drm_client.c
index 08f0b7c95901..af404c9e5cc0 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.c
+++ b/drivers/gpu/drm/xe/xe_drm_client.c
@@ -2,6 

Re: [PATCH v2] dt-bindings: display: synopsys,dw-hdmi: Document ddc-i2c-bus in core

2024-05-17 Thread Rob Herring
On Wed, May 15, 2024 at 04:47:05PM +0300, Laurent Pinchart wrote:
> Hi Marek,
> 
> Thank you for the patch.
> 
> On Wed, May 15, 2024 at 08:27:44AM +0200, Marek Vasut wrote:
> > The DW HDMI driver core is responsible for parsing the 'ddc-i2c-bus' 
> > property,
> > move the property description into the DW HDMI common DT schema too, so this
> > property can be used on all devices integrating the DW HDMI core.
> 
> De-duplicating documentation is good :-)

Generally, yes.

> I see no reason why this property should be disallowed on any of the
> platforms that integrate a DW HDMI (unless that platform has no other
> I2C controller, but I think we can ignore that in the bindings).

The main reason is Because this property should be in a connector node 
as the I2C bus is connected to the connector not the HDMI block.

I would suggest this gets marked 'deprecated'. Can be a separate patch.

Rob


Re: [PATCH] drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1

2024-05-17 Thread Alex Deucher
On Fri, May 17, 2024 at 1:27 PM Tim Van Patten  wrote:
>
> > Fair enough, but this is also the only gfx9 APU which defaults to
> > noretry=1, all of the rest are dGPUs.  I'd argue it should align with
> > the other GFX9 APUs or they should all enable noretry=1.
>
> Do you mean we should remove all IP_VERSION(9, X, X) entries from
> amdgpu_gmc_noretry_set(), leaving just >= IP_VERSION(10, 3, 0)?

No, just take your patch as is.  All of the other 9.x IP versions in
that check are dGPUs.  9.3.0 was the only APU in that list.

Alex


Re: [PATCH v8 07/10] lib: add basic KUnit test for lib/math

2024-05-17 Thread Andy Shevchenko
On Fri, May 17, 2024 at 11:06:07PM +0530, Devarsh Thakkar wrote:
> From: Daniel Latypov 
> 
> Add basic test coverage for files that don't require any config options:
> * part of math.h (what seem to be the most commonly used macros)
> * gcd.c
> * lcm.c
> * int_sqrt.c
> * reciprocal_div.c
> (Ignored int_pow.c since it's a simple textbook algorithm.)
> 
> These tests aren't particularly interesting, but they
> * provide short and simple examples of parameterized tests
> * provide a place to add tests for any new files in this dir
> * are written so adding new test cases to cover edge cases should be
>   easy
>   * looking at code coverage, we hit all the branches in the .c files

...

> [devarsht: Rebase to 6.9 and change license to GPL]

I'm not sure that you may change license. It needs the author's confirmation.

> ---
> Changes since v6:
> * Rebase to linux-next, change license to GPL as suggested by checkpatch.

Note, checkpatch.pl is not false positives free. Be careful
with what it suggests.

> +#include 
> +#include 

> +#include 

Do you know why this header is included?

> +#include 

+ math.h // obviously
+ module.h

> +#include 

+ types.h

...

Other than above, LGTM.

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH 2/2] drm/i915: Don't treat FLR resets as errors

2024-05-17 Thread Nirmoy Das

Hi Andi,

On 5/17/2024 9:34 PM, Andi Shyti wrote:

Hi Nirmoy,

On Fri, May 17, 2024 at 04:00:02PM +0200, Nirmoy Das wrote:

On 5/17/2024 1:25 PM, Andi Shyti wrote:

If we timeout while waiting for an FLR reset, there is nothing we
can do and i915 doesn't have any control on it. In any case the
system is still perfectly usable

If a FLR reset fails then we will have a dead GPU, I don't think the GPU is
usable without a cold reboot.

fact is that the GPU keeps going and even though the timeout has
expired, the system moves to the next phase.
The current test might look like it is has passed, but if you look into 
the subsequent tests you can see a dead GPU:


<7>[  369.168121] pci :00:02.0: [drm:intel_uncore_fini_mmio [i915]] 
Triggering Driver-FLR
*<3>[ 372.170189] pci :00:02.0: [drm] *ERROR* Driver-FLR-teardown 
wait completion failed! -110*

*<7>[ 372.437630] [IGT] i915_selftest: finished subtest requests, SUCCESS*
<7>[  372.438356] [IGT] i915_selftest: starting dynamic subtest migrate
<5>[  373.110580] Setting dangerous option live_selftests - tainting kernel
<3>[  373.183499] i915 :00:02.0: Unable to change power state from D0 to 
D0, device inaccessible
<3>[  373.246921] i915 :00:02.0: [drm] *ERROR* Unrecognized display IP 
version 1023.255; disabling display.
<7>[  373.247130] i915 :00:02.0: [drm:intel_step_init [i915]] Using future 
steppings
<7>[  373.247716] i915 :00:02.0: [drm:intel_step_init [i915]] Using future 
steppings
<7>[  373.248263] i915 :00:02.0: [drm:intel_step_init [i915]] Using future 
display steppings
<7>[  373.251843] i915 :00:02.0: [drm:intel_gt_common_init_early [i915]] 
WOPCM: 2048K
<7>[  373.252505] i915 :00:02.0: [drm:intel_uc_init_early [i915]] GT0: 
enable_guc=3 (guc:yes submission:yes huc:no slpc:yes)
<7>[  373.253140] i915 :00:02.0: [drm:intel_gt_probe_all [i915]] GT0: 
Setting up Primary GT
<7>[  373.253556] i915 :00:02.0: [drm:intel_gt_probe_all [i915]] GT1: 
Setting up Standalone Media GT
<7>[  373.253941] i915 :00:02.0: [drm:intel_gt_common_init_early [i915]] 
WOPCM: 2048K
<7>[  373.254365] i915 :00:02.0: [drm:intel_uc_init_early [i915]] GT1: 
enable_guc=3 (guc:yes submission:yes huc:yes slpc:yes)
*<3>[ 375.256235] i915 :00:02.0: [drm] *ERROR* Device is 
non-operational; MMIO access returns 0x!*

<3>[  375.259089] i915 :00:02.0: Device initialization failed (-5)
<3>[  375.260521] i915 :00:02.0: probe with driver i915 failed with error -5
<7>[  375.392209] [IGT] i915_selftest: finished subtest migrate, FAIL

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_14724/bat-arls-3/dmesg0.txt




This is a serious issue and should be report as an error.  I think we need
to create a HW ticket to understand

why is FLR reset fails.

Maybe it takes longer and longer to reset. We've been sending
several patches in the latest years to fix the timings.


HW spec says 3 sec but we can try increasing it bit higher to try it out.


Regards,

Nirmoy



Andi

Re: [PATCH v3 0/2] Fix Kernel CI issues

2024-05-17 Thread Jakub Kicinski
On Fri, 3 May 2024 20:25:49 +0300 Tomi Valkeinen wrote:
> > I think the second patch also needs to go to drm-misc-next-fixes? The
> > clang warning fixed by it has returned in next-20240503 because it
> > appears that for-linux-next was switch from drm-misc-next to
> > drm-misc-next-fixes, as I see for-linux-next was pointing to commit
> > 235e60653f8d ("drm/debugfs: Drop conditionals around of_node pointers")
> > on drm-misc-next in next-20240502 but it is now pointing to commit
> > be3f3042391d ("drm: zynqmp_dpsub: Always register bridge") on
> > drm-misc-next-fixes in next-20240503.  
> 
> Oh. Hmm, did I just hit the feature-freeze point with the fixes...
> 
> Now I'm unsure where I should push these (if anywhere), as they already 
> are in drm-misc-next.
> 
> DRM Misc maintainers, can you give me a hint? =)

This is now breaking allmodconfig build of Linus's tree.
Could you please get it fixed ASAP?


[PATCH] drm/msm/adreno: Check for zap node availability

2024-05-17 Thread Rob Clark
From: Rob Clark 

This should allow disabling the zap node via an overlay, for slbounce.

Suggested-by: Nikita Travkin 
Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index d9ea15994ae9..a00241e3373b 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -46,7 +46,7 @@ static int zap_shader_load_mdt(struct msm_gpu *gpu, const 
char *fwname,
}
 
np = of_get_child_by_name(dev->of_node, "zap-shader");
-   if (!np) {
+   if (!np || !of_device_is_available(np)) {
zap_available = false;
return -ENODEV;
}
-- 
2.45.1



Re: [PATCH v2 0/7] drm/mipi-dsi: simplify MIPI DSI init/cleanup even more

2024-05-17 Thread Neil Armstrong
Hi,

On Sun, 12 May 2024 02:00:17 +0300, Dmitry Baryshkov wrote:
> Follow the example of mipi_dsi_generic_write_multi(),
> mipi_dsi_dcs_write_buffer_multi(), mipi_dsi_generic_write_seq_multi()
> and mipi_dsi_dcs_write_seq_multi(). Define _multi variants for several
> other common MIPI DSI functions and use these functions in the panel
> code.
> 
> This series also includes a fix for the LG SW43408. If the proposed
> approach is declined, the fix will be submitted separately.
> 
> [...]

Thanks, Applied to https://gitlab.freedesktop.org/drm/misc/kernel.git 
(drm-misc-next)

[1/7] drm/panel: lg-sw43408: add missing error handling
  
https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/51f9183e4af8c7f00e81180cbb9ee4a98a0f0aa1
[2/7] drm/mipi-dsi: wrap more functions for streamline handling
  
https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/f79d6d28d8fe77b14beeaebe5393d9f294f8d09d
[3/7] drm/panel: boe-tv101wum-nl6: use wrapped MIPI DCS functions
  
https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/91329f921283b995ac125a0c6e61be0c1399f66f
[4/7] drm/panel: ilitek-ili9882t: use wrapped MIPI DCS functions
  
https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/510ba36e86eeb3ca89326dd51da32806e1ede693
[5/7] drm/panel: innolux-p079zca: use mipi_dsi_dcs_nop_multi()
  
https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/0f43988fb9c1c0a0c2f5ccf2d1bdb914f6e4e79b
[6/7] drm/panel: novatek-nt36672e: use wrapped MIPI DCS functions
  
https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/67ba7a82d99a8a8b4bcc1b8124b5640c63dd51bf
[7/7] drm/panel: lg-sw43408: use new streamlined MIPI DSI API
  
https://gitlab.freedesktop.org/drm/misc/kernel/-/commit/85cb9d603953d77de5cb311d229a79c439ff6bfb

-- 
Neil



Re: [PATCH 2/2] drm/i915: Don't treat FLR resets as errors

2024-05-17 Thread Andi Shyti
Hi Nirmoy,

On Fri, May 17, 2024 at 04:00:02PM +0200, Nirmoy Das wrote:
> On 5/17/2024 1:25 PM, Andi Shyti wrote:
> > If we timeout while waiting for an FLR reset, there is nothing we
> > can do and i915 doesn't have any control on it. In any case the
> > system is still perfectly usable
> 
> If a FLR reset fails then we will have a dead GPU, I don't think the GPU is
> usable without a cold reboot.

fact is that the GPU keeps going and even though the timeout has
expired, the system moves to the next phase.

> This is a serious issue and should be report as an error.  I think we need
> to create a HW ticket to understand
> 
> why is FLR reset fails.

Maybe it takes longer and longer to reset. We've been sending
several patches in the latest years to fix the timings.

Andi


Re: [PATCH v2 2/7] drm/mipi-dsi: wrap more functions for streamline handling

2024-05-17 Thread Neil Armstrong

On 12/05/2024 01:00, Dmitry Baryshkov wrote:

Follow the pattern of mipi_dsi_dcs_*_multi() and wrap several existing
MIPI DSI functions to use the context for processing. This simplifies
and streamlines driver code to use simpler code pattern.

Note, msleep function is also wrapped in this way as it is frequently
called inbetween other mipi_dsi_dcs_*() functions.

Signed-off-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/drm_mipi_dsi.c | 210 +
  include/drm/drm_mipi_dsi.h |  21 +
  2 files changed, 231 insertions(+)

diff --git a/drivers/gpu/drm/drm_mipi_dsi.c b/drivers/gpu/drm/drm_mipi_dsi.c
index d2957cb692d3..8721edd06c06 100644
--- a/drivers/gpu/drm/drm_mipi_dsi.c
+++ b/drivers/gpu/drm/drm_mipi_dsi.c
@@ -1429,6 +1429,216 @@ int mipi_dsi_dcs_get_display_brightness_large(struct 
mipi_dsi_device *dsi,
  }
  EXPORT_SYMBOL(mipi_dsi_dcs_get_display_brightness_large);
  
+/**

+ * mipi_dsi_picture_parameter_set_multi() - transmit the DSC PPS to the 
peripheral
+ * @ctx: Context for multiple DSI transactions
+ * @pps: VESA DSC 1.1 Picture Parameter Set
+ *
+ * Like mipi_dsi_picture_parameter_set() but deals with errors in a way that
+ * makes it convenient to make several calls in a row.
+ */
+void mipi_dsi_picture_parameter_set_multi(struct mipi_dsi_multi_context *ctx,
+  const struct drm_dsc_picture_parameter_set 
*pps)
+{
+   struct mipi_dsi_device *dsi = ctx->dsi;
+   struct device *dev = >dev;
+   ssize_t ret;
+
+   if (ctx->accum_err)
+   return;
+
+   ret = mipi_dsi_picture_parameter_set(dsi, pps);
+   if (ret < 0) {
+   ctx->accum_err = ret;
+   dev_err(dev, "sending PPS failed: %d\n",
+   ctx->accum_err);
+   }
+}
+EXPORT_SYMBOL(mipi_dsi_picture_parameter_set_multi);
+
+/**
+ * mipi_dsi_compression_mode_ext_multi() - enable/disable DSC on the peripheral
+ * @ctx: Context for multiple DSI transactions
+ * @enable: Whether to enable or disable the DSC
+ * @algo: Selected compression algorithm
+ * @pps_selector: Select PPS from the table of pre-stored or uploaded PPS 
entries
+ *
+ * Like mipi_dsi_compression_mode_ext() but deals with errors in a way that
+ * makes it convenient to make several calls in a row.
+ */
+void mipi_dsi_compression_mode_ext_multi(struct mipi_dsi_multi_context *ctx,
+bool enable,
+enum mipi_dsi_compression_algo algo,
+unsigned int pps_selector)
+{
+   struct mipi_dsi_device *dsi = ctx->dsi;
+   struct device *dev = >dev;
+   ssize_t ret;
+
+   if (ctx->accum_err)
+   return;
+
+   ret = mipi_dsi_compression_mode_ext(dsi, enable, algo, pps_selector);
+   if (ret < 0) {
+   ctx->accum_err = ret;
+   dev_err(dev, "sending COMPRESSION_MODE failed: %d\n",
+   ctx->accum_err);
+   }
+}
+EXPORT_SYMBOL(mipi_dsi_compression_mode_ext_multi);
+
+/**
+ * mipi_dsi_dcs_nop_multi() - send DCS NOP packet
+ * @ctx: Context for multiple DSI transactions
+ *
+ * Like mipi_dsi_dcs_nop() but deals with errors in a way that
+ * makes it convenient to make several calls in a row.
+ */
+void mipi_dsi_dcs_nop_multi(struct mipi_dsi_multi_context *ctx)
+{
+   struct mipi_dsi_device *dsi = ctx->dsi;
+   struct device *dev = >dev;
+   ssize_t ret;
+
+   if (ctx->accum_err)
+   return;
+
+   ret = mipi_dsi_dcs_nop(dsi);
+   if (ret < 0) {
+   ctx->accum_err = ret;
+   dev_err(dev, "sending DCS NOP failed: %d\n",
+   ctx->accum_err);
+   }
+}
+EXPORT_SYMBOL(mipi_dsi_dcs_nop_multi);
+
+/**
+ * mipi_dsi_dcs_enter_sleep_mode_multi() - send DCS ENTER_SLEEP_MODE  packet
+ * @ctx: Context for multiple DSI transactions
+ *
+ * Like mipi_dsi_dcs_enter_sleep_mode() but deals with errors in a way that
+ * makes it convenient to make several calls in a row.
+ */
+void mipi_dsi_dcs_enter_sleep_mode_multi(struct mipi_dsi_multi_context *ctx)
+{
+   struct mipi_dsi_device *dsi = ctx->dsi;
+   struct device *dev = >dev;
+   ssize_t ret;
+
+   if (ctx->accum_err)
+   return;
+
+   ret = mipi_dsi_dcs_enter_sleep_mode(dsi);
+   if (ret < 0) {
+   ctx->accum_err = ret;
+   dev_err(dev, "sending DCS ENTER_SLEEP_MODE failed: %d\n",
+   ctx->accum_err);
+   }
+}
+EXPORT_SYMBOL(mipi_dsi_dcs_enter_sleep_mode_multi);
+
+/**
+ * mipi_dsi_dcs_exit_sleep_mode_multi() - send DCS EXIT_SLEEP_MODE packet
+ * @ctx: Context for multiple DSI transactions
+ *
+ * Like mipi_dsi_dcs_exit_sleep_mode() but deals with errors in a way that
+ * makes it convenient to make several calls in a row.
+ */
+void mipi_dsi_dcs_exit_sleep_mode_multi(struct mipi_dsi_multi_context *ctx)
+{
+   struct mipi_dsi_device *dsi = ctx->dsi;
+ 

[PATCH v2 4/4] drm/vmwgfx: Standardize use of kibibytes when logging

2024-05-17 Thread Ian Forbes
Use the same standard abbreviation KiB instead of incorrect variants.

Signed-off-by: Ian Forbes 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c   | 12 ++--
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 2e1fb46bcaa3..e03b2e682507 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -745,7 +745,7 @@ static int vmw_setup_pci_resources(struct vmw_private *dev,
dev->vram_size = pci_resource_len(pdev, 2);
 
drm_info(>drm,
-   "Register MMIO at 0x%pa size is %llu kiB\n",
+   "Register MMIO at 0x%pa size is %llu KiB\n",
 _start, (uint64_t)rmmio_size / 1024);
dev->rmmio = devm_ioremap(dev->drm.dev,
  rmmio_start,
@@ -764,7 +764,7 @@ static int vmw_setup_pci_resources(struct vmw_private *dev,
fifo_size = pci_resource_len(pdev, 2);
 
drm_info(>drm,
-"FIFO at %pa size is %llu kiB\n",
+"FIFO at %pa size is %llu KiB\n",
 _start, (uint64_t)fifo_size / 1024);
dev->fifo_mem = devm_memremap(dev->drm.dev,
  fifo_start,
@@ -789,7 +789,7 @@ static int vmw_setup_pci_resources(struct vmw_private *dev,
 * SVGA_REG_VRAM_SIZE.
 */
drm_info(>drm,
-"VRAM at %pa size is %llu kiB\n",
+"VRAM at %pa size is %llu KiB\n",
 >vram_start, (uint64_t)dev->vram_size / 1024);
 
return 0;
@@ -983,13 +983,13 @@ static int vmw_driver_load(struct vmw_private *dev_priv, 
u32 pci_id)
dev_priv->max_primary_mem = dev_priv->vram_size;
}
drm_info(_priv->drm,
-"Legacy memory limits: VRAM = %llu kB, FIFO = %llu kB, surface 
= %u kB\n",
+"Legacy memory limits: VRAM = %llu KiB, FIFO = %llu KiB, 
surface = %u KiB\n",
 (u64)dev_priv->vram_size / 1024,
 (u64)dev_priv->fifo_mem_size / 1024,
 dev_priv->memory_size / 1024);
 
drm_info(_priv->drm,
-"MOB limits: max mob size = %u kB, max mob pages = %u\n",
+"MOB limits: max mob size = %u KiB, max mob pages = %u\n",
 dev_priv->max_mob_size / 1024, dev_priv->max_mob_pages);
 
ret = vmw_dma_masks(dev_priv);
@@ -1007,7 +1007,7 @@ static int vmw_driver_load(struct vmw_private *dev_priv, 
u32 pci_id)
 (unsigned)dev_priv->max_gmr_pages);
}
drm_info(_priv->drm,
-"Maximum display memory size is %llu kiB\n",
+"Maximum display memory size is %llu KiB\n",
 (uint64_t)dev_priv->max_primary_mem / 1024);
 
/* Need mmio memory to check for fifo pitchlock cap. */
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
index a0b47c9b33f5..5bd967fbcf55 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
@@ -94,14 +94,14 @@ static int vmw_gmrid_man_get_node(struct 
ttm_resource_manager *man,
} else
new_max_pages = gman->max_gmr_pages * 2;
if (new_max_pages > gman->max_gmr_pages && 
new_max_pages >= gman->used_gmr_pages) {
-   DRM_WARN("vmwgfx: increasing guest mob limits 
to %u kB.\n",
+   DRM_WARN("vmwgfx: increasing guest mob limits 
to %u KiB.\n",
 ((new_max_pages) << (PAGE_SHIFT - 
10)));
 
gman->max_gmr_pages = new_max_pages;
} else {
char buf[256];
snprintf(buf, sizeof(buf),
-"vmwgfx, error: guest graphics is out 
of memory (mob limit at: %ukB).\n",
+"vmwgfx, error: guest graphics is out 
of memory (mob limit at: %u KiB).\n",
 ((gman->max_gmr_pages) << (PAGE_SHIFT 
- 10)));
vmw_host_printf(buf);
DRM_WARN("%s", buf);
-- 
2.34.1



[PATCH v2 3/4] drm/vmwgfx: Remove STDU logic from generic mode_valid function

2024-05-17 Thread Ian Forbes
STDU has its own mode_valid function now so this logic can be removed from
the generic version.

Fixes: 935f795045a6 ("drm/vmwgfx: Refactor drm connector probing for display 
modes")

Signed-off-by: Ian Forbes 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h |  3 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 26 +-
 2 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 4ecaea0026fc..a1ce41e1c468 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -1043,9 +1043,6 @@ void vmw_kms_cursor_snoop(struct vmw_surface *srf,
 int vmw_kms_write_svga(struct vmw_private *vmw_priv,
   unsigned width, unsigned height, unsigned pitch,
   unsigned bpp, unsigned depth);
-bool vmw_kms_validate_mode_vram(struct vmw_private *dev_priv,
-   uint32_t pitch,
-   uint32_t height);
 int vmw_kms_present(struct vmw_private *dev_priv,
struct drm_file *file_priv,
struct vmw_framebuffer *vfb,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 13b2820cae51..9532258a0848 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -2171,13 +2171,12 @@ int vmw_kms_write_svga(struct vmw_private *vmw_priv,
return 0;
 }
 
+static
 bool vmw_kms_validate_mode_vram(struct vmw_private *dev_priv,
-   uint32_t pitch,
-   uint32_t height)
+   u64 pitch,
+   u64 height)
 {
-   return ((u64) pitch * (u64) height) < (u64)
-   ((dev_priv->active_display_unit == vmw_du_screen_target) ?
-dev_priv->max_primary_mem : dev_priv->vram_size);
+   return (pitch * height) < (u64)dev_priv->vram_size;
 }
 
 /**
@@ -2873,25 +2872,18 @@ int vmw_du_helper_plane_update(struct 
vmw_du_update_plane *update)
 enum drm_mode_status vmw_connector_mode_valid(struct drm_connector *connector,
  struct drm_display_mode *mode)
 {
+   enum drm_mode_status ret;
struct drm_device *dev = connector->dev;
struct vmw_private *dev_priv = vmw_priv(dev);
-   u32 max_width = dev_priv->texture_max_width;
-   u32 max_height = dev_priv->texture_max_height;
u32 assumed_cpp = 4;
 
if (dev_priv->assume_16bpp)
assumed_cpp = 2;
 
-   if (dev_priv->active_display_unit == vmw_du_screen_target) {
-   max_width  = min(dev_priv->stdu_max_width,  max_width);
-   max_height = min(dev_priv->stdu_max_height, max_height);
-   }
-
-   if (max_width < mode->hdisplay)
-   return MODE_BAD_HVALUE;
-
-   if (max_height < mode->vdisplay)
-   return MODE_BAD_VVALUE;
+   ret = drm_mode_validate_size(mode, dev_priv->texture_max_width,
+dev_priv->texture_max_height);
+   if (ret != MODE_OK)
+   return ret;
 
if (!vmw_kms_validate_mode_vram(dev_priv,
mode->hdisplay * assumed_cpp,
-- 
2.34.1



[PATCH v2 2/4] drm/vmwgfx: 3D disabled should not effect STDU memory limits

2024-05-17 Thread Ian Forbes
This limit became a hard cap starting with the change referenced below.
Surface creation on the device will fail if the requested size is larger
than this limit so altering the value arbitrarily will expose modes that
are too large for the device's hard limits.

Fixes: 7ebb47c9f9ab ("drm/vmwgfx: Read new register for GB memory when 
available")

Signed-off-by: Ian Forbes 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 4bf6da2b15fe..2e1fb46bcaa3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -959,13 +959,6 @@ static int vmw_driver_load(struct vmw_private *dev_priv, 
u32 pci_id)
vmw_read(dev_priv,
 
SVGA_REG_SUGGESTED_GBOBJECT_MEM_SIZE_KB);
 
-   /*
-* Workaround for low memory 2D VMs to compensate for the
-* allocation taken by fbdev
-*/
-   if (!(dev_priv->capabilities & SVGA_CAP_3D))
-   mem_size *= 3;
-
dev_priv->max_mob_pages = mem_size * 1024 / PAGE_SIZE;
dev_priv->max_primary_mem =
vmw_read(dev_priv, SVGA_REG_MAX_PRIMARY_MEM);
-- 
2.34.1



[PATCH v2 1/4] drm/vmwgfx: Filter modes which exceed graphics memory

2024-05-17 Thread Ian Forbes
SVGA requires individual surfaces to fit within graphics memory
(max_mob_pages) which means that modes with a final buffer size that would
exceed graphics memory must be pruned otherwise creation will fail.

Additionally llvmpipe requires its buffer height and width to be a multiple
of its tile size which is 64. As a result we have to anticipate that
llvmpipe will round up the mode size passed to it by the compositor when
it creates buffers and filter modes where this rounding exceeds graphics
memory.

This fixes an issue where VMs with low graphics memory (< 64MiB) configured
with high resolution mode boot to a black screen because surface creation
fails.

Fixes: d947d1b71deb ("drm/vmwgfx: Add and connect connector helper function")
Signed-off-by: Ian Forbes 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c | 43 ++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 2041c4d48daa..a2b5527a249d 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -43,7 +43,12 @@
 #define vmw_connector_to_stdu(x) \
container_of(x, struct vmw_screen_target_display_unit, base.connector)
 
-
+/**
+ * llvmpipe will align the width and height of its buffers to match its
+ * tile size. We need to keep this in mind when exposing modes to userspace
+ * so that this possible over-allocation will not exceed graphics memory.
+ */
+#define LLVM_PIPE_TILE_SIZE 64
 
 enum stdu_content_type {
SAME_AS_DISPLAY = 0,
@@ -829,7 +834,41 @@ static void vmw_stdu_connector_destroy(struct 
drm_connector *connector)
vmw_stdu_destroy(vmw_connector_to_stdu(connector));
 }
 
+static enum drm_mode_status
+vmw_stdu_connector_mode_valid(struct drm_connector *connector,
+ struct drm_display_mode *mode)
+{
+   enum drm_mode_status ret;
+   struct drm_device *dev = connector->dev;
+   struct vmw_private *dev_priv = vmw_priv(dev);
+   u64 assumed_cpp = dev_priv->assume_16bpp ? 2 : 4;
+   /* Align width and height to account for llvmpipe tile over-alignment */
+   u64 required_mem = ALIGN(mode->hdisplay, LLVM_PIPE_TILE_SIZE) *
+  ALIGN(mode->vdisplay, LLVM_PIPE_TILE_SIZE) *
+  assumed_cpp;
+   required_mem = ALIGN(required_mem, PAGE_SIZE);
+
+   ret = drm_mode_validate_size(mode, dev_priv->stdu_max_width,
+dev_priv->stdu_max_height);
+   if (ret != MODE_OK)
+   return ret;
 
+   ret = drm_mode_validate_size(mode, dev_priv->texture_max_width,
+dev_priv->texture_max_height);
+   if (ret != MODE_OK)
+   return ret;
+
+   if (required_mem > dev_priv->max_primary_mem)
+   return MODE_MEM;
+
+   if (required_mem > dev_priv->max_mob_pages * PAGE_SIZE)
+   return MODE_MEM;
+
+   if (required_mem > dev_priv->max_mob_size)
+   return MODE_MEM;
+
+   return MODE_OK;
+}
 
 static const struct drm_connector_funcs vmw_stdu_connector_funcs = {
.dpms = vmw_du_connector_dpms,
@@ -845,7 +884,7 @@ static const struct drm_connector_funcs 
vmw_stdu_connector_funcs = {
 static const struct
 drm_connector_helper_funcs vmw_stdu_connector_helper_funcs = {
.get_modes = vmw_connector_get_modes,
-   .mode_valid = vmw_connector_mode_valid
+   .mode_valid = vmw_stdu_connector_mode_valid
 };
 
 
-- 
2.34.1



[PATCH v2 0/4] Fix memory limits for STDU

2024-05-17 Thread Ian Forbes
Fixes a bug where modes that are too large for the device are exposed
and set causing a black screen on boot.

v2: Fixed llvmpipe over-alignment bug.

Ian Forbes (4):
  drm/vmwgfx: Filter modes which exceed graphics memory
  drm/vmwgfx: 3D disabled should not effect STDU memory limits
  drm/vmwgfx: Remove STDU logic from generic mode_valid function
  drm/vmwgfx: Standardize use of kibibytes when logging

 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c   | 19 +++-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.h   |  3 --
 drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  4 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c   | 26 ---
 drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c  | 43 ++-
 5 files changed, 58 insertions(+), 37 deletions(-)

-- 
2.34.1



Re: [PATCH v9] drm/bridge: it6505: This patch fixes hibernate to resume no display issue

2024-05-17 Thread Markus Elfring
Please omit the word combination “This patch” from the summary phrase.


…
> But the input FIFO reset will also trigger error interrupts of output
> module rising.Thus, it6505 have to wait a period can clear those
> expected error interrupts caused by manual hardware reset in one
> interrupt handler calling to avoid interrupt looping.

Please improve this change description another bit.


…
> ---
>  drivers/gpu/drm/bridge/ite-it6505.c | 73 +++--
…

You may present version descriptions behind the marker line.

Regards,
Markus


Re: [PATCH v3 0/2] drm: Fix dma_resv deadlock at drm object pin time

2024-05-17 Thread Adrián Larumbe
Hi Boris and Thomas,

On 02.05.2024 14:18, Thomas Zimmermann wrote:
> Hi
> 
> Am 02.05.24 um 14:00 schrieb Boris Brezillon:
> > On Thu, 2 May 2024 13:59:41 +0200
> > Boris Brezillon  wrote:
> > 
> > > Hi Thomas,
> > > 
> > > On Thu, 2 May 2024 13:51:16 +0200
> > > Thomas Zimmermann  wrote:
> > > 
> > > > Hi,
> > > > 
> > > > ignoring my r-b on patch 1, I'd like to rethink the current patches in
> > > > general.
> > > > 
> > > > I think drm_gem_shmem_pin() should become the locked version of _pin(),
> > > > so that drm_gem_shmem_object_pin() can call it directly. The existing
> > > > _pin_unlocked() would not be needed any longer. Same for the _unpin()
> > > > functions. This change would also fix the consistency with the semantics
> > > > of the shmem _vmap() functions, which never take reservation locks.
> > > > 
> > > > There are only two external callers of drm_gem_shmem_pin(): the test
> > > > case and panthor. These assume that drm_gem_shmem_pin() acquires the
> > > > reservation lock. The test case should likely call drm_gem_pin()
> > > > instead. That would acquire the reservation lock and the test would
> > > > validate that shmem's pin helper integrates well into the overall GEM
> > > > framework. The way panthor uses drm_gem_shmem_pin() looks wrong to me.
> > > > For now, it could receive a wrapper that takes the lock and that's it.
> > > I do agree that the current inconsistencies in the naming is
> > > troublesome (sometimes _unlocked, sometimes _locked, with the version
> > > without any suffix meaning either _locked or _unlocked depending on
> > > what the suffixed version does), and that's the very reason I asked
> > > Dmitry to address that in his shrinker series [1]. So, ideally I'd
> > > prefer if patches from Dmitry's series were applied instead of
> > > trying to fix that here (IIRC, we had an ack from Maxime).
> > With the link this time :-).
> > 
> > [1]https://lore.kernel.org/lkml/20240105184624.508603-1-dmitry.osipe...@collabora.com/T/
> 
> Thanks. I remember these patches. Somehow I thought they would have been
> merged already. I wasn't super happy about the naming changes in patch 5,
> because the names of the GEM object callbacks do no longer correspond with
> their implementations. But anyway.
> 
> If we go that direction, we should here simply push drm_gem_shmem_pin() and
> drm_gem_shmem_unpin() into panthor and update the shmem tests with
> drm_gem_pin(). Panfrost and lima would call drm_gem_shmem_pin_locked(). IMHO
> we should not promote the use of drm_gem_shmem_object_*() functions, as they
> are meant to be callbacks for struct drm_gem_object_funcs. (Auto-generating
> them would be nice.)

I'll be doing this in the next patch series iteration, casting the pin 
function's
drm object parameter to an shmem object.

Also for the sake of leaving things in a consistent state, and against Boris' 
advice,
I think I'll leave the drm WARN statement inside drm_gem_shmem_pin_locked. I 
guess
even though Dmitry's working on it, rebasing his work on top of this minor 
change
shouldn't be an issue.

Cheers,
Adrian Larumbe

> Best regards
> Thomas
> 
> 
> > 
> > > Regards,
> > > 
> > > Boris


Re: [PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-05-17 Thread Guenter Roeck

On 5/17/24 10:48, Steven Rostedt wrote:

On Fri, 17 May 2024 10:36:37 -0700
Guenter Roeck  wrote:


Building csky:allmodconfig (and others) ... failed
--
Error log:
In file included from include/trace/trace_events.h:419,
  from include/trace/define_trace.h:102,
  from drivers/cxl/core/trace.h:737,
  from drivers/cxl/core/trace.c:8:
drivers/cxl/core/./trace.h:383:1: error: macro "__assign_str" passed 2 
arguments, but takes just 1

This is with the patch applied on top of v6.9-8410-gff2632d7d08e.
So far that seems to be the only build failure.
Introduced with commit 6aec00139d3a8 ("cxl/core: Add region info to
cxl_general_media and cxl_dram events"). Guess we'll see more of those
towards the end of the commit window.


Looks like I made this patch just before this commit was pulled into
Linus's tree.

Which is why I'll apply and rerun the above again probably on Tuesday of
next week against Linus's latest.

This patch made it through both an allyesconfig and an allmodconfig, but on
the commit I had applied it to, which was:

   1b294a1f3561 ("Merge tag 'net-next-6.10' of 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")

I'll be compiling those two builds after I update it then.



I am currently repeating my test builds with the above errors fixed.
That should take a couple of hours. I'll let you know how it goes.

Guenter



Re: [PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-05-17 Thread Steven Rostedt
On Fri, 17 May 2024 10:36:37 -0700
Guenter Roeck  wrote:

> Building csky:allmodconfig (and others) ... failed
> --
> Error log:
> In file included from include/trace/trace_events.h:419,
>  from include/trace/define_trace.h:102,
>  from drivers/cxl/core/trace.h:737,
>  from drivers/cxl/core/trace.c:8:
> drivers/cxl/core/./trace.h:383:1: error: macro "__assign_str" passed 2 
> arguments, but takes just 1
> 
> This is with the patch applied on top of v6.9-8410-gff2632d7d08e.
> So far that seems to be the only build failure.
> Introduced with commit 6aec00139d3a8 ("cxl/core: Add region info to
> cxl_general_media and cxl_dram events"). Guess we'll see more of those
> towards the end of the commit window.

Looks like I made this patch just before this commit was pulled into
Linus's tree.

Which is why I'll apply and rerun the above again probably on Tuesday of
next week against Linus's latest.

This patch made it through both an allyesconfig and an allmodconfig, but on
the commit I had applied it to, which was:

  1b294a1f3561 ("Merge tag 'net-next-6.10' of 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next")

I'll be compiling those two builds after I update it then.

-- Steve


[PATCH v8 10/10] gpu: ipu-v3: Use generic macro for rounding closest to specified value

2024-05-17 Thread Devarsh Thakkar
Use generic macro round_closest_up() for rounding closest to specified
value instead of using local macro round_closest().

There is no change from functionality point of view as round_closest_up()
is functionally same as the previously used local macro round_closest().

Signed-off-by: Devarsh Thakkar 
---
V8: Update commit message
V1->V7 : (No change, patch introduced in V7)
---
 drivers/gpu/ipu-v3/ipu-image-convert.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/gpu/ipu-v3/ipu-image-convert.c 
b/drivers/gpu/ipu-v3/ipu-image-convert.c
index 841316582ea9..5192a8b5c02c 100644
--- a/drivers/gpu/ipu-v3/ipu-image-convert.c
+++ b/drivers/gpu/ipu-v3/ipu-image-convert.c
@@ -477,8 +477,6 @@ static int calc_image_resize_coefficients(struct 
ipu_image_convert_ctx *ctx,
return 0;
 }
 
-#define round_closest(x, y) round_down((x) + (y)/2, (y))
-
 /*
  * Find the best aligned seam position for the given column / row index.
  * Rotation and image offsets are out of scope.
@@ -565,7 +563,7 @@ static void find_best_seam(struct ipu_image_convert_ctx 
*ctx,
 * The closest input sample position that we could actually
 * start the input tile at, 19.13 fixed point.
 */
-   in_pos_aligned = round_closest(in_pos, 8192U * in_align);
+   in_pos_aligned = round_closest_up(in_pos, 8192U * in_align);
/* Convert 19.13 fixed point to integer */
in_pos_rounded = in_pos_aligned / 8192U;
 
-- 
2.39.1



[PATCH v8 09/10] media: imagination: Round to closest multiple for cropping region

2024-05-17 Thread Devarsh Thakkar
If neither of the flags to round down (V4L2_SEL_FLAG_LE) or round up
(V4L2_SEL_FLAG_GE) are specified by the user, then round to nearest
multiple of requested value while updating the crop rectangle coordinates.

Use the rounding macro which gives preference to rounding down in case two
nearest values (high and low) are possible to raise the probability of
cropping rectangle falling inside the bound region.

This complies with the VIDIOC_G_SELECTION, VIDIOC_S_SELECTION ioctl
description as documented in v4l uapi [1] which specifies that driver
should choose crop rectangle as close as possible if no flags are passed by
user-space, as quoted below :

"``0`` - The driver can adjust the rectangle size freely and shall choose a
crop/compose rectangle as close as possible to the requested
 one."

[1] :
https://www.kernel.org/doc/Documentation/userspace-api/media/v4l/vidioc-g-selection.rst

Signed-off-by: Devarsh Thakkar 
---
V8: Update commit message with specification reference
V1->V7 (No change, patch introduced in V7)
---
 drivers/media/platform/imagination/e5010-jpeg-enc.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/media/platform/imagination/e5010-jpeg-enc.c 
b/drivers/media/platform/imagination/e5010-jpeg-enc.c
index 189e2a99c43d..abd66bc9b96c 100644
--- a/drivers/media/platform/imagination/e5010-jpeg-enc.c
+++ b/drivers/media/platform/imagination/e5010-jpeg-enc.c
@@ -517,10 +517,10 @@ static int e5010_s_selection(struct file *file, void *fh, 
struct v4l2_selection
 
switch (s->flags) {
case 0:
-   s->r.width = round_down(s->r.width, 
queue->fmt->frmsize.step_width);
-   s->r.height = round_down(s->r.height, 
queue->fmt->frmsize.step_height);
-   s->r.left = round_down(s->r.left, 
queue->fmt->frmsize.step_width);
-   s->r.top = round_down(s->r.top, 2);
+   s->r.width = round_closest_down(s->r.width, 
queue->fmt->frmsize.step_width);
+   s->r.height = round_closest_down(s->r.height, 
queue->fmt->frmsize.step_height);
+   s->r.left = round_closest_down(s->r.left, 
queue->fmt->frmsize.step_width);
+   s->r.top = round_closest_down(s->r.top, 2);
 
if (s->r.left + s->r.width > queue->width)
s->r.width = round_down(s->r.width + s->r.left - 
queue->width,
-- 
2.39.1



[PATCH v8 08/10] lib: math_kunit: Add tests for new rounding related macros

2024-05-17 Thread Devarsh Thakkar
Add tests for round_closest_up/down and roundclosest macros which round
to nearest multiple of specified argument. These are tested with kunit
tool as shared here [1].

[1]: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

Signed-off-by: Devarsh Thakkar 
---
V1->V8 (No change, patch introduced in V8)
---
 lib/math/math_kunit.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
index 1b00e4195a1a..1955dcd5655d 100644
--- a/lib/math/math_kunit.c
+++ b/lib/math/math_kunit.c
@@ -68,6 +68,26 @@ static void round_down_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, round_down((1 << 30) - 1, 1 << 29), 1 << 29);
 }
 
+static void round_closest_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_up(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up(14, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_up((1 << 30) - 1, 2), 1 << 30);
+}
+
+static void round_closest_down_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_closest_down(17, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(15, 4), 16);
+   KUNIT_EXPECT_EQ(test, round_closest_down(14, 4), 12);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) + 1, 1 << 30), 1 << 
30);
+   KUNIT_EXPECT_EQ(test, round_closest_down((1 << 30) - 1, 2), (1 << 30) - 
2);
+}
+
 /* These versions can round to numbers that aren't a power of two */
 static void roundup_test(struct kunit *test)
 {
@@ -93,6 +113,18 @@ static void rounddown_test(struct kunit *test)
KUNIT_EXPECT_EQ(test, rounddown(4, 3), 3);
 }
 
+static void roundclosest_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, roundclosest(21, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(19, 5), 20);
+   KUNIT_EXPECT_EQ(test, roundclosest(17, 5), 15);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30), 3), (1 << 30) - 1);
+   KUNIT_EXPECT_EQ(test, roundclosest((1 << 30) - 1, 1 << 29), 1 << 30);
+
+   KUNIT_EXPECT_EQ(test, roundclosest(4, 3), 3);
+   KUNIT_EXPECT_EQ(test, roundclosest(5, 3), 6);
+}
+
 static void div_round_up_test(struct kunit *test)
 {
KUNIT_EXPECT_EQ(test, DIV_ROUND_UP(0, 1), 0);
@@ -270,8 +302,11 @@ static struct kunit_case math_test_cases[] = {
KUNIT_CASE(int_sqrt_test),
KUNIT_CASE(round_up_test),
KUNIT_CASE(round_down_test),
+   KUNIT_CASE(round_closest_up_test),
+   KUNIT_CASE(round_closest_down_test),
KUNIT_CASE(roundup_test),
KUNIT_CASE(rounddown_test),
+   KUNIT_CASE(roundclosest_test),
KUNIT_CASE(div_round_up_test),
KUNIT_CASE(div_round_closest_test),
KUNIT_CASE_PARAM(gcd_test, gcd_gen_params),
-- 
2.39.1



Re: [PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-05-17 Thread Guenter Roeck
On Thu, May 16, 2024 at 01:34:54PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> [
>This is a treewide change. I will likely re-create this patch again in
>the second week of the merge window of v6.10 and submit it then. Hoping
>to keep the conflicts that it will cause to a minimum.
> ]
> 
> With the rework of how the __string() handles dynamic strings where it
> saves off the source string in field in the helper structure[1], the
> assignment of that value to the trace event field is stored in the helper
> value and does not need to be passed in again.
> 
> This means that with:
> 
>   __string(field, mystring)
> 
> Which use to be assigned with __assign_str(field, mystring), no longer
> needs the second parameter and it is unused. With this, __assign_str()
> will now only get a single parameter.
> 
> There's over 700 users of __assign_str() and because coccinelle does not
> handle the TRACE_EVENT() macro I ended up using the following sed script:
> 
>   git grep -l __assign_str | while read a ; do
>   sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
>   mv /tmp/test-file $a;
>   done
> 
> I then searched for __assign_str() that did not end with ';' as those
> were multi line assignments that the sed script above would fail to catch.
> 

Building csky:allmodconfig (and others) ... failed
--
Error log:
In file included from include/trace/trace_events.h:419,
 from include/trace/define_trace.h:102,
 from drivers/cxl/core/trace.h:737,
 from drivers/cxl/core/trace.c:8:
drivers/cxl/core/./trace.h:383:1: error: macro "__assign_str" passed 2 
arguments, but takes just 1

This is with the patch applied on top of v6.9-8410-gff2632d7d08e.
So far that seems to be the only build failure.
Introduced with commit 6aec00139d3a8 ("cxl/core: Add region info to
cxl_general_media and cxl_dram events"). Guess we'll see more of those
towards the end of the commit window.

Guenter


[PATCH v8 07/10] lib: add basic KUnit test for lib/math

2024-05-17 Thread Devarsh Thakkar
From: Daniel Latypov 

Add basic test coverage for files that don't require any config options:
* part of math.h (what seem to be the most commonly used macros)
* gcd.c
* lcm.c
* int_sqrt.c
* reciprocal_div.c
(Ignored int_pow.c since it's a simple textbook algorithm.)

These tests aren't particularly interesting, but they
* provide short and simple examples of parameterized tests
* provide a place to add tests for any new files in this dir
* are written so adding new test cases to cover edge cases should be
  easy
  * looking at code coverage, we hit all the branches in the .c files

Signed-off-by: Daniel Latypov 
Reviewed-by: David Gow 
[devarsht: Rebase to 6.9 and change license to GPL]
Signed-off-by: Devarsh Thakkar 
---
Changes since v6:
* Rebase to linux-next, change license to GPL as suggested by checkpatch.

Changes since v5:
* add in test cases for roundup/rounddown
* address misc comments from David

Changes since v4:
* add in test cases for some math.h macros (abs, round_up/round_down,
  div_round_down/closest)
* use parameterized testing less to keep things terser

Changes since v3:
* fix `checkpatch.pl --strict` warnings
* add test cases for gcd(0,0) and lcm(0,0)
* minor: don't test both gcd(a,b) and gcd(b,a) when a == b

Changes since v2: mv math_test.c => math_kunit.c

Changes since v1:
* Rebase and rewrite to use the new parameterized testing support.
* misc: fix overflow in literal and inline int_sqrt format string.
* related: commit 1f0e943df68a ("Documentation: kunit: provide guidance
for testing many inputs") was merged explaining the patterns shown here.
  * there's an in-flight patch to update it for parameterized testing.
---
 lib/math/Kconfig  |  11 ++
 lib/math/Makefile |   1 +
 lib/math/math_kunit.c | 291 ++
 3 files changed, 303 insertions(+)
 create mode 100644 lib/math/math_kunit.c

diff --git a/lib/math/Kconfig b/lib/math/Kconfig
index 0634b428d0cb..832c6989ca13 100644
--- a/lib/math/Kconfig
+++ b/lib/math/Kconfig
@@ -15,3 +15,14 @@ config PRIME_NUMBERS
 
 config RATIONAL
tristate
+
+config MATH_KUNIT_TEST
+   tristate "KUnit test for math helper functions"
+   help
+ This builds unit test for math helper functions as defined in lib/math
+ and math.h.
+
+ For more information on KUNIT and unit tests in general, please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
diff --git a/lib/math/Makefile b/lib/math/Makefile
index 91fcdb0c9efe..dcf65d10dab2 100644
--- a/lib/math/Makefile
+++ b/lib/math/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_RATIONAL)  += rational.o
 
 obj-$(CONFIG_TEST_DIV64)   += test_div64.o
 obj-$(CONFIG_RATIONAL_KUNIT_TEST) += rational-test.o
+obj-$(CONFIG_MATH_KUNIT_TEST) += math_kunit.o
diff --git a/lib/math/math_kunit.c b/lib/math/math_kunit.c
new file mode 100644
index ..1b00e4195a1a
--- /dev/null
+++ b/lib/math/math_kunit.c
@@ -0,0 +1,291 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple KUnit suite for math helper funcs that are always enabled.
+ *
+ * Copyright (C) 2020, Google LLC.
+ * Author: Daniel Latypov 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static void abs_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, abs((char)0), (char)0);
+   KUNIT_EXPECT_EQ(test, abs((char)42), (char)42);
+   KUNIT_EXPECT_EQ(test, abs((char)-42), (char)42);
+
+   /* The expression in the macro is actually promoted to an int. */
+   KUNIT_EXPECT_EQ(test, abs((short)0),  0);
+   KUNIT_EXPECT_EQ(test, abs((short)42),  42);
+   KUNIT_EXPECT_EQ(test, abs((short)-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0),  0);
+   KUNIT_EXPECT_EQ(test, abs(42),  42);
+   KUNIT_EXPECT_EQ(test, abs(-42),  42);
+
+   KUNIT_EXPECT_EQ(test, abs(0L), 0L);
+   KUNIT_EXPECT_EQ(test, abs(42L), 42L);
+   KUNIT_EXPECT_EQ(test, abs(-42L), 42L);
+
+   KUNIT_EXPECT_EQ(test, abs(0LL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42LL), 42LL);
+   KUNIT_EXPECT_EQ(test, abs(-42LL), 42LL);
+
+   /* Unsigned types get casted to signed. */
+   KUNIT_EXPECT_EQ(test, abs(0ULL), 0LL);
+   KUNIT_EXPECT_EQ(test, abs(42ULL), 42LL);
+}
+
+static void int_sqrt_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, int_sqrt(0UL), 0UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL), 1UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(4UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(5UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(8UL), 2UL);
+   KUNIT_EXPECT_EQ(test, int_sqrt(1UL << 30), 1UL << 15);
+}
+
+static void round_up_test(struct kunit *test)
+{
+   KUNIT_EXPECT_EQ(test, round_up(0, 1), 0);
+   KUNIT_EXPECT_EQ(test, round_up(1, 2), 2);
+   KUNIT_EXPECT_EQ(test, round_up(3, 2), 4);
+   KUNIT_EXPECT_EQ(test, round_up((1 << 30) - 1, 2), 1 << 30);
+   KUNIT_EXPECT_EQ(test, round_up(

[PATCH v8 06/10] math.h: Add macros for rounding to closest value

2024-05-17 Thread Devarsh Thakkar
Add below rounding related macros:

round_closest_up(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round up in case two nearest values are
possible.

round_closest_down(x, y) : Rounds x to closest multiple of y where y is a
power of 2, with a preference to round down in case two nearest values are
possible.

roundclosest(x, y) : Rounds x to closest multiple of y, this macro should
generally be used only when y is not multiple of 2 as otherwise
round_closest* macros should be used which are much faster.

Examples:
 * round_closest_up(17, 4) = 16
 * round_closest_up(15, 4) = 16
 * round_closest_up(14, 4) = 16
 * round_closest_down(17, 4) = 16
 * round_closest_down(15, 4) = 16
 * round_closest_down(14, 4) = 12
 * roundclosest(21, 5) = 20
 * roundclosest(19, 5) = 20
 * roundclosest(17, 5) = 15

Signed-off-by: Devarsh Thakkar 
---
NOTE: This patch is inspired from the Mentor Graphics IPU driver [1]
which uses similar macro locally and which is updated in further patch
in the series to use this generic macro instead along with other drivers
having similar requirements.

[1]:
https://elixir.bootlin.com/linux/v6.8.9/source/drivers/gpu/ipu-v3/ipu-image-convert.c#L480

V8:
- Add new macro to round to nearest value for non-multiple of 2
- Update commit message as suggested

V1->V6 (No change, patch introduced in V7)
---
 include/linux/math.h | 65 
 1 file changed, 65 insertions(+)

diff --git a/include/linux/math.h b/include/linux/math.h
index dd4152711de7..e2cc3769ed0e 100644
--- a/include/linux/math.h
+++ b/include/linux/math.h
@@ -34,6 +34,54 @@
  */
 #define round_down(x, y) ((x) & ~__round_mask(x, y))
 
+/**
+ * round_closest_up - round closest to be multiple of specified value (which is
+ *power of 2) with preference to rounding up
+
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples :
+ * round_closest_up(17, 4) = 16
+ * round_closest_up(15, 4) = 16
+ * round_closest_up(14, 4) = 16
+ */
+#define round_closest_up(x, y) round_down((x) + (y) / 2, (y))
+
+/**
+ * round_closest_down - round closest to be multiple of specified value (which
+ * is power of 2) with preference to rounding down
+ *
+ * @x: the value to round
+ * @y: multiple to round closest to (must be a power of 2)
+ *
+ * Rounds @x to closest multiple of @y (which must be a power of 2).
+ * The value can be either rounded up or rounded down depending upon rounded
+ * value's closeness to the specified value. If there are two closest possible
+ * values, i.e. the difference between the specified value and it's rounded up
+ * and rounded down values is same then preference is given to rounded up
+ * value.
+ *
+ * To perform arbitrary rounding to closest value (not multiple of 2), use
+ * roundclosest().
+ *
+ * Examples :
+ * round_closest_down(17, 4) = 16
+ * round_closest_down(15, 4) = 16
+ * round_closest_down(14, 4) = 12
+ */
+#define round_closest_down(x, y) round_up((x) - (y) / 2, (y))
+
 #define DIV_ROUND_UP __KERNEL_DIV_ROUND_UP
 
 #define DIV_ROUND_DOWN_ULL(ll, d) \
@@ -77,6 +125,23 @@
 }  \
 )
 
+/**
+ * roundclosest - round to nearest multiple
+ * @x: the value to round
+ * @y: multiple to round nearest to
+ *
+ * Rounds @x to nearest multiple of @y.
+ * The rounded value can be greater than or less than @x depending
+ * upon it's nearness to @x. If @y will always be a power of 2, consider
+ * using the faster round_closest_up() or round_closest_down().
+ *
+ * Examples :
+ * roundclosest(21, 5) = 20
+ * roundclosest(19, 5) = 20
+ * roundclosest(17, 5) = 15
+ */
+#define roundclosest(x, y) rounddown((x) + (y) / 2, (y))
+
 /*
  * Divide positive or negative dividend by positive or negative divisor
  * and round to closest integer. Result is undefined for negative
-- 
2.39.1



Re: [PATCH] drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1

2024-05-17 Thread Tim Van Patten
> Fair enough, but this is also the only gfx9 APU which defaults to
> noretry=1, all of the rest are dGPUs.  I'd argue it should align with
> the other GFX9 APUs or they should all enable noretry=1.

Do you mean we should remove all IP_VERSION(9, X, X) entries from
amdgpu_gmc_noretry_set(), leaving just >= IP_VERSION(10, 3, 0)?


[PATCH v8 00/10] Add V4L2 M2M Driver for E5010 JPEG Encoder

2024-05-17 Thread Devarsh Thakkar
This adds support for V4L2 M2M based driver for E5010 JPEG Encoder
which is a stateful JPEG encoder from Imagination technologies
and is present in TI AM62A SoC.

While adding support for it, following additional framework changes were
made:
 - Moved reference quantization and huffman tables provided in
   ITU-T-REC-T.81 to v4l2-jpeg.c as suggested in mailing list [1].
 - Add macros to round to closest integer (either higher or lower) while
   rounding in order of 2.
 - Add KUnit tests for math functions.

v4l2-compliance test :
Link: https://gist.github.com/devarsht/1f039c631ca953a57f405cfce1b69e49

E5010 JPEG Encoder Manual tests :

Performance:
Link: https://gist.github.com/devarsht/c40672944fd71c9a53ab55adbfd9e28b

Functionality:
Link: https://gist.github.com/devarsht/8e88fcaabff016bb2bac83d89c9d23ce

Compression Quality:
Link: https://gist.github.com/devarsht/cbcc7cd97e8c48ba1486caa2b7884655

Multi Instance:
Link: https://gist.github.com/devarsht/22c2fca08cd3441fb40f2c7a4cebc95a

Crop support:
Link: https://gist.github.com/devarsht/de6f5142f678bb1a5338abfd9f814abd

Runtime PM:
Link: https://gist.github.com/devarsht/70cd95d4440ddc678489d93885ddd4dd

Math lib KUnit tests:
Link: https://gist.github.com/devarsht/3f9042825be3da4e133b8f4eda067876

[1]: 
https://lore.kernel.org/all/de46aefe-36da-4e1a-b4fa-b375b2749...@xs4all.nl/

Changelog:
V7->V8:
 - Add KUnit tests for math functions
 - Add roundclosest() for supporting rounding for non-multiple of 2
 - Update commit message as suggested
 - Add Reviewed-by and Acked-by tags to patches as received

V6->V7:
 - Fix cropping support
 - Move reference huffman and quantization tables to v4l2-jpeg.c
 - Fix suspend/resume use-case
 - Add Reviewed-by

V5->V6:
 - Fix sparse warnings

V4->V5:
 - Sort the #includes in driver file alphabetically
 - Rename huffman and quantization tables to not use '_'
 - Add Reviewed-by tag

V3->V4:
- Use ti-specific compatible ti,am62a-jpeg-enc as secondary one in
  dt-binding
- Remove clock-names as only single clock in dt-binding
- Fix issue with default params setting
- Correct v4l2 error prints
- Simplify register write functions with single statement return values
- Remove unrequired error checks from get_queue()
- Drop explicit device_caps setting as it is already taken care by v4l2
  core
- Remove unrequired multiplanar checks and memset from s_fmt, g_fmt
  callback functions
- Fix try_fmt callback to not update the queues
- Remove unrequired contiguous format attribute from queue_init
- Use dynamic allocation for video_device and remove unrequired
  assignments in probe()
- Remove unrequired checks from queue_setup function
- Return queued buffers back if start_streaming fails
- Use ARRAY_SIZE in place of hard-coding
- Use huffman and quantization tables from reference header file

V2->V3:
- Add DONOTMERGE patches for dts and defconfig
- Update driver with below changes :
  - Correct license headers
  - Use more generic name core instead of jasper for base registers
  - Add Comment for forward declarations
  - Simplify quantization table calculations
  - Use v4l2_apply_frmsize_constraints for updating framesize and remove
unrequired functions
  - Place TODO at top of file and in commit message too
  - Use dev_err_probe helper in probe function
  - Fix return value checking for failure scenarios in probe function
  - Use v4l2_err/info/warn helpers instead of dev_err/info/warn helpers
  - Fix unexpected indentation
  - Correct commit message
- Update dt-bindings with below changes :
  - Add vendor specific compatible 
  - Fix commit title and message
  - Update reg names
  - Update clocks to 1
  - Fix dts example with proper naming

V1->V2:
 - Send dt-bindings and driver together

Patch-Diff between the series :
V7->V8 Range diff :
https://gist.github.com/devarsht/3fd6c4e8031ab114248f93d01c8dfc74

V6->V7 Range diff :
https://gist.github.com/devarsht/1db185b1e187eaf397e9e4c37066777e

V5->V6 Range diff :
https://gist.github.com/devarsht/c89180ac2b0d2814614f2b59d0705c19

V4->V5 Range diff :
https://gist.github.com/devarsht/298790af819f299a0a05fec89371097b

V3->V4 Range diff :
https://gist.github.com/devarsht/22a744d999080de6e813bcfb5a596272

Previous patch series:
V7: https://lore.kernel.org/all/20240510082603.1263256-1-devar...@ti.com/
V6: https://lore.kernel.org/all/20240228141140.3530612-1-devar...@ti.com/
V5: https://lore.kernel.org/all/20240215134641.3381478-1-devar...@ti.com/
V4: https://lore.kernel.org/all/20240205114239.924697-1-devar...@ti.com/
V3: https://lore.kernel.org/all/20230816152210.4080779-1-devar...@ti.com/
V2: https://lore.kernel.org/all/20230727112546.2201995-1-devar...@ti.com/

Daniel Latypov (1):
  lib: add basic KUnit test for lib/math

Devarsh Thakkar (9):
  media: dt-bindings: Add Imagination E5010 JPEG Encoder
  media: imagination: Add E5010 JPEG Encoder driver
  media: v4l2-jpeg: Export reference quantization and huffman tables
  media: imagination: Use exporte

Re: [PATCH v2] drm/buddy: Fix the warn on's during force merge

2024-05-17 Thread Paneer Selvam, Arunpravin

Hi Dave,
Please help pull this patch into drm-next. This will fix any unnecessary
warnings in memory pressure situation.

Thanks for the help.

Regards,
Arun.

On 5/17/2024 8:03 PM, Arunpravin Paneer Selvam wrote:

Move the fallback and block incompatible checks
above, so that we dont unnecessarily split the blocks
and leaving the unmerged. This resolves the unnecessary
warn on's thrown during force_merge call.

v2:(Matthew)
   - Move the fallback and block incompatible checks above
 the contains check.

Signed-off-by: Arunpravin Paneer Selvam 
Reviewed-by: Matthew Auld 
Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
Link: 
https://patchwork.kernel.org/project/dri-devel/patch/20240517135015.17565-1-arunpravin.paneersel...@amd.com/
---
  drivers/gpu/drm/drm_buddy.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 1daf778cf6fa..94f8c34fc293 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -524,11 +524,11 @@ __alloc_range_bias(struct drm_buddy *mm,
continue;
}
  
+		if (!fallback && block_incompatible(block, flags))

+   continue;
+
if (contains(start, end, block_start, block_end) &&
order == drm_buddy_block_order(block)) {
-   if (!fallback && block_incompatible(block, flags))
-   continue;
-
/*
 * Find the free block within the range.
 */




Re: [PATCH v4 3/8] drm/xe/lrc: Add helper to capture context timestamp

2024-05-17 Thread Francois Dugast
On Wed, May 15, 2024 at 02:42:53PM -0700, Lucas De Marchi wrote:
> From: Umesh Nerlige Ramappa 
> 
> Add a helper to capture CTX_TIMESTAMP from the context image so it can
> be used to calculate the runtime.
> 
> v2: Add kernel-doc to clarify expectation from caller
> 
> Signed-off-by: Umesh Nerlige Ramappa 
> Signed-off-by: Lucas De Marchi 

LGTM:
Reviewed-by: Francois Dugast 

> ---
>  drivers/gpu/drm/xe/regs/xe_lrc_layout.h |  1 +
>  drivers/gpu/drm/xe/xe_lrc.c | 12 
>  drivers/gpu/drm/xe/xe_lrc.h | 14 ++
>  drivers/gpu/drm/xe/xe_lrc_types.h   |  3 +++
>  4 files changed, 30 insertions(+)
> 
> diff --git a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h 
> b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
> index e6ca8bbda8f4..045dfd09db99 100644
> --- a/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
> +++ b/drivers/gpu/drm/xe/regs/xe_lrc_layout.h
> @@ -11,6 +11,7 @@
>  #define CTX_RING_TAIL(0x06 + 1)
>  #define CTX_RING_START   (0x08 + 1)
>  #define CTX_RING_CTL (0x0a + 1)
> +#define CTX_TIMESTAMP(0x22 + 1)
>  #define CTX_INDIRECT_RING_STATE  (0x26 + 1)
>  #define CTX_PDP0_UDW (0x30 + 1)
>  #define CTX_PDP0_LDW (0x32 + 1)
> diff --git a/drivers/gpu/drm/xe/xe_lrc.c b/drivers/gpu/drm/xe/xe_lrc.c
> index 9b0a4078add3..f679cb9aaea7 100644
> --- a/drivers/gpu/drm/xe/xe_lrc.c
> +++ b/drivers/gpu/drm/xe/xe_lrc.c
> @@ -844,6 +844,7 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
> *hwe,
>   lrc->tile = gt_to_tile(hwe->gt);
>   lrc->ring.size = ring_size;
>   lrc->ring.tail = 0;
> + lrc->ctx_timestamp = 0;
>  
>   xe_hw_fence_ctx_init(>fence_ctx, hwe->gt,
>hwe->fence_irq, hwe->name);
> @@ -898,6 +899,8 @@ int xe_lrc_init(struct xe_lrc *lrc, struct xe_hw_engine 
> *hwe,
>RING_CTL_SIZE(lrc->ring.size) | 
> RING_VALID);
>   }
>  
> + xe_lrc_write_ctx_reg(lrc, CTX_TIMESTAMP, 0);
> +
>   if (xe->info.has_asid && vm)
>   xe_lrc_write_ctx_reg(lrc, PVC_CTX_ASID, vm->usm.asid);
>  
> @@ -1576,3 +1579,12 @@ void xe_lrc_snapshot_free(struct xe_lrc_snapshot 
> *snapshot)
>   xe_bo_put(snapshot->lrc_bo);
>   kfree(snapshot);
>  }
> +
> +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts)
> +{
> + *old_ts = lrc->ctx_timestamp;
> +
> + lrc->ctx_timestamp = xe_lrc_read_ctx_reg(lrc, CTX_TIMESTAMP);
> +
> + return lrc->ctx_timestamp;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_lrc.h b/drivers/gpu/drm/xe/xe_lrc.h
> index e0e841963c23..b9da1031083b 100644
> --- a/drivers/gpu/drm/xe/xe_lrc.h
> +++ b/drivers/gpu/drm/xe/xe_lrc.h
> @@ -66,4 +66,18 @@ void xe_lrc_snapshot_capture_delayed(struct 
> xe_lrc_snapshot *snapshot);
>  void xe_lrc_snapshot_print(struct xe_lrc_snapshot *snapshot, struct 
> drm_printer *p);
>  void xe_lrc_snapshot_free(struct xe_lrc_snapshot *snapshot);
>  
> +/**
> + * xe_lrc_update_timestamp - readout LRC timestamp and update cached value
> + * @lrc: logical ring context for this exec queue
> + * @old_ts: pointer where to save the previous timestamp
> + *
> + * Read the current timestamp for this LRC and update the cached value. The
> + * previous cached value is also returned in @old_ts so the caller can 
> calculate
> + * the delta between 2 updates. Note that this is not intended to be called 
> from
> + * any place, but just by the paths updating the drm client utilization.
> + *
> + * Returns the current LRC timestamp
> + */
> +u32 xe_lrc_update_timestamp(struct xe_lrc *lrc, u32 *old_ts);
> +
>  #endif
> diff --git a/drivers/gpu/drm/xe/xe_lrc_types.h 
> b/drivers/gpu/drm/xe/xe_lrc_types.h
> index cdbf03faef15..0fa055da6b27 100644
> --- a/drivers/gpu/drm/xe/xe_lrc_types.h
> +++ b/drivers/gpu/drm/xe/xe_lrc_types.h
> @@ -45,6 +45,9 @@ struct xe_lrc {
>  
>   /** @fence_ctx: context for hw fence */
>   struct xe_hw_fence_ctx fence_ctx;
> +
> + /** @ctx_timestamp: readout value of CTX_TIMESTAMP on last update */
> + u32 ctx_timestamp;
>  };
>  
>  struct xe_lrc_snapshot;
> -- 
> 2.43.0
> 


[PATCH v2 2/3] drm/print: Improve drm_dbg_printer

2024-05-17 Thread Michal Wajdeczko
With recent introduction of a generic drm dev printk function, we
can now store and use location where drm_dbg_printer was invoked
and output it's symbolic name like we do for all drm debug prints.

Cc: Jani Nikula 
Reviewed-by: Jani Nikula 
Signed-off-by: Michal Wajdeczko 
---
v2: use full cast to match member (Jani)
---
 drivers/gpu/drm/drm_print.c | 3 +--
 include/drm/drm_print.h | 2 ++
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index 41892491a12c..35d00f0c6d64 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -217,8 +217,7 @@ void __drm_printfn_dbg(struct drm_printer *p, struct 
va_format *vaf)
if (!__drm_debug_enabled(category))
return;
 
-   /* Note: __builtin_return_address(0) is useless here. */
-   __drm_dev_vprintk(dev, KERN_DEBUG, NULL, p->prefix, vaf);
+   __drm_dev_vprintk(dev, KERN_DEBUG, p->origin, p->prefix, vaf);
 }
 EXPORT_SYMBOL(__drm_printfn_dbg);
 
diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 089950ad8681..bfc5641c6025 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -175,6 +175,7 @@ struct drm_printer {
void (*printfn)(struct drm_printer *p, struct va_format *vaf);
void (*puts)(struct drm_printer *p, const char *str);
void *arg;
+   const void *origin;
const char *prefix;
enum drm_debug_category category;
 };
@@ -332,6 +333,7 @@ static inline struct drm_printer drm_dbg_printer(struct 
drm_device *drm,
struct drm_printer p = {
.printfn = __drm_printfn_dbg,
.arg = drm,
+   .origin = (const void *)_THIS_IP_, /* it's fine as we will be 
inlined */
.prefix = prefix,
.category = category,
};
-- 
2.43.0



[PATCH v2 3/3] drm/i915: Don't use __func__ as prefix for drm_dbg_printer

2024-05-17 Thread Michal Wajdeczko
Updated code of drm_dbg_printer() is already printing symbolic
name of the caller like drm_dbg() does.

Reviewed-by: Jani Nikula 
Signed-off-by: Michal Wajdeczko 
---
 drivers/gpu/drm/i915/gt/intel_reset.c  | 2 +-
 drivers/gpu/drm/i915/gt/selftest_context.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
b/drivers/gpu/drm/i915/gt/intel_reset.c
index 6161f7a3ff70..735cd23a43c6 100644
--- a/drivers/gpu/drm/i915/gt/intel_reset.c
+++ b/drivers/gpu/drm/i915/gt/intel_reset.c
@@ -1025,7 +1025,7 @@ void intel_gt_set_wedged(struct intel_gt *gt)
 
if (GEM_SHOW_DEBUG()) {
struct drm_printer p = drm_dbg_printer(>i915->drm,
-  DRM_UT_DRIVER, __func__);
+  DRM_UT_DRIVER, NULL);
struct intel_engine_cs *engine;
enum intel_engine_id id;
 
diff --git a/drivers/gpu/drm/i915/gt/selftest_context.c 
b/drivers/gpu/drm/i915/gt/selftest_context.c
index 12eca750f7d0..5eb46700dc4e 100644
--- a/drivers/gpu/drm/i915/gt/selftest_context.c
+++ b/drivers/gpu/drm/i915/gt/selftest_context.c
@@ -286,7 +286,7 @@ static int __live_active_context(struct intel_engine_cs 
*engine)
 
if (intel_engine_pm_is_awake(engine)) {
struct drm_printer p = drm_dbg_printer(>i915->drm,
-  DRM_UT_DRIVER, __func__);
+  DRM_UT_DRIVER, NULL);
 
intel_engine_dump(engine, ,
  "%s is still awake:%d after idle-barriers\n",
-- 
2.43.0



[PATCH v2 0/3] Improve drm printer code

2024-05-17 Thread Michal Wajdeczko
We already have some drm printk functions that need to duplicate
a code to get a similar format of the final result, for example:

  [ ] :00:00.0: [drm:foo] bar
  [ ] :00:00.0: [drm] foo bar
  [ ] :00:00.0: [drm] *ERROR* foo

Add a generic __drm_dev_vprintk() function that can format the
final message like all other existing function do and allows us
to keep the formatting code in one place.

Above also allows to improve drm_dbg_printer() that today lacks
of outputing symbolic name of the caller, like drm_dbg() does.

v1: https://patchwork.freedesktop.org/series/133749/
v2: make it static, keep it simple and use braces (Jani)

Michal Wajdeczko (3):
  drm/print: Add generic drm dev printk function
  drm/print: Improve drm_dbg_printer
  drm/i915: Don't use __func__ as prefix for drm_dbg_printer

 drivers/gpu/drm/drm_print.c| 53 --
 drivers/gpu/drm/i915/gt/intel_reset.c  |  2 +-
 drivers/gpu/drm/i915/gt/selftest_context.c |  2 +-
 include/drm/drm_print.h|  2 +
 4 files changed, 34 insertions(+), 25 deletions(-)

-- 
2.43.0



[PATCH v2 1/3] drm/print: Add generic drm dev printk function

2024-05-17 Thread Michal Wajdeczko
We already have some drm printk functions that need to duplicate
a code to get a similar format of the final result, for example:

  [ ] :00:00.0: [drm:foo] bar
  [ ] :00:00.0: [drm] foo bar
  [ ] :00:00.0: [drm] *ERROR* foo

Add a generic __drm_dev_vprintk() function that can format the
final message like all other existing function do and allows us
to keep the formatting code in one place.

Cc: Jani Nikula 
Signed-off-by: Michal Wajdeczko 
---
v2: make it static, keep it simple and use braces (Jani)
---
 drivers/gpu/drm/drm_print.c | 52 +
 1 file changed, 30 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index cf2efb44722c..41892491a12c 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -176,6 +176,32 @@ void __drm_printfn_seq_file(struct drm_printer *p, struct 
va_format *vaf)
 }
 EXPORT_SYMBOL(__drm_printfn_seq_file);
 
+static void __drm_dev_vprintk(const struct device *dev, const char *level,
+ const void *origin, const char *prefix,
+ struct va_format *vaf)
+{
+   const char *prefix_pad = prefix ? " " : "";
+
+   if (!prefix)
+   prefix = "";
+
+   if (dev) {
+   if (origin)
+   dev_printk(level, dev, "[" DRM_NAME ":%ps]%s%s %pV",
+  origin, prefix_pad, prefix, vaf);
+   else
+   dev_printk(level, dev, "[" DRM_NAME "]%s%s %pV",
+  prefix_pad, prefix, vaf);
+   } else {
+   if (origin)
+   printk("%s" "[" DRM_NAME ":%ps]%s%s %pV",
+  level, origin, prefix_pad, prefix, vaf);
+   else
+   printk("%s" "[" DRM_NAME "]%s%s %pV",
+  level, prefix_pad, prefix, vaf);
+   }
+}
+
 void __drm_printfn_info(struct drm_printer *p, struct va_format *vaf)
 {
dev_info(p->arg, "[" DRM_NAME "] %pV", vaf);
@@ -187,19 +213,12 @@ void __drm_printfn_dbg(struct drm_printer *p, struct 
va_format *vaf)
const struct drm_device *drm = p->arg;
const struct device *dev = drm ? drm->dev : NULL;
enum drm_debug_category category = p->category;
-   const char *prefix = p->prefix ?: "";
-   const char *prefix_pad = p->prefix ? " " : "";
 
if (!__drm_debug_enabled(category))
return;
 
/* Note: __builtin_return_address(0) is useless here. */
-   if (dev)
-   dev_printk(KERN_DEBUG, dev, "[" DRM_NAME "]%s%s %pV",
-  prefix_pad, prefix, vaf);
-   else
-   printk(KERN_DEBUG "[" DRM_NAME "]%s%s %pV",
-  prefix_pad, prefix, vaf);
+   __drm_dev_vprintk(dev, KERN_DEBUG, NULL, p->prefix, vaf);
 }
 EXPORT_SYMBOL(__drm_printfn_dbg);
 
@@ -287,12 +306,7 @@ void drm_dev_printk(const struct device *dev, const char 
*level,
vaf.fmt = format;
vaf.va = 
 
-   if (dev)
-   dev_printk(level, dev, "[" DRM_NAME ":%ps] %pV",
-  __builtin_return_address(0), );
-   else
-   printk("%s" "[" DRM_NAME ":%ps] %pV",
-  level, __builtin_return_address(0), );
+   __drm_dev_vprintk(dev, level, __builtin_return_address(0), NULL, );
 
va_end(args);
 }
@@ -312,12 +326,7 @@ void __drm_dev_dbg(struct _ddebug *desc, const struct 
device *dev,
vaf.fmt = format;
vaf.va = 
 
-   if (dev)
-   dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
-  __builtin_return_address(0), );
-   else
-   printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
-  __builtin_return_address(0), );
+   __drm_dev_vprintk(dev, KERN_DEBUG, __builtin_return_address(0), NULL, 
);
 
va_end(args);
 }
@@ -351,8 +360,7 @@ void __drm_err(const char *format, ...)
vaf.fmt = format;
vaf.va = 
 
-   printk(KERN_ERR "[" DRM_NAME ":%ps] *ERROR* %pV",
-  __builtin_return_address(0), );
+   __drm_dev_vprintk(NULL, KERN_ERR, __builtin_return_address(0), 
"*ERROR*", );
 
va_end(args);
 }
-- 
2.43.0



Re: [PATCH] tracing/treewide: Remove second parameter of __assign_str()

2024-05-17 Thread Darrick J. Wong
On Thu, May 16, 2024 at 01:34:54PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" 
> 
> [
>This is a treewide change. I will likely re-create this patch again in
>the second week of the merge window of v6.10 and submit it then. Hoping
>to keep the conflicts that it will cause to a minimum.
> ]
> 
> With the rework of how the __string() handles dynamic strings where it
> saves off the source string in field in the helper structure[1], the
> assignment of that value to the trace event field is stored in the helper
> value and does not need to be passed in again.
> 
> This means that with:
> 
>   __string(field, mystring)
> 
> Which use to be assigned with __assign_str(field, mystring), no longer
> needs the second parameter and it is unused. With this, __assign_str()
> will now only get a single parameter.
> 
> There's over 700 users of __assign_str() and because coccinelle does not
> handle the TRACE_EVENT() macro I ended up using the following sed script:
> 
>   git grep -l __assign_str | while read a ; do
>   sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
>   mv /tmp/test-file $a;
>   done
> 
> I then searched for __assign_str() that did not end with ';' as those
> were multi line assignments that the sed script above would fail to catch.
> 
> Note, the same updates will need to be done for:
> 
>   __assign_str_len()
>   __assign_rel_str()
>   __assign_rel_str_len()
> 
> I tested this with both an allmodconfig and an allyesconfig (build only for 
> both).
> 
> [1] 
> https://lore.kernel.org/linux-trace-kernel/2024011442.634192...@goodmis.org/
> 
> Cc: Masami Hiramatsu 
> Cc: Mathieu Desnoyers 
> Cc: Linus Torvalds 
> Cc: Julia Lawall 
> Signed-off-by: Steven Rostedt (Google) 

/me finds this pretty magical, but such is the way of macros.
Thanks for being much smarter about them than me. :)

Acked-by: Darrick J. Wong# xfs

--D


Re: [PATCH v6 0/7] Adds support for ConfigFS to VKMS!

2024-05-17 Thread Louis Chauvet
> > > Hi Louis,
> > > 
> > > If you could share a RFC/WIP series it would be awesome!

Hi all!

I just uploaded my WIP series to github here [1]. Most of the work is 
extracted from the current ConfigFS series, I just splitted and completed 
what was done. I also tried to take in account the comments from Sima.

All commits should compile and `modprobe/rmmod/kms_plane` should not 
crashing. The commits are not totaly clean, but it should be only cosmetic 
stuff (formatting in the wrong commit for example). The commit messages 
are not written yet, but the title should be sufficient to understand the 
content of each commit.

This is how I plan to split this work in series: (hash may change over 
time, I will force push to clean commits)

Some preparation stuff (no functionnal change):
256d7045ec70 drm/vkms: Formatting and typo fix
cc2de5004c42 drm/vkms: Rename index to possible_crtc
a74cefc87b9c drm/vkms: Add documentation

More preparation to split everything properly (no functionnal change):
ad2d0b07558f drm/vkms: Properly extract vkms_formats header
f9639cca2d43 drm/vkms: Extract vkms_writeback header
7edda8012b44 drm/vkms: Extract vkms_plane header
ced09ed9d0f7 drm/vkms: Rename macro to avoid confusion
9f00e4823529 drm/vkms: Extract vkms_crtc header
b510e480ed92 drm/vkms: Extract vkms_composer header

Switch all the vkms object to managed (this part need a careful review, 
I am new with DRM, so I probably did some error):
ddef3c09ead6 drm/vkms: Switch to managed for connector
8859cad0e192 drm/vkms: Switch to managed for encoder
d2b8d93fb684 drm/vkms: Switch to managed for crtc
d1ad316b0f0d drm/vkms: Rename all vkms_crtc instance to be consistent

Temporaly remove debugfs entry, I plan to remove this commit:
079d875c015e drm/vkms: remove debugfs entry about the current vkms 
configuration

Clean up vkms_device and unlink vkms_config from vkms_device.
c782dbe9edc3 drm/vkms: Remove vkms_config from vkms_device
8a27c13634a3 drm/vkms: Remove (useles?) group
8fb24e1cdf88 drm/vkms: Introduce directly the default device as 
global/Remove default vkms config

More cleanup:
2572d90723ac drm/vkms: Remove possible crtc from parameters

Switching to platform driver (same thing, it is my first time, I probably 
messed up things):
63be09e05760 drm/vkms: Use a real platform driver
5f4cf18b07d3 drm/vkms: Extract device driver in its own file

The configFS implementation itself. It only allows to create/enable/delete 
a device:
b34651685f2e drm/vkms: Introduce configfs

Those commits were a POC to confirm that it works. They need to be 
replaced by the "real" configuration (creation of crtc/connector/planes...)
dd55451ccef2 drm/vkms: Make overlay configurable with configfs
9dca357f1ee3 drm/vkms: Make cursor configurable with configfs
bd721f41fad9 drm/vkms: Make writeback configurable with configfs

Kind regards,
Louis Chauvet


[1]: https://github.com/Fomys/linux/tree/b4/new-configfs

> > > Since you are already working on the kernel patches (and I guess IGT?),
> > > I'll start working on a libdrm high level API to interact with VKMS from
> > > user-space on top of your patches. I'll share a link as soon as I have a
> > > draft PR.
> > 
> > Just out of curiosity what API would that be? These should fairly
> > simple that they can be configured from a shell script 
> > (mount/mkdir/rm/echo/umount). Believe should be easy enough to test stuff 
> > with 
> > bunch scripts like that.
> 
> My plan is to add a very thin C API around mkdir/rmdir/etc.
> 
> It is true that VKMS can be configure easily using a bash script; however,
> compositors with test suites written in C (or with bindings to libdrm) would
> have to write similar wrappers around the mkdir/rmdir/etc calls.
> I think that it could be beneficial for them to have a shared wrapper 
> available
> in libdrm.
>  
> > Perphas landing the I-G-T tests first (assuming we're settled 
> > on how exactly this would work) might be of greated help to get a green lit 
> > the kernel driver side? Skip if vkms/configfs/something else that tells
> > us VKMS doesn't have ConfigFS eneabled, and run it when that is on.
> > 
> > The lastest iteration was shared by Jim at 
> > https://lore.kernel.org/igt-dev/20230901092819.16924-1-marius.v...@collabora.com/
> > 
> > That way sub-sequent BAT CI would pick up issues, and can also used
> > independently by Louis. Should also divide the work-load evenly with
> > Louis focusing on the just the driver. Happy to review and test it.
> > 
> > > 
> > > > Maybe we can discuss a bit the comment from Daniel (split init between 
> > > > default/configfs, use or not a real platform device...)
> > > > 
> > > > For the split, I think the first solution (struct vkms_config) can be 
> > > > easier to understand and to implement, for two reasons:
> > > > - No need 

Re: [PATCH] drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1

2024-05-17 Thread Alex Deucher
On Fri, May 17, 2024 at 2:35 AM Christian König
 wrote:
>
> Am 16.05.24 um 19:57 schrieb Tim Van Patten:
> > From: Tim Van Patten 
> >
> > The following commit updated gmc->noretry from 0 to 1 for GC HW IP
> > 9.3.0:
> >
> >  commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
> >
> > This causes the device to hang when a page fault occurs, until the
> > device is rebooted. Instead, revert back to gmc->noretry=0 so the device
> > is still responsive.
>
> Wait a second. Why does the device hang on a page fault? That shouldn't
> happen independent of noretry.
>
> So that strongly sounds like this is just hiding a bug elsewhere.

Fair enough, but this is also the only gfx9 APU which defaults to
noretry=1, all of the rest are dGPUs.  I'd argue it should align with
the other GFX9 APUs or they should all enable noretry=1.

Alex

>
> Regards,
> Christian.
>
> >
> > Fixes: 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
> > Signed-off-by: Tim Van Patten 
> > ---
> >
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 1 -
> >   1 file changed, 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > index be4629cdac049..bff54a20835f1 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> > @@ -876,7 +876,6 @@ void amdgpu_gmc_noretry_set(struct amdgpu_device *adev)
> >   struct amdgpu_gmc *gmc = >gmc;
> >   uint32_t gc_ver = amdgpu_ip_version(adev, GC_HWIP, 0);
> >   bool noretry_default = (gc_ver == IP_VERSION(9, 0, 1) ||
> > - gc_ver == IP_VERSION(9, 3, 0) ||
> >   gc_ver == IP_VERSION(9, 4, 0) ||
> >   gc_ver == IP_VERSION(9, 4, 1) ||
> >   gc_ver == IP_VERSION(9, 4, 2) ||
>


Re: [PATCH 1/3] drm/print: Add generic drm dev printk function

2024-05-17 Thread Michal Wajdeczko



On 17.05.2024 15:33, Jani Nikula wrote:
> On Fri, 17 May 2024, Michal Wajdeczko  wrote:
>> We already have some drm printk functions that need to duplicate
>> a code to get a similar format of the final result, for example:
>>
>>   [ ] :00:00.0: [drm:foo] bar
>>   [ ] :00:00.0: [drm] foo bar
>>   [ ] :00:00.0: [drm] *ERROR* foo
>>
>> Add a generic __drm_dev_vprintk() function that can format the
>> final message like all other existing function do and allows us
>> to keep the formatting code in one place.
> 
> Nice idea!
> 
>> Signed-off-by: Michal Wajdeczko 
>> Cc: Jani Nikula 
>> ---
>>  drivers/gpu/drm/drm_print.c | 49 -
>>  include/drm/drm_print.h |  3 +++
>>  2 files changed, 30 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
>> index cf2efb44722c..a2b60c8245a1 100644
>> --- a/drivers/gpu/drm/drm_print.c
>> +++ b/drivers/gpu/drm/drm_print.c
>> @@ -187,19 +187,12 @@ void __drm_printfn_dbg(struct drm_printer *p, struct 
>> va_format *vaf)
>>  const struct drm_device *drm = p->arg;
>>  const struct device *dev = drm ? drm->dev : NULL;
>>  enum drm_debug_category category = p->category;
>> -const char *prefix = p->prefix ?: "";
>> -const char *prefix_pad = p->prefix ? " " : "";
>>  
>>  if (!__drm_debug_enabled(category))
>>  return;
>>  
>>  /* Note: __builtin_return_address(0) is useless here. */
>> -if (dev)
>> -dev_printk(KERN_DEBUG, dev, "[" DRM_NAME "]%s%s %pV",
>> -   prefix_pad, prefix, vaf);
>> -else
>> -printk(KERN_DEBUG "[" DRM_NAME "]%s%s %pV",
>> -   prefix_pad, prefix, vaf);
>> +__drm_dev_vprintk(dev, KERN_DEBUG, NULL, p->prefix, vaf);
>>  }
>>  EXPORT_SYMBOL(__drm_printfn_dbg);
>>  
>> @@ -277,6 +270,29 @@ void drm_print_bits(struct drm_printer *p, unsigned 
>> long value,
>>  }
>>  EXPORT_SYMBOL(drm_print_bits);
>>  
>> +void __drm_dev_vprintk(const struct device *dev, const char *level,
>> +   const void *origin, const char *prefix,
>> +   struct va_format *vaf)
>> +{
>> +const char *prefix_pad = prefix ? " " : (prefix = "");
> 
> Too clever, please just keep it simple:
> 
>   const char *prefix_pad = prefix ? " " : "";
> 
>   if (!prefix)
>   prefix = "";
> 
>> +
>> +if (dev)
>> +if (origin)
>> +dev_printk(level, dev, "[" DRM_NAME ":%ps]%s%s %pV",
>> +   origin, prefix_pad, prefix, vaf);
>> +else
>> +dev_printk(level, dev, "[" DRM_NAME "]%s%s %pV",
>> +   prefix_pad, prefix, vaf);
>> +else
>> +if (origin)
>> +printk("%s" "[" DRM_NAME ":%ps]%s%s %pV",
>> +   level, origin, prefix_pad, prefix, vaf);
>> +else
>> +printk("%s" "[" DRM_NAME "]%s%s %pV",
>> +   level, prefix_pad, prefix, vaf);
> 
> I'd sprinkle a few curly braces around the top level if-else blocks.
> 
> Side note, feels like using DRM_NAME makes things harder, not
> easier. But that's for another patch.
> 
>> +}
>> +EXPORT_SYMBOL(__drm_dev_vprintk);
> 
> AFAICT this could be a non-exported static function. And probably moved
> earlier in the file to not require a declaration.

true for now, but I was planning to add Xe GT specific printer that
would use this function like this:

+static inline void __xe_gt_printfn_dbg(struct drm_printer *p, struct
va_format *vaf)
+{
+   struct xe_gt *gt = p->arg;
+   const struct device *dev = gt_to_xe(gt)->drm.dev;
+   char prefix[8];
+
+   if (!drm_debug_enabled(DRM_UT_DRIVER))
+   return;
+
+   /* xe_gt_dbg() callsite decorations are unhelpful */
+   snprintf(prefix, sizeof(prefix), "GT%u:", gt->info.id);
+   __drm_dev_vprintk(dev, KERN_DEBUG, p->origin, prefix, vaf);
+}
+

but I can add this new custom printer to the series right now in v2 to
show the usage and avoid any confusion

> 
> BR,
> Jani.
> 
>> +
>>  void drm_dev_printk(const struct device *dev, const char *level,
>>

Re: [PATCH] drm/edid: add non-desktop quirk to Bigscreen Beyond HMD

2024-05-17 Thread Sefa Eyeoglu
On Fri, 2024-05-17 at 16:52 +0200, Philipp Zabel wrote:
> On Fr, 2024-05-17 at 16:09 +0200, Sefa Eyeoglu wrote:
> > The Bigscreen Beyond VR headset is a non-desktop output and should
> > be
> > marked as such using an EDID quirk.
> > 
> > Closes https://gitlab.freedesktop.org/drm/misc/kernel/-/issues/39
> 
> From the EDID posted there, it looks like the quirk should not be
> necessary?
> 
> The quoted DisplayID extension block correctly marks this as an HMD:
> 
>   "Display Product Primary Use Case: Head-mounted Virtual Reality
> (VR) display"
> 
> The update_displayid_info() function in drm_edid.c should use this
> information to set the non_desktop flag already. Doesn't this work as
> expected?
> 
> 
> regards
> Philipp

I see.

The only potential reason I can come up with is that the DisplayID
block is incomplete.

$ edid-decode --check
Failures:

Block 0, Base EDID:
  Standard Timings: Missing preferred timing.
EDID:
  DisplayID: Missing DisplayID Product Identification Data Block.
  DisplayID: Missing DisplayID Display Parameters Data Block.
  DisplayID: Missing DisplayID Display Interface Features Data Block.

EDID conformity: FAIL


signature.asc
Description: This is a digitally signed message part


Re: [PATCH] drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1

2024-05-17 Thread Tim Van Patten
On Fri, May 17, 2024 at 12:35 AM Christian König
 wrote:
>
> Am 16.05.24 um 19:57 schrieb Tim Van Patten:
> > From: Tim Van Patten 
> >
> > The following commit updated gmc->noretry from 0 to 1 for GC HW IP
> > 9.3.0:
> >
> >  commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1")
> >
> > This causes the device to hang when a page fault occurs, until the
> > device is rebooted. Instead, revert back to gmc->noretry=0 so the device
> > is still responsive.
>
> Wait a second. Why does the device hang on a page fault? That shouldn't
> happen independent of noretry.

No idea, but hopefully someone within AMD can help answer that.

I'm not an expert in this area, I was just able to bisect to the CL
causing the change in behavior. There are other reports of people
bisecting to the same CL, so this issue appears to extend beyond
ChromeOS:
https://gitlab.freedesktop.org/mesa/mesa/-/issues/9728#note_2063879

> So that strongly sounds like this is just hiding a bug elsewhere.

That's entirely possible, bringing the number of real issues up to (at
least) two:
1. Why the page faults are occurring to begin with.
  - Any video of size 66x2158 seems to trigger the issue.
2. Why the page faults result in the device hanging with gmc->noretry=1.

I've asked prathyushi.nangia@amd (chris.kuruts@amd may be helping as
well) to look into the page faults further, since they can't hang the
device if they don't exist. She should be able to provide any further
details if you're interested, but please feel free to reach out to me
as well if you have any other questions.


Re: [PATCH v2 2/2] drm/mgag200: Add an option to disable Write-Combine

2024-05-17 Thread Thomas Zimmermann

Hi

Am 17.05.24 um 17:09 schrieb Jocelyn Falempe:

Unfortunately, the G200 ioburst workaround doesn't work on some
servers like Dell poweredge XR11, XR5610, or HPE XL260. In this case
completely disabling WC is the only option to achieve low-latency.
So this adds a new Kconfig option to disable WC mapping of the G200.

Signed-off-by: Jocelyn Falempe 


Reviewed-by: Thomas Zimmermann 

Thanks a lot for the fix.

Best regards
Thomas


---
  drivers/gpu/drm/mgag200/Kconfig   | 10 ++
  drivers/gpu/drm/mgag200/mgag200_drv.c |  6 ++
  2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/mgag200/Kconfig b/drivers/gpu/drm/mgag200/Kconfig
index b28c5e4828f47..3096944a8f0ab 100644
--- a/drivers/gpu/drm/mgag200/Kconfig
+++ b/drivers/gpu/drm/mgag200/Kconfig
@@ -11,3 +11,13 @@ config DRM_MGAG200
 MGA G200 desktop chips and the server variants. It requires 0.3.0
 of the modesetting userspace driver, and a version of mga driver
 that will fail on KMS enabled devices.
+
+config DRM_MGAG200_DISABLE_WRITECOMBINE
+   bool "Disable Write Combine mapping of VRAM"
+   depends on DRM_MGAG200 && PREEMPT_RT
+   help
+ The VRAM of the G200 is mapped with Write-Combine to improve
+ performances. This can interfere with real-time tasks; even if they
+ are running on other CPU cores than the graphics output.
+ Enable this option only if you run realtime tasks on a server with a
+ Matrox G200.
\ No newline at end of file
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c 
b/drivers/gpu/drm/mgag200/mgag200_drv.c
index 3883f25ca4d8b..62080cf0f2da4 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.c
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.c
@@ -146,12 +146,18 @@ int mgag200_device_preinit(struct mga_device *mdev)
}
mdev->vram_res = res;
  
+#if defined(CONFIG_DRM_MGAG200_DISABLE_WRITECOMBINE)

+   mdev->vram = devm_ioremap(dev->dev, res->start, resource_size(res));
+   if (!mdev->vram)
+   return -ENOMEM;
+#else
mdev->vram = devm_ioremap_wc(dev->dev, res->start, resource_size(res));
if (!mdev->vram)
return -ENOMEM;
  
  	/* Don't fail on errors, but performance might be reduced. */

devm_arch_phys_wc_add(dev->dev, res->start, resource_size(res));
+#endif
  
  	return 0;

  }


--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)



[PATCH v2 2/2] drm/mgag200: Add an option to disable Write-Combine

2024-05-17 Thread Jocelyn Falempe
Unfortunately, the G200 ioburst workaround doesn't work on some
servers like Dell poweredge XR11, XR5610, or HPE XL260. In this case
completely disabling WC is the only option to achieve low-latency.
So this adds a new Kconfig option to disable WC mapping of the G200.

Signed-off-by: Jocelyn Falempe 
---
 drivers/gpu/drm/mgag200/Kconfig   | 10 ++
 drivers/gpu/drm/mgag200/mgag200_drv.c |  6 ++
 2 files changed, 16 insertions(+)

diff --git a/drivers/gpu/drm/mgag200/Kconfig b/drivers/gpu/drm/mgag200/Kconfig
index b28c5e4828f47..3096944a8f0ab 100644
--- a/drivers/gpu/drm/mgag200/Kconfig
+++ b/drivers/gpu/drm/mgag200/Kconfig
@@ -11,3 +11,13 @@ config DRM_MGAG200
 MGA G200 desktop chips and the server variants. It requires 0.3.0
 of the modesetting userspace driver, and a version of mga driver
 that will fail on KMS enabled devices.
+
+config DRM_MGAG200_DISABLE_WRITECOMBINE
+   bool "Disable Write Combine mapping of VRAM"
+   depends on DRM_MGAG200 && PREEMPT_RT
+   help
+ The VRAM of the G200 is mapped with Write-Combine to improve
+ performances. This can interfere with real-time tasks; even if they
+ are running on other CPU cores than the graphics output.
+ Enable this option only if you run realtime tasks on a server with a
+ Matrox G200.
\ No newline at end of file
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c 
b/drivers/gpu/drm/mgag200/mgag200_drv.c
index 3883f25ca4d8b..62080cf0f2da4 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.c
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.c
@@ -146,12 +146,18 @@ int mgag200_device_preinit(struct mga_device *mdev)
}
mdev->vram_res = res;
 
+#if defined(CONFIG_DRM_MGAG200_DISABLE_WRITECOMBINE)
+   mdev->vram = devm_ioremap(dev->dev, res->start, resource_size(res));
+   if (!mdev->vram)
+   return -ENOMEM;
+#else
mdev->vram = devm_ioremap_wc(dev->dev, res->start, resource_size(res));
if (!mdev->vram)
return -ENOMEM;
 
/* Don't fail on errors, but performance might be reduced. */
devm_arch_phys_wc_add(dev->dev, res->start, resource_size(res));
+#endif
 
return 0;
 }
-- 
2.45.0



[PATCH v2 1/2] Revert "drm/mgag200: Add a workaround for low-latency"

2024-05-17 Thread Jocelyn Falempe
This reverts commit bfa4437fd3938ae2e186e7664b2db65bb8775670.

This workaround doesn't work reliably on all servers.
I'll replace it with an option to disable Write-Combine,
which has more impact on performance, but fix the latency
issue on all hardware.

Signed-off-by: Jocelyn Falempe 
Reviewed-by: Thomas Zimmermann 
---
 drivers/gpu/drm/mgag200/Kconfig| 12 
 drivers/gpu/drm/mgag200/mgag200_drv.c  | 17 -
 drivers/gpu/drm/mgag200/mgag200_mode.c |  8 
 3 files changed, 37 deletions(-)

diff --git a/drivers/gpu/drm/mgag200/Kconfig b/drivers/gpu/drm/mgag200/Kconfig
index 5e4d48df4854c..b28c5e4828f47 100644
--- a/drivers/gpu/drm/mgag200/Kconfig
+++ b/drivers/gpu/drm/mgag200/Kconfig
@@ -11,15 +11,3 @@ config DRM_MGAG200
 MGA G200 desktop chips and the server variants. It requires 0.3.0
 of the modesetting userspace driver, and a version of mga driver
 that will fail on KMS enabled devices.
-
-config DRM_MGAG200_IOBURST_WORKAROUND
-   bool "Disable buffer caching"
-   depends on DRM_MGAG200 && PREEMPT_RT && X86
-   help
- Enable a workaround to avoid I/O bursts within the mgag200 driver at
- the expense of overall display performance.
- It restores the map_wc = true;
-   return >base;
-}
-#endif
-
 /*
  * DRM driver
  */
@@ -113,9 +99,6 @@ static const struct drm_driver mgag200_driver = {
.major = DRIVER_MAJOR,
.minor = DRIVER_MINOR,
.patchlevel = DRIVER_PATCHLEVEL,
-#if defined(CONFIG_DRM_MGAG200_IOBURST_WORKAROUND)
-   .gem_create_object = mgag200_create_object,
-#endif
DRM_GEM_SHMEM_DRIVER_OPS,
 };
 
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c 
b/drivers/gpu/drm/mgag200/mgag200_mode.c
index fc54851d3384d..d3d820f7a77d7 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -13,7 +13,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -438,13 +437,6 @@ static void mgag200_handle_damage(struct mga_device *mdev, 
const struct iosys_ma
 
iosys_map_incr(, drm_fb_clip_offset(fb->pitches[0], fb->format, 
clip));
drm_fb_memcpy(, fb->pitches, vmap, fb, clip);
-
-   /* Flushing the cache greatly improves latency on x86_64 */
-#if defined(CONFIG_DRM_MGAG200_IOBURST_WORKAROUND)
-   if (!vmap->is_iomem)
-   drm_clflush_virt_range(vmap->vaddr + clip->y1 * fb->pitches[0],
-  drm_rect_height(clip) * fb->pitches[0]);
-#endif
 }
 
 /*
-- 
2.45.0



[PATCH v2 0/2] drm/mgag200: Add an option to disable Write-Combine

2024-05-17 Thread Jocelyn Falempe
Unfortunately, the G200 ioburst workaround doesn't work on some servers
like Dell poweredge XR11, XR5610, or HPE XL260
In this case completely disabling WC is the only option to achieve
low-latency.
It's probably useless to maintain two workarounds for this,
so I reverted commit bfa4437fd3938 drm/mgag200: Add a workaround for low-latency
and added a new and simpler option, to just disable Write-Combine.

Jocelyn Falempe (2):
  Revert "drm/mgag200: Add a workaround for low-latency"
  drm/mgag200: Add an option to disable Write-Combine

 drivers/gpu/drm/mgag200/Kconfig| 18 --
 drivers/gpu/drm/mgag200/mgag200_drv.c  | 23 ++-
 drivers/gpu/drm/mgag200/mgag200_mode.c |  8 
 3 files changed, 14 insertions(+), 35 deletions(-)


base-commit: 3179338750d83877bbc491493032bdf192266ad9
-- 
2.45.0



Re: [PATCH 2/2] drm/mgag200: Add an option to disable Write-Combine

2024-05-17 Thread Jocelyn Falempe




On 17/05/2024 11:39, Thomas Zimmermann wrote:

Hi,

just nits below.

Am 16.05.24 um 18:17 schrieb Jocelyn Falempe:

Unfortunately, the G200 ioburst workaround doesn't work on some servers
like Dell poweredge XR11, XR5610, or HPE XL260
In this case completely disabling WC is the only option to achieve
low-latency.
So this adds a new Kconfig option, to disable WC mapping of the G200


The formatting look off. Maybe make this one single paragraph.


Ok,


No comma after 'option'.

Ok,




Signed-off-by: Jocelyn Falempe 
---
  drivers/gpu/drm/mgag200/Kconfig   | 10 ++
  drivers/gpu/drm/mgag200/mgag200_drv.c |  7 +++
  2 files changed, 17 insertions(+)

diff --git a/drivers/gpu/drm/mgag200/Kconfig 
b/drivers/gpu/drm/mgag200/Kconfig

index b28c5e4828f47..73ab5730b74d9 100644
--- a/drivers/gpu/drm/mgag200/Kconfig
+++ b/drivers/gpu/drm/mgag200/Kconfig
@@ -11,3 +11,13 @@ config DRM_MGAG200
   MGA G200 desktop chips and the server variants. It requires 0.3.0
   of the modesetting userspace driver, and a version of mga driver
   that will fail on KMS enabled devices.
+
+config DRM_MGAG200_DISABLE_WRITECOMBINE
+    bool "Disable Write Combine mapping of VRAM"
+    depends on DRM_MGAG200 && PREEMPT_RT
+    help
+  The VRAM of the G200 is mapped with Write-Combine, to improve

No comma after Write-Combine


Ok
+  performances. However this increases the system latency a lot, 
even
Just say "This can interfere with real-time tasks; even if they are 
running on other CPU cores then the graphics output."


Ok



+  for realtime tasks running on other CPU cores. Typically 40us-80us
+  latencies are measured with hwlat when Write Combine is enabled.


Leave out the next sentence: "Typically ..." The measureed numbers 
depend on the hardware and everyone is encouraged to test on their own 
system. You could mention  the numbers in the commit description, as you 
already mention the affected systems there.


Ok


+  Recommended if you run realtime tasks on a server with a Matrox 
G200.


I still think that we should not encourage anyone to use this option. 
Maybe say "Enable this option only if you run..."


Agreed, I will change this.



\ No newline at end of file
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c 
b/drivers/gpu/drm/mgag200/mgag200_drv.c

index 3883f25ca4d8b..7461e3f984eff 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.c
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.c
@@ -146,12 +146,19 @@ int mgag200_device_preinit(struct mga_device *mdev)
  }
  mdev->vram_res = res;
+#if defined(CONFIG_DRM_MGAG200_DISABLE_WRITECOMBINE)
+    drm_info(dev, "Disable Write Combine\n");


I would not print this drm_info() here. The user has selected the config 
option, so they should know what happens. It's also listed in /proc/mtrr 
IIRC.


Sure, I can remove the drm_info().

Thanks for the review, I will send a v2 shortly.


Best regards
Thomas


+    mdev->vram = devm_ioremap(dev->dev, res->start, resource_size(res));
+    if (!mdev->vram)
+    return -ENOMEM;
+#else
  mdev->vram = devm_ioremap_wc(dev->dev, res->start, 
resource_size(res));

  if (!mdev->vram)
  return -ENOMEM;
  /* Don't fail on errors, but performance might be reduced. */
  devm_arch_phys_wc_add(dev->dev, res->start, resource_size(res));
+#endif
  return 0;
  }






Re: [PATCH] drm/edid: add non-desktop quirk to Bigscreen Beyond HMD

2024-05-17 Thread Philipp Zabel
On Fr, 2024-05-17 at 16:09 +0200, Sefa Eyeoglu wrote:
> The Bigscreen Beyond VR headset is a non-desktop output and should be
> marked as such using an EDID quirk.
> 
> Closes https://gitlab.freedesktop.org/drm/misc/kernel/-/issues/39

>From the EDID posted there, it looks like the quirk should not be
necessary?

The quoted DisplayID extension block correctly marks this as an HMD:

  "Display Product Primary Use Case: Head-mounted Virtual Reality (VR) display"

The update_displayid_info() function in drm_edid.c should use this
information to set the non_desktop flag already. Doesn't this work as
expected?


regards
Philipp


Re: [PATCH v2] drm/buddy: Fix the warn on's during force merge

2024-05-17 Thread Paneer Selvam, Arunpravin




On 5/17/2024 7:30 PM, Matthew Auld wrote:

On 17/05/2024 14:50, Arunpravin Paneer Selvam wrote:

Move the fallback and block incompatible checks
above, so that we dont unnecessarily split the blocks
and leaving the unmerged. This resolves the unnecessary
warn on's thrown during force_merge call.

v2:(Matthew)
   - Move the fallback and block incompatible checks above
 the contains check.

Signed-off-by: Arunpravin Paneer Selvam 


Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")

Reviewed-by: Matthew Auld 

A follow up unit test to catch this edge case would be lovely.

Yes, I will follow up on this.

Thanks,
Arun.



---
  drivers/gpu/drm/drm_buddy.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 1daf778cf6fa..94f8c34fc293 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -524,11 +524,11 @@ __alloc_range_bias(struct drm_buddy *mm,
  continue;
  }
  +    if (!fallback && block_incompatible(block, flags))
+    continue;
+
  if (contains(start, end, block_start, block_end) &&
  order == drm_buddy_block_order(block)) {
-    if (!fallback && block_incompatible(block, flags))
-    continue;
-
  /*
   * Find the free block within the range.
   */




[PATCH v2] drm/buddy: Fix the warn on's during force merge

2024-05-17 Thread Arunpravin Paneer Selvam
Move the fallback and block incompatible checks
above, so that we dont unnecessarily split the blocks
and leaving the unmerged. This resolves the unnecessary
warn on's thrown during force_merge call.

v2:(Matthew)
  - Move the fallback and block incompatible checks above
the contains check.

Signed-off-by: Arunpravin Paneer Selvam 
Reviewed-by: Matthew Auld 
Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
Link: 
https://patchwork.kernel.org/project/dri-devel/patch/20240517135015.17565-1-arunpravin.paneersel...@amd.com/
---
 drivers/gpu/drm/drm_buddy.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 1daf778cf6fa..94f8c34fc293 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -524,11 +524,11 @@ __alloc_range_bias(struct drm_buddy *mm,
continue;
}
 
+   if (!fallback && block_incompatible(block, flags))
+   continue;
+
if (contains(start, end, block_start, block_end) &&
order == drm_buddy_block_order(block)) {
-   if (!fallback && block_incompatible(block, flags))
-   continue;
-
/*
 * Find the free block within the range.
 */
-- 
2.25.1



Re: [PATCH] drm/edid: add non-desktop quirk to Bigscreen Beyond HMD

2024-05-17 Thread Sefa Eyeoglu
Hi Jani,

I have just just posted a modified patch with a link to the issue.

Best,
Sefa

On Fri, 2024-05-17 at 15:54 +0300, Jani Nikula wrote:
> On Fri, 17 May 2024, Sefa Eyeoglu  wrote:
> > The Bigscreen Beyond VR headset is a non-desktop output and should
> > be
> > marked as such using an EDID quirk.
> 
> I'd appreciate a bug being filed at [1], attaching the EDID of the
> panel
> there, maybe dmesg with drm.debug=14 enabled too, and referencing the
> bug in the commit message. It gets terribly hard to figure anything
> out
> about the quirks afterwards when some time has passed.
> 
> Thanks,
> Jani.
> 
> 
> 
> [1] https://gitlab.freedesktop.org/drm/misc/kernel/-/issues
> 
> > 
> > Signed-off-by: Sefa Eyeoglu 
> > ---
> >  drivers/gpu/drm/drm_edid.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_edid.c
> > b/drivers/gpu/drm/drm_edid.c
> > index 4f54c91b31b2..d407efc0fb55 100644
> > --- a/drivers/gpu/drm/drm_edid.c
> > +++ b/drivers/gpu/drm/drm_edid.c
> > @@ -200,6 +200,9 @@ static const struct edid_quirk {
> >     /* Rotel RSX-1058 forwards sink's EDID but only does HDMI
> > 1.1*/
> >     EDID_QUIRK('E', 'T', 'R', 13896, EDID_QUIRK_FORCE_8BPC),
> >  
> > +   /* Bigscreen Beyond Headset */
> > +   EDID_QUIRK('B', 'I', 'G', 0x1234, EDID_QUIRK_NON_DESKTOP),
> > +
> >     /* Valve Index Headset */
> >     EDID_QUIRK('V', 'L', 'V', 0x91a8, EDID_QUIRK_NON_DESKTOP),
> >     EDID_QUIRK('V', 'L', 'V', 0x91b0, EDID_QUIRK_NON_DESKTOP),
> 



signature.asc
Description: This is a digitally signed message part


[PATCH] drm/edid: add non-desktop quirk to Bigscreen Beyond HMD

2024-05-17 Thread Sefa Eyeoglu
The Bigscreen Beyond VR headset is a non-desktop output and should be
marked as such using an EDID quirk.

Closes https://gitlab.freedesktop.org/drm/misc/kernel/-/issues/39

Signed-off-by: Sefa Eyeoglu 
---
 drivers/gpu/drm/drm_edid.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 4f54c91b31b2..d407efc0fb55 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -200,6 +200,9 @@ static const struct edid_quirk {
/* Rotel RSX-1058 forwards sink's EDID but only does HDMI 1.1*/
EDID_QUIRK('E', 'T', 'R', 13896, EDID_QUIRK_FORCE_8BPC),
 
+   /* Bigscreen Beyond Headset */
+   EDID_QUIRK('B', 'I', 'G', 0x1234, EDID_QUIRK_NON_DESKTOP),
+
/* Valve Index Headset */
EDID_QUIRK('V', 'L', 'V', 0x91a8, EDID_QUIRK_NON_DESKTOP),
EDID_QUIRK('V', 'L', 'V', 0x91b0, EDID_QUIRK_NON_DESKTOP),
-- 
2.44.0



Re: [PATCH v2] drm/buddy: Fix the warn on's during force merge

2024-05-17 Thread Matthew Auld

On 17/05/2024 14:50, Arunpravin Paneer Selvam wrote:

Move the fallback and block incompatible checks
above, so that we dont unnecessarily split the blocks
and leaving the unmerged. This resolves the unnecessary
warn on's thrown during force_merge call.

v2:(Matthew)
   - Move the fallback and block incompatible checks above
 the contains check.

Signed-off-by: Arunpravin Paneer Selvam 
Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")

Reviewed-by: Matthew Auld 

A follow up unit test to catch this edge case would be lovely.


---
  drivers/gpu/drm/drm_buddy.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 1daf778cf6fa..94f8c34fc293 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -524,11 +524,11 @@ __alloc_range_bias(struct drm_buddy *mm,
continue;
}
  
+		if (!fallback && block_incompatible(block, flags))

+   continue;
+
if (contains(start, end, block_start, block_end) &&
order == drm_buddy_block_order(block)) {
-   if (!fallback && block_incompatible(block, flags))
-   continue;
-
/*
 * Find the free block within the range.
 */


Re: [PATCH 2/2] drm/i915: Don't treat FLR resets as errors

2024-05-17 Thread Nirmoy Das

Hi Andi,

On 5/17/2024 1:25 PM, Andi Shyti wrote:

If we timeout while waiting for an FLR reset, there is nothing we
can do and i915 doesn't have any control on it. In any case the
system is still perfectly usable


If a FLR reset fails then we will have a dead GPU, I don't think the GPU 
is usable without a cold reboot.


This is a serious issue and should be report as an error.  I think we 
need to create a HW ticket to understand


why is FLR reset fails.


Regards,

Nirmoy




  and the function returns void.

We don't need to be alarmed, therefore, print the timeout
expiration as a debug message instead of an error.

Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10955
Signed-off-by: Andi Shyti 
---
  drivers/gpu/drm/i915/intel_uncore.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_uncore.c 
b/drivers/gpu/drm/i915/intel_uncore.c
index 2eba289d88ad..a3fa2ed91aae 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -2637,7 +2637,7 @@ static void driver_initiated_flr(struct intel_uncore 
*uncore)
 */
ret = intel_wait_for_register_fw(uncore, GU_CNTL, DRIVERFLR, 0, 
flr_timeout_ms);
if (ret) {
-   drm_err(>drm,
+   drm_dbg(>drm,
"Failed to wait for Driver-FLR bit to clear! %d\n",
ret);
return;
@@ -2652,7 +2652,7 @@ static void driver_initiated_flr(struct intel_uncore 
*uncore)
 DRIVERFLR, 0,
 flr_timeout_ms);
if (ret) {
-   drm_err(>drm, "Driver-FLR-teardown wait completion failed! 
%d\n", ret);
+   drm_dbg(>drm, "Driver-FLR-teardown wait completion failed! 
%d\n", ret);
return;
}
  
@@ -2661,7 +2661,7 @@ static void driver_initiated_flr(struct intel_uncore *uncore)

 DRIVERFLR_STATUS, DRIVERFLR_STATUS,
 flr_timeout_ms);
if (ret) {
-   drm_err(>drm, "Driver-FLR-reinit wait completion failed! 
%d\n", ret);
+   drm_dbg(>drm, "Driver-FLR-reinit wait completion failed! 
%d\n", ret);
return;
}
  


Re: [PATCH v2] drm/buddy: Fix the warn on's during force merge

2024-05-17 Thread Paneer Selvam, Arunpravin

Hi Matthew,
This fixes the problem.

Regards,
Arun.

On 5/17/2024 7:20 PM, Arunpravin Paneer Selvam wrote:

Move the fallback and block incompatible checks
above, so that we dont unnecessarily split the blocks
and leaving the unmerged. This resolves the unnecessary
warn on's thrown during force_merge call.

v2:(Matthew)
   - Move the fallback and block incompatible checks above
 the contains check.

Signed-off-by: Arunpravin Paneer Selvam 
Fixes: 96950929eb23 ("drm/buddy: Implement tracking clear page feature")
---
  drivers/gpu/drm/drm_buddy.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 1daf778cf6fa..94f8c34fc293 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -524,11 +524,11 @@ __alloc_range_bias(struct drm_buddy *mm,
continue;
}
  
+		if (!fallback && block_incompatible(block, flags))

+   continue;
+
if (contains(start, end, block_start, block_end) &&
order == drm_buddy_block_order(block)) {
-   if (!fallback && block_incompatible(block, flags))
-   continue;
-
/*
 * Find the free block within the range.
 */




  1   2   3   4   5   6   7   8   9   10   >