[syzbot] [dri?] linux-next boot error: WARNING: bad unlock balance in vkms_vblank_simulate

2023-09-16 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:e143016b56ec Add linux-next specific files for 20230913
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=164c5ac7a8
kernel config:  https://syzkaller.appspot.com/x/.config?x=76ee1517f109f01a
dashboard link: https://syzkaller.appspot.com/bug?extid=204dd7e9a83cb8855b63
compiler:   gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 
2.40

Downloadable assets:
disk image: 
https://storage.googleapis.com/syzbot-assets/845fe7fc2fee/disk-e143016b.raw.xz
vmlinux: 
https://storage.googleapis.com/syzbot-assets/d74646a84425/vmlinux-e143016b.xz
kernel image: 
https://storage.googleapis.com/syzbot-assets/bfbe2696ea96/bzImage-e143016b.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+204dd7e9a83cb8855...@syzkaller.appspotmail.com


=
WARNING: bad unlock balance detected!
6.6.0-rc1-next-20230913-syzkaller #0 Not tainted
-
swapper/0/0 is trying to release lock (&vkms_out->enabled_lock) at:
[] vkms_vblank_simulate+0x159/0x3d0 
drivers/gpu/drm/vkms/vkms_crtc.c:34
but there are no more locks to release!

other info that might help us debug this:
no locks held by swapper/0/0.

stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.6.0-rc1-next-20230913-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
08/04/2023
Call Trace:
 
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
 __lock_release kernel/locking/lockdep.c:5430 [inline]
 lock_release+0x4b5/0x680 kernel/locking/lockdep.c:5773
 __mutex_unlock_slowpath+0xa3/0x640 kernel/locking/mutex.c:907
 vkms_vblank_simulate+0x159/0x3d0 drivers/gpu/drm/vkms/vkms_crtc.c:34
 __run_hrtimer kernel/time/hrtimer.c:1688 [inline]
 __hrtimer_run_queues+0x203/0xc10 kernel/time/hrtimer.c:1752
 hrtimer_interrupt+0x31b/0x800 kernel/time/hrtimer.c:1814
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1063 [inline]
 __sysvec_apic_timer_interrupt+0x105/0x3f0 arch/x86/kernel/apic/apic.c:1080
 sysvec_apic_timer_interrupt+0x8e/0xc0 arch/x86/kernel/apic/apic.c:1074
 
 
 asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:645
RIP: 0010:native_irq_disable arch/x86/include/asm/irqflags.h:37 [inline]
RIP: 0010:arch_local_irq_disable arch/x86/include/asm/irqflags.h:72 [inline]
RIP: 0010:acpi_safe_halt+0x1b/0x20 drivers/acpi/processor_idle.c:113
Code: ed c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 65 48 8b 04 25 c0 bc 03 00 
48 8b 00 a8 08 75 0c 66 90 0f 00 2d 17 b9 b2 00 fb f4  c3 0f 1f 00 0f b6 47 
08 3c 01 74 0b 3c 02 74 05 8b 7f 04 eb 9f
RSP: :8c807d70 EFLAGS: 0246
RAX: 4000 RBX: 0001 RCX: 8a41858e
RDX: 0001 RSI: 88801368d800 RDI: 88801368d864
RBP: 88801368d864 R08: 0001 R09: ed1017306dbd
R10: 8880b9836deb R11:  R12: 88801730
R13: 8d661d60 R14:  R15: 
 acpi_idle_enter+0xc5/0x160 drivers/acpi/processor_idle.c:707
 cpuidle_enter_state+0x82/0x500 drivers/cpuidle/cpuidle.c:267
 cpuidle_enter+0x4e/0xa0 drivers/cpuidle/cpuidle.c:388
 cpuidle_idle_call kernel/sched/idle.c:215 [inline]
 do_idle+0x315/0x3f0 kernel/sched/idle.c:282
 cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:379
 rest_init+0x16f/0x2b0 init/main.c:726
 arch_call_rest_init+0x13/0x30 init/main.c:823
 start_kernel+0x39f/0x480 init/main.c:1068
 x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:556
 x86_64_start_kernel+0xb2/0xc0 arch/x86/kernel/head64.c:537
 secondary_startup_64_no_verify+0x166/0x16b
 

Code disassembly (best guess):
   0:   ed  in (%dx),%eax
   1:   c3  ret
   2:   66 66 2e 0f 1f 84 00data16 cs nopw 0x0(%rax,%rax,1)
   9:   00 00 00 00
   d:   66 90   xchg   %ax,%ax
   f:   65 48 8b 04 25 c0 bcmov%gs:0x3bcc0,%rax
  16:   03 00
  18:   48 8b 00mov(%rax),%rax
  1b:   a8 08   test   $0x8,%al
  1d:   75 0c   jne0x2b
  1f:   66 90   xchg   %ax,%ax
  21:   0f 00 2d 17 b9 b2 00verw   0xb2b917(%rip)# 0xb2b93f
  28:   fb  sti
  29:   f4  hlt
* 2a:   fa  cli <-- trapping instruction
  2b:   c3  ret
  2c:   0f 1f 00nopl   (%rax)
  2f:   0f b6 47 08 movzbl 0x8(%rdi),%eax
  33:   3c 01   cmp$0x1,%al
  35:   74 0b   je 0x42
  37:   3c 02   cmp$0x2,%al
  39:   74 05   je 0x40
  3b:   8b 7f 04mov0x4(%rdi),%edi
  3e:   eb 9f   jmp0xffdf


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more in

RE: [PATCH v1 0/3] udmabuf: Add support for page migration out of movable zone or CMA

2023-09-16 Thread Kasireddy, Vivek
Hi David,

> >> I think it makes sense to have a generic (non-GUP) version of
> >> check_and_migrate_movable_pages() available in migration.h that
> >> drivers can use to ensure that they don't break memory hotunplug
> >> accidentally.
> >
> > Definately not.
> >
> > Either use the VMA and pin_user_pages(), or implement
> > pin_user_pages_fd() in core code.
> >
> > Do not open code something wonky in drivers.
> 
> Agreed. pin_user_pages_fd() might become relevant in the context of
> vfio/mdev + KVM gmem -- don't mmap guest memory but instead provide it
> via a special memfd to the kernel.
> 
> So there might be value in having such a core infrastructure.
Ok, I'll work on adding pin_user_pages_fd() soon.

Thanks,
Vivek
> 
> --
> Cheers,
> 
> David / dhildenb



Re: [PATCH v3 12/13] drm/sched/doc: Add Entity teardown documentaion

2023-09-16 Thread Danilo Krummrich

On 9/12/23 04:16, Matthew Brost wrote:

Provide documentation to guide in ways to teardown an entity.

Signed-off-by: Matthew Brost 
---
  Documentation/gpu/drm-mm.rst |  6 ++
  drivers/gpu/drm/scheduler/sched_entity.c | 19 +++
  2 files changed, 25 insertions(+)

diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index c19b34b1c0ed..cb4d6097897e 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -552,6 +552,12 @@ Overview
  .. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
 :doc: Overview
  
+Entity teardown

+---


While I think it is good to document this as well, my concern was more about 
tearing
down the drm_gpu_scheduler. (See also my response to patch 11 of this series.)

How do we ensure that the pending_list is actually empty before calling
drm_sched_fini()? If we don't, we potentially leak memory.

For instance, we could let drm_sched_fini() (or a separate drm_sched_teardown())
cancel run work first and leave free work running until the pending_list is 
empty.

If we think drivers should take care themselves (e.g. through reference 
counting jobs
per scheduler), we should document this and explain why we can't have the 
scheduler do
this for us.


+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_entity.c
+   :doc: Entity teardown
+
  Scheduler Function References
  -
  
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c

index 37557fbb96d0..76f3e10218bb 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -21,6 +21,25 @@
   *
   */
  
+/**

+ * DOC: Entity teardown
+ *
+ * Drivers can teardown down an entity for several reasons. Reasons typically
+ * are a user closes the entity via an IOCTL, the FD associated with the entity
+ * is closed, or the entity encounters an error. The GPU scheduler provides the
+ * basic infrastructure to do this in a few different ways.
+ *
+ * 1. Let the entity run dry (both the pending list and job queue) and then 
call
+ * drm_sched_entity_fini. The backend can accelerate the process of running 
dry.
+ * For example set a flag so run_job is a NOP and set the TDR to a low value to
+ * signal all jobs in a timely manner (this example works for
+ * DRM_SCHED_POLICY_SINGLE_ENTITY).
+ *
+ * 2. Kill the entity directly via drm_sched_entity_flush /
+ * drm_sched_entity_fini ensuring all pending and queued jobs are off the
+ * hardware and signaled.
+ */
+
  #include 
  #include 
  #include 


Re: [PATCH v3 11/13] drm/sched: Waiting for pending jobs to complete in scheduler kill

2023-09-16 Thread Danilo Krummrich

On 9/12/23 16:47, Matthew Brost wrote:

On Tue, Sep 12, 2023 at 11:57:30AM +0200, Christian König wrote:

Am 12.09.23 um 04:16 schrieb Matthew Brost:

Wait for pending jobs to be complete before signaling queued jobs. This
ensures dma-fence signaling order correct and also ensures the entity is
not running on the hardware after drm_sched_entity_flush or
drm_sched_entity_fini returns.


Entities are *not* supposed to outlive the submissions they carry and we
absolutely *can't* wait for submissions to finish while killing the entity.

In other words it is perfectly expected that entities doesn't exists any
more while the submissions they carried are still running on the hardware.

I somehow better need to document how this working and especially why it is
working like that.

This approach came up like four or five times now and we already applied and
reverted patches doing this.

For now let's take a look at the source code of drm_sched_entity_kill():

    /* The entity is guaranteed to not be used by the scheduler */
     prev = rcu_dereference_check(entity->last_scheduled, true);
     dma_fence_get(prev);

     while ((job = to_drm_sched_job(spsc_queue_pop(&entity->job_queue
{
     struct drm_sched_fence *s_fence = job->s_fence;

     dma_fence_get(&s_fence->finished);
     if (!prev || dma_fence_add_callback(prev, &job->finish_cb,
drm_sched_entity_kill_jobs_cb))
     drm_sched_entity_kill_jobs_cb(NULL,
&job->finish_cb);

     prev = &s_fence->finished;
     }
     dma_fence_put(prev);

This ensures the dma-fence signaling order by delegating signaling of the
scheduler fences into callbacks.



Thanks for the explaination, this code makes more sense now. Agree this
patch is not correct.

This patch really is an RFC for something Nouveau needs, I can delete
this patch in the next rev and let Nouveau run with a slightly different
version if needed.


Maybe there was a misunderstanding, I do not see any need for this in Nouveau.

Instead, what I think we need is a way to wait for the pending_list being empty
(meaning all jobs on the pending_list are freed) before we call 
drm_sched_fini().

Currently, if we call drm_sched_fini() there might still be pending jobs on the
pending_list (unless the driver implements something driver specific).
drm_sched_fini() stops the scheduler work though, hence pending jobs will never 
be
freed up leaking their memory.

This might also be true for existing drivers, but since they call 
drm_sched_fini()
from their driver remove callback, this race is extremely unlikely. However, it
definitely is an issue for drivers using the single entitiy policy calling
drm_sched_fini() from a context where it is much more likely pending jobs still
exist.



Matt


Regards,
Christian.



Signed-off-by: Matthew Brost 
---
   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |  2 +-
   drivers/gpu/drm/scheduler/sched_entity.c|  7 ++-
   drivers/gpu/drm/scheduler/sched_main.c  | 50 ++---
   include/drm/gpu_scheduler.h | 18 
   4 files changed, 70 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
index fb5dad687168..7835c0da65c5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
@@ -1873,7 +1873,7 @@ static void amdgpu_ib_preempt_mark_partial_job(struct 
amdgpu_ring *ring)
list_for_each_entry_safe(s_job, tmp, &sched->pending_list, list) {
if (dma_fence_is_signaled(&s_job->s_fence->finished)) {
/* remove job from ring_mirror_list */
-   list_del_init(&s_job->list);
+   drm_sched_remove_pending_job(s_job);
sched->ops->free_job(s_job);
continue;
}
diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
b/drivers/gpu/drm/scheduler/sched_entity.c
index 1dec97caaba3..37557fbb96d0 100644
--- a/drivers/gpu/drm/scheduler/sched_entity.c
+++ b/drivers/gpu/drm/scheduler/sched_entity.c
@@ -104,9 +104,11 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
}
init_completion(&entity->entity_idle);
+   init_completion(&entity->jobs_done);
-   /* We start in an idle state. */
+   /* We start in an idle and jobs done state. */
complete_all(&entity->entity_idle);
+   complete_all(&entity->jobs_done);
spin_lock_init(&entity->rq_lock);
spsc_queue_init(&entity->job_queue);
@@ -256,6 +258,9 @@ static void drm_sched_entity_kill(struct drm_sched_entity 
*entity)
/* Make sure this entity is not used by the scheduler at the moment */
wait_for_completion(&entity->entity_idle);
+   /* Make sure all pending jobs are done */
+   wait_for_completion(&entity->jobs_done);
+
/* The entity is guaranteed to not be 

Re: [PATCH v3 02/13] drm/sched: Convert drm scheduler to use a work queue rather than kthread

2023-09-16 Thread Danilo Krummrich

On 9/12/23 04:16, Matthew Brost wrote:

In XE, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In XE the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in XE we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In XE submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
   - (Rob Clark) Fix msm build
   - Pass in run work queue
v3:
   - (Boris) don't have loop in worker
v4:
   - (Tvrtko) break out submit ready, stop, start helpers into own patch

Signed-off-by: Matthew Brost 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |   2 +-
  drivers/gpu/drm/etnaviv/etnaviv_sched.c|   2 +-
  drivers/gpu/drm/lima/lima_sched.c  |   2 +-
  drivers/gpu/drm/msm/msm_ringbuffer.c   |   2 +-
  drivers/gpu/drm/nouveau/nouveau_sched.c|   2 +-
  drivers/gpu/drm/panfrost/panfrost_job.c|   2 +-
  drivers/gpu/drm/scheduler/sched_main.c | 106 +
  drivers/gpu/drm/v3d/v3d_sched.c|  10 +-
  include/drm/gpu_scheduler.h|  12 ++-
  9 files changed, 65 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 1f8a794704d0..c83a76bccc1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -2305,7 +2305,7 @@ static int amdgpu_device_init_schedulers(struct 
amdgpu_device *adev)
break;
}
  
-		r = drm_sched_init(&ring->sched, &amdgpu_sched_ops,

+   r = drm_sched_init(&ring->sched, &amdgpu_sched_ops, NULL,
   ring->num_hw_submission, 0,
   timeout, adev->reset_domain->wq,
   ring->sched_score, ring->name,
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_sched.c 
b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
index 345fec6cb1a4..618a804ddc34 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_sched.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_sched.c
@@ -134,7 +134,7 @@ int etnaviv_sched_init(struct etnaviv_gpu *gpu)
  {
int ret;
  
-	ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops,

+   ret = drm_sched_init(&gpu->sched, &etnaviv_sched_ops, NULL,
 etnaviv_hw_jobs_limit, etnaviv_job_hang_limit,
 msecs_to_jiffies(500), NULL, NULL,
 dev_name(gpu->dev), gpu->dev);
diff --git a/drivers/gpu/drm/lima/lima_sched.c 
b/drivers/gpu/drm/lima/lima_sched.c
index ffd91a5ee299..8d858aed0e56 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -488,7 +488,7 @@ int lima_sched_pipe_init(struct lima_sched_pipe *pipe, 
const char *name)
  
  	INIT_WORK(&pipe->recover_work, lima_sched_recover_work);
  
-	return drm_sched_init(&pipe->base, &lima_sched_ops, 1,

+   return drm_sched_init(&pipe->base, &lima_sched_ops, NULL, 1,
  lima_job_hang_limit,
  msecs_to_jiffies(timeout), NULL,
  NULL, name, pipe->ldev->dev);
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c 
b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 40c0bc35a44c..b8865e61b40f 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -94,7 +94,7 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu 
*gpu, int id,
 /* currently managing hangcheck ourselves: */
sched_timeout = MAX_SCHEDULE_TIMEOUT;
  
-	ret = drm_sched_init(&ring->sched, &msm_sched_ops,

+   ret = drm_sched_init(&ring->sched, &msm_sched_ops, NULL,
num_hw_submissions, 0, sched_timeout,
NULL, NULL, to_msm_bo(ring->bo)->name, gpu->dev->dev);
if (ret) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c 
b/drivers/gpu/drm/nouveau/nouveau_sched.c
index 88217185e0f3..d458c2227d4f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -429,7 +429,7 @@ int nouveau_

[PATCH] drm/nouveau: sched: fix leaking memory of timedout job

2023-09-16 Thread Danilo Krummrich
Always stop and re-start the scheduler in order to let the scheduler
free up the timedout job in case it got signaled. In case of exec jobs
the job type specific callback will take care to signal all fences and
tear down the channel.

Fixes: b88baab82871 ("drm/nouveau: implement new VM_BIND uAPI")
Signed-off-by: Danilo Krummrich 
---
 drivers/gpu/drm/nouveau/nouveau_exec.c  |  2 +-
 drivers/gpu/drm/nouveau/nouveau_sched.c | 12 +---
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_exec.c 
b/drivers/gpu/drm/nouveau/nouveau_exec.c
index 9c031d15fe0b..49d83ac9e036 100644
--- a/drivers/gpu/drm/nouveau/nouveau_exec.c
+++ b/drivers/gpu/drm/nouveau/nouveau_exec.c
@@ -185,7 +185,7 @@ nouveau_exec_job_timeout(struct nouveau_job *job)
 
nouveau_sched_entity_fini(job->entity);
 
-   return DRM_GPU_SCHED_STAT_ENODEV;
+   return DRM_GPU_SCHED_STAT_NOMINAL;
 }
 
 static struct nouveau_job_ops nouveau_exec_job_ops = {
diff --git a/drivers/gpu/drm/nouveau/nouveau_sched.c 
b/drivers/gpu/drm/nouveau/nouveau_sched.c
index 88217185e0f3..3b7ea5221226 100644
--- a/drivers/gpu/drm/nouveau/nouveau_sched.c
+++ b/drivers/gpu/drm/nouveau/nouveau_sched.c
@@ -375,14 +375,20 @@ nouveau_sched_run_job(struct drm_sched_job *sched_job)
 static enum drm_gpu_sched_stat
 nouveau_sched_timedout_job(struct drm_sched_job *sched_job)
 {
+   struct drm_gpu_scheduler *sched = sched_job->sched;
struct nouveau_job *job = to_nouveau_job(sched_job);
+   enum drm_gpu_sched_stat stat = DRM_GPU_SCHED_STAT_NOMINAL;
 
-   NV_PRINTK(warn, job->cli, "Job timed out.\n");
+   drm_sched_stop(sched, sched_job);
 
if (job->ops->timeout)
-   return job->ops->timeout(job);
+   stat = job->ops->timeout(job);
+   else
+   NV_PRINTK(warn, job->cli, "Generic job timeout.\n");
+
+   drm_sched_start(sched, true);
 
-   return DRM_GPU_SCHED_STAT_ENODEV;
+   return stat;
 }
 
 static void
-- 
2.41.0



Re: [Nouveau] [PATCH] nouveau/u_memcpya: fix NULL vs error pointer bug

2023-09-16 Thread Danilo Krummrich

On 9/16/23 16:26, Dan Carpenter wrote:

On Sat, Sep 16, 2023 at 05:24:04PM +0300, Dan Carpenter wrote:

On Sat, Sep 16, 2023 at 01:41:43AM +0200, Danilo Krummrich wrote:

Hi Dan,

On 9/15/23 14:59, Dan Carpenter wrote:

The u_memcpya() function is supposed to return error pointers on
error.  Returning NULL will lead to an Oops.

Fixes: 68132cc6d1bc ("nouveau/u_memcpya: use vmemdup_user")
Signed-off-by: Dan Carpenter 
---
   drivers/gpu/drm/nouveau/nouveau_drv.h | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
b/drivers/gpu/drm/nouveau/nouveau_drv.h
index 3666a7403e47..52a708a98915 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.h
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
@@ -193,7 +193,7 @@ u_memcpya(uint64_t user, unsigned int nmemb, unsigned int 
size)
size_t bytes;
if (unlikely(check_mul_overflow(nmemb, size, &bytes)))
-   return NULL;
+   return ERR_PTR(-ENOMEM);


I plan to replace this function with an upcoming vmemdup_array_user() helper,
which returns -EOVERFLOW instead, hence mind using that?

Unless you disagree, no need to resubmit the patch, I can change it
before applying the patch.


Generally, I would say that ENOMEM is the correct error code.  I feel
like someone thinks EOVERFLOW means integer overflow and that's not
correct.  I means like if you pass a number higher than INT_MAX to
kstroint().


The most common error code for integer overflows is EINVAL because the
user passed invalid data.


I totally agree with that, and my choice would have been EINVAL as well. It's 
just
that it seems (v)memdup_array_user() [1] goes with that and hence I'd just go 
along.

[1] 
https://lore.kernel.org/lkml/93001a9f3f101be0f374080090f9c32df73ca773.1694202430.git.pstan...@redhat.com/



regards,
dan carpenter





Re: [PATCH v2 1/5] string.h: add array-wrappers for (v)memdup_user()

2023-09-16 Thread Dan Carpenter
On Fri, Sep 08, 2023 at 09:59:40PM +0200, Philipp Stanner wrote:
> +static inline void *memdup_array_user(const void __user *src, size_t n, 
> size_t size)
> +{
> + size_t nbytes;
> +
> + if (unlikely(check_mul_overflow(n, size, &nbytes)))
> + return ERR_PTR(-EOVERFLOW);

No need for an unlikely() because check_mul_overflow() already has one
inside.  I feel like -ENOMEM is more traditional but I doubt anyone/userspace
cares.

> +
> + return memdup_user(src, nbytes);
> +}

regards,
dan carpenter


Re: [Nouveau] [PATCH] nouveau/u_memcpya: fix NULL vs error pointer bug

2023-09-16 Thread Dan Carpenter
On Sat, Sep 16, 2023 at 05:24:04PM +0300, Dan Carpenter wrote:
> On Sat, Sep 16, 2023 at 01:41:43AM +0200, Danilo Krummrich wrote:
> > Hi Dan,
> > 
> > On 9/15/23 14:59, Dan Carpenter wrote:
> > > The u_memcpya() function is supposed to return error pointers on
> > > error.  Returning NULL will lead to an Oops.
> > > 
> > > Fixes: 68132cc6d1bc ("nouveau/u_memcpya: use vmemdup_user")
> > > Signed-off-by: Dan Carpenter 
> > > ---
> > >   drivers/gpu/drm/nouveau/nouveau_drv.h | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
> > > b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > > index 3666a7403e47..52a708a98915 100644
> > > --- a/drivers/gpu/drm/nouveau/nouveau_drv.h
> > > +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > > @@ -193,7 +193,7 @@ u_memcpya(uint64_t user, unsigned int nmemb, unsigned 
> > > int size)
> > >   size_t bytes;
> > >   if (unlikely(check_mul_overflow(nmemb, size, &bytes)))
> > > - return NULL;
> > > + return ERR_PTR(-ENOMEM);
> > 
> > I plan to replace this function with an upcoming vmemdup_array_user() 
> > helper,
> > which returns -EOVERFLOW instead, hence mind using that?
> > 
> > Unless you disagree, no need to resubmit the patch, I can change it
> > before applying the patch.
> 
> Generally, I would say that ENOMEM is the correct error code.  I feel
> like someone thinks EOVERFLOW means integer overflow and that's not
> correct.  I means like if you pass a number higher than INT_MAX to
> kstroint().

The most common error code for integer overflows is EINVAL because the
user passed invalid data.

regards,
dan carpenter



Re: [Nouveau] [PATCH] nouveau/u_memcpya: fix NULL vs error pointer bug

2023-09-16 Thread Dan Carpenter
On Sat, Sep 16, 2023 at 01:41:43AM +0200, Danilo Krummrich wrote:
> Hi Dan,
> 
> On 9/15/23 14:59, Dan Carpenter wrote:
> > The u_memcpya() function is supposed to return error pointers on
> > error.  Returning NULL will lead to an Oops.
> > 
> > Fixes: 68132cc6d1bc ("nouveau/u_memcpya: use vmemdup_user")
> > Signed-off-by: Dan Carpenter 
> > ---
> >   drivers/gpu/drm/nouveau/nouveau_drv.h | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.h 
> > b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > index 3666a7403e47..52a708a98915 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_drv.h
> > +++ b/drivers/gpu/drm/nouveau/nouveau_drv.h
> > @@ -193,7 +193,7 @@ u_memcpya(uint64_t user, unsigned int nmemb, unsigned 
> > int size)
> > size_t bytes;
> > if (unlikely(check_mul_overflow(nmemb, size, &bytes)))
> > -   return NULL;
> > +   return ERR_PTR(-ENOMEM);
> 
> I plan to replace this function with an upcoming vmemdup_array_user() helper,
> which returns -EOVERFLOW instead, hence mind using that?
> 
> Unless you disagree, no need to resubmit the patch, I can change it
> before applying the patch.

Generally, I would say that ENOMEM is the correct error code.  I feel
like someone thinks EOVERFLOW means integer overflow and that's not
correct.  I means like if you pass a number higher than INT_MAX to
kstroint().

But I don't care strongly about this.  You can change it if you want to.

regards,
dan carpenter



Re: [RFC][PATCH v2 0/2] drm/panic: Add a drm panic handler

2023-09-16 Thread nerdopolis
On Friday, September 15, 2023 4:28:20 AM EDT Jocelyn Falempe wrote:
> This introduces a new drm panic handler, which displays a message when a 
> panic occurs.
> So when fbcon is disabled, you can still see a kernel panic.
> 
> This is one of the missing feature, when disabling VT/fbcon in the kernel:
> https://www.reddit.com/r/linux/comments/10eccv9/config_vtn_in_2023/
> Fbcon can be replaced by a userspace kms console, but the panic screen must 
> be done in the kernel.
> 
> This is a proof of concept, and works only with simpledrm, using a new 
> get_scanout_buffer() api
> 
> To test it, make sure you're using the simpledrm driver, and trigger a panic:
> echo c > /proc/sysrq-trigger
> 
This seems to work pretty good! With this one, I don't need to have Weston (or 
another display server) running for it to work this time.
The panic reason works, which is pretty sweet.

FYI: I do get a hunk that fails to apply in simpledrm_remove in 
drivers/gpu/drm/tiny/simpledrm.c
Seems to be a change in a recentish commit
https://github.com/torvalds/linux/commit/84e6da7ad5537826343636b846530ec2167d4a19

Thanks!
> v2
>  * Use get_scanout_buffer() instead of the drm client API. (Thomas Zimmermann)
>  * Add the panic reason to the panic message (Nerdopolis)
>  * Add an exclamation mark (Nerdopolis)
>  
> I didn't reuse the fbdev functions yet, that would need some fbdev 
> refactoring, because they rely on struct fb_info, and struct vc_data (for 
> font/console). But I still plan to at least try it for v3.
> 
> A few more though:
>  1) what about gpu with multiple monitor connected ?
> maybe get_scanout_buffer() could return a list of scanout buffers ?
>  2) I think for some GPU drivers, there might need a flush_scanout_buffer() 
> function, that should be called after the scanout buffer has been filled ?
> 
> Best regards,
> 
> Jocelyn Falempe (2):
>   drm/panic: Add a drm panic handler
>   drm/simpledrm: Add drm_panic support
> 
>  drivers/gpu/drm/Kconfig  |  11 ++
>  drivers/gpu/drm/Makefile |   1 +
>  drivers/gpu/drm/drm_drv.c|   3 +
>  drivers/gpu/drm/drm_panic.c  | 270 +++
>  drivers/gpu/drm/tiny/simpledrm.c |  17 ++
>  include/drm/drm_drv.h|  14 ++
>  include/drm/drm_panic.h  |  41 +
>  7 files changed, 357 insertions(+)
>  create mode 100644 drivers/gpu/drm/drm_panic.c
>  create mode 100644 include/drm/drm_panic.h
> 
> 
> base-commit: 2dde18cd1d8fac735875f2e4987f11817cc0bc2c
> 






Re: [PATCH] drm/omap: dsi: Fix deferred probe warnings

2023-09-16 Thread H. Nikolaus Schaller
Hi Tomi and Tony,

> Am 13.09.2023 um 13:59 schrieb Tomi Valkeinen 
> :
> 
> On 12/04/2023 10:39, Tony Lindgren wrote:
>> We may not have dsi->dsidev initialized during probe, and that can
>> lead into various dsi related warnings as omap_dsi_host_detach() gets
>> called with dsi->dsidev set to NULL.
>> The warnings can be "Fixed dependency cycle(s)" followed by a
>> WARNING: CPU: 0 PID: 787 at drivers/gpu/drm/omapdrm/dss/dsi.c:4414.
>> Let's fix the warnings by checking for a valid dsi->dsidev.
>> Signed-off-by: Tony Lindgren 
>> ---
>>  drivers/gpu/drm/omapdrm/dss/dsi.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> diff --git a/drivers/gpu/drm/omapdrm/dss/dsi.c 
>> b/drivers/gpu/drm/omapdrm/dss/dsi.c
>> --- a/drivers/gpu/drm/omapdrm/dss/dsi.c
>> +++ b/drivers/gpu/drm/omapdrm/dss/dsi.c
>> @@ -4411,7 +4411,7 @@ static int omap_dsi_host_detach(struct mipi_dsi_host 
>> *host,
>>  {
>>  struct dsi_data *dsi = host_to_omap(host);
>>  -   if (WARN_ON(dsi->dsidev != client))
>> +if (dsi->dsidev && WARN_ON(dsi->dsidev != client))
>>  return -EINVAL;
>>  cancel_delayed_work_sync(&dsi->dsi_disable_work);
> 
> Shouldn't this rather be
> 
> if (!dsi->dsidev)
>   return 0;
> 
> before the if (WARN_ON(dsi->dsidev != client)) line?

Yes you are right. We have a different variant in our Pyra kernel:

What we currently have in our Pyra tree is: 
https://git.goldelico.com/?p=letux-kernel.git;a=commitdiff;h=5bf7bd64eec1eb924e794e8d6600919f0dae8c5a;hp=27a0cd6263194d1465e9c53293d35f8c8c988f9d

struct dsi_data *dsi = host_to_omap(host);
 
-   if (WARN_ON(dsi->dsidev != client))
+printk("%s\n", __func__);
+
+   if (!dsi->dsidev || WARN_ON(dsi->dsidev != client))
return -EINVAL;
 
cancel_delayed_work_sync(&dsi->dsi_disable_work);

> 
> If dsi->dsidev is NULL, then attach hasn't been called, and we shouldn't do 
> anything in the detach callback either.
> 
> With your change we'll end up doing all the work in the detach callback, 
> without ever doing their counterpart in the attach side.

If useful, I can post above mentioned patch (without printk).

BR and thanks,
Nikolaus



[PATCH 4/4] drm/i915/dsi: Add some debug logging to mipi_exec_i2c (v2)

2023-09-16 Thread Hans de Goede
Add some debug logging to mipi_exec_i2c, to make debugging various
issues seen with it easier.

Changes in v2:
- Drop unnecessary __func__ drm_dbg_kms() argument

Reviewed-by: Javier Martinez Canillas 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/intel_dsi_vbt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c 
b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
index e56ec3f2d84a..24b2cbcfc1ef 100644
--- a/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
+++ b/drivers/gpu/drm/i915/display/intel_dsi_vbt.c
@@ -565,6 +565,9 @@ static const u8 *mipi_exec_i2c(struct intel_dsi *intel_dsi, 
const u8 *data)
u8 payload_size = *(data + 6);
u8 *payload_data;
 
+   drm_dbg_kms(&i915->drm, "bus %d client-addr 0x%02x reg 0x%02x data 
%*ph\n",
+   vbt_i2c_bus_num, slave_addr, reg_offset, payload_size, data 
+ 7);
+
if (intel_dsi->i2c_bus_num < 0) {
intel_dsi->i2c_bus_num = vbt_i2c_bus_num;
i2c_acpi_find_adapter(intel_dsi, slave_addr);
-- 
2.41.0



[PATCH 2/4] drm/i915/vlv_dsi: Add DMI quirk for wrong I2C bus and panel size on Lenovo Yoga Tablet 2 series (v2)

2023-09-16 Thread Hans de Goede
On the Lenovo Yoga Tablet 2 830 / 1050 there are 2 problems:

1. The I2C MIPI sequence elements reference bus 3. ACPI has I2C1 - I2C7
   which under Linux become bus 0 - 6. And the MIPI sequence reference
   to bus 3 is indented for I2C3 which is bus 2 under Linux.

   This leads to errors like these:
   [  178.244049] i2c_designware 80860F41:03: controller timed out
   [  178.245703] i915 :00:02.0: [drm] *ERROR* Failed to xfer payload of 
size (1) to reg (169)
   There are 3 timeouts when the panel is on, delaying
   waking up the screen on a key press by 3 seconds.

   Note mipi_exec_i2c() cannot just subtract 1 from the bus
   given in the I2C MIPI sequence element. Since on other
   devices the I2C bus-numbers used in the MIPI sequences do
   actually start at 0.

2. width_/height_mm contain a bogus 192mm x 120mm size. This is
   especially a problem on the 8" 830 version which uses a 10:16
   portrait screen where as the bogus size is 16:10.

Add a DMI quirk to override the I2C bus and the panel size with
the correct values.

Note both the 10" 1050 models as well as the 8" 830 models use the same
mainboard and thus the same DMI strings. The 10" 1050 uses a 1920x1200
landscape screen, where as the 8" 830 uses a 1200x1920 portrait screen,
so the quirk handling uses the display resolution to detect the model.

Changes in v2:
- Also override i2c_bus_num to fix mipi_exec_i2c() timeouts

Reviewed-by: Javier Martinez Canillas 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/vlv_dsi.c | 50 ++
 1 file changed, 50 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/vlv_dsi.c 
b/drivers/gpu/drm/i915/display/vlv_dsi.c
index 51c4b1491fa2..e247e3413d90 100644
--- a/drivers/gpu/drm/i915/display/vlv_dsi.c
+++ b/drivers/gpu/drm/i915/display/vlv_dsi.c
@@ -1765,6 +1765,42 @@ static void vlv_dsi_asus_tf103c_mode_fixup(struct 
intel_dsi *intel_dsi)
fixed_mode->vtotal -= 4;
 }
 
+/*
+ * On the Lenovo Yoga Tablet 2 830 / 1050 there are 2 problems:
+ * 1. The I2C MIPI sequence elements reference bus 3. ACPI has I2C1 - I2C7
+ *which under Linux become bus 0 - 6. And the MIPI sequence reference
+ *to bus 3 is indented for I2C3 which is bus 2 under Linux.
+ *
+ *Note mipi_exec_i2c() cannot just subtract 1 from the bus
+ *given in the I2C MIPI sequence element. Since on other
+ *devices the I2C bus-numbers used in the MIPI sequences do
+ *actually start at 0.
+ *
+ * 2. width_/height_mm contain a bogus 192mm x 120mm size. This is
+ *especially a problem on the 8" 830 version which uses a 10:16
+ *portrait screen where as the bogus size is 16:10.
+ */
+static void vlv_dsi_lenovo_yoga_tab2_size_fixup(struct intel_dsi *intel_dsi)
+{
+   const struct drm_display_mode *fixed_mode =
+   intel_panel_preferred_fixed_mode(intel_dsi->attached_connector);
+   struct drm_display_info *info = 
&intel_dsi->attached_connector->base.display_info;
+
+   intel_dsi->i2c_bus_num = 2;
+
+   /*
+* The 10" 1050 uses a 1920x1200 landscape screen, where as the 8" 830
+* uses a 1200x1920 portrait screen.
+*/
+   if (fixed_mode->hdisplay == 1920) {
+   info->width_mm = 216;
+   info->height_mm = 135;
+   } else {
+   info->width_mm = 107;
+   info->height_mm = 171;
+   }
+}
+
 static const struct dmi_system_id vlv_dsi_dmi_quirk_table[] = {
{
/* Asus Transformer Pad TF103C */
@@ -1774,6 +1810,20 @@ static const struct dmi_system_id 
vlv_dsi_dmi_quirk_table[] = {
},
.driver_data = (void *)vlv_dsi_asus_tf103c_mode_fixup,
},
+   {
+   /*
+* Lenovo Yoga Tablet 2 830F/L or 1050F/L (The 8" and 10"
+* Lenovo Yoga Tablet 2 use the same mainboard)
+*/
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "Intel Corp."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "VALLEYVIEW C0 PLATFORM"),
+   DMI_MATCH(DMI_BOARD_NAME, "BYT-T FFD8"),
+   /* Partial match on beginning of BIOS version */
+   DMI_MATCH(DMI_BIOS_VERSION, "BLADE_21"),
+   },
+   .driver_data = (void *)vlv_dsi_lenovo_yoga_tab2_size_fixup,
+   },
{ }
 };
 
-- 
2.41.0



[PATCH 3/4] drm/i915/vlv_dsi: Add DMI quirk for backlight control issues on Lenovo Yoga Tab 3

2023-09-16 Thread Hans de Goede
On the Lenovo Yoga Tab 3 Pro YT3-X90F there are 2 issues with the backlight
on/off MIPI sequences:

1. The backlight on sequence has an I2C MIPI sequence element which uses
   bus 0, but there is a bogus I2cSerialBus resource under the GPU in
   the DSDT which causes i2c_acpi_find_adapter() to pick the wrong bus.

2. There is no backlight off sequence, causing the backlight to stay on.

Add a DMI quirk fixing both issues.

Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/vlv_dsi.c | 32 ++
 1 file changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/vlv_dsi.c 
b/drivers/gpu/drm/i915/display/vlv_dsi.c
index e247e3413d90..db519cfc243b 100644
--- a/drivers/gpu/drm/i915/display/vlv_dsi.c
+++ b/drivers/gpu/drm/i915/display/vlv_dsi.c
@@ -1801,6 +1801,29 @@ static void vlv_dsi_lenovo_yoga_tab2_size_fixup(struct 
intel_dsi *intel_dsi)
}
 }
 
+/*
+ * On the Lenovo Yoga Tab 3 Pro YT3-X90F there are 2 problems:
+ * 1. i2c_acpi_find_adapter() picks the wrong adapter causing mipi_exec_i2c()
+ *to not work. Fix this by setting i2c_bus_num.
+ * 2. There is no backlight off MIPI sequence, causing the backlight to stay 
on.
+ *Add a backlight off sequence mirroring the existing backlight on 
sequence.
+ */
+static void vlv_dsi_lenovo_yoga_tab3_backlight_fixup(struct intel_dsi 
*intel_dsi)
+{
+   static const u8 backlight_off_sequence[16] = {
+   /* Header Seq-id 7, length after header 11 bytes */
+   0x07, 0x0b, 0x00, 0x00, 0x00,
+   /* MIPI_SEQ_ELEM_I2C bus 0 addr 0x2c reg 0x00 data-len 1 data 
0x00 */
+   0x04, 0x08, 0x00, 0x00, 0x00, 0x2c, 0x00, 0x00, 0x01, 0x00,
+   /* MIPI_SEQ_ELEM_END */
+   0x00
+   };
+   struct intel_connector *connector = intel_dsi->attached_connector;
+
+   intel_dsi->i2c_bus_num = 0;
+   connector->panel.vbt.dsi.sequence[MIPI_SEQ_BACKLIGHT_OFF] = 
backlight_off_sequence;
+}
+
 static const struct dmi_system_id vlv_dsi_dmi_quirk_table[] = {
{
/* Asus Transformer Pad TF103C */
@@ -1824,6 +1847,15 @@ static const struct dmi_system_id 
vlv_dsi_dmi_quirk_table[] = {
},
.driver_data = (void *)vlv_dsi_lenovo_yoga_tab2_size_fixup,
},
+   {
+   /* Lenovo Yoga Tab 3 Pro YT3-X90F */
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "Intel Corporation"),
+   DMI_MATCH(DMI_PRODUCT_NAME, "CHERRYVIEW D1 PLATFORM"),
+   DMI_MATCH(DMI_PRODUCT_VERSION, "Blade3-10A-001"),
+   },
+   .driver_data = (void *)vlv_dsi_lenovo_yoga_tab3_backlight_fixup,
+   },
{ }
 };
 
-- 
2.41.0



[PATCH 0/4] drm/i915/vlv_dsi: Add quirks for x86 android tablets (v2)

2023-09-16 Thread Hans de Goede
Hi All,

Some vlv/chv tablets ship with Android as factory OS. The factory OS
BSP style kernel on these tablets does not use the normal x86 hw
autodetection instead it hardcodes a whole bunch of things including
using panel drivers instead of relying on VBT MIPI sequences to
turn the panel/backlight on/off.

The normal i915 driver (which does not use panel drivers) mostly works
since the VBT still needs to contain valid info for the GOP, but because
of the Android kernel relying on panel drivers with various things
hardcoded some DMI quirks are necessary to fix some issues on these
devices.

Some of these issues also are related to which I2C bus to use for
MIPI sequence elements which do I2C transfers. This series also
includes a patch adding some extra debugging to mipi_exec_i2c() to
help with debugging similar issues in the future.

These patches have been posted before but back then I did not get around
to follow up on the series:
https://lore.kernel.org/intel-gfx/20220225214934.383168-1-hdego...@redhat.com/

Changes compared to this old version:
- Drop the changes how the I2C bus number is found, instead just have
  the quirks set the right number directly where necessary. This should
  avoid any chances of causing regressions on devices where the quirks
  do not apply.

- New quirk for backlight control issues on Lenovo Yoga Tab 3

- Address Jani Nikula's remark about __func__ being redundant when using
  drm_dbg_kms()


Regards,

Hans



Hans de Goede (4):
  drm/i915/vlv_dsi: Add DMI quirk for wrong panel modeline in BIOS on
Asus TF103C (v2)
  drm/i915/vlv_dsi: Add DMI quirk for wrong I2C bus and panel size on
Lenovo Yoga Tablet 2 series (v2)
  drm/i915/vlv_dsi: Add DMI quirk for backlight control issues on Lenovo
Yoga Tab 3
  drm/i915/dsi: Add some debug logging to mipi_exec_i2c (v2)

 drivers/gpu/drm/i915/display/intel_dsi_vbt.c |   3 +
 drivers/gpu/drm/i915/display/vlv_dsi.c   | 124 +++
 2 files changed, 127 insertions(+)

-- 
2.41.0



[PATCH 1/4] drm/i915/vlv_dsi: Add DMI quirk for wrong panel modeline in BIOS on Asus TF103C (v2)

2023-09-16 Thread Hans de Goede
Vtotal is wrong in the BIOS supplied modeline for the DSI panel on
the Asus TF103C leading to the last line of the display being shown
as the first line.

Original: "1280x800": 60 67700 1280 1312 1328 1376 800 808 812 820 0x8 0xa
Fixed:"1280x800": 60 67700 1280 1312 1328 1376 800 808 812 816 0x8 0xa

The factory installed Android has a hardcoded modeline in its kernel,
causing it to not suffer from this BIOS bug;
and the Android boot-splash which uses the EFI FB which does have this bug
has the last line all black causing the bug to not be visible.

This commit introduces a generic DMI based quirk mechanism to vlv_dsi for
doing various fixups, and uses this to correct the modeline.

v2:
- s/mode_fixup/dmi_quirk/ to make the new DMI quirk mechanism more generic
- Add a comment with the old and new modelines to the patch and commit msg

Reviewed-by: Javier Martinez Canillas 
Signed-off-by: Hans de Goede 
---
 drivers/gpu/drm/i915/display/vlv_dsi.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/drivers/gpu/drm/i915/display/vlv_dsi.c 
b/drivers/gpu/drm/i915/display/vlv_dsi.c
index a96e7d028c5c..51c4b1491fa2 100644
--- a/drivers/gpu/drm/i915/display/vlv_dsi.c
+++ b/drivers/gpu/drm/i915/display/vlv_dsi.c
@@ -23,6 +23,7 @@
  * Author: Jani Nikula 
  */
 
+#include 
 #include 
 
 #include 
@@ -1744,6 +1745,38 @@ static void vlv_dphy_param_init(struct intel_dsi 
*intel_dsi)
intel_dsi_log_params(intel_dsi);
 }
 
+typedef void (*vlv_dsi_dmi_quirk_func)(struct intel_dsi *intel_dsi);
+
+/*
+ * Vtotal is wrong on the Asus TF103C leading to the last line of the display
+ * being shown as the first line. The factory installed Android has a hardcoded
+ * modeline, causing it to not suffer from this BIOS bug.
+ *
+ * Original mode: "1280x800": 60 67700 1280 1312 1328 1376 800 808 812 820 0x8 
0xa
+ * Fixedmode: "1280x800": 60 67700 1280 1312 1328 1376 800 808 812 816 0x8 
0xa
+ */
+static void vlv_dsi_asus_tf103c_mode_fixup(struct intel_dsi *intel_dsi)
+{
+   /* Cast away the const as we want to fixup the mode */
+   struct drm_display_mode *fixed_mode = (struct drm_display_mode *)
+   intel_panel_preferred_fixed_mode(intel_dsi->attached_connector);
+
+   if (fixed_mode->vtotal == 820)
+   fixed_mode->vtotal -= 4;
+}
+
+static const struct dmi_system_id vlv_dsi_dmi_quirk_table[] = {
+   {
+   /* Asus Transformer Pad TF103C */
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+   DMI_MATCH(DMI_PRODUCT_NAME, "TF103C"),
+   },
+   .driver_data = (void *)vlv_dsi_asus_tf103c_mode_fixup,
+   },
+   { }
+};
+
 void vlv_dsi_init(struct drm_i915_private *dev_priv)
 {
struct intel_dsi *intel_dsi;
@@ -1752,6 +1785,7 @@ void vlv_dsi_init(struct drm_i915_private *dev_priv)
struct intel_connector *intel_connector;
struct drm_connector *connector;
struct drm_display_mode *current_mode;
+   const struct dmi_system_id *dmi_id;
enum port port;
enum pipe pipe;
 
@@ -1883,6 +1917,14 @@ void vlv_dsi_init(struct drm_i915_private *dev_priv)
goto err_cleanup_connector;
}
 
+   dmi_id = dmi_first_match(vlv_dsi_dmi_quirk_table);
+   if (dmi_id) {
+   vlv_dsi_dmi_quirk_func quirk_func =
+   (vlv_dsi_dmi_quirk_func)dmi_id->driver_data;
+
+   quirk_func(intel_dsi);
+   }
+
intel_panel_init(intel_connector, NULL);
 
intel_backlight_setup(intel_connector, INVALID_PIPE);
-- 
2.41.0



[Bug 217916] amdgpu: ring gfx_low timeout (Google Maps zooming)

2023-09-16 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217916

Artem S. Tashkinov (a...@gmx.com) changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |ANSWERED

--- Comment #1 from Artem S. Tashkinov (a...@gmx.com) ---
Please repost here:

https://gitlab.freedesktop.org/drm/amd/-/issues

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.