Re: [PATCH v2 1/3] drm: Support importing dmabufs into drivers without DMA

2021-02-22 Thread Ahmed S. Darwish
On Mon, Feb 22, 2021 at 01:43:26PM +0100, Thomas Zimmermann wrote:
...
>
> [   60.050199] [ cut here ]
> [   60.055092] WARNING: CPU: 0 PID: 1403 at kernel/dma/mapping.c:190 
> dma_map_sg_attrs+0x8f/0xc0
> [   60.064331] Modules linked in: af_packet(E) rfkill(E) dmi_sysfs(E) 
> intel_rapl_msr(E) intel_rapl_common(E) snd_hda_codec_realtek(E) 
> snd_hda_codec_generic(E) ledtrig_audio(E) snd_hda_codec_hdmi(E) 
> x86_pkg_temp_thermal(E) intel_powerclam)
> [   60.064801]  wmi(E) video(E) btrfs(E) blake2b_generic(E) libcrc32c(E) 
> crc32c_intel(E) xor(E) raid6_pq(E) sg(E) dm_multipath(E) dm_mod(E) 
> scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) msr(E) efivarfs(E)
> [   60.170778] CPU: 0 PID: 1403 Comm: Xorg.bin Tainted: GE
> 5.11.0-rc5-1-default+ #747
> [   60.179841] Hardware name: Dell Inc. OptiPlex 9020/0N4YC8, BIOS A24 
> 10/24/2018
> [   60.187145] RIP: 0010:dma_map_sg_attrs+0x8f/0xc0
> [   60.191822] Code: 4d 85 ff 75 2b 49 89 d8 44 89 e1 44 89 f2 4c 89 ee 48 89 
> ef e8 62 30 00 00 85 c0 78 36 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f 0b <0f> 0b 
> 31 c0 eb ed 49 8d 7f 50 e8 72 f5 2a 00 49 8b 47 50 49 89 d8
> [   60.210770] RSP: 0018:c90001d6fc18 EFLAGS: 00010246
> [   60.216062] RAX:  RBX: 0020 RCX: 
> b31e677b
> [   60.223274] RDX: dc00 RSI: 888212c10600 RDI: 
> 8881b08a9488
> [   60.230501] RBP: 8881b08a9030 R08: 0020 R09: 
> 888212c10600
> [   60.237710] R10: ed10425820df R11: 0001 R12: 
> 
> [   60.244939] R13: 888212c10600 R14: 0008 R15: 
> 
> [   60.252155] FS:  7f08ac2b2f00() GS:8887cbe0() 
> knlGS:
> [   60.260333] CS:  0010 DS:  ES:  CR0: 80050033
> [   60.266150] CR2: 55831c899be8 CR3: 00015372e006 CR4: 
> 001706f0
> [   60.273369] Call Trace:
> [   60.275845]  drm_gem_map_dma_buf+0xb4/0x120
> [   60.280089]  dma_buf_map_attachment+0x15d/0x1e0
> [   60.284688]  drm_gem_prime_import_dev.part.0+0x5d/0x140
> [   60.289984]  drm_gem_prime_fd_to_handle+0x213/0x280
> [   60.294931]  ? drm_prime_destroy_file_private+0x30/0x30
> [   60.300224]  drm_ioctl_kernel+0x131/0x180
> [   60.304291]  ? drm_setversion+0x230/0x230
> [   60.308366]  ? drm_prime_destroy_file_private+0x30/0x30
> [   60.313659]  drm_ioctl+0x2f1/0x500
> [   60.317118]  ? drm_version+0x150/0x150
> [   60.320903]  ? lock_downgrade+0xa0/0xa0
> [   60.324806]  ? do_vfs_ioctl+0x5f4/0x680
> [   60.328694]  ? __fget_files+0x13e/0x210
> [   60.332591]  ? ioctl_fiemap.isra.0+0x1a0/0x1a0
> [   60.337102]  ? __fget_files+0x15d/0x210
> [   60.340990]  __x64_sys_ioctl+0xb9/0xf0
> [   60.344795]  do_syscall_64+0x33/0x80
> [   60.348427]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [   60.353542] RIP: 0033:0x7f08ac7b53cb
> [   60.357168] Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 
> 85 c0 78 ba 4c 89 e0 5b 5d 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 
> 01 f0 ff ff 73 01 c3 48 8b 0d 75 ba 0c 00 f7 d8 64 89 01 48
> [   60.376108] RSP: 002b:7ffeabc89fc8 EFLAGS: 0246 ORIG_RAX: 
> 0010
> [   60.383758] RAX: ffda RBX: 7ffeabc8a00c RCX: 
> 7f08ac7b53cb
> [   60.390970] RDX: 7ffeabc8a00c RSI: c00c642e RDI: 
> 000d
> [   60.398221] RBP: c00c642e R08: 55831c691d00 R09: 
> 55831b2ec010
> [   60.405446] R10: 7f08acf6ada0 R11: 0246 R12: 
> 55831c691d00
> [   60.412667] R13: 000d R14: 007e9000 R15: 
> 
> [   60.419903] irq event stamp: 672893
> [   60.423441] hardirqs last  enabled at (672913): [] 
> console_unlock+0x44d/0x500
> [   60.432230] hardirqs last disabled at (672922): [] 
> console_unlock+0x443/0x500
> [   60.441021] softirqs last  enabled at (672940): [] 
> __do_softirq+0x3dd/0x554
> [   60.449634] softirqs last disabled at (672931): [] 
> asm_call_irq_on_stack+0x12/0x20
> [   60.458871] ---[ end trace f2f88696eb17806c ]---
>

Copy-pasting the etnire stack trace to the commit log really hurts
readability and provides no additional value to the reader, IMHO.

> This patch introduces struct drm_driver.gem_prime_create_object, which
  ^

Documentation/process/submitting-patches.rst:

  Describe your changes in imperative mood, e.g. "make xyzzy do frotz"
  instead of "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy
  to do frotz", as if you are giving orders to the codebase to change
  its behaviour.

> Cc: Christoph Hellwig 
> Cc: Greg Kroah-Hartman 
> Cc: Alan Stern 
> Cc: Johan Hovold 
> Cc: Andy Shevchenko 
&g

[PATCH v3 00/20] seqlock: Extend seqcount API with associated locks

2020-06-30 Thread Ahmed S. Darwish
Hi,

This is v3 of the seqlock patch series:

   [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
   https://lore.kernel.org/lkml/20200519214547.352050-1-a.darw...@linutronix.de

   [PATCH v2 00/18]
   https://lore.kernel.org/lkml/20200608005729.1874024-1-a.darw...@linutronix.de

It's based over:

   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git locking/core

to get Peter's lockdep irqstate tracking series below, which untangles
mainline seqlock.h<=>sched.h 'current->' task_struct circular dependency:

   https://lkml.kernel.org/r/linuxppc-dev/20200623083645.277342...@infradead.org

Changelog-v3:

 - Re-add lockdep non-preemptibility checks on seqcount_t write paths.
   They were removed from v2 due to the circular dependencies mentioned.

 - Slight rebase over the new v5.8-rc1 KCSAN seqlock.h changes

 - Collect seqcount_t call-sites acked-by tags

Thanks,

8<------

Ahmed S. Darwish (20):
  Documentation: locking: Describe seqlock design and usage
  seqlock: Properly format kernel-doc code samples
  seqlock: Add missing kernel-doc annotations
  lockdep: Add preemption enabled/disabled assertion APIs
  seqlock: lockdep assert non-preemptibility on seqcount_t write
  seqlock: Extend seqcount API with associated locks
  dma-buf: Remove custom seqcount lockdep class key
  dma-buf: Use sequence counter with associated wound/wait mutex
  sched: tasks: Use sequence counter with associated spinlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  xfrm: policy: Use sequence counters with associated lock
  timekeeping: Use sequence counter with associated raw spinlock
  vfs: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  hrtimer: Use sequence counter with associated raw spinlock

 Documentation/locking/index.rst   |   1 +
 Documentation/locking/seqlock.rst | 242 +
 MAINTAINERS   |   2 +-
 block/blk-iocost.c|   5 +-
 drivers/dma-buf/dma-resv.c|  15 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   2 -
 drivers/md/raid5.c|   2 +-
 drivers/md/raid5.h|   2 +-
 fs/dcache.c   |   2 +-
 fs/fs_struct.c|   4 +-
 fs/nfs/nfs4_fs.h  |   2 +-
 fs/nfs/nfs4state.c|   2 +-
 fs/userfaultfd.c  |   4 +-
 include/linux/dcache.h|   2 +-
 include/linux/dma-resv.h  |   4 +-
 include/linux/fs_struct.h |   2 +-
 include/linux/hrtimer.h   |   2 +-
 include/linux/kvm_irqfd.h |   2 +-
 include/linux/lockdep.h   |  18 +
 include/linux/sched.h |   2 +-
 include/linux/seqlock.h   | 872 ++
 include/linux/seqlock_types_internal.h| 186 
 include/net/netfilter/nf_conntrack.h  |   2 +-
 init/init_task.c  |   3 +-
 kernel/fork.c |   2 +-
 kernel/time/hrtimer.c |  13 +-
 kernel/time/timekeeping.c |  19 +-
 lib/Kconfig.debug |   1 +
 net/netfilter/nf_conntrack_core.c |   5 +-
 net/netfilter/nft_set_rbtree.c|   4 +-
 net/xfrm/xfrm_policy.c|  10 +-
 virt/kvm/eventfd.c|   2 +-
 32 files changed, 1211 insertions(+), 225 deletions(-)
 create mode 100644 Documentation/locking/seqlock.rst
 create mode 100644 include/linux/seqlock_types_internal.h

base-commit: 997e89fa345e9006f311cf9f9c8fd9f7d96c240f
--
2.20.1
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 07/20] dma-buf: Remove custom seqcount lockdep class key

2020-06-30 Thread Ahmed S. Darwish
Commit 3c3b177a9369 ("reservation: add support for read-only access
using rcu") introduced a sequence counter to manage updates to
reservations. Back then, the reservation object initializer
reservation_object_init() was always inlined.

Having the sequence counter initialization inlined meant that each of
the call sites would have a different lockdep class key, which would've
broken lockdep's deadlock detection. The aforementioned commit thus
introduced, and exported, a custom seqcount lockdep class key and name.

The commit 8735f16803f00 ("dma-buf: cleanup reservation_object_init...")
transformed the reservation object initializer to a normal non-inlined C
function. seqcount_init(), which automatically defines the seqcount
lockdep class key and must be called non-inlined, can now be safely used.

Remove the seqcount custom lockdep class key, name, and export. Use
seqcount_init() inside the dma reservation object initializer.

Signed-off-by: Ahmed S. Darwish 
Reviewed-by: Sebastian Andrzej Siewior 
Acked-by: Daniel Vetter 
---
 drivers/dma-buf/dma-resv.c | 9 +
 include/linux/dma-resv.h   | 2 --
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index b45f8514dc82..15efa0c2dacb 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -51,12 +51,6 @@
 DEFINE_WD_CLASS(reservation_ww_class);
 EXPORT_SYMBOL(reservation_ww_class);
 
-struct lock_class_key reservation_seqcount_class;
-EXPORT_SYMBOL(reservation_seqcount_class);
-
-const char reservation_seqcount_string[] = "reservation_seqcount";
-EXPORT_SYMBOL(reservation_seqcount_string);
-
 /**
  * dma_resv_list_alloc - allocate fence list
  * @shared_max: number of fences we need space for
@@ -135,9 +129,8 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
ww_mutex_init(>lock, _ww_class);
+   seqcount_init(>seq);
 
-   __seqcount_init(>seq, reservation_seqcount_string,
-   _seqcount_class);
RCU_INIT_POINTER(obj->fence, NULL);
RCU_INIT_POINTER(obj->fence_excl, NULL);
 }
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index ee50d10f052b..a6538ae7d93f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -46,8 +46,6 @@
 #include 
 
 extern struct ww_class reservation_ww_class;
-extern struct lock_class_key reservation_seqcount_class;
-extern const char reservation_seqcount_string[];
 
 /**
  * struct dma_resv_list - a list of shared fences
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 08/20] dma-buf: Use sequence counter with associated wound/wait mutex

2020-06-30 Thread Ahmed S. Darwish
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.

The dma-buf reservation subsystem uses plain sequence counters to manage
updates to reservations. Writer serialization is accomplished through a
wound/wait mutex.

Acquiring a wound/wait mutex does not disable preemption, so this needs
to be done manually before and after the write side critical section.

Use the newly-added seqcount_ww_mutex_t instead:

  - It associates the ww_mutex with the sequence count, which enables
lockdep to validate that the write side critical section is properly
serialized.

  - It removes the need to explicitly add preempt_disable/enable()
around the write side critical section because the write_begin/end()
functions for this new data type automatically do this.

If lockdep is disabled this ww_mutex lock association is compiled out
and has neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish 
Acked-by: Daniel Vetter 
---
 drivers/dma-buf/dma-resv.c   | 8 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 --
 include/linux/dma-resv.h | 2 +-
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 15efa0c2dacb..a7631352a486 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -129,7 +129,7 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
ww_mutex_init(>lock, _ww_class);
-   seqcount_init(>seq);
+   seqcount_ww_mutex_init(>seq, >lock);
 
RCU_INIT_POINTER(obj->fence, NULL);
RCU_INIT_POINTER(obj->fence_excl, NULL);
@@ -260,7 +260,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct 
dma_fence *fence)
fobj = dma_resv_get_list(obj);
count = fobj->shared_count;
 
-   preempt_disable();
write_seqcount_begin(>seq);
 
for (i = 0; i < count; ++i) {
@@ -282,7 +281,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct 
dma_fence *fence)
smp_store_mb(fobj->shared_count, count);
 
write_seqcount_end(>seq);
-   preempt_enable();
dma_fence_put(old);
 }
 EXPORT_SYMBOL(dma_resv_add_shared_fence);
@@ -309,14 +307,12 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
if (fence)
dma_fence_get(fence);
 
-   preempt_disable();
write_seqcount_begin(>seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(obj->fence_excl, fence);
if (old)
old->shared_count = 0;
write_seqcount_end(>seq);
-   preempt_enable();
 
/* inplace update, no shared fences */
while (i--)
@@ -394,13 +390,11 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct 
dma_resv *src)
src_list = dma_resv_get_list(dst);
old = dma_resv_get_excl(dst);
 
-   preempt_disable();
write_seqcount_begin(>seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(dst->fence_excl, new);
RCU_INIT_POINTER(dst->fence, dst_list);
write_seqcount_end(>seq);
-   preempt_enable();
 
dma_resv_list_free(src_list);
dma_fence_put(old);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index b91b5171270f..ff4b583cb96a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -258,11 +258,9 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
new->shared_count = k;
 
/* Install the new fence list, seqcount provides the barriers */
-   preempt_disable();
write_seqcount_begin(>seq);
RCU_INIT_POINTER(resv->fence, new);
write_seqcount_end(>seq);
-   preempt_enable();
 
/* Drop the references to the removed fences or move them to ef_list */
for (i = j, k = 0; i < old->shared_count; ++i) {
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index a6538ae7d93f..d44a77e8a7e3 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -69,7 +69,7 @@ struct dma_resv_list {
  */
 struct dma_resv {
struct ww_mutex lock;
-   seqcount_t seq;
+   seqcount_ww_mutex_t seq;
 
struct dma_fence __rcu *fence_excl;
struct dma_resv_list __rcu *fence;
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 06/18] dma-buf: Use sequence counter with associated wound/wait mutex

2020-06-08 Thread Ahmed S. Darwish
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.

The dma-buf reservation subsystem uses plain sequence counters to manage
updates to reservations. Writer serialization is accomplished through a
wound/wait mutex.

Acquiring a wound/wait mutex does not disable preemption, so this needs
to be done manually before and after the write side critical section.

Use the newly-added seqcount_ww_mutex_t instead:

  - It associates the ww_mutex with the sequence count, which enables
lockdep to validate that the write side critical section is properly
serialized.

  - It removes the need to explicitly add preempt_disable/enable()
around the write side critical section because the write_begin/end()
functions for this new data type automatically do this.

If lockdep is disabled this ww_mutex lock association is compiled out
and has neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish 
---
 drivers/dma-buf/dma-resv.c   | 8 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 --
 include/linux/dma-resv.h | 2 +-
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 590ce7ad60a0..3aba2b2bfc48 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -128,7 +128,7 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
ww_mutex_init(>lock, _ww_class);
-   seqcount_init(>seq);
+   seqcount_ww_mutex_init(>seq, >lock);
 
RCU_INIT_POINTER(obj->fence, NULL);
RCU_INIT_POINTER(obj->fence_excl, NULL);
@@ -259,7 +259,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct 
dma_fence *fence)
fobj = dma_resv_get_list(obj);
count = fobj->shared_count;
 
-   preempt_disable();
write_seqcount_begin(>seq);
 
for (i = 0; i < count; ++i) {
@@ -281,7 +280,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct 
dma_fence *fence)
smp_store_mb(fobj->shared_count, count);
 
write_seqcount_end(>seq);
-   preempt_enable();
dma_fence_put(old);
 }
 EXPORT_SYMBOL(dma_resv_add_shared_fence);
@@ -308,14 +306,12 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
if (fence)
dma_fence_get(fence);
 
-   preempt_disable();
write_seqcount_begin(>seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(obj->fence_excl, fence);
if (old)
old->shared_count = 0;
write_seqcount_end(>seq);
-   preempt_enable();
 
/* inplace update, no shared fences */
while (i--)
@@ -393,13 +389,11 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct 
dma_resv *src)
src_list = dma_resv_get_list(dst);
old = dma_resv_get_excl(dst);
 
-   preempt_disable();
write_seqcount_begin(>seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(dst->fence_excl, new);
RCU_INIT_POINTER(dst->fence, dst_list);
write_seqcount_end(>seq);
-   preempt_enable();
 
dma_resv_list_free(src_list);
dma_fence_put(old);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 6a5b91d23fd9..c71c0bb6ce26 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -258,11 +258,9 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
new->shared_count = k;
 
/* Install the new fence list, seqcount provides the barriers */
-   preempt_disable();
write_seqcount_begin(>seq);
RCU_INIT_POINTER(resv->fence, new);
write_seqcount_end(>seq);
-   preempt_enable();
 
/* Drop the references to the removed fences or move them to ef_list */
for (i = j, k = 0; i < old->shared_count; ++i) {
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index a6538ae7d93f..d44a77e8a7e3 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -69,7 +69,7 @@ struct dma_resv_list {
  */
 struct dma_resv {
struct ww_mutex lock;
-   seqcount_t seq;
+   seqcount_ww_mutex_t seq;
 
struct dma_fence __rcu *fence_excl;
struct dma_resv_list __rcu *fence;
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 00/18] seqlock: Extend seqcount API with associated locks

2020-06-08 Thread Ahmed S. Darwish
Hi,

This is v2 of the seqlock patch series:

   [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
   https://lore.kernel.org/lkml/20200519214547.352050-1-a.darw...@linutronix.de

Patches 1=>3 of this v2 series add documentation for the existing
seqlock.h datatypes and APIs. Hopefully they can hit v5.8 -rc2 or -rc3.

Changelog-v2


1. Drop, for now, the seqlock v1 patches #7 and #8. These patches added
lockdep non-preemptibility checks to seqcount_t write paths, but they
now depend on on-going work by Peter:

   [PATCH v3 0/5] lockdep: Change IRQ state tracking to use per-cpu variables
   https://lkml.kernel.org/r/20200529213550.683440...@infradead.org

   [PATCH 00/14] x86/entry: disallow #DB more and x86/entry lockdep/nmi
   https://lkml.kernel.org/r/20200529212728.795169...@infradead.org

Once Peter's work get merged, I'll send the non-preemptibility checks as
a separate series.

2. Drop the v1 seqcount_t call-sites bugfixes. I've already posted them
in an isolated series. They got merged into their respective trees, and
will hit v5.8-rc1 soon:

   [PATCH v2 0/6] seqlock: seqcount_t call sites bugfixes
   https://lore.kernel.org/lkml/20200603144949.1122421-1-a.darw...@linutronix.de

3. Patch #1: Add a small paragraph explaining that seqcount_t/seqlock_t
cannot be used if the protected data contains pointers. A similar
paragraph already existed in seqlock.h, but got mistakenly dropped.

4. Patch #2: Don't add RST directives inside kernel-doc comments. Peter
doesn't like them :) I've kept the indentation though, and found a
minimal way for Sphinx to properly render these code samples without too
much disruption.

5. Patch #3: Brush up the introduced kernel-doc comments. Make them more
consistent overall, and more concise.

Thanks,

8<--

Ahmed S. Darwish (18):
  Documentation: locking: Describe seqlock design and usage
  seqlock: Properly format kernel-doc code samples
  seqlock: Add missing kernel-doc annotations
  seqlock: Extend seqcount API with associated locks
  dma-buf: Remove custom seqcount lockdep class key
  dma-buf: Use sequence counter with associated wound/wait mutex
  sched: tasks: Use sequence counter with associated spinlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  xfrm: policy: Use sequence counters with associated lock
  timekeeping: Use sequence counter with associated raw spinlock
  vfs: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  hrtimer: Use sequence counter with associated raw spinlock

 Documentation/locking/index.rst   |   1 +
 Documentation/locking/seqlock.rst | 242 +
 MAINTAINERS   |   2 +-
 block/blk-iocost.c|   5 +-
 drivers/dma-buf/dma-resv.c|  15 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   2 -
 drivers/md/raid5.c|   2 +-
 drivers/md/raid5.h|   2 +-
 fs/dcache.c   |   2 +-
 fs/fs_struct.c|   4 +-
 fs/nfs/nfs4_fs.h  |   2 +-
 fs/nfs/nfs4state.c|   2 +-
 fs/userfaultfd.c  |   4 +-
 include/linux/dcache.h|   2 +-
 include/linux/dma-resv.h  |   4 +-
 include/linux/fs_struct.h |   2 +-
 include/linux/hrtimer.h   |   2 +-
 include/linux/kvm_irqfd.h |   2 +-
 include/linux/sched.h |   2 +-
 include/linux/seqlock.h   | 855 ++
 include/linux/seqlock_types_internal.h| 187 
 include/net/netfilter/nf_conntrack.h  |   2 +-
 init/init_task.c  |   3 +-
 kernel/fork.c |   2 +-
 kernel/time/hrtimer.c |  13 +-
 kernel/time/timekeeping.c |  19 +-
 net/netfilter/nf_conntrack_core.c |   5 +-
 net/netfilter/nft_set_rbtree.c|   4 +-
 net/xfrm/xfrm_policy.c|  10 +-
 virt/kvm/eventfd.c|   2 +-
 30 files changed, 1175 insertions(+), 226 deletions(-)
 create mode 100644 Documentation/locking/seqlock.rst
 create mode 100644 include/linux/seqlock_types_internal.h

base-commit: 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
--
2.20.1
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 05/18] dma-buf: Remove custom seqcount lockdep class key

2020-06-08 Thread Ahmed S. Darwish
Commit 3c3b177a9369 ("reservation: add support for read-only access
using rcu") introduced a sequence counter to manage updates to
reservations. Back then, the reservation object initializer
reservation_object_init() was always inlined.

Having the sequence counter initialization inlined meant that each of
the call sites would have a different lockdep class key, which would've
broken lockdep's deadlock detection. The aforementioned commit thus
introduced, and exported, a custom seqcount lockdep class key and name.

The commit 8735f16803f00 ("dma-buf: cleanup reservation_object_init...")
transformed the reservation object initializer to a normal non-inlined C
function. seqcount_init(), which automatically defines the seqcount
lockdep class key and must be called non-inlined, can now be safely used.

Remove the seqcount custom lockdep class key, name, and export. Use
seqcount_init() inside the dma reservation object initializer.

Signed-off-by: Ahmed S. Darwish 
Reviewed-by: Sebastian Andrzej Siewior 
Acked-by: Daniel Vetter 
---
 drivers/dma-buf/dma-resv.c | 9 +
 include/linux/dma-resv.h   | 2 --
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 4264e64788c4..590ce7ad60a0 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -50,12 +50,6 @@
 DEFINE_WD_CLASS(reservation_ww_class);
 EXPORT_SYMBOL(reservation_ww_class);
 
-struct lock_class_key reservation_seqcount_class;
-EXPORT_SYMBOL(reservation_seqcount_class);
-
-const char reservation_seqcount_string[] = "reservation_seqcount";
-EXPORT_SYMBOL(reservation_seqcount_string);
-
 /**
  * dma_resv_list_alloc - allocate fence list
  * @shared_max: number of fences we need space for
@@ -134,9 +128,8 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
ww_mutex_init(>lock, _ww_class);
+   seqcount_init(>seq);
 
-   __seqcount_init(>seq, reservation_seqcount_string,
-   _seqcount_class);
RCU_INIT_POINTER(obj->fence, NULL);
RCU_INIT_POINTER(obj->fence_excl, NULL);
 }
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index ee50d10f052b..a6538ae7d93f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -46,8 +46,6 @@
 #include 
 
 extern struct ww_class reservation_ww_class;
-extern struct lock_class_key reservation_seqcount_class;
-extern const char reservation_seqcount_string[];
 
 /**
  * struct dma_resv_list - a list of shared fences
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v2 0/6] seqlock: seqcount_t call sites bugfixes

2020-06-05 Thread Ahmed S. Darwish
On Thu, Jun 04, 2020 at 09:28:41AM +0200, Daniel Vetter wrote:
> On Wed, Jun 03, 2020 at 04:49:43PM +0200, Ahmed S. Darwish wrote:
> > Hi,
> >
> > Since patch #7 and #8 from the series:
> >
> >[PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
> >
> > https://lore.kernel.org/lkml/20200519214547.352050-1-a.darw...@linutronix.de
> >
> > are now pending on the lockdep/x86 IRQ state tracking patch series:
> >
> >[PATCH 00/14] x86/entry: disallow #DB more and x86/entry lockdep/nmi
> >https://lkml.kernel.org/r/20200529212728.795169...@infradead.org
> >
> >[PATCH v3 0/5] lockdep: Change IRQ state tracking to use per-cpu 
> > variables
> >https://lkml.kernel.org/r/20200529213550.683440...@infradead.org
> >
> > This is a repost only of the seqcount_t call sites bugfixes that were on
> > top of the seqlock patch series.
> >
> > These fixes are independent, and can thus be merged on their own. I'm
> > reposting them now so they can at least hit -rc2 or -rc3.
>
> I'm confused on what I should do with patch 6 here for dma-buf. Looks like
> just a good cleanup/prep work, so I'd queue it for linux-next and 5.9, but
> sounds like you want this in earlier. Do you need this in 5.8-rc for some
> work meant for 5.9? Will this go in through some topic branch directly?
> Should I apply it?
>
> Patch itself lgtm, I'm just confused what I should do with it.
>

My apologies for the confusion. The cover letter is indeed misleading
w.r.t. the dma-buf patch.  It isn't a bugfix, so it shouldn't hit -rc.

Since without this patch compiling the seqcount series will fail, it
will be best to merge it through tip instead.

So all I need for now is a reviewed-by tag :) I will forwoard it to the
tip tree afterwards.

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 0/6] seqlock: seqcount_t call sites bugfixes

2020-06-04 Thread Ahmed S. Darwish
Hi,

Since patch #7 and #8 from the series:

   [PATCH v1 00/25] seqlock: Extend seqcount API with associated locks
   https://lore.kernel.org/lkml/20200519214547.352050-1-a.darw...@linutronix.de

are now pending on the lockdep/x86 IRQ state tracking patch series:

   [PATCH 00/14] x86/entry: disallow #DB more and x86/entry lockdep/nmi
   https://lkml.kernel.org/r/20200529212728.795169...@infradead.org

   [PATCH v3 0/5] lockdep: Change IRQ state tracking to use per-cpu variables
   https://lkml.kernel.org/r/20200529213550.683440...@infradead.org

This is a repost only of the seqcount_t call sites bugfixes that were on
top of the seqlock patch series.

These fixes are independent, and can thus be merged on their own. I'm
reposting them now so they can at least hit -rc2 or -rc3.

Changelog-v2:

  - patch #1: Add a missing up_read() on netdev_get_name() error path
  exit. Thanks to Dan/kbuild-bot report.

  - patch #4: new patch, invalid preemptible context found by the new
  lockdep checks added in the seqlock series + kbuild-bot.

Thanks,

8<--

Ahmed S. Darwish (6):
  net: core: device_rename: Use rwsem instead of a seqcount
  net: phy: fixed_phy: Remove unused seqcount
  u64_stats: Document writer non-preemptibility requirement
  net: mdiobus: Disable preemption upon u64_stats update
  block: nr_sects_write(): Disable preemption on seqcount write
  dma-buf: Remove custom seqcount lockdep class key

 block/blk.h|  2 ++
 drivers/dma-buf/dma-resv.c |  9 +--
 drivers/net/phy/fixed_phy.c| 26 
 drivers/net/phy/mdio_bus.c |  2 ++
 include/linux/dma-resv.h   |  2 --
 include/linux/u64_stats_sync.h | 43 ++
 net/core/dev.c | 40 ++-
 7 files changed, 56 insertions(+), 68 deletions(-)

base-commit: 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
--
2.20.1
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v2 6/6] dma-buf: Remove custom seqcount lockdep class key

2020-06-04 Thread Ahmed S. Darwish
Commit 3c3b177a9369 ("reservation: add support for read-only access
using rcu") introduced a sequence counter to manage updates to
reservations. Back then, the reservation object initializer
reservation_object_init() was always inlined.

Having the sequence counter initialization inlined meant that each of
the call sites would have a different lockdep class key, which would've
broken lockdep's deadlock detection. The aforementioned commit thus
introduced, and exported, a custom seqcount lockdep class key and name.

The commit 8735f16803f00 ("dma-buf: cleanup reservation_object_init...")
transformed the reservation object initializer to a normal non-inlined C
function. seqcount_init(), which automatically defines the seqcount
lockdep class key and must be called non-inlined, can now be safely used.

Remove the seqcount custom lockdep class key, name, and export. Use
seqcount_init() inside the dma reservation object initializer.

Signed-off-by: Ahmed S. Darwish 
Reviewed-by: Sebastian Andrzej Siewior 
---
 drivers/dma-buf/dma-resv.c | 9 +
 include/linux/dma-resv.h   | 2 --
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 4264e64788c4..590ce7ad60a0 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -50,12 +50,6 @@
 DEFINE_WD_CLASS(reservation_ww_class);
 EXPORT_SYMBOL(reservation_ww_class);
 
-struct lock_class_key reservation_seqcount_class;
-EXPORT_SYMBOL(reservation_seqcount_class);
-
-const char reservation_seqcount_string[] = "reservation_seqcount";
-EXPORT_SYMBOL(reservation_seqcount_string);
-
 /**
  * dma_resv_list_alloc - allocate fence list
  * @shared_max: number of fences we need space for
@@ -134,9 +128,8 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
ww_mutex_init(>lock, _ww_class);
+   seqcount_init(>seq);
 
-   __seqcount_init(>seq, reservation_seqcount_string,
-   _seqcount_class);
RCU_INIT_POINTER(obj->fence, NULL);
RCU_INIT_POINTER(obj->fence_excl, NULL);
 }
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index ee50d10f052b..a6538ae7d93f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -46,8 +46,6 @@
 #include 
 
 extern struct ww_class reservation_ww_class;
-extern struct lock_class_key reservation_seqcount_class;
-extern const char reservation_seqcount_string[];
 
 /**
  * struct dma_resv_list - a list of shared fences
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v1 13/25] dma-buf: Use sequence counter with associated wound/wait mutex

2020-05-22 Thread Ahmed S. Darwish
On Wed, May 20, 2020, Christian König wrote:
> Am 19.05.20 um 23:45 schrieb Ahmed S. Darwish:
> > A sequence counter write side critical section must be protected by some
> > form of locking to serialize writers. If the serialization primitive is
> > not disabling preemption implicitly, preemption has to be explicitly
> > disabled before entering the sequence counter write side critical
> > section.
> >
> > The dma-buf reservation subsystem uses plain sequence counters to manage
> > updates to reservations. Writer serialization is accomplished through a
> > wound/wait mutex.
> >
> > Acquiring a wound/wait mutex does not disable preemption, so this needs
> > to be done manually before and after the write side critical section.
> >
> > Use the newly-added seqcount_ww_mutex_t instead:
> >
> >- It associates the ww_mutex with the sequence count, which enables
> >  lockdep to validate that the write side critical section is properly
> >  serialized.
> >
> >- It removes the need to explicitly add preempt_disable/enable()
> >  around the write side critical section because the write_begin/end()
> >  functions for this new data type automatically do this.
> >
> > If lockdep is disabled this ww_mutex lock association is compiled out
> > and has neither storage size nor runtime overhead.
>
> Mhm, is the dma_resv object the only user of this new seqcount_ww_mutex
> variant ?
>
> If yes we are trying to get rid of this sequence counter for quite some
> time, so I would rather invest the additional time to finish this.
>

In this patch series, each extra "seqcount with associated lock" data
type costs us, exactly:

  - 1 typedef definition, seqcount_ww_mutex_t
  - 1 static initializer, SEQCNT_WW_MUTEX_ZERO()
  - 1 runtime initializer, seqcount_ww_mutex_init()

Definitions for the typedef and the 2 initializers above are
template-code one liners.

The logic which automatically disables preemption upon entering a
seqcount_ww_mutex_t write side critical section is also already shared
with seqcount_mutex_t and any future, preemptible, associated lock.

So, yes, dma-resv is the only user of seqcount_ww_mutex.

But even in that case, given the one liner template code nature of
seqcount_ww_mutex_t logic, it does not make sense to block the dma_resv
and amdgpu change until at some point in the future the sequence counter
is completely removed.

**If and when** the sequence counter gets removed, please just remove
the seqcount_ww_mutex_t data type with it. It will be extremely simple.

> Regards,
> Christian.
>

Thanks,

--
Ahmed S. Darwish
Linutronix GmbH
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v1 00/25] seqlock: Extend seqcount API with associated locks

2020-05-20 Thread Ahmed S. Darwish
Hi,

A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the write side critical section.

There is no built-in debugging mechanism to verify that the lock used
for writer serialization is held and preemption is disabled. Some usage
sites like dma-buf have explicit lockdep checks for the writer-side
lock, but this covers only a small portion of the sequence counter usage
in the kernel.

Add new sequence counter types which allows to associate a lock to the
sequence counter at initialization time. The seqcount API functions are
extended to provide appropriate lockdep assertions depending on the
seqcount/lock type.

For sequence counters with associated locks that do not implicitly
disable preemption, preemption protection is enforced in the sequence
counter write side functions. This removes the need to explicitly add
preempt_disable/enable() around the write side critical sections: the
write_begin/end() functions for these new sequence counter types
automatically do this.

Extend the lockdep API with a macro asserting that preemption is
disabled.  Use it to verify that preemption is disabled for all sequence
counters write side critical sections.

If lockdep is disabled, these lock associations and non-preemptibility
checks are compiled out and have neither storage size nor runtime
overhead. If lockdep is enabled, a pointer to the lock is stored in the
seqcount and the write side API functions enable lockdep assertions.

The following seqcount types with associated locks are introduced:

 seqcount_spinlock_t
 seqcount_raw_spinlock_t
 seqcount_rwlock_t
 seqcount_mutex_t
 seqcount_ww_mutex_t

This lock association is not only useful for debugging purposes, it also
provides a mechanism for PREEMPT_RT to prevent writer starvation. On RT
kernels spinlocks and rwlocks are substituted with sleeping locks and
the code sections protected by these locks become preemptible, which has
the same problem as write side critical section with preemption enabled
on a non-RT kernel. RT utilizes this association by storing the provided
lock pointer and in case that a reader sees an active writer (seqcount
is odd), it does not spin, but blocks on the associated lock similar to
read_seqbegin_or_lock().

By using the lockdep debugging mechanisms added in this patch series, a
number of erroneous seqcount call-sites were discovered across the
kernel. The fixes are included at the beginning of the series.

Thanks,

8<--

Ahmed S. Darwish (25):
  net: core: device_rename: Use rwsem instead of a seqcount
  mm/swap: Don't abuse the seqcount latching API
  net: phy: fixed_phy: Remove unused seqcount
  block: nr_sects_write(): Disable preemption on seqcount write
  u64_stats: Document writer non-preemptibility requirement
  dma-buf: Remove custom seqcount lockdep class key
  lockdep: Add preemption disabled assertion API
  seqlock: lockdep assert non-preemptibility on seqcount_t write
  Documentation: locking: Describe seqlock design and usage
  seqlock: Add RST directives to kernel-doc code samples and notes
  seqlock: Add missing kernel-doc annotations
  seqlock: Extend seqcount API with associated locks
  dma-buf: Use sequence counter with associated wound/wait mutex
  sched: tasks: Use sequence counter with associated spinlock
  netfilter: conntrack: Use sequence counter with associated spinlock
  netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
  xfrm: policy: Use sequence counters with associated lock
  timekeeping: Use sequence counter with associated raw spinlock
  vfs: Use sequence counter with associated spinlock
  raid5: Use sequence counter with associated spinlock
  iocost: Use sequence counter with associated spinlock
  NFSv4: Use sequence counter with associated spinlock
  userfaultfd: Use sequence counter with associated spinlock
  kvm/eventfd: Use sequence counter with associated spinlock
  hrtimer: Use sequence counter with associated raw spinlock

 Documentation/locking/index.rst   |   1 +
 Documentation/locking/seqlock.rst | 239 +
 MAINTAINERS   |   2 +-
 block/blk-iocost.c|   5 +-
 block/blk.h   |   2 +
 drivers/dma-buf/dma-resv.c|  15 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |   2 -
 drivers/md/raid5.c|   2 +-
 drivers/md/raid5.h|   2 +-
 drivers/net/phy/fixed_phy.c   |  25 +-
 fs/dcache.c   |   2 +-
 fs/fs_struct.c|   4 +-
 fs/nfs/nfs4_fs.h  |   2 +-
 fs/nfs/nfs4state.c|   2 +-
 fs/userfaultfd.c  |   4 +-
 incl

[PATCH v1 06/25] dma-buf: Remove custom seqcount lockdep class key

2020-05-20 Thread Ahmed S. Darwish
Commit 3c3b177a9369 ("reservation: add support for read-only access
using rcu") introduced a sequence counter to manage updates to
reservations. Back then, the reservation object initializer
reservation_object_init() was always inlined.

Having the sequence counter initialization inlined meant that each of
the call sites would have a different lockdep class key, which would've
broken lockdep's deadlock detection. The aforementioned commit thus
introduced, and exported, a custom seqcount lockdep class key and name.

The commit 8735f16803f00 ("dma-buf: cleanup reservation_object_init...")
transformed the reservation object initializer to a normal non-inlined C
function. seqcount_init(), which automatically defines the seqcount
lockdep class key and must be called non-inlined, can now be safely used.

Remove the seqcount custom lockdep class key, name, and export. Use
seqcount_init() inside the dma reservation object initializer.

Signed-off-by: Ahmed S. Darwish 
Reviewed-by: Sebastian Andrzej Siewior 
---
 drivers/dma-buf/dma-resv.c | 9 +
 include/linux/dma-resv.h   | 2 --
 2 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 4264e64788c4..590ce7ad60a0 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -50,12 +50,6 @@
 DEFINE_WD_CLASS(reservation_ww_class);
 EXPORT_SYMBOL(reservation_ww_class);
 
-struct lock_class_key reservation_seqcount_class;
-EXPORT_SYMBOL(reservation_seqcount_class);
-
-const char reservation_seqcount_string[] = "reservation_seqcount";
-EXPORT_SYMBOL(reservation_seqcount_string);
-
 /**
  * dma_resv_list_alloc - allocate fence list
  * @shared_max: number of fences we need space for
@@ -134,9 +128,8 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
ww_mutex_init(>lock, _ww_class);
+   seqcount_init(>seq);
 
-   __seqcount_init(>seq, reservation_seqcount_string,
-   _seqcount_class);
RCU_INIT_POINTER(obj->fence, NULL);
RCU_INIT_POINTER(obj->fence_excl, NULL);
 }
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index ee50d10f052b..a6538ae7d93f 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -46,8 +46,6 @@
 #include 
 
 extern struct ww_class reservation_ww_class;
-extern struct lock_class_key reservation_seqcount_class;
-extern const char reservation_seqcount_string[];
 
 /**
  * struct dma_resv_list - a list of shared fences
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v1 13/25] dma-buf: Use sequence counter with associated wound/wait mutex

2020-05-20 Thread Ahmed S. Darwish
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. If the serialization primitive is
not disabling preemption implicitly, preemption has to be explicitly
disabled before entering the sequence counter write side critical
section.

The dma-buf reservation subsystem uses plain sequence counters to manage
updates to reservations. Writer serialization is accomplished through a
wound/wait mutex.

Acquiring a wound/wait mutex does not disable preemption, so this needs
to be done manually before and after the write side critical section.

Use the newly-added seqcount_ww_mutex_t instead:

  - It associates the ww_mutex with the sequence count, which enables
lockdep to validate that the write side critical section is properly
serialized.

  - It removes the need to explicitly add preempt_disable/enable()
around the write side critical section because the write_begin/end()
functions for this new data type automatically do this.

If lockdep is disabled this ww_mutex lock association is compiled out
and has neither storage size nor runtime overhead.

Signed-off-by: Ahmed S. Darwish 
---
 drivers/dma-buf/dma-resv.c   | 8 +---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 --
 include/linux/dma-resv.h | 2 +-
 3 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c
index 590ce7ad60a0..3aba2b2bfc48 100644
--- a/drivers/dma-buf/dma-resv.c
+++ b/drivers/dma-buf/dma-resv.c
@@ -128,7 +128,7 @@ subsys_initcall(dma_resv_lockdep);
 void dma_resv_init(struct dma_resv *obj)
 {
ww_mutex_init(>lock, _ww_class);
-   seqcount_init(>seq);
+   seqcount_ww_mutex_init(>seq, >lock);
 
RCU_INIT_POINTER(obj->fence, NULL);
RCU_INIT_POINTER(obj->fence_excl, NULL);
@@ -259,7 +259,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct 
dma_fence *fence)
fobj = dma_resv_get_list(obj);
count = fobj->shared_count;
 
-   preempt_disable();
write_seqcount_begin(>seq);
 
for (i = 0; i < count; ++i) {
@@ -281,7 +280,6 @@ void dma_resv_add_shared_fence(struct dma_resv *obj, struct 
dma_fence *fence)
smp_store_mb(fobj->shared_count, count);
 
write_seqcount_end(>seq);
-   preempt_enable();
dma_fence_put(old);
 }
 EXPORT_SYMBOL(dma_resv_add_shared_fence);
@@ -308,14 +306,12 @@ void dma_resv_add_excl_fence(struct dma_resv *obj, struct 
dma_fence *fence)
if (fence)
dma_fence_get(fence);
 
-   preempt_disable();
write_seqcount_begin(>seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(obj->fence_excl, fence);
if (old)
old->shared_count = 0;
write_seqcount_end(>seq);
-   preempt_enable();
 
/* inplace update, no shared fences */
while (i--)
@@ -393,13 +389,11 @@ int dma_resv_copy_fences(struct dma_resv *dst, struct 
dma_resv *src)
src_list = dma_resv_get_list(dst);
old = dma_resv_get_excl(dst);
 
-   preempt_disable();
write_seqcount_begin(>seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(dst->fence_excl, new);
RCU_INIT_POINTER(dst->fence, dst_list);
write_seqcount_end(>seq);
-   preempt_enable();
 
dma_resv_list_free(src_list);
dma_fence_put(old);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 9dff792c9290..87fd32aae8f9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -258,11 +258,9 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
amdgpu_bo *bo,
new->shared_count = k;
 
/* Install the new fence list, seqcount provides the barriers */
-   preempt_disable();
write_seqcount_begin(>seq);
RCU_INIT_POINTER(resv->fence, new);
write_seqcount_end(>seq);
-   preempt_enable();
 
/* Drop the references to the removed fences or move them to ef_list */
for (i = j, k = 0; i < old->shared_count; ++i) {
diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
index a6538ae7d93f..d44a77e8a7e3 100644
--- a/include/linux/dma-resv.h
+++ b/include/linux/dma-resv.h
@@ -69,7 +69,7 @@ struct dma_resv_list {
  */
 struct dma_resv {
struct ww_mutex lock;
-   seqcount_t seq;
+   seqcount_ww_mutex_t seq;
 
struct dma_fence __rcu *fence_excl;
struct dma_resv_list __rcu *fence;
-- 
2.20.1

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm: headers: Add neccessary include files and guards

2019-03-27 Thread Ahmed S. Darwish
Hi,

On Wed, Mar 27, 2019 at 09:30:59AM +0100, Daniel Vetter wrote:
> On Wed, Mar 27, 2019 at 12:23:43AM +0100, Ahmed S. Darwish wrote:
> > Otherwise gcc will complain about unknown types, and declarations
> > inside parameter lists, if "drm_internal.h" is used in C files with
> > less headers than what's now typically done under drivers/gpu/drm/.
> >
> > Signed-off-by: Ahmed S. Darwish 
> > ---
> >
> > Notes:
> > This was triggered by the in-development drm-panic code.
> >
> >  drivers/gpu/drm/drm_crtc_helper_internal.h |  5 +
> >  drivers/gpu/drm/drm_crtc_internal.h|  5 +
> >  drivers/gpu/drm/drm_internal.h | 12 
> >  include/drm/drm_auth.h |  9 +
> >  4 files changed, 31 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/drm_crtc_helper_internal.h 
> > b/drivers/gpu/drm/drm_crtc_helper_internal.h
> > index b5ac1581e623..ee8d8682db09 100644
> > --- a/drivers/gpu/drm/drm_crtc_helper_internal.h
> > +++ b/drivers/gpu/drm/drm_crtc_helper_internal.h
> > @@ -20,6 +20,9 @@
> >   * OTHER DEALINGS IN THE SOFTWARE.
> >   */
> >
> > +#ifndef _DRM_CRTC_HELPER_INTERNAL_H
> > +#define _DRM_CRTC_HELPER_INTERNAL_H
>
> I'm kinda wondering how you managed to include these multiple times?
>

Oh, this was only added for completeness / consistency. The actual
errors triggered are the missing declarations.

> > +
> >  /*
> >   * This header file contains mode setting related functions and definitions
> >   * which are only used within the drm kms helper module as internal
> > @@ -75,3 +78,5 @@ enum drm_mode_status drm_encoder_mode_valid(struct 
> > drm_encoder *encoder,
> > const struct drm_display_mode 
> > *mode);
> >  enum drm_mode_status drm_connector_mode_valid(struct drm_connector 
> > *connector,
> >   struct drm_display_mode *mode);
> > +
> > +#endif /* _DRM_CRTC_HELPER_INTERNAL_H */
> > diff --git a/drivers/gpu/drm/drm_crtc_internal.h 
> > b/drivers/gpu/drm/drm_crtc_internal.h
> > index 216f2a9ee3d4..840c1cb2eb7b 100644
> > --- a/drivers/gpu/drm/drm_crtc_internal.h
> > +++ b/drivers/gpu/drm/drm_crtc_internal.h
> > @@ -25,6 +25,9 @@
> >   * OTHER DEALINGS IN THE SOFTWARE.
> >   */
> >
> > +#ifndef _DRM_CRTC_INTERNAL_H
> > +#define _DRM_CRTC_INTERNAL_H
> > +
> >  /*
> >   * This header file contains mode setting related functions and definitions
> >   * which are only used within the drm module as internal implementation 
> > details
> > @@ -252,3 +255,5 @@ int drm_mode_page_flip_ioctl(struct drm_device *dev,
> >  void drm_mode_fixup_1366x768(struct drm_display_mode *mode);
> >  void drm_reset_display_info(struct drm_connector *connector);
> >  u32 drm_add_display_info(struct drm_connector *connector, const struct 
> > edid *edid);
> > +
> > +#endif /* _DRM_CRTC_INTERNAL_H */
> > diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
> > index 251d67e04c2d..a1b68836b048 100644
> > --- a/drivers/gpu/drm/drm_internal.h
> > +++ b/drivers/gpu/drm/drm_internal.h
> > @@ -21,7 +21,17 @@
> >   * OTHER DEALINGS IN THE SOFTWARE.
> >   */
> >
> > +#ifndef _DRM_INTERNAL_H
> > +#define _DRM_INTERNAL_H
> > +
> > +#include 
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> >  #include 
> > +#include 
>
> If you only need these for the structures used in pointers, then please
> use forward declarations, don't pull in the entire thing.
>
> I'm also wondering whether we actually need drm_ioctl.h, or whether a few
> forward decls would be good enough.
>

Oh, I see, forward decls then .. let's drop this.

> >
> >  #define DRM_IF_MAJOR 1
> >  #define DRM_IF_MINOR 4
> > @@ -191,3 +201,5 @@ int drm_syncobj_signal_ioctl(struct drm_device *dev, 
> > void *data,
> >  void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent,
> > const struct drm_framebuffer *fb);
> >  int drm_framebuffer_debugfs_init(struct drm_minor *minor);
> > +
> > +#endif /* _DRM_INTERNAL_H */
> > diff --git a/include/drm/drm_auth.h b/include/drm/drm_auth.h
> > index 86bff9841b54..a1a59fbf26b1 100644
> > --- a/include/drm/drm_auth.h
> > +++ b/include/drm/drm_auth.h
> > @@ -28,6 +28,15 @@
> >  #ifndef _DRM_AUTH_H_
> >  #define _DRM_AUTH_H_
> >
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include 
> > +
> > +#include 
> > +#include 
>
> Same comment about forward decls. Plus it would be good to also clean up
> drm_auth.c, i.e. drop the drmP.h include and put the drm_auth.h include as
> the very first one. To make sure the header is now working correctly
> stand-alone.
>

will do.

> -Daniel
>

thanks,
--darwi
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

[PATCH] drm: headers: Add neccessary include files and guards

2019-03-26 Thread Ahmed S. Darwish
Otherwise gcc will complain about unknown types, and declarations
inside parameter lists, if "drm_internal.h" is used in C files with
less headers than what's now typically done under drivers/gpu/drm/.

Signed-off-by: Ahmed S. Darwish 
---

Notes:
This was triggered by the in-development drm-panic code.

 drivers/gpu/drm/drm_crtc_helper_internal.h |  5 +
 drivers/gpu/drm/drm_crtc_internal.h|  5 +
 drivers/gpu/drm/drm_internal.h | 12 
 include/drm/drm_auth.h |  9 +
 4 files changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/drm_crtc_helper_internal.h 
b/drivers/gpu/drm/drm_crtc_helper_internal.h
index b5ac1581e623..ee8d8682db09 100644
--- a/drivers/gpu/drm/drm_crtc_helper_internal.h
+++ b/drivers/gpu/drm/drm_crtc_helper_internal.h
@@ -20,6 +20,9 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */

+#ifndef _DRM_CRTC_HELPER_INTERNAL_H
+#define _DRM_CRTC_HELPER_INTERNAL_H
+
 /*
  * This header file contains mode setting related functions and definitions
  * which are only used within the drm kms helper module as internal
@@ -75,3 +78,5 @@ enum drm_mode_status drm_encoder_mode_valid(struct 
drm_encoder *encoder,
const struct drm_display_mode 
*mode);
 enum drm_mode_status drm_connector_mode_valid(struct drm_connector *connector,
  struct drm_display_mode *mode);
+
+#endif /* _DRM_CRTC_HELPER_INTERNAL_H */
diff --git a/drivers/gpu/drm/drm_crtc_internal.h 
b/drivers/gpu/drm/drm_crtc_internal.h
index 216f2a9ee3d4..840c1cb2eb7b 100644
--- a/drivers/gpu/drm/drm_crtc_internal.h
+++ b/drivers/gpu/drm/drm_crtc_internal.h
@@ -25,6 +25,9 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */

+#ifndef _DRM_CRTC_INTERNAL_H
+#define _DRM_CRTC_INTERNAL_H
+
 /*
  * This header file contains mode setting related functions and definitions
  * which are only used within the drm module as internal implementation details
@@ -252,3 +255,5 @@ int drm_mode_page_flip_ioctl(struct drm_device *dev,
 void drm_mode_fixup_1366x768(struct drm_display_mode *mode);
 void drm_reset_display_info(struct drm_connector *connector);
 u32 drm_add_display_info(struct drm_connector *connector, const struct edid 
*edid);
+
+#endif /* _DRM_CRTC_INTERNAL_H */
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 251d67e04c2d..a1b68836b048 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -21,7 +21,17 @@
  * OTHER DEALINGS IN THE SOFTWARE.
  */

+#ifndef _DRM_INTERNAL_H
+#define _DRM_INTERNAL_H
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
 #include 
+#include 

 #define DRM_IF_MAJOR 1
 #define DRM_IF_MINOR 4
@@ -191,3 +201,5 @@ int drm_syncobj_signal_ioctl(struct drm_device *dev, void 
*data,
 void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent,
const struct drm_framebuffer *fb);
 int drm_framebuffer_debugfs_init(struct drm_minor *minor);
+
+#endif /* _DRM_INTERNAL_H */
diff --git a/include/drm/drm_auth.h b/include/drm/drm_auth.h
index 86bff9841b54..a1a59fbf26b1 100644
--- a/include/drm/drm_auth.h
+++ b/include/drm/drm_auth.h
@@ -28,6 +28,15 @@
 #ifndef _DRM_AUTH_H_
 #define _DRM_AUTH_H_

+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+
 /*
  * Legacy DRI1 locking data structure. Only here instead of in drm_legacy.h for
  * include ordering reasons.

--
darwi
http://darwish.chasingpointers.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 0/3] drm: Add panic handling

2019-03-17 Thread Ahmed S. Darwish

=> Now that the dust has settled, here's a summary of this huge
   50-email thread (thanks Daniel, Noralf, John, everyone!).

=> Parts of this document are a direct rewording of Daniel's replies,
   so I took the liberty of adding a Co-developed-by tag here..

=> This is only a summary, and _not_ an official patch submission.
   It's now Show-me-the-code time ;-)

Subject: [PATCH] Documentation: gpu: Add initial DRM panic design

Co-developed-by: Daniel Vetter 
Signed-off-by: Ahmed S. Darwish 
---
 Documentation/gpu/drm-panic-design.rst | 124 +
 1 file changed, 124 insertions(+)
 create mode 100644 Documentation/gpu/drm-panic-design.rst

diff --git a/Documentation/gpu/drm-panic-design.rst 
b/Documentation/gpu/drm-panic-design.rst
new file mode 100644
index ..ba306193f347
--- /dev/null
+++ b/Documentation/gpu/drm-panic-design.rst
@@ -0,0 +1,124 @@
+
+
+A DRM-based panic viewer
+
+
+The Linux Kernel currently contains the necessary plumbing for viewing
+a kernel panic log using DRM-based kmsg dumpers, even if the system is
+currently running a graphical session (e.g. wayland).
+
+.. _drm_panic_design:
+
+Implementation design
+=
+
+Code pathes running in a panic context face several constraints:
+
+1. Code must be fully synchronous: interrupts are disabled
+2. Dynamic memory allocations are not allowed
+3. Cannot acquire any mutexes: atomic context, due to 1.
+4. Cannot acquire any spin locks: potential spinning-forever risk
+
+For the *DRM* panic code, the extra conditions below apply:
+
+5. Code must only trylock relevant DRM subsystem locks
+6. If any trylock operation fails, the code *must not* go through
+7. All rendering is done on the GPU's "current display buffer" used
+   at the time of panic(): no HW programming is done at all.
+8. The code must be non-intrusive, and *must not* affect any other
+   panic handling mechanism (netconsole, ramoops, kexec, etc.)
+
+Rationale follows.
+
+
+Spin locks
+--
+
+Acquiring a spin lock in a panic() context is potentially lethal:
+the lock might've been already acquired, _permanently_, by another
+core that is now fully shut down through an IPI from the panic()-ing
+core.
+
+Moreover, at least on x86, the first target architecture for this
+work, the panic()-ing core wait by default for *a full second* until
+all other cores finish their non-preemptible regions and terminate.
+If that did not work out, it even tries more aggressively with NMIs.
+
+So if the other non panic()-ing cores was holding a DRM-related lock
+through spin_lock_irqsave() for more than one second, then it's a
+bug in the DRM layer code. Thus, the DRM panic viewer cannot do
+anything and should just exit.  [A]
+
+What if the non panic()-ing core was holding a DRM lock through
+barebone spin_lock()? Interrupts are enabled there, so the IPI will be
+handled, and thus that core will effectively die while the lock is
+*forever held*.  [B]
+
+
+Trylocking
+--
+
+The DRM panic code always *tries* to acquire the *minimum relevant
+set* of DRM related locks, through the basic :c:func:`spin_trylock()`
+mechanism.
+
+From case [A] and case [B] above, if the trylock operation fails,
+there's no point in retrying it multiple times: the relevant locks
+are in a broken and unrecoverable state anyway.
+
+Moreover, The panic code cannot also just ignore the DRM locks and
+force its way through: a broken non-preemptible DRM region implies
+either invalid SW state (e.g. broken linked list pointers), or a GPU
+in an unknown HW state.
+
+A GPU in an unknown HW state is quite dangerous: it has access to the
+full system memory, and if poked incorrectly, there's a really good
+chance it can kill the entire machine.
+
+
+GPU hardware access
+---
+
+In the current state, a full GPU reset, modesetting, or even disabling
+GPU planes, is not doable under a panic() context: it implies going
+through a potentially huge set of DRM call-chains that cannot be
+sanely verified against the :ref:`drm_panic_design` requirements
+(e.g. memory allocations, spinlocks, etc.).
+
+The current approach is simple: run the minimal amount of code
+necessary to draw pixels onto the current scanout buffers. Instead
+of disabling GPU planes, the biggest visible rectangle is just picked.
+
+*Usually* there should be a main window that is big enough to show the
+oops.
+
+
+CI testing
+--
+
+One of the things that killed earlier linux DRM panic handling efforts,
+beside getting into deep DRM call-chains that cannot be verified, was
+that it couldn't be tested except with real oopses.
+
+The first set of bug reports was whack-a-molde kind of bugs where the
+oops displayed was caused by the DRM panic handler itself instead of
+the real oops causing the panic.
+
+Thus, the :ref:`drm_panic_design` requirements was created. Moreover:
+
+  - Special hooks are added at the spin_lock(

Re: [PATCH v2 1/3] drm: Add support for panic message output

2019-03-13 Thread Ahmed S. Darwish
Hi,

On Wed, Mar 13, 2019 at 09:35:05AM +0100, Daniel Vetter wrote:
> On Tue, Mar 12, 2019 at 11:13:03PM +0100, Ahmed S. Darwish wrote:
> > On Mon, Mar 11, 2019 at 11:33:15PM +0100, Noralf Trønnes wrote:
> > > Den 11.03.2019 20.23, skrev Daniel Vetter:
[……]
> > > >
> > > > class_for_each_device uses klist, which only uses an irqsave spinlock. I
> > > > think that's good enough. Comment to that effect would be good e.g.
> > > >
> > > > /* based on klist, which uses only a spin_lock_irqsave, which we
> > > >  * assume still works */
> > > >
> > > > If we aim for perfect this should be a trylock still, maybe using our 
> > > > own
> > > > device list.
> > > >
> >
> > I definitely agree here.
> >
> > The lock may already be locked either by a stopped CPU, or by the
> > very same CPU we execute panic() on (e.g. NMI panic() on the
> > printing CPU).
> >
> > This is why it's very common for example in serial consoles, which
> > are usually careful about re-entrance and panic contexts, to do:
> >
> >   xx_console_write(...) {
> > if (oops_in_progress)
> > locked = spin_trylock_irqsave(>lock, flags);
> > else
> > spin_lock_irqsave(>lock, flags);
> >   }
> >
> > I'm quite positive we should do the same for panic drm drivers.
>
> Yeah Ideally all the locking in the drm path would be trylock only.
>
> I wonder whether lockdep could help us validate this, with some "don't
> allow anything except trylocks in this context". It's easy to audit the
> core code with review, but drivers are much tougher. And often end up with
> really deep callchains to get at the backing buffers.
[……]
> > > > Instead what I had in mind is to recreate the worst
> > > > possible panic context as much as feasible (disabling interrupts should 
> > > > be
> > > > a good start, maybe we can even do an nmi callback), and then call our
> > > > panic implementation. That way we can test the panic handler in a
> > > > non-destructive way (i.e. aside from last dmesg lines printed to the
> > > > screen nothing bad happens to the kernel: No real panic, no oops, no
> > > > tainting).
> > >
> > > The interrupt case I can do, nmi I have no idea.
> > >
> >
> > I agree too. Disabling interrupts + CONFIG_DEBUG_ATOMIC_SLEEP
> > would be a nice non-destructive test-case emulation.
>
> See above, if we can somehow emulate "all locks are held, only allow
> trylock" with lockdep that would be great too.
>

H, to prototype things faster, I'll do it as a hook:

   spin_lock(spinlock_t *lock)
   {
might_spin_forever_under_panic_path();
...
   }

Just like how mutexes and DEBUG_ATOMIC_SLEEP do it today:

   void mutex_lock(struct mutex *lock)
   {
might_sleep();
...
   }

Hopefully the locking maintainers can accept something like this.
If not, let's play with lockdep (which is a bit complex).

===> Thanks to Sebastian for suggesting this!

> Plus nmi context, once/if that somehow becomes relevant.
>
> The thing that killed the old drm panic handling code definitely was that
> we flat out couldnt' test it except with real oopses. And that's just
> whack-a-mole and bug reporter frustration if you first have a few patch
> iterations around "oh sry, it's still some oops in the panic handler, not
> the first one that we're seeing" :-/
>

Oh, that's a critical point:

   full data > some data > no data > invalid/useless data

First impressions for the feature will definitely count. Hopefully
the hooks above and some heavy DRM CI testing __beforehand__ can
improve things before publishing to a wider audience..

Cheers,

--
darwi
http://darwish.chasingpointers.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 1/3] drm: Add support for panic message output

2019-03-13 Thread Ahmed S. Darwish

[[ Adding Sebastian, who is quite experienced in intricate
   locking situations due to daily PREEMPT_RT work.. ]]

On Wed, Mar 13, 2019 at 09:37:10AM +0100, Daniel Vetter wrote:
> On Wed, Mar 13, 2019 at 08:49:17AM +0100, John Ogness wrote:
> > On 2019-03-12, Ahmed S. Darwish  wrote:
> > > On Wed, Mar 13, 2019 at 09:37:10AM +0100, Daniel Vetter wrote:
[……]
> > > >
> > > > If we aim for perfect this should be a trylock still, maybe using
> > > > our own device list.
> > > >
> > >
> > > I definitely agree here.
> > >
> > > The lock may already be locked either by a stopped CPU, or by the
> > > very same CPU we execute panic() on (e.g. NMI panic() on the
> > > printing CPU).
> > >
> > > This is why it's very common for example in serial consoles, which
> > > are usually careful about re-entrance and panic contexts, to do:
> > >
> > >   xx_console_write(...) {
> > >   if (oops_in_progress)
> > >   locked = spin_trylock_irqsave(>lock, flags);
> > >   else
> > >   spin_lock_irqsave(>lock, flags);
> > >   }
> > >
> > > I'm quite positive we should do the same for panic drm drivers.
> >
> > This construction will continue, even if the trylock fails. It only
> > makes sense to do this if the driver has a chance of being
> > successful. Ignoring locks is a serious issue. I personally am quite
> > unhappy that the serial drivers do this, which was part of my motivation
> > for the new printk design I'm working on.
> >
> > If the driver is not capable of doing something useful on a failed
> > trylock, then I recommend just skipping that device. Maybe trying it
> > again later after trying all the devices?
>
> Ah yes missed that. If the trylock fails anywhere, we must bail out.
>
> Not sure retrying is useful, my experience from at least drm is that
> either you're lucky, and drm wasn't doing anything right when the machine
> blew up, and then the trylocks will all go through. Or you're unlucky, and
> most likely that means drm itself blew up, and no amount of retrying is
> going to help. I wouldn't bother.
>

Ack, retrying most probably won't help.

I see that, at least in x86:

  kernel/panic.c::panic()
=> kernel/panic.c::smp_send_stop()
  => arch/x86/incluse/smp.h::smp_send_stop(0)
=> arch/x86/kernel/smp.c::native_stop_other_cpus(wait)
  /*
   * Don't wait longer than a second if the caller
   * didn't ask us to wait.
   */
   timeout = USEC_PER_SEC;
   while (num_online_cpus() > 1 && (wait || timeout--))
   udelay(1);

So the panic()-ing core wait by default for _a full second_ until
all other cores finish their non-preemptible regions. If that did
not work out, it even tries more aggressively with NMIs.

So if one of the other non panic()-ing cores was holding a DRM
related lock through spin_lock_irqsave() for more than one second,
well, it's a bug in the DRM layer code anyway..  [A]

Problem is, what will happen if the non panic()-ing core was
holding a DRM lock through barebone spin_lock()? Interrupts are
enabled there, so the IPI will be handled, and thus that core will
effectively die while the lock is forever held :-(   [B]

But, well, yeah, I don't think there's a solution for the [B] part
except by spin_trylock() the fewest amount of locks possible, and
bailing out if anyone of them is held.

For reference, I asked over IRC why not just resetting GPU state,
but it's not easy as it sounds to people outside of the gfx world:

  [abridged]
So it seems Windows is forcing driver writers to
   implement a gpu reset method [*]
since Windows vista
   yeah
   not going to do that from panic
   there's a non-zero chance that gpu reset takes down
   the system with at least lots of hw/driver combos
   I think all our drivers also have gpu reset support
   if something is still rendering while we panic, mea culpa
   only way around that is switching planes
   which we tried with the old approach, and that's just
   too hard
   you end up running a few 100kloc of code worst case
   no way to audit/validate that for panic context
   but the rendering shouldn't really matter
   yeah, actually modesetting in a panic seems hopeless.
   at least for modern compositors
   anholt, yeah I killed that idea
   so they only render to the backbuffer
   and we won't show that with a panic
   yep.  just draw on a plane.  for everything but
   uncomposited x11, it'll already be idle.
   so only really an issue on boot splash and frontbuffer x11
   and there a few cursors/characters drawn won't really
   matter in most cases
   anholt, yeah that's t

Re: [PATCH v2 1/3] drm: Add support for panic message output

2019-03-13 Thread Ahmed S. Darwish
Hi,

[[ CCing John for the trylock parts ]]

On Mon, Mar 11, 2019 at 11:33:15PM +0100, Noralf Trønnes wrote:
>
>
> Den 11.03.2019 20.23, skrev Daniel Vetter:
> > On Mon, Mar 11, 2019 at 06:42:16PM +0100, Noralf Trønnes wrote:
> >> This adds support for outputting kernel messages on panic().
> >> A kernel message dumper is used to dump the log. The dumper iterates
> >> over each DRM device and it's crtc's to find suitable framebuffers.
> >>
> >> All the other dumpers are run before this one except mtdoops.
> >> Only atomic drivers are supported.
> >>
> >> Signed-off-by: Noralf Trønnes 
> >
> > Bunch of comments/ideas for you or Darwish below, whoever picks this up.
>
> Actually it would ne nice if Darwish could pick it up since he will do
> it on i915 which will be useful to a much broader audience.
> If not I'll respin when I'm done with the drm_fb_helper refactoring.
>

Yup, I'll be more than happy to do this.. while preserving all of
Noralf's authorship and copyright notices of course.

I guess it can be:

  - Handle the comments posted by Daniel and others (I'll post
some questions too)

  - Add the necessary i915 specific bits

  - Test, post v3/v4/../vn. Rinse and repeat. Keep it local at
dri-devel until getting the necessary S-o-Bs.

  - Post to wider audience (some feedback from distribution folks
would also be nice, before posting to lkml)

More comments below..

[...]

> >> +
> >> +static void drm_panic_kmsg_dump(struct kmsg_dumper *dumper,
> >> +  enum kmsg_dump_reason reason)
> >> +{
> >> +  class_for_each_device(drm_class, NULL, dumper, drm_panic_dev_iter);
> >
> > class_for_each_device uses klist, which only uses an irqsave spinlock. I
> > think that's good enough. Comment to that effect would be good e.g.
> >
> > /* based on klist, which uses only a spin_lock_irqsave, which we
> >  * assume still works */
> >
> > If we aim for perfect this should be a trylock still, maybe using our own
> > device list.
> >

I definitely agree here.

The lock may already be locked either by a stopped CPU, or by the
very same CPU we execute panic() on (e.g. NMI panic() on the
printing CPU).

This is why it's very common for example in serial consoles, which
are usually careful about re-entrance and panic contexts, to do:

  xx_console_write(...) {
if (oops_in_progress)
locked = spin_trylock_irqsave(>lock, flags);
else
spin_lock_irqsave(>lock, flags);
  }

I'm quite positive we should do the same for panic drm drivers.
John?

> >> +}
> >> +
> >> +static struct kmsg_dumper drm_panic_kmsg_dumper = {
> >> +  .dump = drm_panic_kmsg_dump,
> >> +  .max_reason = KMSG_DUMP_PANIC,
> >> +};
> >> +
> >> +static ssize_t drm_panic_file_panic_write(struct file *file,
> >> +const char __user *user_buf,
> >> +size_t count, loff_t *ppos)
> >> +{
> >> +  unsigned long long val;
> >> +  char buf[24];
> >> +  size_t size;
> >> +  ssize_t ret;
> >> +
> >> +  size = min(sizeof(buf) - 1, count);
> >> +  if (copy_from_user(buf, user_buf, size))
> >> +  return -EFAULT;
> >> +
> >> +  buf[size] = '\0';
> >> +  ret = kstrtoull(buf, 0, );
> >> +  if (ret)
> >> +  return ret;
> >> +
> >> +  drm_panic_kmsg_dumper.max_reason = KMSG_DUMP_OOPS;
> >> +  wmb();
> >> +
> >> +  /* Do a real test with: echo c > /proc/sysrq-trigger */
> >> +
> >> +  if (val == 0) {
> >> +  pr_info("Test panic screen using kmsg_dump(OOPS)\n");
> >> +  kmsg_dump(KMSG_DUMP_OOPS);
> >> +  } else if (val == 1) {
> >> +  char *null_pointer = NULL;
> >> +
> >> +  pr_info("Test panic screen using NULL pointer dereference\n");
> >> +  *null_pointer = 1;
> >> +  } else {
> >> +  return -EINVAL;
> >> +  }
> >
> > This isn't quite what I had in mind, since it still kills the kernel (like
> > sysrq-trigger).
>
> If val == 0, it doesn't kill the kernel, it only dumps the kernel log.
> And it doesn't taint the kernel either.
>
> > Instead what I had in mind is to recreate the worst
> > possible panic context as much as feasible (disabling interrupts should be
> > a good start, maybe we can even do an nmi callback), and then call our
> > panic implementation. That way we can test the panic handler in a
> > non-destructive way (i.e. aside from last dmesg lines printed to the
> > screen nothing bad happens to the kernel: No real panic, no oops, no
> > tainting).
>
> The interrupt case I can do, nmi I have no idea.
>

I agree too. Disabling interrupts + CONFIG_DEBUG_ATOMIC_SLEEP
would be a nice non-destructive test-case emulation.

thanks!

--
darwi
http://darwish.chasingpointers.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 1/3] drm: Add support for panic message output

2019-03-12 Thread Ahmed S. Darwish
Hi,

[[ CCing John for the trylock parts ]]

On Mon, Mar 11, 2019 at 11:33:15PM +0100, Noralf Trønnes wrote:
>
> Den 11.03.2019 20.23, skrev Daniel Vetter:
> > On Mon, Mar 11, 2019 at 06:42:16PM +0100, Noralf Trønnes wrote:
> >> This adds support for outputting kernel messages on panic().
> >> A kernel message dumper is used to dump the log. The dumper iterates
> >> over each DRM device and it's crtc's to find suitable framebuffers.
> >>
> >> All the other dumpers are run before this one except mtdoops.
> >> Only atomic drivers are supported.
> >>
> >> Signed-off-by: Noralf Trønnes 
> >
> > Bunch of comments/ideas for you or Darwish below, whoever picks this up.
>
> Actually it would ne nice if Darwish could pick it up since he will do
> it on i915 which will be useful to a much broader audience.
> If not I'll respin when I'm done with the drm_fb_helper refactoring.
>

Yup, I'll be more than happy to do this.. while preserving all of
Noralf's authorship and copyright notices of course.

I guess it can be:

  - Handle the comments posted by Daniel and others (I'll post
some questions too)

  - Add the necessary i915 specific bits

  - Test, post v3/v4/../vn. Rinse and repeat. Keep it local at
dri-devel until getting the necessary S-o-Bs.

  - Post to wider audience (some feedback from distribution folks
would also be nice, before posting to lkml)

More comments below..

[...]

> >> +
> >> +static void drm_panic_kmsg_dump(struct kmsg_dumper *dumper,
> >> +  enum kmsg_dump_reason reason)
> >> +{
> >> +  class_for_each_device(drm_class, NULL, dumper, drm_panic_dev_iter);
> >
> > class_for_each_device uses klist, which only uses an irqsave spinlock. I
> > think that's good enough. Comment to that effect would be good e.g.
> >
> > /* based on klist, which uses only a spin_lock_irqsave, which we
> >  * assume still works */
> >
> > If we aim for perfect this should be a trylock still, maybe using our own
> > device list.
> >

I definitely agree here.

The lock may already be locked either by a stopped CPU, or by the
very same CPU we execute panic() on (e.g. NMI panic() on the
printing CPU).

This is why it's very common for example in serial consoles, which
are usually careful about re-entrance and panic contexts, to do:

  xx_console_write(...) {
if (oops_in_progress)
locked = spin_trylock_irqsave(>lock, flags);
else
spin_lock_irqsave(>lock, flags);
  }

I'm quite positive we should do the same for panic drm drivers.
John?

> >> +}
> >> +
> >> +static struct kmsg_dumper drm_panic_kmsg_dumper = {
> >> +  .dump = drm_panic_kmsg_dump,
> >> +  .max_reason = KMSG_DUMP_PANIC,
> >> +};
> >> +
> >> +static ssize_t drm_panic_file_panic_write(struct file *file,
> >> +const char __user *user_buf,
> >> +size_t count, loff_t *ppos)
> >> +{
> >> +  unsigned long long val;
> >> +  char buf[24];
> >> +  size_t size;
> >> +  ssize_t ret;
> >> +
> >> +  size = min(sizeof(buf) - 1, count);
> >> +  if (copy_from_user(buf, user_buf, size))
> >> +  return -EFAULT;
> >> +
> >> +  buf[size] = '\0';
> >> +  ret = kstrtoull(buf, 0, );
> >> +  if (ret)
> >> +  return ret;
> >> +
> >> +  drm_panic_kmsg_dumper.max_reason = KMSG_DUMP_OOPS;
> >> +  wmb();
> >> +
> >> +  /* Do a real test with: echo c > /proc/sysrq-trigger */
> >> +
> >> +  if (val == 0) {
> >> +  pr_info("Test panic screen using kmsg_dump(OOPS)\n");
> >> +  kmsg_dump(KMSG_DUMP_OOPS);
> >> +  } else if (val == 1) {
> >> +  char *null_pointer = NULL;
> >> +
> >> +  pr_info("Test panic screen using NULL pointer dereference\n");
> >> +  *null_pointer = 1;
> >> +  } else {
> >> +  return -EINVAL;
> >> +  }
> >
> > This isn't quite what I had in mind, since it still kills the kernel (like
> > sysrq-trigger).
>
> If val == 0, it doesn't kill the kernel, it only dumps the kernel log.
> And it doesn't taint the kernel either.
>
> > Instead what I had in mind is to recreate the worst
> > possible panic context as much as feasible (disabling interrupts should be
> > a good start, maybe we can even do an nmi callback), and then call our
> > panic implementation. That way we can test the panic handler in a
> > non-destructive way (i.e. aside from last dmesg lines printed to the
> > screen nothing bad happens to the kernel: No real panic, no oops, no
> > tainting).
>
> The interrupt case I can do, nmi I have no idea.
>

I agree too. Disabling interrupts + CONFIG_DEBUG_ATOMIC_SLEEP
would be a nice non-destructive test-case emulation.

thanks!

--
darwi
http://darwish.chasingpointers.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v2 1/3] drm: Add support for panic message output

2019-03-12 Thread Ahmed S. Darwish
Hi,

On Tue, Mar 12, 2019 at 11:58:10AM +0100, Daniel Vetter wrote:
> On Mon, Mar 11, 2019 at 11:33:15PM +0100, Noralf Trønnes wrote:
> >
> >
> > Den 11.03.2019 20.23, skrev Daniel Vetter:
> > > On Mon, Mar 11, 2019 at 06:42:16PM +0100, Noralf Trønnes wrote:
[...]
> > >> +}
> > >> +
> > >> +static struct kmsg_dumper drm_panic_kmsg_dumper = {
> > >> +.dump = drm_panic_kmsg_dump,
> > >> +.max_reason = KMSG_DUMP_PANIC,
> > >> +};
> > >> +
> > >> +static ssize_t drm_panic_file_panic_write(struct file *file,
> > >> +  const char __user *user_buf,
> > >> +  size_t count, loff_t *ppos)
> > >> +{
> > >> +unsigned long long val;
> > >> +char buf[24];
> > >> +size_t size;
> > >> +ssize_t ret;
> > >> +
> > >> +size = min(sizeof(buf) - 1, count);
> > >> +if (copy_from_user(buf, user_buf, size))
> > >> +return -EFAULT;
> > >> +
> > >> +buf[size] = '\0';
> > >> +ret = kstrtoull(buf, 0, );
> > >> +if (ret)
> > >> +return ret;
> > >> +
> > >> +drm_panic_kmsg_dumper.max_reason = KMSG_DUMP_OOPS;
> > >> +wmb();
> > >> +
> > >> +/* Do a real test with: echo c > /proc/sysrq-trigger */
> > >> +
> > >> +if (val == 0) {
> > >> +pr_info("Test panic screen using kmsg_dump(OOPS)\n");
> > >> +kmsg_dump(KMSG_DUMP_OOPS);
> > >> +} else if (val == 1) {
> > >> +char *null_pointer = NULL;
> > >> +
> > >> +pr_info("Test panic screen using NULL pointer 
> > >> dereference\n");
> > >> +*null_pointer = 1;
> > >> +} else {
> > >> +return -EINVAL;
> > >> +}
>> > >
> > > This isn't quite what I had in mind, since it still kills the kernel (like
> > > sysrq-trigger).
> >
> > If val == 0, it doesn't kill the kernel, it only dumps the kernel log.
> > And it doesn't taint the kernel either.
>
> Ah I didn't realize that. Sounds like a good option to keep.
>
> > > Instead what I had in mind is to recreate the worst
> > > possible panic context as much as feasible (disabling interrupts should be
> > > a good start, maybe we can even do an nmi callback), and then call our
> > > panic implementation. That way we can test the panic handler in a
> > > non-destructive way (i.e. aside from last dmesg lines printed to the
> > > screen nothing bad happens to the kernel: No real panic, no oops, no
> > > tainting).
> >
> > The interrupt case I can do, nmi I have no idea.
>
> I just read the printk nmi code again and it looks like there's now even
> more special handling for issues happening in nmi context, so we should
> never see an oops from nmi context. So local_irq_disable() wrapping should
> be good enough for testing.
> -Daniel
>

Hmmm, John is on a mission to change that soon:

  - 
https://lore.kernel.org/lkml/20190212143003.48446-1-john.ogn...@linutronix.de

  - https://lwn.net/Articles/780556

A v2 is on the way, and there's a lot of interest from different
groups, including RT, to get that in.

So I guess it would be nice to cover NMI contexts from the start,
not necessarily from a non-destructive CI testing perspective, but
at least while doing normal tests and code reviews.

(Is it possible to do non-destructive NMI tests? I don't know..)

thanks!

--
darwi
http://darwish.chasingpointers.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: DRM-based Oops viewer

2019-03-12 Thread Ahmed S. Darwish
Hello Jani,

On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
> On Sun, 10 Mar 2019, "Ahmed S. Darwish"  wrote:
> > Hello DRM/UEFI maintainers,
> >
> > Several years ago, I wrote a set of patches to dump the kernel
> > log to disk upon panic -- through BIOS INT 0x13 services. [1]
> >
> > The overwhelming response was that it's unsafe to do this in a
> > generic manner. Linus proposed a video-based viewer instead: [2]
> >
> > If you want to do the BIOS services thing, do it for video: copy the
> > oops to low RAM, return to real mode, re-run the graphics card POST
> > routines to initialize text-mode, and use the BIOS to print out the
> > oops.  That is WAY less scary than writing to disk.
> >
> > Of course it's 2019 now though, and it's quite known that
> > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> >
> > Researching whether this can be done from UEFI, it was also clear
> > that UEFI "Runtime Services" do not provide any re-initialization
> > routines. [4]
> >
> > The maximum possible that UEFI can provide is a GOP-provided
> > framebuffer that's ready to use by the OS -- even after the UEFI
> > boot phase is marked as done through ExitBootServices(). [5]
> >
> > Of course, once native drivers like i915 or radeon take over,
> > such a framebuffer is toast... [6]
> >
> > Thus a possible remaining option, is to display the oops through
> > "minimal" DRM drivers provided for each HW variant... Since
> > these special drivers will run only and fully under a panic()
> > context though, several constraints exist:
> >
> >   - The code should be fully synchronous (irqs are disabled)
> >   - It should not allocate any dynamic memory
> >   - It should make minimal assumptions about HW state
> >   - It should not chain into any other kernel subsystem
> >   - It has ample freedom to use delay-based loops and the
> > like, the kernel is already dead.
> >
> > How feasible is it to have such a special "DRM viewoops"
> > framework + its minimal drivers in the kernel?
>
> Please first better define what you want to achieve.
>

Oh I thought this was clear..

What I want to achieve is:

  - for normal day-to-day x86 laptops users
  - properly inform the user when a kernel panic happens during
a running graphical session (e.g. wayland/gnome/kde/...).

The current situation is dismal: the screen _just freezes_, and
users are left wondering what the heck has really happened to
their system (?)

Some out-of-the-box notification mechanism, for everyday distros
like Fedora and Ubuntu that can be enabled by default, would
improve the situation considerably..

Yes, there are many _developer_ features that can mitigate the
issue somewhat, but they're not really useful for everyday "normal"
usage:

  - netconsole is definitely not an option. It implies a lab
setting where two machines are always connected through a
network.

  - ramoops is _completely irrelevant_ for normal users. It's
mostly for embedded systems and the like; requires intimate
knowledge of the hardware by the user translated into DT
bindings or special platform_data struct..

  - kexec/kdump partially solves the save-to-disk problem, but
doesn't solve the user notification part..

Maybe a new "kexec/kview" solution can be useful, but
distributions don't enable kexec/k* solutions _by default_.

Maybe they fear the extra burden of maintaining two kernels
at the same time? or the requirement of reserving memory
for the crashkernel beforehand?

Linux should not just "freeze the screen" upon panic, even
if a crashkernel is not present .. _some_ sane default
built-in user notification mechanism should be there.

  - efivars are neat, they partially solve the save-to-disk
problem, but does not solve the user notification problem.

Moreover, they always carry the risk of bricking laptops..

> Do you want to store the dmesg or oops (like your original series
> suggests) or do you want to display the oops?

The original save-to-disk series was only shown for context.
This is a pure display solution; no disk is invovled at all.

> Do you want the facility to be functioning at all times, or only
> when specifically requested in advance by the user?

At all times, as a basic "sane default" for laptop-oriented
distributions to enable (think ubuntu, fedora, mint, etc.)

> If you want to display the oops, do you want it to
> also work when the display is disabled at the time of the oops?

If the screen is disabled, then this is definitely out of scope.

This only deals with classical laptop usage scenarios, w

Re: DRM-based Oops viewer

2019-03-12 Thread Ahmed S. Darwish
On Mon, Mar 11, 2019 at 02:49:41PM +0100, Daniel Vetter wrote:
> On Mon, Mar 11, 2019 at 11:04:19AM +0200, Jani Nikula wrote:
> > On Sun, 10 Mar 2019, "Ahmed S. Darwish"  wrote:
> > > Hello DRM/UEFI maintainers,
> > >
> > > Several years ago, I wrote a set of patches to dump the kernel
> > > log to disk upon panic -- through BIOS INT 0x13 services. [1]
> > >
> > > The overwhelming response was that it's unsafe to do this in a
> > > generic manner. Linus proposed a video-based viewer instead: [2]
> > >
> > > If you want to do the BIOS services thing, do it for video: copy the
> > > oops to low RAM, return to real mode, re-run the graphics card POST
> > > routines to initialize text-mode, and use the BIOS to print out the
> > > oops.  That is WAY less scary than writing to disk.
> > >
> > > Of course it's 2019 now though, and it's quite known that
> > > Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]
> > >
> > > Researching whether this can be done from UEFI, it was also clear
> > > that UEFI "Runtime Services" do not provide any re-initialization
> > > routines. [4]
> > >
> > > The maximum possible that UEFI can provide is a GOP-provided
> > > framebuffer that's ready to use by the OS -- even after the UEFI
> > > boot phase is marked as done through ExitBootServices(). [5]
> > >
> > > Of course, once native drivers like i915 or radeon take over,
> > > such a framebuffer is toast... [6]
> > >
> > > Thus a possible remaining option, is to display the oops through
> > > "minimal" DRM drivers provided for each HW variant... Since
> > > these special drivers will run only and fully under a panic()
> > > context though, several constraints exist:
> > >
> > >   - The code should be fully synchronous (irqs are disabled)
> > >   - It should not allocate any dynamic memory
> > >   - It should make minimal assumptions about HW state
> > >   - It should not chain into any other kernel subsystem
> > >   - It has ample freedom to use delay-based loops and the
> > > like, the kernel is already dead.
> > >
> > > How feasible is it to have such a special "DRM viewoops"
> > > framework + its minimal drivers in the kernel?
> >
> > Please first better define what you want to achieve.
> >
> > Do you want to store the dmesg or oops (like your original series
> > suggests) or do you want to display the oops? Do you want the facility
> > to be functioning at all times, or only when specifically requested in
> > advance by the user? If you want to display the oops, do you want it to
> > also work when the display is disabled at the time of the oops? What if
> > the display is at attached to a port on a dock?
> >
> > There's at least kdump, ramoops, and netconsole that can be used to
> > achieve some of what you want. How do they fall short for you?
>
> Assuming the use-case is to get an oops to display on a kms driver, we do
> have a fairly comprehensive plan of what that's should look like:
>
> https://dri.freedesktop.org/docs/drm/gpu/todo.html#make-panic-handling-work
>
> This takes into account all the failed previous attempts at trying to get
> an oops to display. It's conceptually a match with your viewoops framework
> I think.

Thanks a lot Daniel for the reference! Yup, this is a conceptual
match indeed!

It's great to know that at the maintainer level there is some
agreement on, awareness of, and plans for, the general topic...

I'll jump to Noralf Trønnes's just-posted patches then and see how
to move from there :)

all the best,

--darwi
http://darwish.chasingpointers.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

DRM-based Oops viewer

2019-03-11 Thread Ahmed S. Darwish
Hello DRM/UEFI maintainers,

Several years ago, I wrote a set of patches to dump the kernel
log to disk upon panic -- through BIOS INT 0x13 services. [1]

The overwhelming response was that it's unsafe to do this in a
generic manner. Linus proposed a video-based viewer instead: [2]

If you want to do the BIOS services thing, do it for video: copy the
oops to low RAM, return to real mode, re-run the graphics card POST
routines to initialize text-mode, and use the BIOS to print out the
oops.  That is WAY less scary than writing to disk.

Of course it's 2019 now though, and it's quite known that
Intel is officially obsoleting the PC/AT BIOS by 2020.. [3]

Researching whether this can be done from UEFI, it was also clear
that UEFI "Runtime Services" do not provide any re-initialization
routines. [4]

The maximum possible that UEFI can provide is a GOP-provided
framebuffer that's ready to use by the OS -- even after the UEFI
boot phase is marked as done through ExitBootServices(). [5]

Of course, once native drivers like i915 or radeon take over,
such a framebuffer is toast... [6]

Thus a possible remaining option, is to display the oops through
"minimal" DRM drivers provided for each HW variant... Since
these special drivers will run only and fully under a panic()
context though, several constraints exist:

  - The code should be fully synchronous (irqs are disabled)
  - It should not allocate any dynamic memory
  - It should make minimal assumptions about HW state
  - It should not chain into any other kernel subsystem
  - It has ample freedom to use delay-based loops and the
like, the kernel is already dead.

How feasible is it to have such a special "DRM viewoops"
framework + its minimal drivers in the kernel?

The target is to start from i915, since that's what in my
laptop now, and work from there..

Some final notes:

  - The NT kernel has a similar concept, but for storage instead.
They're used to dump core under kernel panic() situations,
and are called "Minoport storage drivers". [7]

  - Since Windows 7+, a very fancy Blue Screen of Death is
displayed, with Unicode and whatnot, implying GPU drivers
involvement. [8]

  - Mac OS X also does something similar [9]

  - On Linux laptops, the current situation is _really_ bad.

In any graphical session, type "echo c > /proc/sysrq-trigger";
the screen will just completely freeze...

Desired first goal: just print the panic() log

Thanks a lot,

[1] https://lore.kernel.org/lkml/20110125134748.GA10051@laptop
[2] 
https://lore.kernel.org/lkml/AANLkTinU0KYiCd4p=z+=ojbkeeot2g+cayvdru02k...@mail.gmail.com

[3] 
https://uefi.org/sites/default/files/resources/Brian_Richardson_Intel_Final.pdf

[4] UEFI v2.7 spec, Chapter 8, "Services — Runtime Services"
[5] UEFI v2.7 spec, Section 12.9, "Graphics Output Protocol"
"The Graphics Output Protocol supports this capability by
 providing the EFI OS loader access to a hardware frame buffer
 and enough information to allow the OS to draw directly to
 the graphics output device."

[6] linux/drivers/gpu/drm/i915/i915_drv.c::i915_kick_out_firmware_fb()
linux/drivers/gpu/drm/radeon/radeon_drv.c::radeon_pci_probe()

[7] 
https://docs.microsoft.com/en-us/windows-hardware/drivers/storage/restrictions-on-miniport-drivers-that-manage-the-boot-drive

[8] 
https://upload.wikimedia.org/wikipedia/commons/archive/5/56/20181019151937%21Bsodwindows10.png
[9] 
https://upload.wikimedia.org/wikipedia/commons/4/4a/Mac_OS_X_10.2_Kernel_Panic.jpg

--darwi
http://darwish.chasingpointers.com
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel