Den 09.05.2022 10.32, skrev Thomas Zimmermann:
> Hi Noralf
>
> Am 06.05.22 um 16:11 schrieb Thomas Zimmermann:
>> Hi
>>
>> Am 06.05.22 um 16:01 schrieb Noralf Trønnes:
>>> Hi Thomas,
>>>
>>> I'm getting this on Ubuntu 22.04:
>>>
>>> [ 0.000000] Linux version 5.15.0-27-generic (buildd@ubuntu) (gcc
>>> (Ubuntu 11.2.0-19ubuntu1) 11.2.0, GNU ld (GNU Binutils for Ubuntu) 2.38)
>>> #28-Ubuntu SMP Thu Apr 14 04:55:28 UTC 2022 (Ubuntu 5.15.0-27.28-generic
>>> 5.15.30)
>>>
>>> [ 4.830866] usb 2-3.1: new high-speed USB device number 4 using
>>> xhci_hcd
>>> [ 4.935546] usb 2-3.1: New USB device found, idVendor=1d50,
>>> idProduct=614d, bcdDevice= 1.00
>>> [ 4.935553] usb 2-3.1: New USB device strings: Mfr=1, Product=2,
>>> SerialNumber=3
>>> [ 4.935556] usb 2-3.1: Product: Raspberry Pi 4 Display Gadget
>>> [ 4.935558] usb 2-3.1: Manufacturer: Raspberry Pi
>>> [ 4.935560] usb 2-3.1: SerialNumber: 100000003b40d6c6
>>>
>>> [ 7.497361] [drm] Initialized gud 1.0.0 20200422 for 2-3.1:1.0 on
>>> minor 0
>>>
>>> [ 7.573048] gud 2-3.1:1.0: [drm] fb1: guddrmfb frame buffer device
>>>
>>> [ 9.199402]
>>> ================================================================================
>>>
>>> [ 9.199411] UBSAN: invalid-load in
>>> /build/linux-HMZHpV/linux-5.15.0/include/linux/dma-buf-map.h:224:9
>>> [ 9.199416] load of value 226 is not a valid value for type '_Bool'
>>> [ 9.199420] CPU: 0 PID: 113 Comm: kworker/0:2 Not tainted
>>> 5.15.0-27-generic #28-Ubuntu
>>> [ 9.199424] Hardware name: Hewlett-Packard HP EliteBook 820 G1/1991,
>>> BIOS L71 Ver. 01.44 04/12/2018
>>> [ 9.199427] Workqueue: events_long gud_flush_work [gud]
>>> [ 9.199440] Call Trace:
>>> [ 9.199443] <TASK>
>>> [ 9.199447] show_stack+0x52/0x58
>>> [ 9.199456] dump_stack_lvl+0x4a/0x5f
>>> [ 9.199464] dump_stack+0x10/0x12
>>> [ 9.199468] ubsan_epilogue+0x9/0x45
>>> [ 9.199473] __ubsan_handle_load_invalid_value.cold+0x44/0x49
>>> [ 9.199478] drm_gem_fb_vmap.cold+0x10/0x3d [drm_kms_helper]
>>> [ 9.199519] gud_prep_flush+0xaa/0x410 [gud]
>>> [ 9.199525] ? check_preempt_curr+0x5d/0x70
>>> [ 9.199533] ? update_load_avg+0x82/0x620
>>> [ 9.199540] ? set_next_entity+0xb7/0x200
>>> [ 9.199545] gud_flush_work+0x1e0/0x430 [gud]
>>> [ 9.199551] ? psi_task_switch+0x1e7/0x220
>>> [ 9.199557] process_one_work+0x22b/0x3d0
>>> [ 9.199564] worker_thread+0x53/0x410
>>> [ 9.199570] ? process_one_work+0x3d0/0x3d0
>>> [ 9.199575] kthread+0x12a/0x150
>>> [ 9.199579] ? set_kthread_struct+0x50/0x50
>>> [ 9.199584] ret_from_fork+0x22/0x30
>>> [ 9.199593] </TASK>
>>> [ 9.199595]
>>> ================================================================================
>>>
>>>
>>> [ 9.199598]
>>> ================================================================================
>>>
>>> [ 9.199600] UBSAN: invalid-load in
>>> /build/linux-HMZHpV/linux-5.15.0/include/linux/dma-buf-map.h:194:9
>>> [ 9.199604] load of value 226 is not a valid value for type '_Bool'
>>> [ 9.199606] CPU: 0 PID: 113 Comm: kworker/0:2 Not tainted
>>> 5.15.0-27-generic #28-Ubuntu
>>> [ 9.199610] Hardware name: Hewlett-Packard HP EliteBook 820 G1/1991,
>>> BIOS L71 Ver. 01.44 04/12/2018
>>> [ 9.199612] Workqueue: events_long gud_flush_work [gud]
>>> [ 9.199618] Call Trace:
>>> [ 9.199619] <TASK>
>>> [ 9.199621] show_stack+0x52/0x58
>>> [ 9.199627] dump_stack_lvl+0x4a/0x5f
>>> [ 9.199633] dump_stack+0x10/0x12
>>> [ 9.199637] ubsan_epilogue+0x9/0x45
>>> [ 9.199641] __ubsan_handle_load_invalid_value.cold+0x44/0x49
>>> [ 9.199646] drm_gem_fb_vmap.cold+0x24/0x3d [drm_kms_helper]
>>> [ 9.199675] gud_prep_flush+0xaa/0x410 [gud]
>>> [ 9.199682] ? check_preempt_curr+0x5d/0x70
>>> [ 9.199688] ? update_load_avg+0x82/0x620
>>> [ 9.199693] ? update_load_avg+0x82/0x620
>>> [ 9.199697] gud_flush_work+0x1e0/0x430 [gud]
>>> [ 9.199702] ? psi_task_switch+0x1e7/0x220
>>> [ 9.199706] process_one_work+0x22b/0x3d0
>>> [ 9.199713] worker_thread+0x53/0x410
>>> [ 9.199718] ? process_one_work+0x3d0/0x3d0
>>> [ 9.199723] kthread+0x12a/0x150
>>> [ 9.199728] ? set_kthread_struct+0x50/0x50
>>> [ 9.199732] ret_from_fork+0x22/0x30
>>> [ 9.199741] </TASK>
>>> [ 9.199743]
>>> ================================================================================
>>>
>>>
>>> It's the "if (map->is_iomem)" statement in dma_buf_map_clear() and
>>> dma_buf_map_is_null() that triggers this.
>>>
>>> I tried 5.18.0-rc5 and the problem is still present.
>>>
>>> UBSAN entries in the config:
>>>
>>> CONFIG_ARCH_HAS_UBSAN_SANITIZE_ALL=y
>>> CONFIG_UBSAN=y
>>> # CONFIG_UBSAN_TRAP is not set
>>> CONFIG_CC_HAS_UBSAN_BOUNDS=y
>>> CONFIG_UBSAN_BOUNDS=y
>>> CONFIG_UBSAN_ONLY_BOUNDS=y
>>> CONFIG_UBSAN_SHIFT=y
>>> # CONFIG_UBSAN_DIV_ZERO is not set
>>> CONFIG_UBSAN_BOOL=y
>>> CONFIG_UBSAN_ENUM=y
>>> # CONFIG_UBSAN_ALIGNMENT is not set
>>> CONFIG_UBSAN_SANITIZE_ALL=y
>>> # CONFIG_TEST_UBSAN is not set
>>>
>>> Continuing further down.
>>>
>>>
>>> Den 30.07.2021 20.35, skrev Thomas Zimmermann:
>>>> Abstract the framebuffer details by mapping its BOs with a call
>>>> to drm_gem_fb_vmap(). Unmap with drm_gem_fb_vunmap().
>>>>
>>>> The call to drm_gem_fb_vmap() ensures that all BOs are mapped
>>>> correctly. Gud still only supports single-plane formats.
>>>>
>>>> No functional changes.
>>>>
>>>> Signed-off-by: Thomas Zimmermann <tzimmerm...@suse.de>
>>>> Acked-by: Noralf Trønnes <nor...@tronnes.org>
>>>> Acked-by: Sam Ravnborg <s...@ravnborg.org>
>>>> ---
>>>> drivers/gpu/drm/gud/gud_pipe.c | 10 +++++-----
>>>> 1 file changed, 5 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/gud/gud_pipe.c
>>>> b/drivers/gpu/drm/gud/gud_pipe.c
>>>> index 4d7a26b68a2e..7e009f562b30 100644
>>>> --- a/drivers/gpu/drm/gud/gud_pipe.c
>>>> +++ b/drivers/gpu/drm/gud/gud_pipe.c
>>>> @@ -14,8 +14,8 @@
>>>> #include <drm/drm_format_helper.h>
>>>> #include <drm/drm_fourcc.h>
>>>> #include <drm/drm_framebuffer.h>
>>>> +#include <drm/drm_gem.h>
>>>> #include <drm/drm_gem_framebuffer_helper.h>
>>>> -#include <drm/drm_gem_shmem_helper.h>
>>>> #include <drm/drm_print.h>
>>>> #include <drm/drm_rect.h>
>>>> #include <drm/drm_simple_kms_helper.h>
>>>> @@ -152,7 +152,7 @@ static int gud_prep_flush(struct gud_device
>>>> *gdrm, struct drm_framebuffer *fb,
>>>> {
>>>> struct dma_buf_attachment *import_attach =
>>>> fb->obj[0]->import_attach;
>>>> u8 compression = gdrm->compression;
>>>> - struct dma_buf_map map;
>>>> + struct dma_buf_map map[DRM_FORMAT_MAX_PLANES];
>>>
>>> Zeroing map solves the problem:
>>>
>>> struct iosys_map map[DRM_FORMAT_MAX_PLANES] = {};
>>>
>>> I don't understand the conditional clearing in
>>> dma_buf_map_clear/iosys_map_clear(), the doc says: Clears all fields to
>>> zero. If I zero the whole structure unconditionally this also keeps
>>> UBSAN happy.
>
> iomap_sys_clear() assumes that the instance is already initialized.
> Hence, calling it at [1] with un-zeroed, stack-allocated memory operates
> on undefined state. It doesn't matter for the result, though. I guess
> the semantics of iosys_sys_clear() are not stellar.
>
I did a quick look through the struct iosys_map users and found these
using a stack allocated variable that has not been initialized:
These call dma_buf_vmap() directly:
drm_gem_cma_prime_import_sg_table_vmap
igt_dmabuf_export_vmap
igt_dmabuf_import_ownership
igt_dmabuf_import
etnaviv_gem_prime_vmap_impl
Ends up calling dma_buf_vmap() if the bo is imported:
panfrost_perfcnt_enable_locked
lima_sched_build_error_task_list
tegra_bo_mmap
mipi_dbi_fb_dirty
mipi_dbi_buf_copy
gud_prep_flush
Ends up calling iosys_map_is_null() at least:
ast_cursor_plane_init
Ends up calling iosys_map_is_equal():
ast_cursor_plane_destroy
>>
>> Thanks for debugging this problem. It's uninitialized and some of the
>> internal helpers look at all planes, even if they are empty. I have a
>> patchset to fix that throughout the DRM modules. I'll post on Monday.
>
> I have posted that patchset at [2]. If you have the time, I'd appreciate
> if you could give it a test run.
>
I'll see if I can do that.
Noralf.
> Best regards
> Thomas
>
> [1]
> https://elixir.bootlin.com/linux/v5.17.5/source/drivers/gpu/drm/drm_gem_framebuffer_helper.c#L348
>
> [2]
> https://lore.kernel.org/dri-devel/20220509081602.474-1-tzimmerm...@suse.de/T/#t
>
>
>>
>> If we need a quick fix, we could do the zeroing everywhere.
>>
>> Best regards
>> Thomas
>>
>>>
>>> Noralf.
>>>
>>>> void *vaddr, *buf;
>>>> size_t pitch, len;
>>>> int ret = 0;
>>>> @@ -162,11 +162,11 @@ static int gud_prep_flush(struct gud_device
>>>> *gdrm, struct drm_framebuffer *fb,
>>>> if (len > gdrm->bulk_len)
>>>> return -E2BIG;
>>>> - ret = drm_gem_shmem_vmap(fb->obj[0], &map);
>>>> + ret = drm_gem_fb_vmap(fb, map);
>>>> if (ret)
>>>> return ret;
>>>> - vaddr = map.vaddr + fb->offsets[0];
>>>> + vaddr = map[0].vaddr + fb->offsets[0];
>>>> ret = drm_gem_fb_begin_cpu_access(fb, DMA_FROM_DEVICE);
>>>> if (ret)
>>>> @@ -225,7 +225,7 @@ static int gud_prep_flush(struct gud_device
>>>> *gdrm, struct drm_framebuffer *fb,
>>>> end_cpu_access:
>>>> drm_gem_fb_end_cpu_access(fb, DMA_FROM_DEVICE);
>>>> vunmap:
>>>> - drm_gem_shmem_vunmap(fb->obj[0], &map);
>>>> + drm_gem_fb_vunmap(fb, map);
>>>> return ret;
>>>> }
>>
>