Re: [Intel-gfx] Possible regression in drm/i915 driver: memleak

2022-12-21 Thread Mirsad Goran Todorovac

On 20.12.2022. 20:34, Mirsad Todorovac wrote:

On 12/20/22 16:52, Tvrtko Ursulin wrote:


On 20/12/2022 15:22, srinivas pandruvada wrote:

+Added DRM mailing list and maintainers

On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:

Hi all,

I have been unsuccessful to find any particular Intel i915 maintainer
emails, so my best bet is to post here, as you will must assuredly
already know them.


For future reference you can use 
${kernel_dir}/scripts/get_maintainer.pl -f ...



The problem is a kernel memory leak that is repeatedly occurring
triggered during the execution of Chrome browser under the latest
6.1.0+
kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.

The build is with KMEMLEAK, KASAN and MGLRU turned on during the
build,
on a vanilla mainline kernel from Mr. Torvalds' tree.

The leaks look like this one:

unreferenced object 0x888131754880 (size 64):
    comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
    hex dump (first 32 bytes):
  01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

  00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
...>
    backtrace:
  [] slab_post_alloc_hook+0xb2/0x340
  [] __kmem_cache_alloc_node+0x1bf/0x2c0
  [] kmalloc_trace+0x2a/0xb0
  [] drm_vma_node_allow+0x45/0x150 [drm]
  [] __assign_mmap_offset_handle+0x615/0x820
[i915]
  [] i915_gem_mmap_offset_ioctl+0x77/0x110
[i915]
  [] drm_ioctl_kernel+0x181/0x280 [drm]
  [] drm_ioctl+0x2dd/0x6a0 [drm]
  [] __x64_sys_ioctl+0xc4/0x100
  [] do_syscall_64+0x58/0x80
  [] entry_SYSCALL_64_after_hwframe+0x72/0xdc

The complete list of leaks in attachment, but they seem similar or
the same.

Please find attached lshw and kernel build config file.

I will probably check the same parms on my laptop at home, which is
also
Lenovo, but a different hw config and Ubuntu 22.10.


Could you try the below patch?

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c

index c3ea243d414d..0b07534c203a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
  insert:
 mmo = insert_mmo(obj, mmo);
 GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
-out:
+
 if (file)
 drm_vma_node_allow(>vma_node, file);
+out:
 return mmo;

  err:

Maybe it is not the best fix but curious to know if it will make the 
leak go away.


Hi,

After 27 minutes uptime with the patched kernel it looks promising.
It is much longer than it took for the buggy kernel to leak slabs.

Here is the output:

[root@pc-mtodorov marvin]# echo scan > /sys/kernel/debug/kmemleak
[root@pc-mtodorov marvin]# cat !$
cat /sys/kernel/debug/kmemleak
unreferenced object 0x888105028d80 (size 16):
  comm "kworker/u12:5", pid 359, jiffies 4294902898 (age 1620.144s)
  hex dump (first 16 bytes):
    6d 65 6d 73 74 69 63 6b 30 00 00 00 00 00 00 00 memstick0...
  backtrace:
    [] slab_post_alloc_hook+0xb2/0x340
    [] __kmem_cache_alloc_node+0x1bf/0x2c0
    [] __kmalloc_node_track_caller+0x55/0x160
    [] kstrdup+0x36/0x60
    [] kstrdup_const+0x28/0x30
    [] kvasprintf_const+0x97/0xd0
    [] kobject_set_name_vargs+0x34/0xc0
    [] dev_set_name+0x9b/0xd0
    [] memstick_check+0x181/0x639 [memstick]
    [] process_one_work+0x4e6/0x7e0
    [] worker_thread+0x76/0x770
    [] kthread+0x168/0x1a0
    [] ret_from_fork+0x29/0x50
[root@pc-mtodorov marvin]# w
 20:27:35 up 27 min,  2 users,  load average: 0.83, 1.15, 1.19
USER TTY  FROM LOGIN@   IDLE   JCPU   PCPU WHAT
marvin   tty2 tty2 20:01   27:10  10:12   2.09s 
/opt/google/chrome/chrome --type=utility --utility-sub-type=audio.m

marvin   pts/1    -    20:01    0.00s  2:00   0.38s sudo bash
[root@pc-mtodorov marvin]# uname -rms
Linux 6.1.0-b6bb9676f216-mglru-kmemlk-kasan+ x86_64
[root@pc-mtodorov marvin]#

2. On the Ubuntu 22.10 with Debian build I did not reproduce the error 
thus far.


This looks to me like fixed, but if it doesn't leak anything until 
Thursday morning when I will see this desktop box next time, then 
we'll know with more certainty. 


After an inspection in the morning local time and 12:10h uptime, it 
appears that the problem is fixed. No chrome-triggered 
i915_gem_mmap_offset_ioctl leaks.


By this uptime, there were about 30 instances of leaks in the unpatched 
kernel.


Congratulations!

Kind regards,
Mirsad

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu



Re: [Intel-gfx] Possible regression in drm/i915 driver: memleak

2022-12-21 Thread Mirsad Goran Todorovac

On 20. 12. 2022. 16:52, Tvrtko Ursulin wrote:


On 20/12/2022 15:22, srinivas pandruvada wrote:

+Added DRM mailing list and maintainers

On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:

Hi all,

I have been unsuccessful to find any particular Intel i915 maintainer
emails, so my best bet is to post here, as you will must assuredly
already know them.


For future reference you can use ${kernel_dir}/scripts/get_maintainer.pl -f ...


Thank you, this will help a great deal provided that I find any
more bugs ...


The problem is a kernel memory leak that is repeatedly occurring
triggered during the execution of Chrome browser under the latest
6.1.0+
kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.

The build is with KMEMLEAK, KASAN and MGLRU turned on during the
build,
on a vanilla mainline kernel from Mr. Torvalds' tree.

The leaks look like this one:

unreferenced object 0x888131754880 (size 64):
    comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
    hex dump (first 32 bytes):
  01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

  00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
...>
    backtrace:
  [] slab_post_alloc_hook+0xb2/0x340
  [] __kmem_cache_alloc_node+0x1bf/0x2c0
  [] kmalloc_trace+0x2a/0xb0
  [] drm_vma_node_allow+0x45/0x150 [drm]
  [] __assign_mmap_offset_handle+0x615/0x820
[i915]
  [] i915_gem_mmap_offset_ioctl+0x77/0x110
[i915]
  [] drm_ioctl_kernel+0x181/0x280 [drm]
  [] drm_ioctl+0x2dd/0x6a0 [drm]
  [] __x64_sys_ioctl+0xc4/0x100
  [] do_syscall_64+0x58/0x80
  [] entry_SYSCALL_64_after_hwframe+0x72/0xdc

The complete list of leaks in attachment, but they seem similar or
the same.

Please find attached lshw and kernel build config file.

I will probably check the same parms on my laptop at home, which is
also
Lenovo, but a different hw config and Ubuntu 22.10.


Could you try the below patch?

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index c3ea243d414d..0b07534c203a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
  insert:
     mmo = insert_mmo(obj, mmo);
     GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
-out:
+
     if (file)
     drm_vma_node_allow(>vma_node, file);
+out:
     return mmo;

  err:

Maybe it is not the best fix but curious to know if it will make the leak go 
away.


The patch was successfully applied to the latest Mr. Torvalds' tree (commit 
b6bb9676f216).

It is currently building, which can take up to 90 minutes on our system.

Now the test depends on whether I will be able to setup the machine at work 
remotely
(there were some firewalls on port 22 recently).

I will keep you updated.

Thanks,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union



Re: [Intel-gfx] Possible regression in drm/i915 driver: memleak

2022-12-20 Thread Tvrtko Ursulin



Hi,

On 20/12/2022 15:22, srinivas pandruvada wrote:

+Added DRM mailing list and maintainers

On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:

Hi all,

I have been unsuccessful to find any particular Intel i915 maintainer
emails, so my best bet is to post here, as you will must assuredly
already know them.


For future reference you can use ${kernel_dir}/scripts/get_maintainer.pl -f ...


The problem is a kernel memory leak that is repeatedly occurring
triggered during the execution of Chrome browser under the latest
6.1.0+
kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.

The build is with KMEMLEAK, KASAN and MGLRU turned on during the
build,
on a vanilla mainline kernel from Mr. Torvalds' tree.

The leaks look like this one:

unreferenced object 0x888131754880 (size 64):
    comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
    hex dump (first 32 bytes):
  01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

  00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff
...>
    backtrace:
  [] slab_post_alloc_hook+0xb2/0x340
  [] __kmem_cache_alloc_node+0x1bf/0x2c0
  [] kmalloc_trace+0x2a/0xb0
  [] drm_vma_node_allow+0x45/0x150 [drm]
  [] __assign_mmap_offset_handle+0x615/0x820
[i915]
  [] i915_gem_mmap_offset_ioctl+0x77/0x110
[i915]
  [] drm_ioctl_kernel+0x181/0x280 [drm]
  [] drm_ioctl+0x2dd/0x6a0 [drm]
  [] __x64_sys_ioctl+0xc4/0x100
  [] do_syscall_64+0x58/0x80
  [] entry_SYSCALL_64_after_hwframe+0x72/0xdc

The complete list of leaks in attachment, but they seem similar or
the same.

Please find attached lshw and kernel build config file.

I will probably check the same parms on my laptop at home, which is
also
Lenovo, but a different hw config and Ubuntu 22.10.


Could you try the below patch?

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c 
b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
index c3ea243d414d..0b07534c203a 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c
@@ -679,9 +679,10 @@ mmap_offset_attach(struct drm_i915_gem_object *obj,
 insert:
mmo = insert_mmo(obj, mmo);
GEM_BUG_ON(lookup_mmo(obj, mmap_type) != mmo);
-out:
+
if (file)
drm_vma_node_allow(>vma_node, file);
+out:
return mmo;

 err:

Maybe it is not the best fix but curious to know if it will make the leak go 
away.

Regards,

Tvrtko


Re: [Intel-gfx] Possible regression in drm/i915 driver: memleak

2022-12-20 Thread srinivas pandruvada
+Added DRM mailing list and maintainers

On Tue, 2022-12-20 at 15:33 +0100, Mirsad Todorovac wrote:
> Hi all,
> 
> I have been unsuccessful to find any particular Intel i915 maintainer
> emails, so my best bet is to post here, as you will must assuredly 
> already know them.
> 
> The problem is a kernel memory leak that is repeatedly occurring 
> triggered during the execution of Chrome browser under the latest
> 6.1.0+ 
> kernel of this morning and Almalinux 8.6 on a Lenovo desktop box
> with Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz CPU.
> 
> The build is with KMEMLEAK, KASAN and MGLRU turned on during the
> build, 
> on a vanilla mainline kernel from Mr. Torvalds' tree.
> 
> The leaks look like this one:
> 
> unreferenced object 0x888131754880 (size 64):
>    comm "chrome", pid 13058, jiffies 4298568878 (age 3708.084s)
>    hex dump (first 32 bytes):
>  01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 
>  00 00 00 00 00 00 00 00 00 80 1e 3e 83 88 ff ff 
> ...>
>    backtrace:
>  [] slab_post_alloc_hook+0xb2/0x340
>  [] __kmem_cache_alloc_node+0x1bf/0x2c0
>  [] kmalloc_trace+0x2a/0xb0
>  [] drm_vma_node_allow+0x45/0x150 [drm]
>  [] __assign_mmap_offset_handle+0x615/0x820
> [i915]
>  [] i915_gem_mmap_offset_ioctl+0x77/0x110
> [i915]
>  [] drm_ioctl_kernel+0x181/0x280 [drm]
>  [] drm_ioctl+0x2dd/0x6a0 [drm]
>  [] __x64_sys_ioctl+0xc4/0x100
>  [] do_syscall_64+0x58/0x80
>  [] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> The complete list of leaks in attachment, but they seem similar or
> the same.
> 
> Please find attached lshw and kernel build config file.
> 
> I will probably check the same parms on my laptop at home, which is
> also 
> Lenovo, but a different hw config and Ubuntu 22.10.
> 
> Thanks,
> Mirsad
> 
> -- 
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu