Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console

2020-12-09 Thread Shung-Hsi Yu
On Wed, Dec 09, 2020 at 07:28:49PM +0100, Thomas Gleixner wrote:
> But the fix is not to tone down the warning. The proper fix is to do the
> search in the correct order.
Agree. Thank you for the speedy fix!

Tested-by: Shung-Hsi Yu 



Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console

2020-12-08 Thread Shung-Hsi Yu
On Wed, Dec 09, 2020 at 02:33:04PM +0800, Shung-Hsi Yu wrote:
> Hi Thomas,
> 
> On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote:
> > The real problem is irqbalanced aggressively exhausting the vector space
> > of a _whole_ socket to the point that there is not a single vector left
> > for serial. That's the problem you want to fix.
> 
> I believe this warning also gets triggered even when there's _no_ vector
> exhaustion.
> 
> This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on
> a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0).
 typo ^^

Should be "affinity mask set to cpumask_of_node(1) when
irq_data_get_node(irqd) == 0".


Shung-Hsi

>   $ lscpu
>   ...
>   NUMA node0 CPU(s):   0-25,52-77
>   NUMA node1 CPU(s):   26-51,78-103
> 
>   $ cat /sys/kernel/debug/tracing/trace
>...
>   irqbalance-1994[017] d...74.912799: irq_matrix_alloc: bit=33 cpu=26 
> online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
> global_rsvd=341, total_alloc=217
>   irqbalance-1994[017] d...74.912802: vector_alloc: irq=4 vector=33 
> reserved=0 ret=0
>   irqbalance-1994[017] d...74.912804: vector_update: irq=4 vector=33 
> cpu=26 prev_vector=33 prev_cpu=7
>   irqbalance-1994[017] d...74.912805: vector_config: irq=4 vector=33 
> cpu=26 apicdest=0x0040
>   -0   [007] d.h.74.970733: vector_free_moved: irq=4 cpu=7 
> vector=33 is_managed=0
>   -0   [007] d.h.74.970738: irq_matrix_free: bit=33 cpu=7 
> online=1 avl=200 alloc=1 managed=1 online_maps=104 global_avl=20687, 
> global_rsvd=341, total_alloc=217
>...
> (agetty)-3004[047] d...81.731231: vector_deactivate: irq=4 
> is_managed=0 can_reserve=1 reserve=0
> (agetty)-3004[047] d...81.738035: vector_clear: irq=4 vector=33 
> cpu=26 prev_vector=0 prev_cpu=7
> (agetty)-3004[047] d...81.738040: irq_matrix_free: bit=33 cpu=26 
> online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20689, 
> global_rsvd=341, total_alloc=215
> (agetty)-3004[047] d...81.738046: irq_matrix_reserve: 
> online_maps=104 global_avl=20689, global_rsvd=342, total_alloc=215
> (agetty)-3004[047] d...81.766739: vector_reserve: irq=4 ret=0
> (agetty)-3004[047] d...81.766741: vector_config: irq=4 vector=239 
> cpu=0 apicdest=0x
> (agetty)-3004[047] d...81.777152: vector_activate: irq=4 
> is_managed=0 can_reserve=1 reserve=0
> (agetty)-3004[047] d...81.777157: vector_alloc: irq=4 vector=0 
> reserved=1 ret=-22
> > irq_matrix_alloc() failed with
>   EINVAL because the cpumask
>   passed in is empty, which is a
>   result of affmask being
>   (ff,c000,000f,fc00)
>   and cpumask_of_node(node)
>   being
>   
> (00,3fff,fff0,03ff). 
> 
> (agetty)-3004[047] d...81.789349: irq_matrix_alloc: bit=33 cpu=1 
> online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, 
> global_rsvd=341, total_alloc=216
> (agetty)-3004[047] d...81.789351: vector_alloc: irq=4 vector=33 
> reserved=1 ret=0
> (agetty)-3004[047] d...81.789353: vector_update: irq=4 vector=33 
> cpu=1 prev_vector=0 prev_cpu=26
> (agetty)-3004[047] d...81.789355: vector_config: irq=4 vector=33 
> cpu=1 apicdest=0x0002
> > "irq 4: Affinity broken due to
>   vector space exhaustion."
>   warning shows up
> 
> (agetty)-3004[047] d...81.900783: irq_matrix_alloc: bit=33 cpu=26 
> online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
> global_rsvd=341, total_alloc=217
> (agetty)-3004[047] d...82.053535: vector_alloc: irq=4 vector=33 
> reserved=0 ret=0
> (agetty)-3004[047] d...82.053536: vector_update: irq=4 vector=33 
> cpu=26 prev_vector=33 prev_cpu=1
> (agetty)-3004[047] d...82.053538: vector_config: irq=4 vector=33 
> cpu=26 apicdest=0x0040



Re: "irq 4: Affinity broken due to vector space exhaustion." warning on restart of ttyS0 console

2020-12-08 Thread Shung-Hsi Yu
Hi Thomas,

On Tue, Nov 10, 2020 at 09:56:27PM +0100, Thomas Gleixner wrote:
> The real problem is irqbalanced aggressively exhausting the vector space
> of a _whole_ socket to the point that there is not a single vector left
> for serial. That's the problem you want to fix.

I believe this warning also gets triggered even when there's _no_ vector
exhaustion.

This seem to happen when the IRQ's affinity mask is set (wrongly) to CPUs on
a different NUMA node (e.g. cpumask_of_node(1) when the irqd->irq == 0).

  $ lscpu
  ...
  NUMA node0 CPU(s):   0-25,52-77
  NUMA node1 CPU(s):   26-51,78-103

  $ cat /sys/kernel/debug/tracing/trace
   ...
  irqbalance-1994[017] d...74.912799: irq_matrix_alloc: bit=33 cpu=26 
online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
  irqbalance-1994[017] d...74.912802: vector_alloc: irq=4 vector=33 
reserved=0 ret=0
  irqbalance-1994[017] d...74.912804: vector_update: irq=4 vector=33 
cpu=26 prev_vector=33 prev_cpu=7
  irqbalance-1994[017] d...74.912805: vector_config: irq=4 vector=33 
cpu=26 apicdest=0x0040
  -0   [007] d.h.74.970733: vector_free_moved: irq=4 cpu=7 
vector=33 is_managed=0
  -0   [007] d.h.74.970738: irq_matrix_free: bit=33 cpu=7 
online=1 avl=200 alloc=1 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
   ...
(agetty)-3004[047] d...81.731231: vector_deactivate: irq=4 
is_managed=0 can_reserve=1 reserve=0
(agetty)-3004[047] d...81.738035: vector_clear: irq=4 vector=33 
cpu=26 prev_vector=0 prev_cpu=7
(agetty)-3004[047] d...81.738040: irq_matrix_free: bit=33 cpu=26 
online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20689, 
global_rsvd=341, total_alloc=215
(agetty)-3004[047] d...81.738046: irq_matrix_reserve: 
online_maps=104 global_avl=20689, global_rsvd=342, total_alloc=215
(agetty)-3004[047] d...81.766739: vector_reserve: irq=4 ret=0
(agetty)-3004[047] d...81.766741: vector_config: irq=4 vector=239 
cpu=0 apicdest=0x
(agetty)-3004[047] d...81.777152: vector_activate: irq=4 
is_managed=0 can_reserve=1 reserve=0
(agetty)-3004[047] d...81.777157: vector_alloc: irq=4 vector=0 
reserved=1 ret=-22
> irq_matrix_alloc() failed with
  EINVAL because the cpumask
  passed in is empty, which is a
  result of affmask being
  (ff,c000,000f,fc00)
  and cpumask_of_node(node)
  being
  (00,3fff,fff0,03ff). 

(agetty)-3004[047] d...81.789349: irq_matrix_alloc: bit=33 cpu=1 
online=1 avl=199 alloc=2 managed=1 online_maps=104 global_avl=20688, 
global_rsvd=341, total_alloc=216
(agetty)-3004[047] d...81.789351: vector_alloc: irq=4 vector=33 
reserved=1 ret=0
(agetty)-3004[047] d...81.789353: vector_update: irq=4 vector=33 
cpu=1 prev_vector=0 prev_cpu=26
(agetty)-3004[047] d...81.789355: vector_config: irq=4 vector=33 
cpu=1 apicdest=0x0002
> "irq 4: Affinity broken due to
  vector space exhaustion."
  warning shows up

(agetty)-3004[047] d...81.900783: irq_matrix_alloc: bit=33 cpu=26 
online=1 avl=198 alloc=3 managed=1 online_maps=104 global_avl=20687, 
global_rsvd=341, total_alloc=217
(agetty)-3004[047] d...82.053535: vector_alloc: irq=4 vector=33 
reserved=0 ret=0
(agetty)-3004[047] d...82.053536: vector_update: irq=4 vector=33 
cpu=26 prev_vector=33 prev_cpu=1
(agetty)-3004[047] d...82.053538: vector_config: irq=4 vector=33 
cpu=26 apicdest=0x0040


Shung-Hsi Yu



Re: [PATCH v3 2/8] staging: qlge: Initialize devlink health dump framework

2020-10-20 Thread Shung-Hsi Yu
On Tue, Oct 20, 2020 at 04:57:11PM +0800, Shung-Hsi Yu wrote:
> Hi,
> 
> This patch trigger the following KASAN error inside qlge_init_device().
> 
> [...] general protection fault, probably for non-canonical address 
> 0xdc4b:  [#1] SMP DEBUG_PAGEALLOC KASAN PTI
> [...] KASAN: null-ptr-deref in range [0x0258-0x025f]
> [...] CPU: 0 PID: 438 Comm: systemd-udevd Tainted: G C  E 
> 5.9.0-kvmsmall+ #7
> [...] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.13.0-48-
> g...ilt.opensuse.org 04/01/2014
> [...] RIP: 0010:qlge_get_8000_flash_params+0x377/0x6e0 [qlge]
> [...] Code: 03 80 3c 02 00 0f 85 57 03 00 00 49 8b af 68 08 00 00 48 b8 00 00 
> 00 00 00 fc ff df 48 8d bd 5f 02 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 
> 89 fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 c6 02 00 00
> [...] RSP: 0018:c9f97788 EFLAGS: 00010207
> [...] RAX: dc00 RBX:  RCX: 
> [...] RDX: 004b RSI: c08cb843 RDI: 025f
> [...] R10: fbfff5f718a0 R11: 0001 R12: dc00
> [...] R13: 888111085d40 R14: 888111085d40 R15: 888111080280
> [...] FS:  7f315f5db280() GS:8881f520() 
> knlGS:
> [...] CS:  0010 DS:  ES:  CR0: 80050033
> [...] CR2: 55bb25297170 CR3: 000110674000 CR4: 06f0
> [...] DR0:  DR1:  DR2: 
> [...] DR3:  DR6: fffe0ff0 DR7: 0400
> [...] Call Trace:
> [...]  ? qlge_get_8012_flash_params+0x600/0x600 [qlge]
> [...]  ? static_obj+0x8a/0xc0
> [...]  ? lockdep_init_map_waits+0x26a/0x700
> [...]  qlge_init_device+0x425/0x1000 [qlge]
> [...]  ? debug_mutex_init+0x31/0x60
> [...]  qlge_probe+0xfe/0x6c0 [qlge]
> 
> 
> With qlge_get_8000_flash_params+0x377/0x6e0 corresponding to the following:
> 
>   if (qdev->flash.flash_params_8000.data_type1 == 2)
>   memcpy(mac_addr,
>  qdev->flash.flash_params_8000.mac_addr1,
>  qdev->ndev->addr_len); // < Here

This is because qdev->ndev == 0.

The reason is that before qlge_get_8000_flash_params() get called qdev is 
memset-ed inside qlge_init_device().

static int qlge_init_device(struct pci_dev *pdev, struct qlge_adapter *qdev,
int cards_found)
{
struct net_device *ndev = qdev->ndev;
int err = 0;

memset((void *)qdev, 0, sizeof(*qdev));

// ...

// qlge_get_8000_flash_params() get's called
err = qdev->nic_ops->get_flash(qdev);

// ...
}



Re: [PATCH v3 2/8] staging: qlge: Initialize devlink health dump framework

2020-10-20 Thread Shung-Hsi Yu
Hi,

This patch trigger the following KASAN error inside qlge_init_device().

[...] general protection fault, probably for non-canonical address 
0xdc4b:  [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[...] KASAN: null-ptr-deref in range [0x0258-0x025f]
[...] CPU: 0 PID: 438 Comm: systemd-udevd Tainted: G C  E 
5.9.0-kvmsmall+ #7
[...] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-
g...ilt.opensuse.org 04/01/2014
[...] RIP: 0010:qlge_get_8000_flash_params+0x377/0x6e0 [qlge]
[...] Code: 03 80 3c 02 00 0f 85 57 03 00 00 49 8b af 68 08 00 00 48 b8 00 00 
00 00 00 fc ff df 48 8d bd 5f 02 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 
fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 c6 02 00 00
[...] RSP: 0018:c9f97788 EFLAGS: 00010207
[...] RAX: dc00 RBX:  RCX: 
[...] RDX: 004b RSI: c08cb843 RDI: 025f
[...] R10: fbfff5f718a0 R11: 0001 R12: dc00
[...] R13: 888111085d40 R14: 888111085d40 R15: 888111080280
[...] FS:  7f315f5db280() GS:8881f520() 
knlGS:
[...] CS:  0010 DS:  ES:  CR0: 80050033
[...] CR2: 55bb25297170 CR3: 000110674000 CR4: 06f0
[...] DR0:  DR1:  DR2: 
[...] DR3:  DR6: fffe0ff0 DR7: 0400
[...] Call Trace:
[...]  ? qlge_get_8012_flash_params+0x600/0x600 [qlge]
[...]  ? static_obj+0x8a/0xc0
[...]  ? lockdep_init_map_waits+0x26a/0x700
[...]  qlge_init_device+0x425/0x1000 [qlge]
[...]  ? debug_mutex_init+0x31/0x60
[...]  qlge_probe+0xfe/0x6c0 [qlge]
[...]  ? qlge_set_mac_addr+0x330/0x330 [qlge]
[...]  local_pci_probe+0xd8/0x170
[...]  pci_call_probe+0x156/0x3d0
[...]  ? pci_match_device+0x30c/0x620
[...]  ? pci_pm_suspend_noirq+0x980/0x980
[...]  ? pci_match_device+0x33c/0x620
[...]  ? kernfs_put+0x18/0x30
[...]  pci_device_probe+0x1e0/0x270
[...]  ? pci_dma_configure+0x57/0xd0
[...]  really_probe+0x218/0xd20
[...]  driver_probe_device+0x1e6/0x2c0
[...]  device_driver_attach+0x209/0x270
[...]  __driver_attach+0xf6/0x260
[...]  ? device_driver_attach+0x270/0x270
[...]  bus_for_each_dev+0x114/0x1a0
[...]  ? subsys_find_device_by_id+0x2d0/0x2d0
[...]  ? bus_add_driver+0x2d2/0x620
[...]  bus_add_driver+0x352/0x620
[...]  driver_register+0x1ee/0x4b0
[...]  ? 0xc08e9000
[...]  do_one_initcall+0xbb/0x400
[...]  ? trace_event_raw_event_initcall_finish+0x1c0/0x1c0
[...]  ? rcu_read_lock_sched_held+0x3f/0x70
[...]  ? trace_kmalloc+0xa2/0xd0
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  do_init_module+0x1ce/0x780
[...]  load_module+0x14b1/0x16d0
[...]  ? post_relocation+0x3a0/0x3a0
[...]  ? device_driver_attach+0x270/0x270
[...]  bus_for_each_dev+0x114/0x1a0
[...]  ? subsys_find_device_by_id+0x2d0/0x2d0
[...]  ? bus_add_driver+0x2d2/0x620
[...]  bus_add_driver+0x352/0x620
[...]  driver_register+0x1ee/0x4b0
[...]  ? 0xc08e9000
[...]  do_one_initcall+0xbb/0x400
[...]  ? trace_event_raw_event_initcall_finish+0x1c0/0x1c0
[...]  ? rcu_read_lock_sched_held+0x3f/0x70
[...]  ? trace_kmalloc+0xa2/0xd0
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  do_init_module+0x1ce/0x780
[...]  load_module+0x14b1/0x16d0
[...]  ? post_relocation+0x3a0/0x3a0
[...]  ? kernel_read_file_from_fd+0x4b/0x90
[...]  __do_sys_finit_module+0x110/0x1a0
[...]  ? __ia32_sys_init_module+0xa0/0xa0
[...]  do_syscall_64+0x33/0x80
[...]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

With qlge_get_8000_flash_params+0x377/0x6e0 corresponding to the following:

if (qdev->flash.flash_params_8000.data_type1 == 2)
memcpy(mac_addr,
   qdev->flash.flash_params_8000.mac_addr1,
   qdev->ndev->addr_len); // < Here

IIRC I didn't see this with v1. However I didn't test v2, so I'm not sure if
this issue is introduced during v2 or v3.

Best,
Shung-Hsi



Re: [PATCH v2 2/7] staging: qlge: Initialize devlink health dump framework

2020-10-20 Thread Shung-Hsi Yu
On Tue, Oct 20, 2020 at 04:36:09PM +0800, Shung-Hsi Yu wrote:
> This patch trigger the following KASAN error inside qlge_init_device().

Sorry, I meant to reply to the v3 series, please ignore this email.



Re: [PATCH v2 2/7] staging: qlge: Initialize devlink health dump framework

2020-10-20 Thread Shung-Hsi Yu
Hi,

This patch trigger the following KASAN error inside qlge_init_device().

[...] general protection fault, probably for non-canonical address 
0xdc4b:  [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[...] KASAN: null-ptr-deref in range [0x0258-0x025f]
[...] CPU: 0 PID: 438 Comm: systemd-udevd Tainted: G C  E 
5.9.0-kvmsmall+ #7
[...] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-
g...ilt.opensuse.org 04/01/2014
[...] RIP: 0010:qlge_get_8000_flash_params+0x377/0x6e0 [qlge]
[...] Code: 03 80 3c 02 00 0f 85 57 03 00 00 49 8b af 68 08 00 00 48 b8 00 00 
00 00 00 fc ff df 48 8d bd 5f 02 00 00 48 89 fa 48 c1 ea 03 <0f> b6 04 02 48 89 
fa 83 e2 07 38 d0 7f 08 84 c0 0f 85 c6 02 00 00
[...] RSP: 0018:c9f97788 EFLAGS: 00010207
[...] RAX: dc00 RBX:  RCX: 
[...] RDX: 004b RSI: c08cb843 RDI: 025f
[...] R10: fbfff5f718a0 R11: 0001 R12: dc00
[...] R13: 888111085d40 R14: 888111085d40 R15: 888111080280
[...] FS:  7f315f5db280() GS:8881f520() 
knlGS:
[...] CS:  0010 DS:  ES:  CR0: 80050033
[...] CR2: 55bb25297170 CR3: 000110674000 CR4: 06f0
[...] DR0:  DR1:  DR2: 
[...] DR3:  DR6: fffe0ff0 DR7: 0400
[...] Call Trace:
[...]  ? qlge_get_8012_flash_params+0x600/0x600 [qlge]
[...]  ? static_obj+0x8a/0xc0
[...]  ? lockdep_init_map_waits+0x26a/0x700
[...]  qlge_init_device+0x425/0x1000 [qlge]
[...]  ? debug_mutex_init+0x31/0x60
[...]  qlge_probe+0xfe/0x6c0 [qlge]
[...]  ? qlge_set_mac_addr+0x330/0x330 [qlge]
[...]  local_pci_probe+0xd8/0x170
[...]  pci_call_probe+0x156/0x3d0
[...]  ? pci_match_device+0x30c/0x620
[...]  ? pci_pm_suspend_noirq+0x980/0x980
[...]  ? pci_match_device+0x33c/0x620
[...]  ? kernfs_put+0x18/0x30
[...]  pci_device_probe+0x1e0/0x270
[...]  ? pci_dma_configure+0x57/0xd0
[...]  really_probe+0x218/0xd20
[...]  driver_probe_device+0x1e6/0x2c0
[...]  device_driver_attach+0x209/0x270
[...]  __driver_attach+0xf6/0x260
[...]  ? device_driver_attach+0x270/0x270
[...]  bus_for_each_dev+0x114/0x1a0
[...]  ? subsys_find_device_by_id+0x2d0/0x2d0
[...]  ? bus_add_driver+0x2d2/0x620
[...]  bus_add_driver+0x352/0x620
[...]  driver_register+0x1ee/0x4b0
[...]  ? 0xc08e9000
[...]  do_one_initcall+0xbb/0x400
[...]  ? trace_event_raw_event_initcall_finish+0x1c0/0x1c0
[...]  ? rcu_read_lock_sched_held+0x3f/0x70
[...]  ? trace_kmalloc+0xa2/0xd0
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  do_init_module+0x1ce/0x780
[...]  load_module+0x14b1/0x16d0
[...]  ? post_relocation+0x3a0/0x3a0
[...]  ? device_driver_attach+0x270/0x270
[...]  bus_for_each_dev+0x114/0x1a0
[...]  ? subsys_find_device_by_id+0x2d0/0x2d0
[...]  ? bus_add_driver+0x2d2/0x620
[...]  bus_add_driver+0x352/0x620
[...]  driver_register+0x1ee/0x4b0
[...]  ? 0xc08e9000
[...]  do_one_initcall+0xbb/0x400
[...]  ? trace_event_raw_event_initcall_finish+0x1c0/0x1c0
[...]  ? rcu_read_lock_sched_held+0x3f/0x70
[...]  ? trace_kmalloc+0xa2/0xd0
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  ? kasan_unpoison_shadow+0x33/0x40
[...]  do_init_module+0x1ce/0x780
[...]  load_module+0x14b1/0x16d0
[...]  ? post_relocation+0x3a0/0x3a0
[...]  ? kernel_read_file_from_fd+0x4b/0x90
[...]  __do_sys_finit_module+0x110/0x1a0
[...]  ? __ia32_sys_init_module+0xa0/0xa0
[...]  do_syscall_64+0x33/0x80
[...]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

With qlge_get_8000_flash_params+0x377/0x6e0 corresponding to the following:

if (qdev->flash.flash_params_8000.data_type1 == 2)
memcpy(mac_addr,
   qdev->flash.flash_params_8000.mac_addr1,
   qdev->ndev->addr_len); // < Here

IIRC I didn't see this with v1. However I didn't test v2, so I'm not sure if
this issue is introduced during v2 or v3.

Best,
Shung-Hsi



Re: [PATCH 4.19 41/88] net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()

2020-09-08 Thread Shung-Hsi Yu
On Tue, Sep 08, 2020 at 09:53:11PM +0200, Pavel Machek wrote:
> Hi!
> 
> > On machines with much memory (> 2 TByte) and log_mtts_per_seg == 0, a
> > max_order of 31 will be passed to mlx_buddy_init(), which results in
> > s = BITS_TO_LONGS(1 << 31) becoming a negative value, leading to
> > kvmalloc_array() failure when it is converted to size_t.
> > 
> >   mlx4_core :b1:00.0: Failed to initialize memory region table, aborting
> >   mlx4_core: probe of :b1:00.0 failed with error -12
> > 
> > Fix this issue by changing the left shifting operand from a signed literal 
> > to
> > an unsigned one.
> 
> Will we still have problems with > 4 TByte machines?

AFAIK we're safe since max_buddy is calculated as such

/* In drivers/net/ethernet/mellanox/mlx4/mr.c */
err = mlx4_buddy_init(_table->mtt_buddy,
  ilog2((u32)dev->caps.num_mtts /
  (1 << log_mtts_per_seg)));

Also, num_mtts is capped at 2^31

/* In drivers/net/ethernet/mellanox/mlx4/profile.c */
/*
 * We want to scale the number of MTTs with the size of the
 * system memory, since it makes sense to register a lot of
 * memory on a system with a lot of memory.  As a heuristic,
 * make sure we have enough MTTs to cover twice the system
 * memory (with PAGE_SIZE entries).
 *
 * This number has to be a power of two and fit into 32 bits
 * due to device limitations, so cap this at 2^31 as well.
 * That limits us to 8TB of memory registration per HCA with
 * 4KB pages, which is probably OK for the next few months.
 */
si_meminfo();
request->num_mtt =
roundup_pow_of_two(max_t(unsigned, request->num_mtt,
 min(1UL << (31 - log_mtts_per_seg),
 (si.totalram << 1) >> 
log_mtts_per_seg)));

Best,
Shung-Hsi Yu

> Should the computation be done in u64?
>
> Best regards,
>           Pavel
> 
> > Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand 
> > adapters")
> > Signed-off-by: Shung-Hsi Yu 
> > Signed-off-by: David S. Miller 
> > Signed-off-by: Sasha Levin 
> 
> > +++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
> > @@ -114,7 +114,7 @@ static int mlx4_buddy_init(struct mlx4_buddy *buddy, 
> > int max_order)
> > goto err_out;
> >  
> > for (i = 0; i <= buddy->max_order; ++i) {
> > -   s = BITS_TO_LONGS(1 << (buddy->max_order - i));
> > +   s = BITS_TO_LONGS(1UL << (buddy->max_order - i));
> > buddy->bits[i] = kvmalloc_array(s, sizeof(long), GFP_KERNEL | 
> > __GFP_ZERO);
> > if (!buddy->bits[i])
> > goto err_out_free;
> 
> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html




[PATCH net] net: ethernet: mlx4: Fix memory allocation in mlx4_buddy_init()

2020-08-31 Thread Shung-Hsi Yu
On machines with much memory (> 2 TByte) and log_mtts_per_seg == 0, a
max_order of 31 will be passed to mlx_buddy_init(), which results in
s = BITS_TO_LONGS(1 << 31) becoming a negative value, leading to
kvmalloc_array() failure when it is converted to size_t.

  mlx4_core :b1:00.0: Failed to initialize memory region table, aborting
  mlx4_core: probe of :b1:00.0 failed with error -12

Fix this issue by changing the left shifting operand from a signed literal to
an unsigned one.

Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand 
adapters")
Signed-off-by: Shung-Hsi Yu 
---
 drivers/net/ethernet/mellanox/mlx4/mr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c 
b/drivers/net/ethernet/mellanox/mlx4/mr.c
index d2986f1f2db0..d7444782bfdd 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -114,7 +114,7 @@ static int mlx4_buddy_init(struct mlx4_buddy *buddy, int 
max_order)
goto err_out;
 
for (i = 0; i <= buddy->max_order; ++i) {
-   s = BITS_TO_LONGS(1 << (buddy->max_order - i));
+   s = BITS_TO_LONGS(1UL << (buddy->max_order - i));
buddy->bits[i] = kvmalloc_array(s, sizeof(long), GFP_KERNEL | 
__GFP_ZERO);
if (!buddy->bits[i])
goto err_out_free;
-- 
2.28.0