When the memory allocation was stalled, here is what perf top was giving me:
Samples: 483K of event 'cycles:ppp', Event count (approx.): 52114089074
Overhead  Shared Object                             Symbol
  28.53%  [kernel]                                  [k] total_mapcount
  25.34%  [kernel]                                  [k] kvm_age_rmapp
  13.54%  [kernel]                                  [k] slot_rmap_walk_next
  11.24%  [kernel]                                  [k] kvm_handle_hva_range
   6.35%  [kernel]                                  [k] rmap_get_first
   3.69%  [kernel]                                  [k] __x86_indirect_thunk_r13
   1.33%  [kernel]                                  [k] __isolate_lru_page
   0.63%  [kernel]                                  [k] 
isolate_lru_pages.isra.58
   0.48%  [kernel]                                  [k] page_vma_mapped_walk
   0.40%  [kernel]                                  [k] __mod_node_page_state
   0.35%  [kernel]                                  [k] clear_page_erms
   0.31%  [kernel]                                  [k] shrink_page_list
   0.28%  [kernel]                                  [k] _find_next_bit
   0.27%  [kernel]                                  [k] putback_inactive_pages
   0.27%  [kernel]                                  [k] move_active_pages_to_lru
   0.27%  [kernel]                                  [k] inactive_list_is_low
   0.22%  [kernel]                                  [k] __mod_zone_page_state


numactl -H when the memorry allocation stalled:
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 55983 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63810 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63782 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63757 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 368 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63744 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 
root@gpu-compute028:~# numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 2 4 6 8 10 12 14
node 0 size: 64288 MB
node 0 free: 366 MB
node 1 cpus: 1 3 5 7 9 11 13 15
node 1 size: 64489 MB
node 1 free: 63504 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10 

then i killed the process.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1808412

Title:
  4.15.0 memory allocation issue

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  My server is : 
  PowerEdge T630
  2x Intel(R) Xeon(R) CPU E5-2623 v4 @ 2.60GHz
  128G of ram
  4x VGA compatible controller [0300]: NVIDIA Corporation GP102 [TITAN X] 
[10de:1b00] (rev a1)

  Starting 116G ram 16vcpus + 4 pci passthrough allocating memory stops
  after about half of the memory.

  When upgrading from kernel 4.13.0 to 4.15.0 starting a vm takes a long
  time.

  I tested kernel : 
  linux-image-4.13.0-37 not affected
  linux-image-4.13.0-45 not affected
  linux-image-4.15.0-34 affected
  linux-image-4.15.0-42 affected

  After disabling transparent_hugepage on 4.15 everything seems to work
  correctly.

  cat /proc/cmdline 
  BOOT_IMAGE=/boot/vmlinuz-4.15.0-42-generic root=UUID=<some uuid> ro 
intel_iommu=on transparent_hugepage=never splash quiet vt.handoff=7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1808412/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to