wForget commented on issue #11542: URL: https://github.com/apache/incubator-gluten/issues/11542#issuecomment-3949611858
@marin-ma Thank you for your suggestion. I tried printing jemalloc stats
when the executor exits. The resident memory is significantly larger than
allocated memory. After adding the `narenas:10` jemalloc option, the job ran
successfully.
```
Allocated: 1906598072, active: 1933512704, metadata: 79745008 (n_thp 0),
resident: 3496644608, mapped: 4205060096, retained: 2011422720
```
Detailed jemalloc statistics:
```
___ Begin jemalloc statistics ___
Version: "5.3.0-0-g54eaed1d8b56b1aa528be3bdd1877e59c56fa90c"
Build-time option settings
config.cache_oblivious: true
config.debug: false
config.fill: true
config.lazy_lock: false
config.malloc_conf: ""
config.opt_safety_checks: false
config.prof: true
config.prof_libgcc: true
config.prof_libunwind: false
config.stats: true
config.utrace: false
config.xmalloc: false
Run-time option settings
opt.abort: false
opt.abort_conf: false
opt.cache_oblivious: true
opt.confirm_conf: false
opt.retain: true
opt.dss: "secondary"
opt.narenas: 348
opt.percpu_arena: "disabled"
opt.oversize_threshold: 8388608
opt.hpa: false
opt.hpa_slab_max_alloc: 65536
opt.hpa_hugification_threshold: 1992294
opt.hpa_hugify_delay_ms: 10000
opt.hpa_min_purge_interval_ms: 5000
opt.hpa_dirty_mult: "0.25"
opt.hpa_sec_nshards: 4
opt.hpa_sec_max_alloc: 32768
opt.hpa_sec_max_bytes: 262144
opt.hpa_sec_bytes_after_flush: 131072
opt.hpa_sec_batch_fill_extra: 0
opt.metadata_thp: "disabled"
opt.mutex_max_spin: 600
opt.background_thread: false (background_thread: false)
opt.dirty_decay_ms: 10000 (arenas.dirty_decay_ms: 10000)
opt.muzzy_decay_ms: 0 (arenas.muzzy_decay_ms: 0)
opt.lg_extent_max_active_fit: 6
opt.junk: "false"
opt.zero: false
opt.experimental_infallible_new: false
opt.tcache: true
opt.tcache_max: 32768
opt.tcache_nslots_small_min: 20
opt.tcache_nslots_small_max: 200
opt.tcache_nslots_large: 20
opt.lg_tcache_nslots_mul: 1
opt.tcache_gc_incr_bytes: 65536
opt.tcache_gc_delay_bytes: 0
opt.lg_tcache_flush_small_div: 1
opt.lg_tcache_flush_large_div: 1
opt.thp: "default"
opt.prof: true
opt.prof_prefix: "/tmp/pilot_gluten_heap_perf"
opt.prof_active: true (prof.active: true)
opt.prof_thread_active_init: true (prof.thread_active_init: true)
opt.lg_prof_sample: 19 (prof.lg_sample: 19)
opt.prof_accum: false
opt.lg_prof_interval: -1
opt.prof_gdump: false
opt.prof_final: false
opt.prof_leak: false
opt.prof_leak_error: false
opt.stats_print: false
opt.stats_print_opts: ""
opt.stats_print: false
opt.stats_print_opts: ""
opt.stats_interval: -1
opt.stats_interval_opts: ""
opt.zero_realloc: "free"
Profiling settings
prof.thread_active_init: true
prof.active: true
prof.gdump: false
prof.interval: 0
prof.lg_sample: 19
Arenas: 349
Quantum size: 16
Page size: 4096
Maximum thread-cached size class: 32768
Number of bin size classes: 36
Number of thread-cache bin size classes: 41
Number of large size classes: 196
Allocated: 1906598072, active: 1933512704, metadata: 79745008 (n_thp 0),
resident: 3496644608, mapped: 4205060096, retained: 2011422720
Count of realloc(non-null-ptr, 0) calls: 0
Background threads: 0, num_runs: 0, run_interval: 0 ns
n_lock_ops (#/sec) n_waiting (#/sec)
n_spin_acq (#/sec) n_owner_switch (#/sec) total_wait_ns (#/sec)
max_wait_ns max_n_thds
background_thread 698 3 0 0
0 0 349 1 0 0
0 0
max_per_bg_thd 0 0 0 0
0 0 0 0 0 0
0 0
ctl 2 0 0 0
0 0 2 0 0 0
0 0
prof 765304 3845 2 0
31 0 9329 46 1000033 5025
1000033 1
prof_thds_data 304 1 0 0
1 0 301 1 0 0
0 0
prof_dump 3 0 0 0
0 0 3 0 0 0
0 0
prof_recent_alloc 2 0 0 0
0 0 2 0 0 0
0 0
prof_recent_dump 2 0 0 0
0 0 2 0 0 0
0 0
prof_stats 2 0 0 0
0 0 2 0 0 0
0 0
Merged arenas stats:
assigned threads: 486
uptime: 199395607872
dss allocation precedence: "N/A"
decaying: time npages sweeps madvises purged
dirty: N/A 365649 347 2524 5481511
muzzy: N/A 0 0 0 0
allocated nmalloc (#/sec)
ndalloc (#/sec) nrequests (#/sec) nfill (#/sec)
nflush (#/sec)
small: 94007480 1714566 8615
1186562 5962 7215628 36259 463336 2328
214867 1079
large: 1812590592 1164656 5852
1162134 5839 1222933 6145 1164656 5852
34057 171
total: 1906598072 2879222 14468
2348696 11802 8438561 42404 1627992 8180
248924 1250
active: 1933512704
mapped: 4205060096
retained: 2011422720
base: 64714224
internal: 15030784
metadata_thp: 0
tcache_bytes: 36307016
tcache_stashed_bytes: 0
resident: 3496644608
abandoned_vm: 0
extent_avail: 5045
n_lock_ops (#/sec) n_waiting (#/sec)
n_spin_acq (#/sec) n_owner_switch (#/sec) total_wait_ns (#/sec)
max_wait_ns max_n_thds
large 397 1 0 0
0 0 397 1 0 0
0 0
extent_avail 2001391 10057 0 0
0 0 2472 12 0 0
0 0
extents_dirty 2496985 12547 0 0
2 0 3396 17 0 0
0 0
extents_muzzy 397 1 0 0
0 0 397 1 0 0
0 0
extents_retained 22876 114 0 0
0 0 1618 8 0 0
0 0
decay_dirty 5655 28 0 0
0 0 558 2 0 0
0 0
decay_muzzy 397 1 0 0
0 0 397 1 0 0
0 0
base 22286 111 0 0
0 0 1476 7 0 0
0 0
tcache_list 965 4 0 0
0 0 915 4 0 0
0 0
hpa_shard 0 0 0 0
0 0 0 0 0 0
0 0
hpa_shard_grow 0 0 0 0
0 0 0 0 0 0
0 0
hpa_sec 0 0 0 0
0 0 0 0 0 0
0 0
bins: size ind allocated nmalloc (#/sec) ndalloc
(#/sec) nrequests (#/sec) nshards curregs curslabs
nonfull_slabs regs pgs util nfills (#/sec) nflushes (#/sec)
nslabs nreslabs (#/sec) n_lock_ops (#/sec) n_waiting (#/sec)
n_spin_acq (#/sec) n_owner_switch (#/sec) total_wait_ns (#/sec)
max_wait_ns max_n_thds
8 0 70600 41946 210 33121
166 198931 999 1 8825 56 3
512 1 0.307 21919 110 9311 46 64
39 0 31745 159 0 0 0
0 491 2 0 0 0 0
16 1 509440 60411 303 28571
143 745984 3748 1 31840 175 5
256 1 0.710 22085 110 9827 49 191
17 0 32544 163 0 0 0
0 523 2 0 0 0 0
32 2 5025216 224897 1130 67859
341 1625175 8166 1 157038 1344 116
128 1 0.912 45322 227 12163 61 1398
22883 114 60029 301 0 0 0
0 1479 7 0 0 0
0
48 3 2931984 83742 420 22659
113 1034011 5196 1 61083 359 48
256 3 0.664 13070 65 4302 21 625
2825 14 18514 93 0 0 0
0 730 3 0 0 0
0
64 4 2629824 92683 465 51592
259 524518 2635 1 41091 669 38
64 1 0.959 30355 152 12693 63 768
16530 83 44349 222 0 0 0
0 731 3 0 0 0
0
80 5 2739680 51898 260 17652
88 473599 2379 1 34246 195 19
256 5 0.686 12808 64 8134 40 204
8976 45 21746 109 0 0 0
0 711 3 0 0 0
0
96 6 3444000 51358 258 15483
77 554977 2788 1 35875 311 18
128 3 0.901 11839 59 8207 41 321
107 0 20862 104 0 0 0
0 557 2 0 0 0 0
112 7 2061920 41370 207 22960
115 138613 696 1 18410 119 3
256 7 0.604 18317 92 10301 51 122
6 0 29187 146 0 0 0
0 531 2 0 0 0 0
128 8 4391808 73475 369 39164
196 163208 820 1 34311 1108 70
32 1 0.967 26624 133 11767 59 1261
4622 23 40290 202 0 0 0
0 769 3 0 0 0
0
160 9 2637920 20840 104 4353
21 89683 450 1 16487 162 3
128 5 0.795 1058 5 605 3 177
331 1 2289 11 0 0 0
0 532 2 0 0 0 0
192 10 2500032 51855 260 38834
195 272053 1367 1 13021 228 10
64 3 0.892 23571 118 10338 51 251
57 0 53476 268 0 0 0
0 717 3 0 0 0 0
224 11 2851744 364277 1830 351546
1766 589639 2963 1 12731 148 9
128 7 0.672 45104 226 10591 53 942
31800 159 620451 3117 0 0 0
0 1991 10 0 0 0
0
256 12 1400064 190799 958 185330
931 199567 1002 1 5469 376 32
16 1 0.909 11837 59 7060 35 1041
17919 90 360486 1811 0 0 0
0 944 4 0 0 0
0
320 13 10816320 125568 630 91767
461 202172 1015 1 33801 574 22
64 5 0.920 25472 128 10515 52 676
1937 9 134665 676 1 0 0
0 6263 31 1000033 5025 1000033
1
384 14 693120 4485 22 2680
13 3847 19 1 1805 73 8
32 3 0.772 537 2 398 2 448
31 0 4217 21 0 0 0
0 658 3 0 0 0 0
448 15 716800 14497 72 12897
64 28883 145 1 1600 43 7
64 7 0.581 10869 54 9335 46 67
11 0 20831 104 0 0 0
0 572 2 0 0 0 0
512 16 1647616 11155 56 7937
39 25998 130 1 3218 445 15
8 1 0.903 5504 27 3614 18 781
1001 5 10467 52 0 0 0
0 635 3 0 0 0
0
640 17 1374080 3614 18 1467
7 7909 39 1 2147 85 11
32 5 0.789 452 2 342 1 130
26 0 1590 7 0 0 0
0 660 3 0 0 0 0
768 18 1238016 11858 59 10246
51 30537 153 1 1612 118 10
16 3 0.853 7523 37 3351 16 411
1102 5 11890 59 0 0 0
0 611 3 0 0 0
0
896 19 1239168 11296 56 9913
49 19056 95 1 1383 61 5
32 7 0.708 8564 43 8605 43 80
9 0 17816 89 0 0 0
0 585 2 0 0 0 0
1024 20 1276928 4259 21 3012
15 6012 30 1 1247 359 45
4 1 0.868 1992 10 1621 8 645
659 3 5222 26 0 0 0
0 932 4 0 0 0 0
1280 21 2064640 11436 57 9823
49 10825 54 1 1613 114 10
16 5 0.884 1344 6 1018 5 695
78 0 3846 19 0 0 2
0 874 4 0 0 0 0
1536 22 1096704 3916 19 3202
16 6061 30 1 714 122 9
8 3 0.731 2918 14 1629 8 200
73 0 5210 26 0 0 0
0 526 2 0 0 0 0
1792 23 2057216 10293 51 9145
45 19479 97 1 1148 84 7
16 7 0.854 8579 43 8505 42 202
107 0 17754 89 0 0 0
0 533 2 0 0 0 0
2048 24 2684928 11112 55 9801
49 21221 106 1 1311 739 8
2 1 0.887 7385 37 5144 25 5095
3881 19 18399 92 0 0 0
0 830 4 0 0 0
0
2560 25 1704960 3198 16 2532
12 5296 26 1 666 115 11
8 5 0.723 1550 7 1328 6 205
53 0 3552 17 0 0 0
0 543 2 0 0 0 0
3072 26 2691072 19242 96 18366
92 31402 157 1 876 256 8
4 3 0.855 14728 74 6516 32 2110
3120 15 23815 119 0 0 0
0 557 2 0 0 0
0
3584 27 3333120 5432 27 4502
22 7863 39 1 930 164 18
8 7 0.708 2565 12 2270 11 251
1465 7 5582 28 0 0 0
0 585 2 0 0 0
0
4096 28 1384448 31401 157 31063
156 53567 269 1 338 338 0
1 1 1 20804 104 9010 45 31401
0 0 61679 309 0 0 0
0 508 2 0 0 0 0
5120 29 2094080 6003 30 5594
28 9210 46 1 409 120 5
4 5 0.852 3283 16 2929 14 2139
2286 11 8809 44 0 0 0
0 508 2 0 0 0
0
6144 30 1646592 28787 144 28519
143 44661 224 1 268 142 5
2 3 0.943 22500 113 7919 39 12505
9412 47 43369 217 0 0 0
0 485 2 0 0 0
0
7168 31 6752256 2014 10 1072
5 6236 31 1 942 265 48
4 7 0.888 806 4 755 3 914
64 0 2927 14 0 0 0
0 489 2 0 0 0 0
8192 32 9076736 25408 127 24300
122 39710 199 1 1108 1108 0
1 2 1 17512 88 8467 42 25408
0 0 51832 260 0 0 0
0 486 2 0 0 0 0
10240 33 2027520 1164 5 966
4 1782 8 1 198 106 4
2 5 0.933 459 2 487 2 590
85 0 1991 10 0 0 0
0 501 2 0 0 0 0
12288 34 1720320 18528 93 18388
92 23508 118 1 140 140 0
1 3 1 13955 70 5662 28 18528
0 0 38586 193 0 0 0
0 468 2 0 0 0 0
14336 35 1476608 349 1 246
1 435 2 1 103 55 1
2 7 0.936 126 0 148 0 160
48 0 874 4 0 0 0
0 457 2 0 0 0 0
large: size ind allocated nmalloc (#/sec) ndalloc
(#/sec) nrequests (#/sec) curlextents
16384 36 3145728 15551 78 15359
77 16319 82 192
20480 37 901120 1789 8 1745
8 40362 202 44
24576 38 1548288 5379 27 5316
26 5536 27 63
28672 39 14307328 2717 13 2218
11 3231 16 499
32768 40 28213248 71532 359 70671
355 89797 451 861
40960 41 860160 17153 86 17132
86 17153 86 21
49152 42 2457600 458054 2301 458004
2301 458054 2301 50
57344 43 286720 2694 13 2689
13 2694 13 5
65536 44 1376256 175700 882 175679
882 175700 882 21
81920 45 737280 24224 121 24215
121 24224 121 9
98304 46 294912 14284 71 14281
71 14284 71 3
114688 47 78217216 2763 13 2081
10 2763 13 682
131072 48 131072 44451 223 44450
223 44451 223 1
163840 49 163840 3891 19 3890
19 3891 19 1
196608 50 589824 69748 350 69745
350 69748 350 3
229376 51 458752 2071 10 2069
10 2071 10 2
262144 52 262144 42102 211 42101
211 42102 211 1
327680 53 1310720 2449 12 2445
12 2449 12 4
393216 54 393216 558 2 557
2 558 2 1
458752 55 0 267 1 267
1 267 1 0
524288 56 1572864 32868 165 32865
165 32868 165 3
655360 57 0 195 0 195
0 195 0 0
786432 58 0 166 0 166
0 166 0 0
917504 59 0 74 0 74
0 74 0 0
1048576 60 3145728 115968 582 115965
582 115968 582 3
1310720 61 1310720 15348 77 15347
77 15348 77 1
1572864 62 0 35 0 35
0 35 0 0
1835008 63 0 8 0 8
0 8 0 0
2097152 64 0 5673 28 5673
28 5673 28 0
2621440 65 2621440 19 0 18
0 19 0 1
3145728 66 3145728 35020 175 35019
175 35020 175 1
3670016 67 0 17 0 17
0 17 0 0
4194304 68 109051904 1473 7 1447
7 1473 7 26
5242880 69 0 33 0 33
0 33 0 0
6291456 70 0 19 0 19
0 19 0 0
---
12582912 74 12582912 16 0 15
0 16 0 1
---
16777216 76 0 1 0 1
0 1 0 0
---
25165824 78 0 12 0 12
0 12 0 0
---
41943040 81 0 1 0 1
0 1 0 0
50331648 82 0 2 0 2
0 2 0 0
---
67108864 84 1543503872 327 1 304
1 327 1 23
---
100663296 86 0 2 0 2
0 2 0 0
---
201326592 90 0 2 0 2
0 2 0 0
---
extents: size ind ndirty dirty nmuzzy muzzy
nretained retained ntotal total
4096 0 97 397312 0 0
103 421888 200 819200
8192 1 73 598016 0 0
98 802816 171 1400832
12288 2 54 663552 0 0
53 651264 107 1314816
16384 3 8 131072 0 0
46 753664 54 884736
20480 4 8 163840 0 0
39 798720 47 962560
24576 5 6 147456 0 0
26 638976 32 786432
28672 6 115 3297280 0 0
27 774144 142 4071424
32768 7 11 360448 0 0
21 688128 32 1048576
40960 8 114 4214784 0 0
33 1245184 147 5459968
49152 9 15 684032 0 0
15 704512 30 1388544
57344 10 6 319488 0 0
11 618496 17 937984
65536 11 1 65536 0 0
2 126976 3 192512
81920 12 37 2772992 0 0
14 1036288 51 3809280
98304 13 11 983040 0 0
12 1060864 23 2043904
114688 14 17 1884160 0 0
1 106496 18 1990656
131072 15 6 741376 0 0
9 1138688 15 1880064
163840 16 15 2224128 0 0
5 749568 20 2973696
196608 17 15 2752512 0 0
6 1105920 21 3858432
229376 18 12 2580480 0 0
5 1101824 17 3682304
262144 19 8 2007040 0 0
2 507904 10 2514944
327680 20 11 3117056 0 0
2 573440 13 3690496
393216 21 20 6959104 0 0
1 368640 21 7327744
458752 22 11 4628480 0 0
0 0 11 4628480
524288 23 5 2453504 0 0
1 483328 6 2936832
655360 24 15 8835072 0 0
4 2396160 19 11231232
786432 25 8 5787648 0 0
5 3522560 13 9310208
917504 26 8 6922240 0 0
3 2449408 11 9371648
1048576 27 5 4911104 0 0
224 233881600 229 238792704
1310720 28 7 8294400 0 0
8 9256960 15 17551360
1572864 29 5 7172096 0 0
14 20525056 19 27697152
1835008 30 4 7020544 0 0
25 42569728 29 49590272
2097152 31 4 8007680 0 0
332 675139584 336 683147264
2621440 32 5 11796480 0 0
6 13918208 11 25714688
3145728 33 8 23060480 0 0
46 143540224 54 166600704
3670016 34 9 30277632 0 0
8 28930048 17 59207680
4194304 35 5 19058688 0 0
27 112807936 32 131866624
5242880 36 277 1164173312 0 0
5 25681920 282 1189855232
6291456 37 27 141938688 0 0
7 43548672 34 185487360
7340032 38 1 6295552 0 0
4 29360128 5 35655680
8388608 39 0 0 0 0
2 16773120 2 16773120
---
12582912 41 0 0 0 0
2 22429696 2 22429696
14680064 42 0 0 0 0
1 14680064 1 14680064
16777216 43 0 0 0 0
1 16773120 1 16773120
---
33554432 47 0 0 0 0
3 100638720 3 100638720
---
50331648 49 0 0 0 0
2 100659200 2 100659200
---
67108864 51 0 0 0 0
5 335482880 5 335482880
---
Bytes in small extent cache: 0
HPA shard stats:
Purge passes: 0 (0 / sec)
Purges: 0 (0 / sec)
Hugeifies: 0 (0 / sec)
Dehugifies: 0 (0 / sec)
In full slabs:
npageslabs: 0 huge, 0 nonhuge
nactive: 0 huge, 0 nonhuge
ndirty: 0 huge, 0 nonhuge
nretained: 0 huge, 0 nonhuge
In empty slabs:
npageslabs: 0 huge, 0 nonhuge
nactive: 0 huge, 0 nonhuge
ndirty: 0 huge, 0 nonhuge
nretained: 0 huge, 0 nonhuge
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
