New submission from Dong-hee Na <>:

Compiling CPython with the PGO option is good for CPython performance but 
compile time is very painful since PGO profiling is executed with a single 

When I tested with run -m test --pgo -j8, it doesn't affect to optimized result 
with fast build time.

so I would like to provide the option for the number of workers for PGO build. 
and also with this feature, we can include more PGO tests more aggressively.

@vstinner, Do you have any suggestions for this option?
- a: ./configure --enable-optimizations --pgo-workers=8
- b: ./configure --enable-optimizations --with-concurrent-pgo
- c: ./configure --enable-optimizations (By detecting system cpu count)

Following metrics is the reference for decision making :)

## Build Time
real    4m42.799s

TO-BE(this case -j8): 
real    2m10.405s

## No performance regression
I didn't check how the environment is reliable but there looks no regression.
| Benchmark              | base    | workers               |
| 2to3                   | 409 ms  | 412 ms: 1.01x slower  |
| chaos                  | 115 ms  | 114 ms: 1.01x faster  |
| deltablue              | 6.66 ms | 6.59 ms: 1.01x faster |
| fannkuch               | 605 ms  | 611 ms: 1.01x slower  |
| float                  | 138 ms  | 129 ms: 1.07x faster  |
| go                     | 220 ms  | 215 ms: 1.02x faster  |
| hexiom                 | 10.3 ms | 10.1 ms: 1.02x faster |
| json_dumps             | 19.6 ms | 19.2 ms: 1.02x faster |
| json_loads             | 40.6 us | 39.7 us: 1.02x faster |
| logging_silent         | 180 ns  | 173 ns: 1.04x faster  |
| logging_simple         | 8.89 us | 8.81 us: 1.01x faster |
| nqueens                | 134 ms  | 136 ms: 1.01x slower  |
| pathlib                | 24.6 ms | 24.2 ms: 1.01x faster |
| pickle                 | 16.1 us | 15.9 us: 1.01x faster |
| pickle_dict            | 41.4 us | 38.1 us: 1.09x faster |
| pickle_list            | 6.27 us | 5.09 us: 1.23x faster |
| pickle_pure_python     | 499 us  | 492 us: 1.01x faster  |
| pidigits               | 285 ms  | 290 ms: 1.02x slower  |
| python_startup         | 12.1 ms | 12.2 ms: 1.01x slower |
| python_startup_no_site | 8.91 ms | 8.89 ms: 1.00x faster |
| raytrace               | 510 ms  | 500 ms: 1.02x faster  |
| regex_compile          | 211 ms  | 210 ms: 1.00x faster  |
| regex_effbot           | 4.99 ms | 4.88 ms: 1.02x faster |
| regex_v8               | 37.3 ms | 36.3 ms: 1.03x faster |
| richards               | 73.6 ms | 72.2 ms: 1.02x faster |
| scimark_fft            | 542 ms  | 552 ms: 1.02x slower  |
| scimark_lu             | 189 ms  | 184 ms: 1.03x faster  |
| scimark_monte_carlo    | 106 ms  | 106 ms: 1.01x slower  |
| scimark_sor            | 199 ms  | 196 ms: 1.01x faster  |
| spectral_norm          | 177 ms  | 176 ms: 1.01x faster  |
| unpack_sequence        | 64.9 ns | 63.7 ns: 1.02x faster |
| unpickle               | 21.5 us | 21.6 us: 1.00x slower |
| unpickle_list          | 7.69 us | 7.55 us: 1.02x faster |
| unpickle_pure_python   | 402 us  | 394 us: 1.02x faster  |
| xml_etree_parse        | 218 ms  | 217 ms: 1.01x faster  |
| xml_etree_iterparse    | 156 ms  | 156 ms: 1.01x faster  |
| xml_etree_generate     | 132 ms  | 131 ms: 1.01x faster  |
| xml_etree_process      | 92.8 ms | 91.5 ms: 1.02x faster |
| Geometric mean         | (ref)   | 1.02x faster          |

Benchmark hidden because not significant (8): logging_format, meteor_contest, 
nbody, pyflate, regex_dna, scimark_sparse_mat_mult, sqlite_synth, telco

assignee: corona10
components: Build
messages: 411888
nosy: corona10, gvanrossum, vstinner
priority: normal
severity: normal
status: open
title: Provide number of workers option for fast PGO build time
type: enhancement
versions: Python 3.11

Python tracker <>
Python-bugs-list mailing list

Reply via email to