[gcc(refs/vendors/microsoft/heads/main)] Merge commit 'ffb521f157fec823907913602ed5cb73dd1c63e8' into merge-master-ffb521f157fec823907913602e

2024-05-06 Thread Eugene Rozenfeld via Gcc-cvs
https://gcc.gnu.org/g:da1eddccf7ec1836d0a00a5cc09e945d90e3de95

commit da1eddccf7ec1836d0a00a5cc09e945d90e3de95
Merge: 012850a7800 ffb521f157f
Author: MS Automation 
Date:   Sun May 5 13:29:43 2024 +

Merge commit 'ffb521f157fec823907913602ed5cb73dd1c63e8' into 
merge-master-ffb521f157fec823907913602ed5cb73dd1c63e8-2024-05-05T13-29-40

Diff:

 contrib/ChangeLog  | 5 +
 contrib/gcc-changelog/git_update_version.py| 4 +-
 gcc/BASE-VER   | 2 +-
 gcc/ChangeLog  |  1124 ++
 gcc/DATESTAMP  | 2 +-
 gcc/ada/ChangeLog  | 4 +
 gcc/ada/gnatvsn.ads| 2 +-
 gcc/analyzer/ChangeLog |31 +
 gcc/analyzer/analyzer.cc   |15 +-
 gcc/analyzer/analyzer.h| 4 +-
 gcc/analyzer/engine.cc | 2 +-
 gcc/analyzer/kf.cc |22 +
 gcc/analyzer/known-function-manager.cc |38 +-
 gcc/analyzer/known-function-manager.h  | 5 +
 gcc/analyzer/sm-file.cc| 8 +
 gcc/analyzer/sm-malloc.cc  | 1 +
 gcc/analyzer/sm-signal.cc  |11 +-
 gcc/bitmap.cc  | 5 +-
 gcc/c-family/ChangeLog |22 +
 gcc/c-family/c-attribs.cc  | 2 +-
 gcc/c-family/c-cppbuiltin.cc   | 5 +-
 gcc/c-family/c.opt | 2 +-
 gcc/c/ChangeLog| 5 +
 gcc/c/gimple-parser.cc |35 +
 gcc/cfgexpand.cc   |57 +-
 gcc/cfgrtl.cc  | 3 +-
 gcc/common.opt | 6 +
 gcc/common/config/i386/i386-common.cc  | 2 +-
 gcc/common/config/riscv/riscv-common.cc| 8 +-
 gcc/config.gcc |42 +-
 gcc/config/aarch64/aarch64-ldp-fusion.cc   | 8 -
 gcc/config/aarch64/aarch64-ldpstp.md   | 8 +-
 gcc/config/aarch64/aarch64-tuning-flags.def| 3 -
 gcc/config/aarch64/aarch64.cc  |   199 +-
 gcc/config/aarch64/tuning_models/emag.h| 2 +-
 gcc/config/aarch64/tuning_models/xgene1.h  | 2 +-
 gcc/config/arm/arm.cc  |69 +
 gcc/config/bpf/bpf-c.cc|88 +
 gcc/config/bpf/bpf.cc  |24 +-
 gcc/config/bpf/bpf.h   | 9 +-
 gcc/config/bpf/bpf.opt | 2 +-
 gcc/config/bpf/t-bpf   | 4 +
 gcc/config/darwin.opt  | 2 +-
 gcc/config/epiphany/epiphany.opt   | 2 +-
 gcc/config/gcn/gcn-hsa.h   | 2 +-
 gcc/config/gcn/gcn-opts.h  | 2 +
 gcc/config/gcn/gcn.cc  | 8 +
 gcc/config/gcn/gcn.h   | 2 +
 gcc/config/gcn/gcn.opt | 3 +
 gcc/config/gcn/mkoffload.cc| 9 +
 gcc/config/gcn/t-omp-device| 2 +-
 gcc/config/i386/i386-options.cc| 1 +
 gcc/config/i386/i386.md|25 +-
 gcc/config/loongarch/genopts/gen-evolution.awk |   230 +
 gcc/config/loongarch/genopts/genstr.sh |82 +-
 gcc/config/loongarch/genopts/isa-evolution.in  |10 +-
 gcc/config/loongarch/genopts/loongarch-strings | 5 +-
 gcc/config/loongarch/genopts/loongarch.opt.in  |43 +-
 gcc/config/loongarch/loongarch-c.cc|60 +-
 gcc/config/loongarch/loongarch-cpu.cc  |37 +-
 gcc/config/loongarch/loongarch-def.cc  |83 +-
 gcc/config/loongarch/loongarch-def.h   |37 +-
 gcc/config/loongarch/loongarch-driver.cc   | 8 +-
 gcc/config/loongarch/loongarch-evolution.cc|60 +
 ...oongarch-cpucfg-map.h => loongarch-evolution.h} |46 +-
 gcc/config/loongarch/loongarch-opts.cc |66 +-
 gcc/config/loongarch/loongarch-opts.h  |15 +-
 gcc/config/loongarch/loongarch-str.h   | 5 +-
 gcc/config/loongarch/loongarch.cc  |11 +-
 gcc/config/loongarch/loongarch.h   | 1 +
 gcc/config/loongarch/loongarch.md  |16 +-
 gcc/config/loongarch/loongarch.opt |43 +-
 gcc/config/loongarch/t-loongarch   |26 +-
 gcc/config/mips/i6400.md   | 6 +
 

[gcc/microsoft/heads/main] (249 commits) Merge commit 'ffb521f157fec823907913602ed5cb73dd1c63e8' int

2024-05-06 Thread Eugene Rozenfeld via Gcc-cvs
The branch 'microsoft/heads/main' was updated to point to:

 da1eddccf7e... Merge commit 'ffb521f157fec823907913602ed5cb73dd1c63e8' int

It previously pointed to:

 012850a7800... Merge commit 'ef2392236ec629351496d7f299d6a0956080e4d9' int

Diff:

Summary of changes (added commits):
---

  da1eddc... Merge commit 'ffb521f157fec823907913602ed5cb73dd1c63e8' int
  ffb521f... Remove m_nloops field from loop_versioning (*)
  fda7419... Daily bump. (*)
  6cec31d... Add prange entries in gimple-range-op.cc. (*)
  ff306c7... Implement operator_ge for prange (*)
  76fae40... Implement operator_gt for prange. (*)
  3a4ee6e... Implement operator_le for prange. (*)
  bfa2323... Implement operator_lt for prange. (*)
  bcb2269... Implement operator_equal for prange. (*)
  d1be4c9... Implement operator_not_equal for prange. (*)
  6b9e640... Implement operator_bitwise_or for prange. (*)
  e58f149... Implement operator_bitwise_and for prange. (*)
  f803b93... Implement operator_pointer_diff for prange. (*)
  86ff3c4... Implement pointer_plus_operator for prange. (*)
  54d3fd6... Implement operator_addr_expr for prange. (*)
  1a4f5d4... Implement operator_min and operator_max for prange. (*)
  95fce0d... Implement operator_cast for prange. (*)
  a91fd7b... Implement operator_cst for prange. (*)
  e7b6e96... Implement operator_identity for prange. (*)
  31377ee... Implement range-op dispatch for prange. (*)
  ddf039a... Add prange implementation for get_legacy_range. (*)
  7c1b136... Add hashing support for prange. (*)
  f859996... Add storage support for prange. (*)
  a7f1285... Add streaming support for prange. (*)
  64993a8... Implement basic prange class. (*)
  f589196... Minimal prange class showing inlining degradation to VRP. (*)
  3e3d115... Daily bump. (*)
  b69dac5... libbacktrace: add DLLS as they are loaded (*)
  04f24e4... Fix printing COMPOUND_EXPR in .original [PR23872] (*)
  8f3afb8... c++: initializer_list and EH [PR114935] (*)
  c943d7b... PR modula2/114929 extra for loop iteration count regression (*)
  cb5d904... c++: Implement __is_pointer built-in trait (*)
  9b51b3e... c++: Implement __is_volatile built-in trait (*)
  7dd8c90... c++: Implement __is_const built-in trait (*)
  467ca4a... RISC-V: miscll comment fixes [NFC] (*)
  301b954... docs: rtl: document GET_MODE_INNER (*)
  5219414... testsuite: fix analyzer C++ failures on Solaris [PR111475] (*)
  a2e9032... Add default bitmap obstack allocation check (*)
  c5245c8... Silence two instances of -Wcalloc-transposed-args (*)
  212d93f... libstdc++: Update powerpc-linux-gnu baseline_symbols (*)
  7a212ac... Avoid changing type in the type_hash_canon hash (*)
  fe40d52... cfgrtl: Fix MEM_EXPR update in duplicate_insn_chain [PR1149 (*)
  7117e1f... tree-inline: Add __builtin_stack_{save,restore} pair about  (*)
  87e35da... tree-optimization/114921 - _Float16 -> __bf16 isn't noop (*)
  f04dc89... c++: remove lookup_template_class's entering_scope flag (*)
  a561dc0... PR modula2/114929 for loop fails to iterate down to zero wh (*)
  f2d0116... Daily bump. (*)
  8367c99... [committed][RISC-V] Fix nearbyint failure on rv32 and forma (*)
  affd24b... libgfortran: Fix up the autoreconf warnings (*)
  9b54bea... [RFA][RISC-V] Improve constant synthesis for constants with (*)
  bd010e9... Regenerate gcc.pot (*)
  ff4dc8b... RISC-V: Add testcase for pr114734 (*)
  1e29da0... [committed] [RISC-V] Don't run new rounding tests on newlib (*)
  7317d62... c++: Clear is_unbraced_* when parsing declaration_seq_opt [ (*)
  67e66c9... modula2: Regenerate libgm2 Makefile.ins using correct inclu (*)
  bf33fe9... Improve SLP dump and graph (*)
  9b5c0be... Objective-C, NeXT, v2: Correct a regression in code-gen. (*)
  43dc430... PR modula2/113836 gm2 does not dump gimple or quadruples to (*)
  4bb21b5... fix single argument static_assert (*)
  a704554... lto-wrapper: Truncate files using -truncate driver option [ (*)
  985b5a9... Driver: Add new -truncate option (*)
  5eb25d1... libgomp: Add gfx90c, 1036 and 1103 declare variant tests (*)
  2f15787... c++: Implement C++26 P2573R2 - = delete("should have a reas (*)
  c59708f... Make graph dumps use graphviz format (*)
  5176402... Remove live-info global bitmap (*)
  66f49cc... s390: testsuite: Fix risbg-ll-2.c (*)
  02917ac... c++: Don't emit unused GMF partial specializations [PR11463 (*)
  6c4a745... s390: testsuite: Fix zero_bits_compound-1.c (*)
  bbe8359... middle-end/114579 - speed up add_scope_conflicts (*)
  ad30265... c++: Implement modules ABI for vtable emissions (*)
  fd48e67... Daily bump. (*)
  ff98aab... doc: Describe limitations re Ada, D, and Go on FreeBSD (*)
  a12cae9... c++: drop in-charge for dtors without vbases (*)
  76ca6e1... [committed] [RISC-V] Trivial pattern cleanup (*)
  fad93e7... [committed] [RISC-V] Fix detection of store pair fusion cas (*)
  1fbe1a5... libstdc++: Guard uses of is_pointer_interconvertible_v [PR1 (*)
  c3bc278... c++: const void* memchr 

[gcc(refs/vendors/microsoft/heads/main)] Merge commit 'ef2392236ec629351496d7f299d6a0956080e4d9' into merge-master-ef2392236ec629351496d7f299

2024-04-22 Thread Eugene Rozenfeld via Gcc-cvs
https://gcc.gnu.org/g:012850a7800c90506ecefda08067edf7c0035826

commit 012850a7800c90506ecefda08067edf7c0035826
Merge: b707d632683 ef2392236ec
Author: MS Automation 
Date:   Sun Apr 21 13:17:55 2024 +

Merge commit 'ef2392236ec629351496d7f299d6a0956080e4d9' into 
merge-master-ef2392236ec629351496d7f299d6a0956080e4d9-2024-04-21T13-17-53

Diff:

 ChangeLog  |38 +
 MAINTAINERS| 6 +-
 Makefile.in|33 +
 Makefile.tpl   |26 +-
 config/ChangeLog   |14 +
 config/acx.m4  |10 +
 config/lcmessage.m4| 4 +-
 configure  |80 +
 configure.ac   |30 +
 contrib/ChangeLog  |18 +
 contrib/check-params-in-docs.py|19 +-
 contrib/config-list.mk | 3 +-
 contrib/gcc-changelog/git_update_version.py| 3 +-
 gcc/ChangeLog  |  1472 ++
 gcc/DATESTAMP  | 2 +-
 gcc/ada/ChangeLog  | 9 +
 .../doc/gnat_rm/implementation_defined_aspects.rst |12 +
 .../doc/gnat_rm/implementation_defined_pragmas.rst |18 +
 gcc/ada/gnat_rm.texi   |  1649 +-
 gcc/ada/gnat_ugn.texi  | 4 +-
 gcc/analyzer/ChangeLog |85 +
 gcc/analyzer/access-diagram.cc |12 +-
 gcc/analyzer/analyzer.opt  | 2 +-
 gcc/analyzer/call-details.cc   |33 +-
 gcc/analyzer/call-summary.cc   |12 +
 gcc/analyzer/infinite-loop.cc  |22 +
 gcc/analyzer/infinite-recursion.cc |13 +
 gcc/analyzer/program-state.cc  | 4 +-
 gcc/analyzer/ranges.cc |15 +
 gcc/analyzer/ranges.h  | 4 +
 gcc/analyzer/region-model-manager.cc   | 2 +
 gcc/analyzer/region-model.cc   | 9 +-
 gcc/analyzer/region.cc | 2 +-
 gcc/analyzer/sm-taint.cc   |15 +-
 gcc/analyzer/store.cc  |20 +-
 gcc/asan.cc|29 +-
 gcc/attribs.cc | 7 +-
 gcc/auto-profile.cc| 1 -
 gcc/btfout.cc  |   169 +-
 gcc/builtins.cc| 2 +-
 gcc/c-family/ChangeLog |34 +
 gcc/c-family/c-common.cc   |15 +-
 gcc/c-family/c-pretty-print.cc | 2 +
 gcc/c-family/c-warn.cc | 1 -
 gcc/c-family/c.opt | 4 +
 gcc/c-family/c.opt.urls| 3 +
 gcc/c/ChangeLog|37 +
 gcc/c/c-decl.cc|53 +-
 gcc/c/c-typeck.cc  | 4 +-
 gcc/cfganal.cc |15 +-
 gcc/cfgloopmanip.cc| 2 +-
 gcc/cgraph.h   | 6 +
 gcc/cgraphunit.cc  | 2 +
 gcc/collect2.cc| 7 +-
 gcc/combine.cc |16 +-
 gcc/common.opt | 9 +
 gcc/common.opt.urls| 6 +
 gcc/common/config/riscv/riscv-common.cc| 1 -
 gcc/config.gcc |46 +-
 gcc/config/aarch64/aarch64-arches.def  | 2 +-
 gcc/config/aarch64/aarch64-c.cc| 6 +-
 gcc/config/aarch64/aarch64-cores.def   | 4 +-
 gcc/config/aarch64/aarch64-feature-deps.h  | 6 +-
 gcc/config/aarch64/aarch64-gnu.h   |68 +
 gcc/config/aarch64/aarch64-ldp-fusion.cc   | 2 +-
 gcc/config/aarch64/aarch64-option-extensions.def   |93 +-
 gcc/config/aarch64/aarch64-protos.h| 6 +-
 gcc/config/aarch64/aarch64-sve-builtins-base.cc|52 +-
 gcc/config/aarch64/aarch64-sve-builtins.cc |   104 +-
 gcc/config/aarch64/aarch64-sve-builtins.h  |18 +-
 gcc/config/aarch64/aarch64-sve.md  |22 +-
 gcc/config/aarch64/aarch64.cc  |   138 +-
 gcc/config/aarch64/aarch64.h   | 3 +-
 gcc/config/aarch64/aarch64.md  |23 +-
 

[gcc/microsoft/heads/main] (408 commits) Merge commit 'ef2392236ec629351496d7f299d6a0956080e4d9' int

2024-04-22 Thread Eugene Rozenfeld via Gcc-cvs
The branch 'microsoft/heads/main' was updated to point to:

 012850a7800... Merge commit 'ef2392236ec629351496d7f299d6a0956080e4d9' int

It previously pointed to:

 b707d632683... Merge commit 'ecd2c373720af214f20671f9e7d760548b51ef4a' int

Diff:

Summary of changes (added commits):
---

  012850a... Merge commit 'ef2392236ec629351496d7f299d6a0956080e4d9' int
  ef23922... Revert "RISC-V: Support highpart register overlap for widen (*)
  d37b34f... RISC-V: Add xfail test case for incorrect overlap on v0 (*)
  a7d01a7... Daily bump. (*)
  3afcb04... Revert "RISC-V: Fix overlap group incorrect overlap on v0" (*)
  f9a48fe... PR modula2/112893 full type checking between proctype and p (*)
  1690e47... RISC-V: Add xfail test case for wv insn highest overlap (*)
  f5447ea... Revert "RISC-V: Support highest overlap for wv instructions (*)
  9f10005... RISC-V: Add xfail test case for wv insn register overlap (*)
  0cbeafe... Revert "RISC-V: Support one more overlap for wv instruction (*)
  90ded75... Daily bump. (*)
  c23db3e... i386: Fix up *avx2_eq3 constraints [PR114783] (*)
  2afdecc... c-family: Allow arguments with NULLPTR_TYPE as sentinels [P (*)
  a39983b... c: Fix ICE with -g and -std=c23 related to incomplete types (*)
  d86472a... libstdc++: Simplify constraints on <=> for std::reference_w (*)
  eed7fb1... libstdc++: Support link chains in std::chrono::tzdb::locate (*)
  e8f0540... Update gcc sv.po (*)
  33bf8e5... internal-fn: Fix up expand_arith_overflow [PR114753] (*)
  1216460... middle-end: refactory vect_recog_absolute_difference to sim (*)
  9451b6c... Enable 'gcc.dg/pr114768.c' for nvptx target [PR114768] (*)
  ede01df... bpf: remove huge memory waste with string allocation. (*)
  d7190d0... bpf: support more instructions to match CO-RE relocations (*)
  4d4929f... d: Fix ICE in build_deref, at d/d-codegen.cc:1650 [PR111650 (*)
  9f29584... rtlanal: Fix set_noop_p for volatile loads or stores [PR114 (*)
  36f4c8a... libgcc: Another __divmodbitint4 bug fix [PR114762] (*)
  694fa37... [vxworks] avoid mangling __STDC_VERSION_LIMITS_H__ (*)
  85c187b... Daily bump. (*)
  e498ba9... Add nios2*-*-* to the list of obsolete targets (*)
  e243d0f... Fortran: Fix ICE and clear incorrect error messages [PR1147 (*)
  7eecc08... [testsuite] [i386] add -msse2 to tests that require it (*)
  0ea96af... [testsuite] [i386] work around fails with --enable-frame-po (*)
  36d0038... [testsuite] [arm] accept empty init for bfloat16 (*)
  ce2dfc5... [c++] [testsuite] adjust contracts9.C for negative addresse (*)
  df92df0... [testsuite] [aarch64] Require fpic effective target. (*)
  514c6b1... [testsuite] [i386] require fpic for pr111497.C (*)
  cc02ebf... [testsuite] xfail pr103798-2 in C++ on vxworks too [PR11370 (*)
  e965162... [testsuite] [analyzer] include sys/select.h if available (*)
  8a11709... [testsuite] [analyzer] require fork where used (*)
  5be4f20... [testsuite] [analyzer] skip access-mode: O_ACCMODE on vxwor (*)
  76a1bcc... [testsuite] [analyzer] avoid vxworks libc mode_t (*)
  5dfbc05... [testsuite] introduce strndup effective target (*)
  dcf0bd1... [libstdc++] [testsuite] disable SRA for compare_exchange_pa (*)
  5b17817... [libstdc++] [testsuite] xfail double-prec from_chars for fl (*)
  da3504a... [libstdc++] define zoneinfo_dir_override on vxworks (*)
  a2f4be3... AArch64: remove reliance on register allocator for simd/gpr (*)
  82d6d38... libgcc: Fix up __divmodbitint4 [PR114755] (*)
  6c152c9... internal-fn: Temporarily disable flag_trapv during .{ADD,SU (*)
  6e62ede... testsuite, rs6000: Fix builtins-6-p9-runnable.c for BE [PR1 (*)
  58a0b19... rs6000: Fix bcd test case (*)
  69576bc... Daily bump. (*)
  7c2a9db... libstdc++: Implement "Printing blank lines with println" fo (*)
  5705614... DOCUMENTATION_ROOT_URL vs. release branches [PR114738] (*)
  a9fefbf... libcpp: Regenerate aclocal.m4 and configure [PR 114748] (*)
  bf2b523... tree-optimization/114749 - reset partial vector decision fo (*)
  420ece6... GCN: Enable effective-target 'vect_long_long' (*)
  909c6fa... AVR: target/114752 - Fix ICE on inline asm const 64-bit flo (*)
  3cfe94a... libstdc++: Add include guard to simd-internal header (*)
  0fc7f3c... libstdc++: Avoid ill-formed types on ARM (*)
  299d14a... asan: Don't instrument .ABNORMAL_DISPATCHER [PR114743] (*)
  9c7cf5d... Daily bump. (*)
  eadd05d... PR modula2/114745: const cast causes ICE (*)
  f438acf... testsuite: Fix data check loop on vect-early-break_124-pr11 (*)
  48024a9... Fortran: ALLOCATE of fixed-length CHARACTER with SOURCE/MOL (*)
  4437482... libstdc++: Fix "extact" typos in comments (*)
  8eddd87... Document that vector_size works with typedefs [PR92880] (*)
  f949481... tree-optimization/114736 - SLP DFS walk issue (*)
  45a41ac... tree-optimization/114733 - neg induction fails for 1 elemen (*)
  a7578a0... OpenACC 2.7: Adjust acc_map_data/acc_unmap_data interaction (*)
  274f6bb... Fix some comment nits (*)
  

[gcc r14-9780] Don't set full_profile in auto-profile [PR113765]

2024-04-03 Thread Eugene Rozenfeld via Gcc-cvs
https://gcc.gnu.org/g:fe385c219994f6d5c1ffe00bcaf5a62c3d18caaf

commit r14-9780-gfe385c219994f6d5c1ffe00bcaf5a62c3d18caaf
Author: Eugene Rozenfeld 
Date:   Tue Mar 26 16:28:08 2024 -0700

Don't set full_profile in auto-profile [PR113765]

auto-profile currently doesn't guarantee that it will set probabilities
on all edges because of zero basic block counts. Normally those edges
just have probabilities set by the preceding profile_estimate pass but
under -O0 profile_estimate pass doesn't run. The patch removes setting
of full_profile to true in auto-profile.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
PR gcov-profile/113765
* auto-profile.cc (afdo_annotate_cfg): Don't set full_profile to 
true

Diff:
---
 gcc/auto-profile.cc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index e5407d32fbb..de59b94bcb3 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1580,7 +1580,6 @@ afdo_annotate_cfg (const stmt_set _stmts)
 }
   update_max_bb_count ();
   profile_status_for_fn (cfun) = PROFILE_READ;
-  cfun->cfg->full_profile = true;
   if (flag_value_profile_transformations)
 {
   gimple_value_profile_transformations ();


[gcc/microsoft/heads/main] (420 commits) Merge commit 'ecd2c373720af214f20671f9e7d760548b51ef4a' int

2024-03-26 Thread Eugene Rozenfeld via Gcc-cvs
The branch 'microsoft/heads/main' was updated to point to:

 b707d632683... Merge commit 'ecd2c373720af214f20671f9e7d760548b51ef4a' int

It previously pointed to:

 7c61ce995ef... Merge commit 'fd07a29e39f5347d6cef3e7042a32306f97a1719' int

Diff:

Summary of changes (added commits):
---

  b707d63... Merge commit 'ecd2c373720af214f20671f9e7d760548b51ef4a' int
  ecd2c37... Daily bump. (*)
  bb04a11... Daily bump. (*)
  a684581... PR modula2/11 trunc float malformed error cause ICE (*)
  c8a343f... PR modula2/114443 missing quote cause ICE (*)
  80a0cb3... analyzer: fix ICE and false positive with -Wanalyzer-deref- (*)
  2e4b337... hppa: Fix LO_SUM DLTIND14R address support in PRINT_OPERAND (*)
  5435850... libstdc++: Disable std::formatter specializations (LWG 3944 (*)
  3763fb8... libstdc++: Add __is_in_place_index_v helper and use it in < (*)
  f4605c5... libstdc++: Use std::type_identity_t in  as per (*)
  4a46a48... bitint: Fix bitfield loads in handle_cast [PR114433] (*)
  f92cf8c... bitint: Handle complex types in build_bitint_stmt_ssa_confl (*)
  8fc5593... predcom: Punt for steps which aren't multiples of access si (*)
  7a01cc7... xtensa: Add supplementary split pattern for "*addsubx" (*)
  e898586... Daily bump. (*)
  c2e28df... libstdc++: Destroy allocators in re-inserted container node (*)
  142cc4c... libstdc++: Constrain std::vector default constructor [PR113 (*)
  8539c56... libstdc++: Use feature test macros in   (*)
  90788dd... libstdc++: Regenerate  in maintainer mode (*)
  d1b8c22... libstdc++: Update docs on build process and generated files (*)
  ef2efc5... libstdc++: Fix Python scripts to output the correct filenam (*)
  f511854... libstdc++: Fix typos in MemoryChecker assertion messages in (*)
  16afbd9... libstdc++: Begin lifetime of storage in std::vector [ (*)
  c7a774e... analyzer: fixes to __atomic_{exchange,load,store} [PR114286 (*)
  4adb1a5... testsuite, Darwin: Use the IOKit framework in framework-1.c (*)
  71a44fa... libstdc++: Sync the atomic_link_flags implementation with G (*)
  e47330d... libstdc++, Darwin: Do not use dev/null as the file for exec (*)
  e0e9499... tree-optimization/114151 - revert PR114074 fix (*)
  1f25771... arc: Fix up arc_setup_incoming_varargs [PR114175] (*)
  c1fd458... LoongArch: Fix C23 (...) functions returning large aggregat (*)
  167ec6d... arm: [MVE intrinsics] Fix support for loads [PR target/1143 (*)
  a185d8a... alpha: Fix alpha_setup_incoming_varargs [PR114175] (*)
  8f85b46... rs6000: Fix up setup_incoming_varargs [PR114175] (*)
  8959ab6... openmp: Make c_omp_check_loop_binding_exprs diagnostics tra (*)
  94c3508... tree-optimization/114375 - disallow SLP discovery of permut (*)
  9eeca77... [PATCH v5 1/1] RISC-V: Add support for XCVbi extension in C (*)
  d91a0ce... [PATCH] RISC-V: Add XiangShan Nanhu microarchitecture. (*)
  c4845ed... c++: Fix handling of no-linkage decls for modules (*)
  2340973... Daily bump. (*)
  9ee19f9... c-c++-common/Wrestrict.c: fix some typos and enable for LLP (*)
  1579394... analyzer: fix ICEs due to sloppy types in bounds-checking [ (*)
  3c2827d... analyzer: support null operands in remove_ssa_names (*)
  dda96a9... libstdc++: Fix N3344 behavior on _Safe_iterator::_M_can_adv (*)
  5f6e085... libstdc++: Fix _Safe_local_iterator<>::_M_valid_range (*)
  b96c543... i386: Unify {general,timode}_scalar_chain::convert_op [PR11 (*)
  3be2b8f... Fortran: error recovery in frontend optimization [PR103715] (*)
  9316d02... testsuite: Turn errors back into warnings in arm/acle/cde-m (*)
  d2029bb... PR modula2/114380 Incorrect type specified in an error mess (*)
  3e3ad8e... Update gcc fr.po, sv.po (*)
  fd776d0... gcc_update: Add missing generated files (*)
  6cb5ef3... testsuite: Fix excess errors for new modules testcases on p (*)
  dfc9d1c... Add missing config/i386/zn4zn5.md file (*)
  d0aa0af... Add AMD znver5 processor enablement with scheduler model (*)
  9361f19... avr.md - Tweak xor insn constraints. (*)
  58753db... testsuite: Define _POSIX_C_SOURCE for test (*)
  942d470... Add missing hf/bf patterns. (*)
  f0fda1a... hppa: Improve handling of REG+D addresses when generating P (*)
  27d2e93... Daily bump. (*)
  3f3f0b7... Fortran: fix for absent array argument passed to optional d (*)
  8064107... hppa: Fix complaint about non-delegitimized UNSPEC UNSPEC_T (*)
  07fad7a... libstdc++: Implement N3644 on _Safe_iterator<> [PR114316] (*)
  f065c58... PR modula2/114296 ICE when attempting to create a constant  (*)
  2d454f9... d: Merge upstream dmd, druntime 855353a1d9 (*)
  b5490af... Daily bump. (*)
  218d174... i386: Fix setup of incoming varargs for (...) functions whi (*)
  3e8323e... bitint: Fix up stores to large/huge _BitInt bitfields [PR11 (*)
  ead3075... c++: Check module attachment instead of just purview when n (*)
  5213047... libcc1: fix  include (*)
  53fb2cf... Daily bump. (*)
  a6dab19... libgcc: Fix quotient and/or remainder negation in __divmodb (*)
 

AutoFDO tools for GCC

2024-03-25 Thread Eugene Rozenfeld via Gcc
Hello,

I've been the AutoFDO maintainer for the last 1.5 years. I've resurrected 
autoprofiledbootstrap build and made a number of other fixes/improvements 
(e.g., discriminator support).

The tools for AutoFDO (create_gcov, etc.) currently live in 
https://github.com/google/AutoFDO  repo and GCC AutoFDO documentation points 
users to that repo. That repo also has tools for LLVM AutoFDO.
https://github.com/google/AutoFDO  has several submodules: 
https://github.com/google/autofdo/blob/master/.gitmodules

I got a message from Snehasish (cc'd) that google intends to migrate the tools 
for LLVM to the LLVM repo and wants to archive 
https://github.com/google/AutoFDO. That will be a problem for AutoFDO in GCC. 
The idea to find a different home for GCC AutoFDO tools was discussed before on 
this alias but this becomes more urgent now. One idea was to build these tools 
from GCC repo and another was to produce gcov from perf tool directly. Andi 
(cc'd)  had some early unfinished prototype for latter.

Please let me know if you have thoughts on how we should proceed.

Thanks,

Eugene


RE: [EXTERNAL] Check that passes do not forget to define profile

2023-08-30 Thread Eugene Rozenfeld via Gcc-patches
Hi Jan,

These new checks are too strong for AutoFDO. For example, the edge 
probabilities are not guaranteed to be initialized (see 
afdo_calculate_branch_prob).
This currently breaks autoprofiledbootstrap build.

I suggest removing
cfun->cfg->full_profile = true;
from auto-profile.cc.

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Jan Hubicka via Gcc-patches
Sent: Thursday, August 24, 2023 6:15 AM
To: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Check that passes do not forget to define profile

Hi,
this patch extends verifier to check that all probabilities and counts are 
initialized if profile is supposed to be present.  This is a bit complicated by 
the posibility that we inline !flag_guess_branch_probability function into 
function with profile defined and in this case we need to stop verification.  
For this reason I added flag to cfg structure tracking this.

Bootstrapped/regtested x86_64-linux, comitted.

gcc/ChangeLog:

* cfg.h (struct control_flow_graph): New field full_profile.
* auto-profile.cc (afdo_annotate_cfg): Set full_profile to true.
* cfg.cc (init_flow): Set full_profile to false.
* graphite.cc (graphite_transform_loops): Set full_profile to false.
* lto-streamer-in.cc (input_cfg): Initialize full_profile flag.
* predict.cc (pass_profile::execute): Set full_profile to true.
* symtab-thunks.cc (expand_thunk): Set full_profile to true.
* tree-cfg.cc (gimple_verify_flow_info): Verify that profile is full
if full_profile is set.
* tree-inline.cc (initialize_cfun): Initialize full_profile.
(expand_call_inline): Combine full_profile.


diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc index 
e3af3555e75..ff3b763945c 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1578,6 +1578,7 @@ afdo_annotate_cfg (const stmt_set _stmts)
 }
   update_max_bb_count ();
   profile_status_for_fn (cfun) = PROFILE_READ;
+  cfun->cfg->full_profile = true;
   if (flag_value_profile_transformations)
 {
   gimple_value_profile_transformations (); diff --git a/gcc/cfg.cc 
b/gcc/cfg.cc index 9eb9916f61a..b7865f14e7f 100644
--- a/gcc/cfg.cc
+++ b/gcc/cfg.cc
@@ -81,6 +81,7 @@ init_flow (struct function *the_fun)
 = ENTRY_BLOCK_PTR_FOR_FN (the_fun);
   the_fun->cfg->edge_flags_allocated = EDGE_ALL_FLAGS;
   the_fun->cfg->bb_flags_allocated = BB_ALL_FLAGS;
+  the_fun->cfg->full_profile = false;
 }
 

 /* Helper function for remove_edge and free_cffg.  Frees edge structure diff 
--git a/gcc/cfg.h b/gcc/cfg.h index a0e944979c8..53e2553012c 100644
--- a/gcc/cfg.h
+++ b/gcc/cfg.h
@@ -78,6 +78,9 @@ struct GTY(()) control_flow_graph {
   /* Dynamically allocated edge/bb flags.  */
   int edge_flags_allocated;
   int bb_flags_allocated;
+
+  /* Set if the profile is computed on every edge and basic block.  */  
+ bool full_profile;
 };
 
 
diff --git a/gcc/graphite.cc b/gcc/graphite.cc index 19f8975ffa2..2b387d5b016 
100644
--- a/gcc/graphite.cc
+++ b/gcc/graphite.cc
@@ -512,6 +512,8 @@ graphite_transform_loops (void)
 
   if (changed)
 {
+  /* FIXME: Graphite does not update profile meaningfully currently.  */
+  cfun->cfg->full_profile = false;
   cleanup_tree_cfg ();
   profile_status_for_fn (cfun) = PROFILE_ABSENT;
   release_recorded_exits (cfun);
diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc index 
0cce14414ca..d3128fcebe4 100644
--- a/gcc/lto-streamer-in.cc
+++ b/gcc/lto-streamer-in.cc
@@ -1030,6 +1030,7 @@ input_cfg (class lto_input_block *ib, class data_in 
*data_in,
   basic_block p_bb;
   unsigned int i;
   int index;
+  bool full_profile = false;
 
   init_empty_tree_cfg_for_function (fn);
 
@@ -1071,6 +1072,8 @@ input_cfg (class lto_input_block *ib, class data_in 
*data_in,
  data_in->location_cache.input_location_and_block (>goto_locus,
, ib, data_in);
  e->probability = profile_probability::stream_in (ib);
+ if (!e->probability.initialized_p ())
+   full_profile = false;
 
}
 
@@ -1145,6 +1148,7 @@ input_cfg (class lto_input_block *ib, class data_in 
*data_in,
 
   /* Rebuild the loop tree.  */
   flow_loops_find (loops);
+  cfun->cfg->full_profile = full_profile;
 }
 
 
diff --git a/gcc/predict.cc b/gcc/predict.cc index 5a1a561cc24..396746cbfd1 
100644
--- a/gcc/predict.cc
+++ b/gcc/predict.cc
@@ -4131,6 +4131,7 @@ pass_profile::execute (function *fun)
 scev_initialize ();
 
   tree_estimate_probability (false);
+  cfun->cfg->full_profile = true;
 
   if (nb_loops > 1)
 scev_finalize ();
diff --git a/gcc/symtab-thunks.cc b/gcc/symtab-thunks.cc index 
4c04235c41b..23ead0d2138 100644
--- a/gcc/symtab-thunks.cc
+++ b/gcc/symtab-thunks.cc
@@ -648,6 +648,7 @@ expand_thunk (cgraph_node *node, bool output_asm_thunks,
  ? PROFILE_READ : PROFILE_GUESSED;
   /* FIXME: C++ FE should stop setting TREE_ASM_WRITTEN on 

RE: [EXTERNAL] Re: [PATCH] Collect both user and kernel events for autofdo tests and autoprofiledbootstrap

2023-07-05 Thread Eugene Rozenfeld via Gcc-patches
There is no warning and perf /uk succeeds when kptr_restrict is set to 1 and 
perf_event_paranoid set to 2. However, create_gcov may fail since it won't be 
able to understand kernel addresses and it requires at least 95% of events to 
be successfully mapped.

If I set both kptr_restrict and perf_event_paranoid to 1, then I do get 
warnings from perf (but it still succeeds and exits with a 0 code). And, of 
course create_gcov will also fail to map some events since it won't understand 
kernel addresses.

WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.

Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.

Samples in kernel modules won't be resolved at all.

If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.

Couldn't record kernel reference relocation symbol
Symbol resolution may be skewed if relocation was used (e.g. kexec).
Check /proc/kallsyms permission or run as root.
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.037 MB 
/home/erozen/gcc1_objdir/gcc/testsuite/gcc/indir-call-prof.perf.data (86 
samples) ]

Eugene

-Original Message-
From: Richard Biener  
Sent: Monday, July 3, 2023 12:47 AM
To: Eugene Rozenfeld 
Cc: Sam James ; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH] Collect both user and kernel events for 
autofdo tests and autoprofiledbootstrap

On Sat, Jul 1, 2023 at 12:05 AM Eugene Rozenfeld 
 wrote:
>
> I also set /proc/sys/kernel/perf_event_paranoid to 1 instead of the default 2.

Does the perf attempt fail when the privileges are not adjusted and you specify 
--all?  I see it adds /uk as flags, when I do

> perf record -e instructions//uk ./a.out

it doesn't complain in any way with

> cat /proc/sys/kernel/kptr_restrict
1
> cat /proc/sys/kernel/perf_event_paranoid
2

so in case the 'kernel' side is simply ignored when profiling there isn't 
permitted/possible then I guess the patch is OK?

Can you confirm?

Thanks,
Richard.

> -Original Message-
> From: Gcc-patches 
>  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Friday, June 30, 2023 2:44 PM
> To: Sam James ; Richard Biener 
> 
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [EXTERNAL] Re: [PATCH] Collect both user and kernel 
> events for autofdo tests and autoprofiledbootstrap
>
> I don't run this with elevated privileges but I set 
> /proc/sys/kernel/kptr_restrict to 0. Setting that does require elevated 
> privileges.
>
> If that's not acceptable, the only fix I can think of is to make that event 
> mapping threshold percentage a parameter to create_gcov and pass something 
> low enough. 80% instead of the current threshold of 95% should work, although 
> it's a bit fragile.
>
> Eugene
>
> -Original Message-
> From: Sam James 
> Sent: Friday, June 30, 2023 1:59 AM
> To: Richard Biener 
> Cc: Eugene Rozenfeld ; 
> gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: [PATCH] Collect both user and kernel events 
> for autofdo tests and autoprofiledbootstrap
>
> [You don't often get email from s...@gentoo.org. Learn why this is 
> important at https://aka.ms/LearnAboutSenderIdentification ]
>
> Richard Biener via Gcc-patches  writes:
>
> > On Fri, Jun 30, 2023 at 7:28 AM Eugene Rozenfeld via Gcc-patches 
> >  wrote:
> >>
> >> When we collect just user events for autofdo with lbr we get some 
> >> events where branch sources are kernel addresses and branch targets 
> >> are user addresses. Without kernel MMAP events create_gcov can't 
> >> make sense of kernel addresses. Currently create_gcov fails if it 
> >> can't map at least 95% of events. We sometimes get below this threshold 
> >> with just user events. The change is to collect both user events and 
> >> kernel events.
> >
> > Does this require elevated privileges?  Can we instead "fix" create_gcov 
> > here?
>
> Right, requiring privileges for this is going to be a no-go for a lot of 
> builders. In a distro context, for example, it means we can't consider 
> autofdo at all.


RE: [EXTERNAL] Re: [PATCH] Collect both user and kernel events for autofdo tests and autoprofiledbootstrap

2023-06-30 Thread Eugene Rozenfeld via Gcc-patches
I also set /proc/sys/kernel/perf_event_paranoid to 1 instead of the default 2.

-Original Message-
From: Gcc-patches  On 
Behalf Of Eugene Rozenfeld via Gcc-patches
Sent: Friday, June 30, 2023 2:44 PM
To: Sam James ; Richard Biener 
Cc: gcc-patches@gcc.gnu.org
Subject: RE: [EXTERNAL] Re: [PATCH] Collect both user and kernel events for 
autofdo tests and autoprofiledbootstrap

I don't run this with elevated privileges but I set 
/proc/sys/kernel/kptr_restrict to 0. Setting that does require elevated 
privileges.

If that's not acceptable, the only fix I can think of is to make that event 
mapping threshold percentage a parameter to create_gcov and pass something low 
enough. 80% instead of the current threshold of 95% should work, although it's 
a bit fragile.

Eugene

-Original Message-
From: Sam James 
Sent: Friday, June 30, 2023 1:59 AM
To: Richard Biener 
Cc: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] Collect both user and kernel events for autofdo 
tests and autoprofiledbootstrap

[You don't often get email from s...@gentoo.org. Learn why this is important at 
https://aka.ms/LearnAboutSenderIdentification ]

Richard Biener via Gcc-patches  writes:

> On Fri, Jun 30, 2023 at 7:28 AM Eugene Rozenfeld via Gcc-patches 
>  wrote:
>>
>> When we collect just user events for autofdo with lbr we get some 
>> events where branch sources are kernel addresses and branch targets 
>> are user addresses. Without kernel MMAP events create_gcov can't make 
>> sense of kernel addresses. Currently create_gcov fails if it can't 
>> map at least 95% of events. We sometimes get below this threshold with just 
>> user events. The change is to collect both user events and kernel events.
>
> Does this require elevated privileges?  Can we instead "fix" create_gcov here?

Right, requiring privileges for this is going to be a no-go for a lot of 
builders. In a distro context, for example, it means we can't consider autofdo 
at all.


RE: [EXTERNAL] Re: [PATCH] Collect both user and kernel events for autofdo tests and autoprofiledbootstrap

2023-06-30 Thread Eugene Rozenfeld via Gcc-patches
I don't run this with elevated privileges but I set 
/proc/sys/kernel/kptr_restrict to 0. Setting that does require elevated 
privileges.

If that's not acceptable, the only fix I can think of is to make that event 
mapping threshold percentage a parameter to create_gcov and pass something low 
enough. 80% instead of the current threshold of 95% should work, although it's 
a bit fragile.

Eugene

-Original Message-
From: Sam James  
Sent: Friday, June 30, 2023 1:59 AM
To: Richard Biener 
Cc: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] Collect both user and kernel events for autofdo 
tests and autoprofiledbootstrap

[You don't often get email from s...@gentoo.org. Learn why this is important at 
https://aka.ms/LearnAboutSenderIdentification ]

Richard Biener via Gcc-patches  writes:

> On Fri, Jun 30, 2023 at 7:28 AM Eugene Rozenfeld via Gcc-patches 
>  wrote:
>>
>> When we collect just user events for autofdo with lbr we get some 
>> events where branch sources are kernel addresses and branch targets 
>> are user addresses. Without kernel MMAP events create_gcov can't make 
>> sense of kernel addresses. Currently create_gcov fails if it can't 
>> map at least 95% of events. We sometimes get below this threshold with just 
>> user events. The change is to collect both user events and kernel events.
>
> Does this require elevated privileges?  Can we instead "fix" create_gcov here?

Right, requiring privileges for this is going to be a no-go for a lot of 
builders. In a distro context, for example, it means we can't consider autofdo 
at all.


[PATCH] Collect both user and kernel events for autofdo tests and autoprofiledbootstrap

2023-06-29 Thread Eugene Rozenfeld via Gcc-patches
When we collect just user events for autofdo with lbr we get some events where 
branch
sources are kernel addresses and branch targets are user addresses. Without 
kernel MMAP
events create_gcov can't make sense of kernel addresses. Currently create_gcov 
fails if
it can't map at least 95% of events. We sometimes get below this threshold with 
just
user events. The change is to collect both user events and kernel events.

Tested on x86_64-pc-linux-gnu.

ChangeLog:

* Makefile.in: Collect both kernel and user events for autofdo
* Makefile.tpl: Collect both kernel and user events for autofdo

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Collect both kernel and user events for 
autofdo
---
 Makefile.in   | 2 +-
 Makefile.tpl  | 2 +-
 gcc/testsuite/lib/target-supports.exp | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index f19a9db621e..04307ca561b 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -404,7 +404,7 @@ MAKEINFO = @MAKEINFO@
 EXPECT = @EXPECT@
 RUNTEST = @RUNTEST@
 
-AUTO_PROFILE = gcc-auto-profile -c 1000
+AUTO_PROFILE = gcc-auto-profile --all -c 1000
 
 # This just becomes part of the MAKEINFO definition passed down to
 # sub-makes.  It lets flags be given on the command line while still
diff --git a/Makefile.tpl b/Makefile.tpl
index 3a5b7ed3c92..d0fe7e2fb77 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -407,7 +407,7 @@ MAKEINFO = @MAKEINFO@
 EXPECT = @EXPECT@
 RUNTEST = @RUNTEST@
 
-AUTO_PROFILE = gcc-auto-profile -c 1000
+AUTO_PROFILE = gcc-auto-profile --all -c 1000
 
 # This just becomes part of the MAKEINFO definition passed down to
 # sub-makes.  It lets flags be given on the command line while still
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 4d04df2a709..b16853d76df 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -704,7 +704,7 @@ proc check_effective_target_keeps_null_pointer_checks { } {
 # this allows parallelism of 16 and higher of parallel gcc-auto-profile
 proc profopt-perf-wrapper { } {
 global srcdir
-return "$srcdir/../config/i386/gcc-auto-profile -m8 "
+return "$srcdir/../config/i386/gcc-auto-profile --all -m8 "
 }
 
 # Return true if profiling is supported on the target.
-- 
2.25.1



[PATCH] Fix collection and processing of autoprofile data for target libs

2023-06-27 Thread Eugene Rozenfeld via Gcc-patches
cc1, cc1plus, and lto built during STAGEautoprofile need to be built with
debug info since they are used to build target libs. -gtoggle was
turning off debug info for this stage.

create_gcov should be passed prev-gcc/cc1, prev-gcc/cc1plus, and prev-gcc/lto
instead of stage1-gcc/cc1, stage1-gcc/cc1plus, and stage1-gcc/lto when
processing profile data collected while building target libraries.

Tested on x86_64-pc-linux-gnu.

ChangeLog:

* Makefile.in: Remove -gtoggle for STAGEautoprofile
* Makefile.tpl: Remove -gtoggle for STAGEautoprofile

gcc/c/ChangeLog:

* c/Make-lang.in: Pass correct stage cc1 when processing
profile data collected while building target libraries

gcc/cp/ChangeLog:

* cp/Make-lang.in: Pass correct stage cc1plus when processing
profile data collected while building target libraries

gcc/lto/ChangeLog:

* lto/Make-lang.in: Pass correct stage lto when processing
profile data collected while building target libraries
---
 Makefile.in  | 2 +-
 Makefile.tpl | 2 +-
 gcc/c/Make-lang.in   | 4 ++--
 gcc/cp/Make-lang.in  | 4 ++--
 gcc/lto/Make-lang.in | 2 +-
 5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index b559454cc90..61e5faf550f 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -635,7 +635,7 @@ STAGEtrain_TFLAGS = $(filter-out 
-fchecking=1,$(STAGE3_TFLAGS))
 STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
-fprofile-reproducible=parallel-runs
 STAGEfeedback_TFLAGS = $(STAGE4_TFLAGS)
 
-STAGEautoprofile_CFLAGS = $(STAGE2_CFLAGS) -g
+STAGEautoprofile_CFLAGS = $(filter-out -gtoggle,$(STAGE2_CFLAGS)) -g
 STAGEautoprofile_TFLAGS = $(STAGE2_TFLAGS)
 
 STAGEautofeedback_CFLAGS = $(STAGE3_CFLAGS)
diff --git a/Makefile.tpl b/Makefile.tpl
index 6bcee3021c9..3a5b7ed3c92 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -558,7 +558,7 @@ STAGEtrain_TFLAGS = $(filter-out 
-fchecking=1,$(STAGE3_TFLAGS))
 STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
-fprofile-reproducible=parallel-runs
 STAGEfeedback_TFLAGS = $(STAGE4_TFLAGS)
 
-STAGEautoprofile_CFLAGS = $(STAGE2_CFLAGS) -g
+STAGEautoprofile_CFLAGS = $(filter-out -gtoggle,$(STAGE2_CFLAGS)) -g
 STAGEautoprofile_TFLAGS = $(STAGE2_TFLAGS)
 
 STAGEautofeedback_CFLAGS = $(STAGE3_CFLAGS)
diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
index 20840aceab6..79bc0dfd1cf 100644
--- a/gcc/c/Make-lang.in
+++ b/gcc/c/Make-lang.in
@@ -113,10 +113,10 @@ create_fdas_for_cc1: ../stage1-gcc/cc1$(exeext) 
../prev-gcc/$(PERF_DATA)
  echo $$perf_path; \
  if [ -f $$perf_path ]; then \
profile_name=cc1_$$component_in_prev_target.fda; \
-   $(CREATE_GCOV) -binary ../stage1-gcc/cc1$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \
+   $(CREATE_GCOV) -binary ../prev-gcc/cc1$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \
  fi; \
done;
-#

+#
 # Build hooks:
 
 c.info:
diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index c08ee91447e..ba5e8766e99 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -211,10 +211,10 @@ create_fdas_for_cc1plus: ../stage1-gcc/cc1plus$(exeext) 
../prev-gcc/$(PERF_DATA)
  echo $$perf_path; \
  if [ -f $$perf_path ]; then \
profile_name=cc1plus_$$component_in_prev_target.fda; \
-   $(CREATE_GCOV) -binary ../stage1-gcc/cc1plus$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \
+   $(CREATE_GCOV) -binary ../prev-gcc/cc1plus$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \
  fi; \
done;
-#

+#
 # Build hooks:
 
 c++.all.cross: g++-cross$(exeext)
diff --git a/gcc/lto/Make-lang.in b/gcc/lto/Make-lang.in
index 4f6025100a3..98aa9f4cc39 100644
--- a/gcc/lto/Make-lang.in
+++ b/gcc/lto/Make-lang.in
@@ -130,7 +130,7 @@ create_fdas_for_lto1: ../stage1-gcc/lto1$(exeext) 
../prev-gcc/$(PERF_DATA)
  echo $$perf_path; \
  if [ -f $$perf_path ]; then \
profile_name=lto1_$$component_in_prev_target.fda; \
-   $(CREATE_GCOV) -binary ../stage1-gcc/lto1$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \
+   $(CREATE_GCOV) -binary ../prev-gcc/lto1$(exeext) -gcov 
$$profile_name -profile $$perf_path -gcov_version 2; \
  fi; \
done;
 
-- 
2.25.1



RE: [EXTERNAL] [PATCH] Update perf auto profile script

2023-06-05 Thread Eugene Rozenfeld via Gcc-patches
Ok for trunk. Thank you for updating this!

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Andi Kleen via Gcc-patches
Sent: Tuesday, May 30, 2023 4:08 AM
To: gcc-patches@gcc.gnu.org
Cc: Andi Kleen 
Subject: [EXTERNAL] [PATCH] Update perf auto profile script

- Fix gen_autofdo_event: The download URL for the Intel Perfmon Event
  list has changed, as well as the JSON format.
  Also it now uses pattern matching to match CPUs. Update the script to support 
all of this.
- Regenerate gcc-auto-profile with the latest published Intel model
  numbers, so it works with recent systems.
- So far it's still broken on hybrid systems
---
 contrib/gen_autofdo_event.py | 7 ---
 gcc/config/i386/gcc-auto-profile | 9 -
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py index 
ac23b83888db..533c706c090b 100755
--- a/contrib/gen_autofdo_event.py
+++ b/contrib/gen_autofdo_event.py
@@ -32,8 +32,9 @@ import json
 import argparse
 import collections
 import os
+import fnmatch

-baseurl = "https://download.01.org/perfmon;
+baseurl = "https://raw.githubusercontent.com/intel/perfmon/main;

 target_events = ('BR_INST_RETIRED.NEAR_TAKEN',
  'BR_INST_EXEC.TAKEN',
@@ -74,7 +75,7 @@ def get_cpustr():
 def find_event(eventurl, model):
 print("Downloading", eventurl, file = sys.stderr)
 u = urllib.request.urlopen(eventurl)
-events = json.loads(u.read())
+events = json.loads(u.read())["Events"]
 u.close()

 found = 0
@@ -102,7 +103,7 @@ found = 0
 cpufound = 0
 for j in u:
 n = j.rstrip().decode().split(',')
-if len(n) >= 4 and (args.all or n[0] == cpu) and n[3] == "core":
+if len(n) >= 4 and (args.all or fnmatch.fnmatch(cpu, n[0])) and n[3] == 
"core":
 components = n[0].split("-")
 model = components[2]
 model = int(model, 16)
diff --git a/gcc/config/i386/gcc-auto-profile b/gcc/config/i386/gcc-auto-profile
index 5ab224b041b9..04f7d35dcc51 100755
--- a/gcc/config/i386/gcc-auto-profile
+++ b/gcc/config/i386/gcc-auto-profile
@@ -43,8 +43,10 @@ model*:\ 47|\
 model*:\ 37|\
 model*:\ 44) E="cpu/event=0x88,umask=0x40/$FLAGS" ;;  model*:\ 55|\
+model*:\ 74|\
 model*:\ 77|\
 model*:\ 76|\
+model*:\ 90|\
 model*:\ 92|\
 model*:\ 95|\
 model*:\ 87|\
@@ -75,14 +77,19 @@ model*:\ 165|\
 model*:\ 166|\
 model*:\ 85|\
 model*:\ 85) E="cpu/event=0xC4,umask=0x20/p$FLAGS" ;;
+model*:\ 125|\
 model*:\ 126|\
+model*:\ 167|\
 model*:\ 140|\
 model*:\ 141|\
 model*:\ 143|\
+model*:\ 207|\
 model*:\ 106|\
 model*:\ 108) E="cpu/event=0xc4,umask=0x20/p$FLAGS" ;;  model*:\ 134|\ 
-model*:\ 150) E="cpu/event=0xc4,umask=0xfe/p$FLAGS" ;;
+model*:\ 150|\
+model*:\ 156|\
+model*:\ 190) E="cpu/event=0xc4,umask=0xfe/p$FLAGS" ;;
 *)
 echo >&2 "Unknown CPU. Run contrib/gen_autofdo_event.py --all --script to 
update script."
exit 1 ;;
--
2.40.1



RE: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings during autoprofiledbootstrap build

2023-05-17 Thread Eugene Rozenfeld via Gcc-patches
rse profiledbootstrap and autoprofiledbootstrap are not build configs 
>> but make targets - that makes it more difficult (or impossible) to use the 
>> --disable-werror machinery here.
>>
>> There is
>>
>> STAGE_CONFIGURE_FLAGS=@stage2_werror_flag@
>>
>> so it might be possible to filter out --enable-werror-always from 
>> STAGEautofeedback_CONFIGURE_FLAGS?
>>
>> Richard.
>>
>> > Thanks,
>> >
>> > Eugene
>> >
>> > -Original Message-
>> > From: Richard Biener 
>> > Sent: Tuesday, May 9, 2023 11:40 PM
>> > To: Eugene Rozenfeld 
>> > Cc: gcc-patches@gcc.gnu.org
>> > Subject: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings 
>> > during autoprofiledbootstrap build
>> >
>> > On Wed, May 10, 2023 at 3:38 AM Eugene Rozenfeld via Gcc-patches 
>> >  wrote:
>> > >
>> > > autoprofiledbootstrap build produces new warnings since inlining 
>> > > decisions are different from other builds. This patch contains 
>> > > fixes and workarounds for those warnings.
>> > >
>> > > Tested on x86_64-pc-linux-gnu.
>> >
>> > Rather than this would it make sense to add --disable-werror to 
>> > autoprofiledbootstrap configs like we do for others?  I also wonder how 
>> > "stable" the afdo bootstrap inlining decisions are, so applying these 
>> > workarounds may not be sustainable?
>> >
>> > > gcc/ChangeLog:
>> > >
>> > > * config/i386/i386-expand.cc (expand_vec_perm_interleave2): Work 
>> > > around
>> > > -Wstringop-overflow false positive during autoprofiledbootstrap
>> > > * ipa-devirt.cc (debug_tree_odr_name): Fix for -Wformat-overflow
>> > > warning during autoprofiledbootstrap
>> > > * lra-eliminations.cc (setup_can_eliminate): Work around
>> > > -Wmaybe-uninitialized false positive during autoprofiledbootstrap
>> > > * opts-common.cc (candidates_list_and_hint): Work around
>> > > -Wstringop-overflow false positive during autoprofiledbootstrap
>> > > * tree-ssa-ccp.cc (bit_value_unop): Work around 
>> > > -Wmaybe-uninitialized
>> > > false positive during autoprofiledbootstrap
>> > > * wide-int.h (wi::copy): Work around -Wmaybe-uninitialized false
>> > > positive during autoprofiledbootstrap
>> > > ---
>> > >  gcc/config/i386/i386-expand.cc | 11 +++
>> > >  gcc/ipa-devirt.cc  |  3 ++-
>> > >  gcc/lra-eliminations.cc| 11 +++
>> > >  gcc/opts-common.cc |  1 +
>> > >  gcc/tree-ssa-ccp.cc| 11 +++
>> > >  gcc/wide-int.h | 11 +++
>> > >  6 files changed, 47 insertions(+), 1 deletion(-)
>> > >
>> > > diff --git a/gcc/config/i386/i386-expand.cc 
>> > > b/gcc/config/i386/i386-expand.cc index 634fe61ba79..be9f912775b
>> > > 100644
>> > > --- a/gcc/config/i386/i386-expand.cc
>> > > +++ b/gcc/config/i386/i386-expand.cc
>> > > @@ -20419,6 +20419,13 @@ expand_vec_perm_pblendv (struct 
>> > > expand_vec_perm_d *d)
>> > >
>> > >  static bool expand_vec_perm_interleave3 (struct 
>> > > expand_vec_perm_d *d);
>> > >
>> > > +/* Work around -Wstringop-overflow false positive during 
>> > > +autoprofiledbootstrap.  */
>> > > +
>> > > +# if GCC_VERSION >= 7001
>> > > +#pragma GCC diagnostic push
>> > > +#pragma GCC diagnostic ignored "-Wstringop-overflow"
>> > > +#endif
>> > > +
>> > >  /* A subroutine of ix86_expand_vec_perm_const_1.  Try to simplify
>> > > a two vector permutation into a single vector permutation by using
>> > > an interleave operation to merge the vectors.  */ @@ -20737,6
>> > > +20744,10 @@ expand_vec_perm_interleave2 (struct 
>> > > +expand_vec_perm_d
>> > > +*d)
>> > >return true;
>> > >  }
>> > >
>> > > +# if GCC_VERSION >= 7001
>> > > +#pragma GCC diagnostic pop
>> > > +#endif
>> > > +
>> > >  /* A subroutine of ix86_expand_vec_perm_const_1.  Try to simplify
>> > > a single vector cross-lane permutation into vpermq followed
>> > > by any of the s

RE: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings during autoprofiledbootstrap build

2023-05-12 Thread Eugene Rozenfeld via Gcc-patches
Thank you, Richard. I went with your suggestion. New patch:


[PATCH] Disable warnings as errors for STAGEautofeedback.

Compilation during STAGEautofeedback produces additional warnings
since inlining decisions with -fauto-profile are different from
other builds.

This patches disables warnings as errors for STAGEautofeedback.

Tested on x86_64-pc-linux-gnu.

ChangeLog:

* Makefile.in: Disable warnings as errors for STAGEautofeedback
---
 Makefile.in | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index 33f3c862557..4c14c73ea61 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -590,8 +590,7 @@ STAGEautofeedback_CXXFLAGS = $(CXXFLAGS)
 STAGEautofeedback_CXXFLAGS = $(STAGEautofeedback_CFLAGS)
 @endif target-libstdc++-v3-bootstrap
 STAGEautofeedback_TFLAGS = $(STAGE_TFLAGS)
-STAGEautofeedback_CONFIGURE_FLAGS = $(STAGE_CONFIGURE_FLAGS)
-
+STAGEautofeedback_CONFIGURE_FLAGS = $(filter-out 
--enable-werror-always,$(STAGE_CONFIGURE_FLAGS))
 
 # By default, C and C++ are the only stage1 languages, because they are the
 # only ones we require to build with the bootstrap compiler, and also the
-- 
2.25.1

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Thursday, May 11, 2023 1:58 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings during 
autoprofiledbootstrap build

On Thu, May 11, 2023 at 4:23 AM Eugene Rozenfeld 
 wrote:
>
> I'm ok with disabling warnings as errors for autoprofiledbootstrap. What's 
> the proper way to do that? Searching for "--disable-werror" I see matches in 
> lib configure files but not in gcc files.

We have --with-build-config selecting things like bootstrap-O3 and configure 
then disables werror by default if the build config is anything other than the 
default or bootstrap-debug.

Of course profiledbootstrap and autoprofiledbootstrap are not build configs but 
make targets - that makes it more difficult (or impossible) to use the 
--disable-werror machinery here.

There is

STAGE_CONFIGURE_FLAGS=@stage2_werror_flag@

so it might be possible to filter out --enable-werror-always from 
STAGEautofeedback_CONFIGURE_FLAGS?

Richard.

> Thanks,
>
> Eugene
>
> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, May 9, 2023 11:40 PM
> To: Eugene Rozenfeld 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings 
> during autoprofiledbootstrap build
>
> On Wed, May 10, 2023 at 3:38 AM Eugene Rozenfeld via Gcc-patches 
>  wrote:
> >
> > autoprofiledbootstrap build produces new warnings since inlining 
> > decisions are different from other builds. This patch contains fixes 
> > and workarounds for those warnings.
> >
> > Tested on x86_64-pc-linux-gnu.
>
> Rather than this would it make sense to add --disable-werror to 
> autoprofiledbootstrap configs like we do for others?  I also wonder how 
> "stable" the afdo bootstrap inlining decisions are, so applying these 
> workarounds may not be sustainable?
>
> > gcc/ChangeLog:
> >
> > * config/i386/i386-expand.cc (expand_vec_perm_interleave2): Work 
> > around
> > -Wstringop-overflow false positive during autoprofiledbootstrap
> > * ipa-devirt.cc (debug_tree_odr_name): Fix for -Wformat-overflow
> > warning during autoprofiledbootstrap
> > * lra-eliminations.cc (setup_can_eliminate): Work around
> > -Wmaybe-uninitialized false positive during autoprofiledbootstrap
> > * opts-common.cc (candidates_list_and_hint): Work around
> > -Wstringop-overflow false positive during autoprofiledbootstrap
> > * tree-ssa-ccp.cc (bit_value_unop): Work around 
> > -Wmaybe-uninitialized
> > false positive during autoprofiledbootstrap
> > * wide-int.h (wi::copy): Work around -Wmaybe-uninitialized false
> > positive during autoprofiledbootstrap
> > ---
> >  gcc/config/i386/i386-expand.cc | 11 +++
> >  gcc/ipa-devirt.cc  |  3 ++-
> >  gcc/lra-eliminations.cc| 11 +++
> >  gcc/opts-common.cc |  1 +
> >  gcc/tree-ssa-ccp.cc| 11 +++
> >  gcc/wide-int.h | 11 +++
> >  6 files changed, 47 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/i386/i386-expand.cc 
> > b/gcc/config/i386/i386-expand.cc index 634fe61ba79..be9f912775b 
> > 100644
> > --- a/gcc/config/i386/i386-expand.cc
> > +++ b/gcc/config/i386/i386-expand.cc
> > @@ -20419,6 +20419,13 @@ expand_vec_perm_pblendv (struct 
> > expand_vec_perm_d *d)
> >
> >  static bool expand_vec_perm_interleave3 (struc

RE: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings during autoprofiledbootstrap build

2023-05-10 Thread Eugene Rozenfeld via Gcc-patches
I'm ok with disabling warnings as errors for autoprofiledbootstrap. What's the 
proper way to do that? Searching for "--disable-werror" I see matches in lib 
configure files but not in gcc files.

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Tuesday, May 9, 2023 11:40 PM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings during 
autoprofiledbootstrap build

On Wed, May 10, 2023 at 3:38 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> autoprofiledbootstrap build produces new warnings since inlining 
> decisions are different from other builds. This patch contains fixes 
> and workarounds for those warnings.
>
> Tested on x86_64-pc-linux-gnu.

Rather than this would it make sense to add --disable-werror to 
autoprofiledbootstrap configs like we do for others?  I also wonder how 
"stable" the afdo bootstrap inlining decisions are, so applying these 
workarounds may not be sustainable?

> gcc/ChangeLog:
>
> * config/i386/i386-expand.cc (expand_vec_perm_interleave2): Work 
> around
> -Wstringop-overflow false positive during autoprofiledbootstrap
> * ipa-devirt.cc (debug_tree_odr_name): Fix for -Wformat-overflow
> warning during autoprofiledbootstrap
> * lra-eliminations.cc (setup_can_eliminate): Work around
> -Wmaybe-uninitialized false positive during autoprofiledbootstrap
> * opts-common.cc (candidates_list_and_hint): Work around
> -Wstringop-overflow false positive during autoprofiledbootstrap
> * tree-ssa-ccp.cc (bit_value_unop): Work around -Wmaybe-uninitialized
> false positive during autoprofiledbootstrap
> * wide-int.h (wi::copy): Work around -Wmaybe-uninitialized false
> positive during autoprofiledbootstrap
> ---
>  gcc/config/i386/i386-expand.cc | 11 +++
>  gcc/ipa-devirt.cc  |  3 ++-
>  gcc/lra-eliminations.cc| 11 +++
>  gcc/opts-common.cc |  1 +
>  gcc/tree-ssa-ccp.cc| 11 +++
>  gcc/wide-int.h | 11 +++
>  6 files changed, 47 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/i386/i386-expand.cc 
> b/gcc/config/i386/i386-expand.cc index 634fe61ba79..be9f912775b 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -20419,6 +20419,13 @@ expand_vec_perm_pblendv (struct 
> expand_vec_perm_d *d)
>
>  static bool expand_vec_perm_interleave3 (struct expand_vec_perm_d 
> *d);
>
> +/* Work around -Wstringop-overflow false positive during 
> +autoprofiledbootstrap.  */
> +
> +# if GCC_VERSION >= 7001
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wstringop-overflow"
> +#endif
> +
>  /* A subroutine of ix86_expand_vec_perm_const_1.  Try to simplify
> a two vector permutation into a single vector permutation by using
> an interleave operation to merge the vectors.  */ @@ -20737,6 
> +20744,10 @@ expand_vec_perm_interleave2 (struct expand_vec_perm_d *d)
>return true;
>  }
>
> +# if GCC_VERSION >= 7001
> +#pragma GCC diagnostic pop
> +#endif
> +
>  /* A subroutine of ix86_expand_vec_perm_const_1.  Try to simplify
> a single vector cross-lane permutation into vpermq followed
> by any of the single insn permutations.  */ diff --git 
> a/gcc/ipa-devirt.cc b/gcc/ipa-devirt.cc index 819860258d1..36ea266e834 
> 100644
> --- a/gcc/ipa-devirt.cc
> +++ b/gcc/ipa-devirt.cc
> @@ -4033,7 +4033,8 @@ debug_tree_odr_name (tree type, bool demangle)
>odr = cplus_demangle (odr, opts);
>  }
>
> -  fprintf (stderr, "%s\n", odr);
> +  if (odr != NULL)
> +fprintf (stderr, "%s\n", odr);
>  }
>
>  /* Register ODR enum so we later stream record about its values.  */ 
> diff --git a/gcc/lra-eliminations.cc b/gcc/lra-eliminations.cc index 
> 4220639..05e2a7e0d68 100644
> --- a/gcc/lra-eliminations.cc
> +++ b/gcc/lra-eliminations.cc
> @@ -138,6 +138,13 @@ lra_debug_elim_table (void)
>print_elim_table (stderr);
>  }
>
> +/* Work around -Wmaybe-uninitialized false positive during 
> +autoprofiledbootstrap.  */
> +
> +# if GCC_VERSION >= 4007
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
> +#endif
> +
>  /* Setup possibility of elimination in elimination table element EP to
> VALUE.  Setup FRAME_POINTER_NEEDED if elimination from frame
> pointer to stack pointer is not possible anymore.  */ @@ -152,6 
> +159,10 @@ setup_can_eliminate (class lra_elim_table *ep, bool value)
>  REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = 0;  }
>
> +# if G

RE: [EXTERNAL] Re: [PATCH] Fixes and workarounds for warnings during autoprofiledbootstrap build

2023-05-10 Thread Eugene Rozenfeld via Gcc-patches


> I cannot find a call to this debug function on trunk.  How exactly did this 
> trigger a warning?

Here is the command during autoprofiledbootstrap build that resulted in a 
warning:

~/gcc1_objdir/gcc$ /home/erozen/gcc1_objdir/./prev-gcc/xg++ 
-B/home/erozen/gcc1_objdir/./prev-gcc/ 
-B/home/erozen/GCC1/x86_64-pc-linux-gnu/bin/ -nostdinc++ 
-B/home/erozen/gcc1_objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
-B/home/erozen/gcc1_objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
  
-I/home/erozen/gcc1_objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
  -I/home/erozen/gcc1_objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/include  
-I/home/erozen/gcc1/libstdc++-v3/libsupc++ 
-L/home/erozen/gcc1_objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs 
-L/home/erozen/gcc1_objdir/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs
  -fno-PIE -c   -g -O2 -fchecking=1 -DIN_GCC -fno-exceptions -fno-rtti 
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Wconditionally-supported 
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H 
-fauto-profile=cc1plus.fda -I. -I. -I/home/erozen/gcc1_objdir/../gcc1/gcc 
-I/home/erozen/gcc1_objdir/../gcc1/gcc/. 
-I/home/erozen/gcc1_objdir/../gcc1/gcc/../include  
-I/home/erozen/gcc1_objdir/../gcc1/gcc/../libcpp/include 
-I/home/erozen/gcc1_objdir/../gcc1/gcc/../libcody 
-I/home/erozen/gcc1_objdir/./gmp -I/home/erozen/gcc1/gmp 
-I/home/erozen/gcc1_objdir/./mpfr/src -I/home/erozen/gcc1/mpfr/src 
-I/home/erozen/gcc1/mpc/src  
-I/home/erozen/gcc1_objdir/../gcc1/gcc/../libdecnumber 
-I/home/erozen/gcc1_objdir/../gcc1/gcc/../libdecnumber/bid -I../libdecnumber 
-I/home/erozen/gcc1_objdir/../gcc1/gcc/../libbacktrace 
-I/home/erozen/gcc1_objdir/./isl/include -I/home/erozen/gcc1/isl/include  -o 
ipa-devirt.o -MT ipa-devirt.o -MMD -MP -MF ./.deps/ipa-devirt.TPo 
/home/erozen/gcc1_objdir/../gcc1/gcc/ipa-devirt.cc
/home/erozen/gcc1_objdir/../gcc1/gcc/ipa-devirt.cc: In function 'void 
debug_tree_odr_name(tree, bool)':
/home/erozen/gcc1_objdir/../gcc1/gcc/ipa-devirt.cc:4037:23: error: '%s' 
directive argument is null [-Werror=format-overflow=]
 4037 | fprintf (stderr, "%s\n", odr);


> In any case, IMHO the function should rather print something that makes it 
> clear that an odr name could not be obtained rather than printing nothing.

> I also think that if we want to handle the case, we should do it before also 
> possibly passing odr to demangler.

I'll modify the fix unless we end up suppressing warnings as errors for 
autoprofiledbootstrap as Richard suggested.

Thanks,

Martin


[PATCH] Fixes and workarounds for warnings during autoprofiledbootstrap build

2023-05-09 Thread Eugene Rozenfeld via Gcc-patches
autoprofiledbootstrap build produces new warnings since inlining decisions
are different from other builds. This patch contains fixes and workarounds
for those warnings.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* config/i386/i386-expand.cc (expand_vec_perm_interleave2): Work around
-Wstringop-overflow false positive during autoprofiledbootstrap
* ipa-devirt.cc (debug_tree_odr_name): Fix for -Wformat-overflow
warning during autoprofiledbootstrap
* lra-eliminations.cc (setup_can_eliminate): Work around
-Wmaybe-uninitialized false positive during autoprofiledbootstrap
* opts-common.cc (candidates_list_and_hint): Work around
-Wstringop-overflow false positive during autoprofiledbootstrap
* tree-ssa-ccp.cc (bit_value_unop): Work around -Wmaybe-uninitialized
false positive during autoprofiledbootstrap
* wide-int.h (wi::copy): Work around -Wmaybe-uninitialized false
positive during autoprofiledbootstrap
---
 gcc/config/i386/i386-expand.cc | 11 +++
 gcc/ipa-devirt.cc  |  3 ++-
 gcc/lra-eliminations.cc| 11 +++
 gcc/opts-common.cc |  1 +
 gcc/tree-ssa-ccp.cc| 11 +++
 gcc/wide-int.h | 11 +++
 6 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 634fe61ba79..be9f912775b 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -20419,6 +20419,13 @@ expand_vec_perm_pblendv (struct expand_vec_perm_d *d)
 
 static bool expand_vec_perm_interleave3 (struct expand_vec_perm_d *d);
 
+/* Work around -Wstringop-overflow false positive during 
autoprofiledbootstrap.  */
+
+# if GCC_VERSION >= 7001
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wstringop-overflow"
+#endif
+
 /* A subroutine of ix86_expand_vec_perm_const_1.  Try to simplify
a two vector permutation into a single vector permutation by using
an interleave operation to merge the vectors.  */
@@ -20737,6 +20744,10 @@ expand_vec_perm_interleave2 (struct expand_vec_perm_d 
*d)
   return true;
 }
 
+# if GCC_VERSION >= 7001
+#pragma GCC diagnostic pop
+#endif
+
 /* A subroutine of ix86_expand_vec_perm_const_1.  Try to simplify
a single vector cross-lane permutation into vpermq followed
by any of the single insn permutations.  */
diff --git a/gcc/ipa-devirt.cc b/gcc/ipa-devirt.cc
index 819860258d1..36ea266e834 100644
--- a/gcc/ipa-devirt.cc
+++ b/gcc/ipa-devirt.cc
@@ -4033,7 +4033,8 @@ debug_tree_odr_name (tree type, bool demangle)
   odr = cplus_demangle (odr, opts);
 }
 
-  fprintf (stderr, "%s\n", odr);
+  if (odr != NULL)
+fprintf (stderr, "%s\n", odr);
 }
 
 /* Register ODR enum so we later stream record about its values.  */
diff --git a/gcc/lra-eliminations.cc b/gcc/lra-eliminations.cc
index 4220639..05e2a7e0d68 100644
--- a/gcc/lra-eliminations.cc
+++ b/gcc/lra-eliminations.cc
@@ -138,6 +138,13 @@ lra_debug_elim_table (void)
   print_elim_table (stderr);
 }
 
+/* Work around -Wmaybe-uninitialized false positive during 
autoprofiledbootstrap.  */
+
+# if GCC_VERSION >= 4007
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#endif
+
 /* Setup possibility of elimination in elimination table element EP to
VALUE.  Setup FRAME_POINTER_NEEDED if elimination from frame
pointer to stack pointer is not possible anymore.  */
@@ -152,6 +159,10 @@ setup_can_eliminate (class lra_elim_table *ep, bool value)
 REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = 0;
 }
 
+# if GCC_VERSION >= 4007
+#pragma GCC diagnostic pop
+#endif
+
 /* Map: eliminable "from" register -> its current elimination,
or NULL if none.  The elimination table may contain more than
one elimination for the same hard register, but this map specifies
diff --git a/gcc/opts-common.cc b/gcc/opts-common.cc
index 23ddcaa3b55..0bb8e34e2b0 100644
--- a/gcc/opts-common.cc
+++ b/gcc/opts-common.cc
@@ -1388,6 +1388,7 @@ candidates_list_and_hint (const char *arg, char *,
   p[len] = ' ';
   p += len + 1;
 }
+  gcc_assert(p > str);
   p[-1] = '\0';
   return find_closest_string (arg, );
 }
diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 03a984f2adf..a54e5a90464 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -1976,6 +1976,13 @@ bit_value_binop (enum tree_code code, signop sgn, int 
width,
 }
 }
 
+/* Work around -Wmaybe-uninitialized false positive during 
autoprofiledbootstrap.  */
+
+# if GCC_VERSION >= 4007
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
+#endif
+
 /* Return the propagation value when applying the operation CODE to
the value RHS yielding type TYPE.  */
 
@@ -2011,6 +2018,10 @@ bit_value_unop (enum tree_code code, tree type, tree rhs)
   return val;
 }
 
+# if GCC_VERSION >= 4007
+#pragma GCC diagnostic pop
+#endif
+
 /* 

RE: [EXTERNAL] Re: [PATCH][PUSHED] Fix cfg maintenance after inlining in AutoFDO

2023-05-09 Thread Eugene Rozenfeld via Gcc-patches
>>  free_dominance_info (CDI_DOMINATORS); @@ -1674,7 +1677,7 @@ 
>> auto_profile (void)
>>  pop_cfun ();
>>}
>>
>> -  return TODO_rebuild_cgraph_edges;
>> +  return 0;

>This change isn't mentioned - was it accidential?

No, it wasn't accidental. There is no reason to return 
TODO_rebuild_cgraph_edges since we called cgraph_edge::rebuild_edges () after 
each early_inline ().
Here is more context before that diff:

todo |= early_inline ();
autofdo::afdo_annotate_cfg (promoted_stmts);
compute_function_frequency ();

/* Local pure-const may imply need to fixup the cfg.  */
todo |= execute_fixup_cfg ();
if (todo & TODO_cleanup_cfg)
  cleanup_tree_cfg ();

free_dominance_info (CDI_DOMINATORS);
free_dominance_info (CDI_POST_DOMINATORS);
cgraph_edge::rebuild_edges ();
compute_fn_summary (cgraph_node::get (current_function_decl), true);
pop_cfun ();
  }

  return 0;

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Monday, May 8, 2023 11:40 PM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH][PUSHED] Fix cfg maintenance after inlining in 
AutoFDO

On Tue, May 9, 2023 at 12:27 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> Todo from early_inliner needs to be propagated so that 
> cleanup_tree_cfg () is called if necessary.
>
> This bug was causing an assert in get_loop_body during ipa-sra in 
> autoprofiledbootstrap build since loops weren't fixed up and one of 
> the loops had num_nodes set to 0.
>
> Tested on x86_64-pc-linux-gnu.
>
> gcc/ChangeLog:
>
> * auto-profile.cc (auto_profile): Check todo from early_inline
> to see if cleanup_tree_vfg needs to be called.

_cfg

> (early_inline): Return todo from early_inliner.
> ---
>  gcc/auto-profile.cc | 21 -
>  1 file changed, 12 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc index 
> 360c42c4b89..e3af3555e75 100644
> --- a/gcc/auto-profile.cc
> +++ b/gcc/auto-profile.cc
> @@ -1589,13 +1589,14 @@ afdo_annotate_cfg (const stmt_set 
> _stmts)
>
>  /* Wrapper function to invoke early inliner.  */
>
> -static void
> +static unsigned int
>  early_inline ()
>  {
>compute_fn_summary (cgraph_node::get (current_function_decl), 
> true);
> -  unsigned todo = early_inliner (cfun);
> +  unsigned int todo = early_inliner (cfun);
>if (todo & TODO_update_ssa_any)
>  update_ssa (TODO_update_ssa);
> +  return todo;
>  }
>
>  /* Use AutoFDO profile to annoate the control flow graph.
> @@ -1651,20 +1652,22 @@ auto_profile (void)
> function before annotation, so the profile inside bar@loc_foo2
> will be useful.  */
>  autofdo::stmt_set promoted_stmts;
> +unsigned int todo = 0;
>  for (int i = 0; i < 10; i++)
>{
> -if (!flag_value_profile_transformations
> -|| !autofdo::afdo_vpt_for_early_inline (_stmts))
> -  break;
> -early_inline ();
> +   if (!flag_value_profile_transformations
> +   || !autofdo::afdo_vpt_for_early_inline (_stmts))
> + break;
> +   todo |= early_inline ();
>}
>
> -early_inline ();
> +todo |= early_inline ();
>  autofdo::afdo_annotate_cfg (promoted_stmts);
>  compute_function_frequency ();
>
>  /* Local pure-const may imply need to fixup the cfg.  */
> -if (execute_fixup_cfg () & TODO_cleanup_cfg)
> +todo |= execute_fixup_cfg ();
> +if (todo & TODO_cleanup_cfg)
>cleanup_tree_cfg ();
>
>  free_dominance_info (CDI_DOMINATORS); @@ -1674,7 +1677,7 @@ 
> auto_profile (void)
>  pop_cfun ();
>}
>
> -  return TODO_rebuild_cgraph_edges;
> +  return 0;

This change isn't mentioned - was it accidential?

Otherwise looks OK.

Thanks,
Richard.

>  }
>  } /* namespace autofdo.  */
>
> --
> 2.25.1
>


[PATCH][PUSHED] Fix cfg maintenance after inlining in AutoFDO

2023-05-08 Thread Eugene Rozenfeld via Gcc-patches
Todo from early_inliner needs to be propagated so that
cleanup_tree_cfg () is called if necessary.

This bug was causing an assert in get_loop_body during
ipa-sra in autoprofiledbootstrap build since loops weren't
fixed up and one of the loops had num_nodes set to 0.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* auto-profile.cc (auto_profile): Check todo from early_inline
to see if cleanup_tree_vfg needs to be called.
(early_inline): Return todo from early_inliner.
---
 gcc/auto-profile.cc | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 360c42c4b89..e3af3555e75 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1589,13 +1589,14 @@ afdo_annotate_cfg (const stmt_set _stmts)
 
 /* Wrapper function to invoke early inliner.  */
 
-static void
+static unsigned int
 early_inline ()
 {
   compute_fn_summary (cgraph_node::get (current_function_decl), true);
-  unsigned todo = early_inliner (cfun);
+  unsigned int todo = early_inliner (cfun);
   if (todo & TODO_update_ssa_any)
 update_ssa (TODO_update_ssa);
+  return todo;
 }
 
 /* Use AutoFDO profile to annoate the control flow graph.
@@ -1651,20 +1652,22 @@ auto_profile (void)
function before annotation, so the profile inside bar@loc_foo2
will be useful.  */
 autofdo::stmt_set promoted_stmts;
+unsigned int todo = 0;
 for (int i = 0; i < 10; i++)
   {
-if (!flag_value_profile_transformations
-|| !autofdo::afdo_vpt_for_early_inline (_stmts))
-  break;
-early_inline ();
+   if (!flag_value_profile_transformations
+   || !autofdo::afdo_vpt_for_early_inline (_stmts))
+ break;
+   todo |= early_inline ();
   }
 
-early_inline ();
+todo |= early_inline ();
 autofdo::afdo_annotate_cfg (promoted_stmts);
 compute_function_frequency ();
 
 /* Local pure-const may imply need to fixup the cfg.  */
-if (execute_fixup_cfg () & TODO_cleanup_cfg)
+todo |= execute_fixup_cfg ();
+if (todo & TODO_cleanup_cfg)
   cleanup_tree_cfg ();
 
 free_dominance_info (CDI_DOMINATORS);
@@ -1674,7 +1677,7 @@ auto_profile (void)
 pop_cfun ();
   }
 
-  return TODO_rebuild_cgraph_edges;
+  return 0;
 }
 } /* namespace autofdo.  */
 
-- 
2.25.1



RE: [EXTERNAL] Re: [PATCH] Fix autoprofiledbootstrap build

2023-03-27 Thread Eugene Rozenfeld via Gcc-patches
Ping for https://gcc.gnu.org/pipermail/gcc-patches/2023-March/613974.html

Thanks,

Eugene

-Original Message-
From: Eugene Rozenfeld 
Sent: Tuesday, March 14, 2023 2:21 PM
To: Jeff Law ; gcc-patches@gcc.gnu.org; Andi Kleen 

Subject: RE: [EXTERNAL] Re: [PATCH] Fix autoprofiledbootstrap build

Hi Jeff,

I revived profile_merger tool in http://github.com/google/autofdo and re-worked 
the patch to merge profiles for compiling the libraries.

Please take a look at the attached patch.

Thanks,

Eugene

-Original Message-
From: Jeff Law  
Sent: Tuesday, November 22, 2022 10:16 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org; 
Andi Kleen 
Subject: Re: [EXTERNAL] Re: [PATCH] Fix autoprofiledbootstrap build

[You don't often get email from jeffreya...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

On 11/22/22 14:20, Eugene Rozenfeld wrote:
> I took another look at this. We actually collect perf data when building the 
> libraries. So, we have ./prev-gcc/perf.data, ./prev-libcpp/perf.data, 
> ./prev-libiberty/perf.data, etc. But when creating gcov data for  
> -fauto-profile build of cc1plus or cc1 we only use ./prev-gcc/perf.data . So, 
> a better solution would be either having a single perf.data for all builds 
> (gcc and libraries) or merging perf.data files before attempting 
> autostagefeedback. What would you recommend?

ISTM that if neither approach loses data, then they're functionally equivalent 
-- meaning that we can select whichever is easier to wire into our build system.

A single perf.data might serialize the build.  So perhaps separate, then merge 
right before autostagefeedback.


But I'm willing to go with whatever you think is best.

Jeff




RE: [EXTERNAL] Re: [PATCH] Fix autoprofiledbootstrap build

2023-03-14 Thread Eugene Rozenfeld via Gcc-patches
Hi Jeff,

I revived profile_merger tool in http://github.com/google/autofdo and re-worked 
the patch to merge profiles for compiling the libraries.

Please take a look at the attached patch.

Thanks,

Eugene

-Original Message-
From: Jeff Law  
Sent: Tuesday, November 22, 2022 10:16 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org; 
Andi Kleen 
Subject: Re: [EXTERNAL] Re: [PATCH] Fix autoprofiledbootstrap build

[You don't often get email from jeffreya...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

On 11/22/22 14:20, Eugene Rozenfeld wrote:
> I took another look at this. We actually collect perf data when building the 
> libraries. So, we have ./prev-gcc/perf.data, ./prev-libcpp/perf.data, 
> ./prev-libiberty/perf.data, etc. But when creating gcov data for  
> -fauto-profile build of cc1plus or cc1 we only use ./prev-gcc/perf.data . So, 
> a better solution would be either having a single perf.data for all builds 
> (gcc and libraries) or merging perf.data files before attempting 
> autostagefeedback. What would you recommend?

ISTM that if neither approach loses data, then they're functionally equivalent 
-- meaning that we can select whichever is easier to wire into our build system.

A single perf.data might serialize the build.  So perhaps separate, then merge 
right before autostagefeedback.


But I'm willing to go with whatever you think is best.

Jeff




0001-Fix-autoprofiledbootstrap-build.patch
Description: 0001-Fix-autoprofiledbootstrap-build.patch


RE: [EXTERNAL] Re: [PATCH] Fix count comparison in ipa-cp

2022-12-06 Thread Eugene Rozenfeld via Gcc-patches
I initially ran into this while reviving autoprofiledbootstrap build.

I was able to create a simple reliable test for this bug and captured it in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108000

I also included the test case in the updated patch below.

Eugene

=

The existing comparison was incorrect for non-PRECISE counts
(e.g., AFDO): we could end up with a 0 base_count, which could
lead to asserts, e.g., in good_cloning_opportunity_p.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
PR ipa/108000
* ipa-cp.cc (ipcp_propagate_stage): Fix profile count comparison

gcc/testsuite
* gcc.dg/tree-prof/pr108000.c: Regression test
---
 gcc/ipa-cp.cc |  2 +-
 gcc/testsuite/gcc.dg/tree-prof/pr108000.c | 93 +++
 2 files changed, 94 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/pr108000.c

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index d5230c7c5e6..cc031ebed0f 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -4225,7 +4225,7 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
for (cgraph_edge *cs = node->callees; cs; cs = cs->next_callee)
  {
profile_count count = cs->count.ipa ();
-   if (!(count > profile_count::zero ()))
+   if (!count.nonzero_p ())
  continue;
 
enum availability avail;
diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr108000.c 
b/gcc/testsuite/gcc.dg/tree-prof/pr108000.c
new file mode 100644
index 000..c59ea799748
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/pr108000.c
@@ -0,0 +1,93 @@
+/* { dg-options "-O2" } */
+
+#include 
+
+volatile int flag;
+const int array_size = 10;
+int* array;
+int iterations = 1000;
+
+#define BAR(num) \
+int __attribute__((noinline)) \
+bar##num (int i, int j) \
+{ \
+  if (i == 0) \
+return 2*num - 1; \
+  else \
+return 2*num; \
+}
+
+BAR(1)
+BAR(2)
+BAR(3)
+BAR(4)
+BAR(5)
+BAR(6)
+BAR(7)
+BAR(8)
+BAR(9)
+BAR(10)
+BAR(11)
+BAR(12)
+BAR(13)
+BAR(14)
+BAR(15)
+BAR(16)
+BAR(17)
+BAR(18)
+BAR(19)
+
+int __attribute__((noinline))
+foo ()
+{
+  switch (flag)
+  {
+   case 1:
+   return bar1 (0, 0);
+   case 2:
+   return bar2 (0, 0);
+   case 3:
+   return bar3 (0, 0);
+   case 4:
+   return bar4 (0, 0);
+   case 5:
+   return bar5 (0, 0);
+   case 6:
+   return bar6 (0, 0);
+   case 7:
+   return bar7 (0, 0);
+   case 8:
+   return bar8 (0, 0);
+   case 9:
+   return bar9 (0, 0);
+   case 10:
+   return bar10 (0, 0);
+   case 11:
+   return bar11 (0, 0);
+   case 12:
+   return bar12 (0, 0);
+   case 13:
+   return bar13 (0, 0);
+   case 14:
+   return bar14 (0, 0);
+   case 15:
+   return bar15 (0, 0);
+   case 16:
+   return bar16 (0, 0);
+   case 17:
+   return bar17 (0, 0);
+   case 18:
+   return bar18 (0, 0);
+   default:
+   return bar19(0, 0);
+  }
+}
+
+int
+main ()
+{
+  flag = 0;
+  array = calloc(array_size, sizeof(int));
+  for (int i = 0, j = 0; i < iterations; ++i, j = (j + 1) % 10)
+array[j] = foo ();
+}
-- 
2.25.1

-Original Message-
From: Jeff Law  
Sent: Tuesday, November 22, 2022 12:03 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] Fix count comparison in ipa-cp

[You don't often get email from jeffreya...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

On 11/21/22 14:26, Eugene Rozenfeld via Gcc-patches wrote:
> The existing comparison was incorrect for non-PRECISE counts (e.g., 
> AFDO): we could end up with a 0 base_count, which could lead to 
> asserts, e.g., in good_cloning_opportunity_p.
>
> gcc/ChangeLog:
>
>  * ipa-cp.cc (ipcp_propagate_stage): Fix profile count comparison.

OK.  Probably somewhat painful to pull together a reliable test for this, right?


Jeff




RE: [EXTERNAL] Re: [PATCH] Fix autoprofiledbootstrap build

2022-11-22 Thread Eugene Rozenfeld via Gcc-patches
I took another look at this. We actually collect perf data when building the 
libraries. So, we have ./prev-gcc/perf.data, ./prev-libcpp/perf.data, 
./prev-libiberty/perf.data, etc. But when creating gcov data for  
-fauto-profile build of cc1plus or cc1 we only use ./prev-gcc/perf.data . So, a 
better solution would be either having a single perf.data for all builds (gcc 
and libraries) or merging perf.data files before attempting autostagefeedback. 
What would you recommend?

Thanks,

Eugene

-Original Message-
From: Jeff Law  
Sent: Tuesday, November 22, 2022 12:01 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org; 
Andi Kleen 
Subject: [EXTERNAL] Re: [PATCH] Fix autoprofiledbootstrap build

[You don't often get email from jeffreya...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

On 11/21/22 14:57, Eugene Rozenfeld via Gcc-patches wrote:
> 1. Fix gcov version
> 2. Don't attempt to create an autoprofile file for cc1 since cc1plus 
> (not cc1) is not invoked when building cc1 3. Fix documentation typo
>
> Tested on x86_64-pc-linux-gnu.
>
> gcc/ChangeLog:
>
>   * c/Make-lang.in: Don't attempt to create an autoprofile file for cc1
>   * cp/Make-lang.in: Fix gcov version
>   * lto/Make-lang.in: Fix gcov version
>   * doc/install.texi: Fix documentation typo

Just to be 100% sure.  While the compiler is built with cc1plus, various 
runtime libraries are still build with the C compiler and thus would use cc1.  
AFAICT it looks like we don't try to build the runtime libraries to get any 
data about the behavior of the C compiler.  Can you confirm?


Assuming that's correct, this is fine for the trunk.


Thanks,

Jeff



[PATCH] Fix autoprofiledbootstrap build

2022-11-21 Thread Eugene Rozenfeld via Gcc-patches
1. Fix gcov version
2. Don't attempt to create an autoprofile file for cc1 since cc1plus
(not cc1) is not invoked when building cc1
3. Fix documentation typo

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* c/Make-lang.in: Don't attempt to create an autoprofile file for cc1
* cp/Make-lang.in: Fix gcov version
* lto/Make-lang.in: Fix gcov version
* doc/install.texi: Fix documentation typo
---
 gcc/c/Make-lang.in   | 15 +--
 gcc/cp/Make-lang.in  |  2 +-
 gcc/doc/install.texi |  2 +-
 gcc/lto/Make-lang.in |  2 +-
 4 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/gcc/c/Make-lang.in b/gcc/c/Make-lang.in
index 9bd9c0ea123..ba33ec03bf0 100644
--- a/gcc/c/Make-lang.in
+++ b/gcc/c/Make-lang.in
@@ -62,12 +62,6 @@ c_OBJS = $(C_OBJS) cc1-checksum.o c/gccspec.o
 # Use strict warnings for this front end.
 c-warn = $(STRICT_WARN)
 
-ifeq ($(if $(wildcard ../stage_current),$(shell cat \
-  ../stage_current)),stageautofeedback)
-$(C_OBJS): ALL_COMPILERFLAGS += -fauto-profile=cc1.fda
-$(C_OBJS): cc1.fda
-endif
-
 # compute checksum over all object files and the options
 # re-use the checksum from the prev-final stage so it passes
 # the bootstrap comparison and allows comparing of the cc1 binary
@@ -88,9 +82,6 @@ cc1$(exeext): $(C_OBJS) cc1-checksum.o $(BACKEND) $(LIBDEPS)
  cc1-checksum.o $(BACKEND) $(LIBS) $(BACKENDLIBS)
@$(call LINK_PROGRESS,$(INDEX.c),end)
 
-cc1.fda: ../stage1-gcc/cc1$(exeext) ../prev-gcc/$(PERF_DATA)
-   $(CREATE_GCOV) -binary ../stage1-gcc/cc1$(exeext) -gcov cc1.fda 
-profile ../prev-gcc/$(PERF_DATA) -gcov_version 1
-
 #

 # Build hooks:
 
@@ -180,7 +171,6 @@ c.mostlyclean:
-rm -f cc1$(exeext)
-rm -f c/*$(objext)
-rm -f c/*$(coverageexts)
-   -rm -f cc1.fda
 c.clean:
 c.distclean:
-rm -f c/config.status c/Makefile
@@ -201,7 +191,4 @@ c.stageprofile: stageprofile-start
-mv c/*$(objext) stageprofile/c
 c.stagefeedback: stagefeedback-start
-mv c/*$(objext) stagefeedback/c
-c.autostageprofile: autostageprofile-start
-   -mv c/*$(objext) autostageprofile/c
-c.autostagefeedback: autostagefeedback-start
-   -mv c/*$(objext) autostagefeedback/c
+
diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index 291835d326e..49e5cd66912 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -178,7 +178,7 @@ endif
 cp/name-lookup.o: $(srcdir)/cp/std-name-hint.h
 
 cc1plus.fda: ../stage1-gcc/cc1plus$(exeext) ../prev-gcc/$(PERF_DATA)
-   $(CREATE_GCOV) -binary ../stage1-gcc/cc1plus$(exeext) -gcov cc1plus.fda 
-profile ../prev-gcc/$(PERF_DATA) -gcov_version 1
+   $(CREATE_GCOV) -binary ../stage1-gcc/cc1plus$(exeext) -gcov cc1plus.fda 
-profile ../prev-gcc/$(PERF_DATA) -gcov_version 2
 
 #

 # Build hooks:
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index c1876f24a84..61a483bc410 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -3059,7 +3059,7 @@ It is recommended to only use GCC for this.
 
 On Linux/x86_64 hosts with some restrictions (no virtualization) it is
 also possible to do autofdo build with @samp{make
-autoprofiledback}. This uses Linux perf to sample branches in the
+autoprofiledbootstrap}. This uses Linux perf to sample branches in the
 binary and then rebuild it with feedback derived from the profile.
 Linux perf and the @code{autofdo} toolkit needs to be installed for
 this.
diff --git a/gcc/lto/Make-lang.in b/gcc/lto/Make-lang.in
index a2dcf0dfc12..3ee748489ac 100644
--- a/gcc/lto/Make-lang.in
+++ b/gcc/lto/Make-lang.in
@@ -106,7 +106,7 @@ $(LTO_DUMP_EXE): $(LTO_DUMP_OBJS) $(BACKEND) $(LIBDEPS) 
$(lto2.prev)
 lto/lto-dump.o: $(LTO_OBJS)
 
 lto1.fda: ../prev-gcc/lto1$(exeext) ../prev-gcc/$(PERF_DATA)
-   $(CREATE_GCOV) -binary ../prev-gcc/lto1$(exeext) -gcov lto1.fda 
-profile ../prev-gcc/$(PERF_DATA) -gcov_version 1
+   $(CREATE_GCOV) -binary ../prev-gcc/lto1$(exeext) -gcov lto1.fda 
-profile ../prev-gcc/$(PERF_DATA) -gcov_version 2
 
 # LTO testing is done as part of C/C++/Fortran etc. testing.
 check-lto:
-- 
2.25.1



[PATCH] Fix count comparison in ipa-cp

2022-11-21 Thread Eugene Rozenfeld via Gcc-patches
The existing comparison was incorrect for non-PRECISE counts
(e.g., AFDO): we could end up with a 0 base_count, which could
lead to asserts, e.g., in good_cloning_opportunity_p.

gcc/ChangeLog:

* ipa-cp.cc (ipcp_propagate_stage): Fix profile count comparison.
---
 gcc/ipa-cp.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index d2bcd5e5e69..9df8b456759 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -4225,7 +4225,7 @@ ipcp_propagate_stage (class ipa_topo_info *topo)
for (cgraph_edge *cs = node->callees; cs; cs = cs->next_callee)
  {
profile_count count = cs->count.ipa ();
-   if (!(count > profile_count::zero ()))
+   if (!count.nonzero_p ())
  continue;
 
enum availability avail;
-- 
2.25.1



[PATCH][PUSHED] Don't force DWARF4 for AutoFDO tests

2022-10-25 Thread Eugene Rozenfeld via Gcc-patches
Support for DWARF5 was added to create_gcov in
https://github.com/google/autofdo so we no longer need
to force DWARF4 for AutoFDO tests.

Tested on x86_64-pc-linux-gnu.

gcc/testsuite/ChangeLog:
* lib/profopt.exp: Don't force DWARF4 for AutoFDO tests
---
 gcc/testsuite/lib/profopt.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/profopt.exp b/gcc/testsuite/lib/profopt.exp
index ac7712a5b42..fe88d2fce37 100644
--- a/gcc/testsuite/lib/profopt.exp
+++ b/gcc/testsuite/lib/profopt.exp
@@ -289,7 +289,7 @@ proc auto-profopt-execute { src } {
 return
 }
 set profile_wrapper [profopt-perf-wrapper]
-set profile_option "-gdwarf-4 -DFOR_AUTOFDO_TESTING"
+set profile_option "-g -DFOR_AUTOFDO_TESTING"
 set feedback_option "-fauto-profile -DFOR_AUTOFDO_TESTING -fearly-inlining"
 set run_autofdo 1
 profopt-execute $src
-- 
2.25.1


[PATCH][PUSHED] Start using discriminators in AutoFDO

2022-10-25 Thread Eugene Rozenfeld via Gcc-patches
Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* auto-profile.cc (get_combined_location): Include discriminator in the
returned combined location.
(read_function_instance): Read discriminators from profiles.
---
 gcc/auto-profile.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index ca48404eaf1..97307321cbf 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -363,7 +363,8 @@ get_combined_location (location_t loc, tree decl)
   /* TODO: allow more bits for line and less bits for discriminator.  */
   if (LOCATION_LINE (loc) - DECL_SOURCE_LINE (decl) >= (1<<16))
 warning_at (loc, OPT_Woverflow, "offset exceeds 16 bytes");
-  return ((LOCATION_LINE (loc) - DECL_SOURCE_LINE (decl)) << 16);
+  return ((LOCATION_LINE (loc) - DECL_SOURCE_LINE (decl)) << 16)
+| get_discriminator_from_loc (loc);
 }
 
 /* Return the function decl of a given lexical BLOCK.  */
@@ -652,7 +653,7 @@ function_instance::read_function_instance 
(function_instance_stack *stack,
 
   for (unsigned i = 0; i < num_pos_counts; i++)
 {
-  unsigned offset = gcov_read_unsigned () & 0x;
+  unsigned offset = gcov_read_unsigned ();
   unsigned num_targets = gcov_read_unsigned ();
   gcov_type count = gcov_read_counter ();
   s->pos_counts[offset].count = count;
-- 
2.25.1


RE: [EXTERNAL] RE: [r13-3172 Regression] FAIL:libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for excess errors)

2022-10-17 Thread Eugene Rozenfeld via Gcc-patches
Yes, I received that one. The root cause is the -gstatement-frontiers issue 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100733 . I submitted a workaround 
patch for that ( 
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603673.html ) but it 
hasn't been approved yet. Another workaround is passing 
-gno-statement-frontiers to the test.

Eugene

-Original Message-
From: Thomas Schwinge  
Sent: Monday, October 17, 2022 3:46 AM
To: Eugene Rozenfeld 
Cc: haochen.ji...@intel.com; gcc-patches@gcc.gnu.org; gcc-regress...@gcc.gnu.org
Subject: [EXTERNAL] RE: [r13-3172 Regression] 
FAIL:libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for 
excess errors) on Linux/x86_64

Hi!

On 2022-10-17T06:11:01+, "Jiang, Haochen via Gcc-patches" 
 wrote:
> I just checkout to your commit and the test still got failed.
>
> It is reporting like this:
> xgcc: error: 
> /export/users2/haochenj/src/gcc/master/./libgomp/testsuite/libgomp.oac
> c-c++/../libgomp.oacc-c-c++-common/kernels-loop-g.c: '-fcompare-debug' 
> failure (length)

Right.  I had filed 

 "[13 Regression]
c-c++-common/goacc/kernels-loop-g.c: '-fcompare-debug' failure (length)"
with you in CC: .  (Have you not received that
one?)


Grüße
 Thomas


> Also fix a typo in manually sending, should be this to reproduce
>
> To reproduce:
>
> $ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
> RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
> --target_board='unix{-m32}'"
> $ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
> RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
> --target_board='unix{-m32\ -march=cascadelake}'"
> $ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
> RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
> --target_board='unix{-m64}'"
> $ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
> RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
> --target_board='unix{-m64\ -march=cascadelake}'"
>
> BRs,
> Haochen
>
> From: Jiang, Haochen
> Sent: Monday, October 17, 2022 1:41 PM
> To: Eugene Rozenfeld ; 
> gcc-patches@gcc.gnu.org; gcc-regress...@gcc.gnu.org
> Subject: RE: [r13-3172 Regression] 
> FAIL:libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c 
> -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 
> (test for excess errors) on Linux/x86_64
>
> If that has been fixed, just ignore that mail.
>
> It is run through by a script and got the result few days ago. 
> However, the sendmail service was down on that machine and I just 
> noticed that issue. So I sent that result manually today in case that is not 
> fixed.
>
> Sorry for the disturb!
>
> BRs,
> Haochen
>
> From: Eugene Rozenfeld 
> mailto:eugene.rozenf...@microsoft.com>
> >
> Sent: Monday, October 17, 2022 1:23 PM
> To: Jiang, Haochen 
> mailto:haochen.ji...@intel.com>>; 
> gcc-patches@gcc.gnu.org; 
> gcc-regress...@gcc.gnu.org
> Subject: RE: [r13-3172 Regression] 
> FAIL:libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c 
> -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 
> (test for excess errors) on Linux/x86_64
>
> That commit had a bug that was fixed in 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D80f414e6d73f9f1683f9
> 3d83ce63a6a482e54beedata=05%7C01%7CEugene.Rozenfeld%40microsoft.c
> om%7C05c1cd8b5db4455fe3de08dab02ce66e%7C72f988bf86f141af91ab2d7cd011db
> 47%7C1%7C0%7C638016004172688130%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj
> AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
> mp;sdata=50kAEt5C%2F8DO%2BvuyNvE7c1ydxriGeQwY5G%2Bqs%2FgSGE0%3Dre
> served=0
>
> Was that fix included in your GCC build?
>
> From: Jiang, Haochen 
> mailto:haochen.ji...@intel.com>>
> Sent: Sunday, October 16, 2022 8:09 PM
> To: gcc-patches@gcc.gnu.org; Eugene 
> Rozenfeld 
> mailto:eugene.rozenf...@microsoft.com>
> >; Jiang, Haochen 
> mailto:haochen.ji...@intel.com>>; 
> gcc-regress...@gcc.gnu.org
> Subject: [EXTERNAL] [r13-3172 Regression] 
> FAIL:libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c 
> -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 
> (test for excess errors) on Linux/x86_64
>
> You don't often get email from 
> 

RE: [EXTERNAL] Re: [PATCH] Don't print discriminators for -fcompare-debug.

2022-10-17 Thread Eugene Rozenfeld via Gcc-patches
Yes, -gstatement-frontiers is the root cause here but the new approach to 
discriminators is especially prone to this. I added the workaround to pr85213.c 
in my original discriminator patch but now two more -fcompare-debug bugs were 
opened (PR107231 and PR107169). I suspect we'll keep getting more. So I'd like 
to disable printing discriminators in -fcompare-debug dums until 
-gstatement-frontier issue is fixed.

Eugene

-Original Message-
From: Richard Biener  
Sent: Monday, October 17, 2022 12:06 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org; Jason Merrill 
Subject: [EXTERNAL] Re: [PATCH] Don't print discriminators for -fcompare-debug.

On Sun, Oct 16, 2022 at 10:25 PM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> With -gstatement-frontiers we may end up with different IR coming from 
> the front end with and without debug information turned on.
> See 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D100733data=05%7C01%7CEugene.Rozenfeld%40microsoft.com%7C5d3df88ec7e14f5eec2708dab00e0440%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638015871510301049%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=2JjQHAgDi6%2Fet1vowA1IRcdInJMkkjuva9DbM5rHawc%3Dreserved=0
>  for details.
> That may result in differences in discriminator values and 
> -fcompare-debug failures.
>
> This patch disables printing of discriminators when the dump is 
> intended for -fcompare-debug comparison and reverses the workaround in a test.

I don't think this is the correct approach.  -gstatement-frontiers is known to 
be prone to these issues and is the one to blame here.  I think the bugs should 
be SUSPENDED until -gstatement-frontiers is fixed or at least disabled by 
default (IIRC Jakub tried that but failed last time)

> Tested on x86_64-pc-linux-gnu.
>
> gcc/ChangeLog:
> PR debug/107231
> PR debug/107169
> * print-rtl.cc (print_rtx_operand_code_i): Don't print discriminators
> for -fdebug-compare.
>
> gcc/testsuite/ChangeLog:
>
> * c-c++-common/ubsan/pr85213.c: Reverse the workaround for 
> discriminators.
> ---
>  gcc/print-rtl.cc   | 13 ++---
>  gcc/testsuite/c-c++-common/ubsan/pr85213.c |  7 +--
>  2 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/print-rtl.cc b/gcc/print-rtl.cc index 
> e115f987173..0476f3d7e79 100644
> --- a/gcc/print-rtl.cc
> +++ b/gcc/print-rtl.cc
> @@ -453,10 +453,17 @@ rtx_writer::print_rtx_operand_code_i (const_rtx in_rtx, 
> int idx)
>   expanded_location xloc = insn_location (in_insn);
>   fprintf (m_outfile, " \"%s\":%i:%i", xloc.file, xloc.line,
>xloc.column);
> - int discriminator = insn_discriminator (in_insn);
> -   if (discriminator)
> - fprintf (m_outfile, " discrim %d", discriminator);
>
> + /* Don't print discriminators for -fcompare-debug since the IR
> +coming from the front end may be different with and without
> +debug information turned on. That may result in different
> +discriminator values. */
> + if (!(dump_flags & TDF_COMPARE_DEBUG))
> +   {
> + int discriminator = insn_discriminator (in_insn);
> + if (discriminator)
> +   fprintf (m_outfile, " discrim %d", discriminator);
> +   }
> }
>  #endif
>  }
> diff --git a/gcc/testsuite/c-c++-common/ubsan/pr85213.c 
> b/gcc/testsuite/c-c++-common/ubsan/pr85213.c
> index e903e976f2c..8a6be81d20f 100644
> --- a/gcc/testsuite/c-c++-common/ubsan/pr85213.c
> +++ b/gcc/testsuite/c-c++-common/ubsan/pr85213.c
> @@ -1,11 +1,6 @@
>  /* PR sanitizer/85213 */
>  /* { dg-do compile } */
> -/* Pass -gno-statement-frontiers to work around
> -   
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D100733data=05%7C01%7CEugene.Rozenfeld%40microsoft.com%7C5d3df88ec7e14f5eec2708dab00e0440%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638015871510301049%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=2JjQHAgDi6%2Fet1vowA1IRcdInJMkkjuva9DbM5rHawc%3Dreserved=0
>  :
> -   without it the IR coming from the front end may be different with and 
> without
> -   debug information turned on. That may cause e.g., different discriminator 
> values
> -   and -fcompare-debug failures. */
> -/* { dg-options "-O1 -fsanitize=undefined -fcompare-debug 
> -gno-statement-frontiers" } */
> +/* { dg-options "-O1 -fsanitize=undefined -fcompare-debug" } */
>
>  int
>  foo (int x)
> --
> 2.25.1


RE: [r13-3172 Regression] FAIL:libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for excess errors) on Linux/x86_64

2022-10-16 Thread Eugene Rozenfeld via Gcc-patches
That commit had a bug that was fixed in 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=80f414e6d73f9f1683f93d83ce63a6a482e54bee

Was that fix included in your GCC build?

From: Jiang, Haochen 
Sent: Sunday, October 16, 2022 8:09 PM
To: gcc-patches@gcc.gnu.org; Eugene Rozenfeld ; 
Jiang, Haochen ; gcc-regress...@gcc.gnu.org
Subject: [EXTERNAL] [r13-3172 Regression] 
FAIL:libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -O2 (test for 
excess errors) on Linux/x86_64

You don't often get email from 
haochen.ji...@intel.com. Learn why this is 
important
On Linux/x86_64,

f30e9fd33e56a5a721346ea6140722e1b193db42 is the first bad commit
commit f30e9fd33e56a5a721346ea6140722e1b193db42
Author: Eugene Rozenfeld mailto:ero...@microsoft.com>>
Date:   Thu Apr 21 16:43:24 2022 -0700

Set discriminators for call stmts on the same line within the same basic 
block.

caused

FAIL: libgomp.oacc-c../../libgomp.oacc-c-c..-common/kernels-loop-g.c 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2  (test for 
excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-2288/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="c++.exp=libgomp.oacc-c-c++-common/kernels-loop-g.c 
--target_board='unix{-m64\ -march=cascadelake}'"



[PATCH] Don't print discriminators for -fcompare-debug.

2022-10-16 Thread Eugene Rozenfeld via Gcc-patches
With -gstatement-frontiers we may end up with different IR
coming from the front end with and without debug information turned on.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100733 for details.
That may result in differences in discriminator values and -fcompare-debug
failures.

This patch disables printing of discriminators when the dump is intended
for -fcompare-debug comparison and reverses the workaround in a test.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
PR debug/107231
PR debug/107169
* print-rtl.cc (print_rtx_operand_code_i): Don't print discriminators
for -fdebug-compare.

gcc/testsuite/ChangeLog:

* c-c++-common/ubsan/pr85213.c: Reverse the workaround for 
discriminators.
---
 gcc/print-rtl.cc   | 13 ++---
 gcc/testsuite/c-c++-common/ubsan/pr85213.c |  7 +--
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/gcc/print-rtl.cc b/gcc/print-rtl.cc
index e115f987173..0476f3d7e79 100644
--- a/gcc/print-rtl.cc
+++ b/gcc/print-rtl.cc
@@ -453,10 +453,17 @@ rtx_writer::print_rtx_operand_code_i (const_rtx in_rtx, 
int idx)
  expanded_location xloc = insn_location (in_insn);
  fprintf (m_outfile, " \"%s\":%i:%i", xloc.file, xloc.line,
   xloc.column);
- int discriminator = insn_discriminator (in_insn);
-   if (discriminator)
- fprintf (m_outfile, " discrim %d", discriminator);
 
+ /* Don't print discriminators for -fcompare-debug since the IR
+coming from the front end may be different with and without
+debug information turned on. That may result in different
+discriminator values. */
+ if (!(dump_flags & TDF_COMPARE_DEBUG))
+   {
+ int discriminator = insn_discriminator (in_insn);
+ if (discriminator)
+   fprintf (m_outfile, " discrim %d", discriminator);
+   }
}
 #endif
 }
diff --git a/gcc/testsuite/c-c++-common/ubsan/pr85213.c 
b/gcc/testsuite/c-c++-common/ubsan/pr85213.c
index e903e976f2c..8a6be81d20f 100644
--- a/gcc/testsuite/c-c++-common/ubsan/pr85213.c
+++ b/gcc/testsuite/c-c++-common/ubsan/pr85213.c
@@ -1,11 +1,6 @@
 /* PR sanitizer/85213 */
 /* { dg-do compile } */
-/* Pass -gno-statement-frontiers to work around
-   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100733 :
-   without it the IR coming from the front end may be different with and 
without
-   debug information turned on. That may cause e.g., different discriminator 
values
-   and -fcompare-debug failures. */
-/* { dg-options "-O1 -fsanitize=undefined -fcompare-debug 
-gno-statement-frontiers" } */
+/* { dg-options "-O1 -fsanitize=undefined -fcompare-debug" } */
 
 int
 foo (int x)
-- 
2.25.1


Re: [EXTERNAL] Re: [PATCH] Set discriminators for call stmts on the same line within the same basic block

2022-10-10 Thread Eugene Rozenfeld via Gcc-patches
I sent a patch that fixes a bug introduced by this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603203.html

What you are seeing could have been caused by the same bug since it involves an 
uninitialized variable.

Eugene


On Oct 10, 2022, at 5:54 PM, David Edelsohn  wrote:


This patch causes a bootstrap comparison failure on AIX.  It apparently does 
not cause a failure on PPC64BE Linux with the same ABI, so I suspect that the 
failure may be related to the way that function aliases are implemented on AIX, 
which doesn't have ELF symbol alias semantics.

"This change will also simplify call site lookups since now location with 
discriminator will uniquely identify the call site (no callee function name is 
needed)."

I will open a PR with more information about the comparison difference now that 
I have a work-around to bring AIX back to a bootstrappable state.  Any thoughts 
about what could be going wrong?

Thanks, David




[PATCH][ICE] Fix for PR107193.

2022-10-10 Thread Eugene Rozenfeld via Gcc-patches
The bug was introduced in f30e9fd33e56a5a721346ea6140722e1b193db42.
A variable (cur_locus_e) was incorrectly declared inside a loop.
I also moved two other declarations (last and locus) down to make
the code more clear.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
PR debug/107193
* tree-cfg.cc (assign_discriminators): Move declaration of cur_locus_e
out of the loop.
---
 gcc/tree-cfg.cc | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 41f2925665f..ae781871a19 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -1204,9 +1204,8 @@ assign_discriminators (void)
   edge e;
   edge_iterator ei;
   gimple_stmt_iterator gsi;
-  gimple *last = last_stmt (bb);
-  location_t locus = last ? gimple_location (last) : UNKNOWN_LOCATION;
   location_t curr_locus = UNKNOWN_LOCATION;
+  expanded_location curr_locus_e = {};
   int curr_discr = 0;
 
   /* Traverse the basic block, if two function calls within a basic block
@@ -1215,7 +1214,7 @@ assign_discriminators (void)
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
{
  gimple *stmt = gsi_stmt (gsi);
- expanded_location curr_locus_e;
+
  if (curr_locus == UNKNOWN_LOCATION)
{
  curr_locus = gimple_location (stmt);
@@ -1238,6 +1237,8 @@ assign_discriminators (void)
curr_discr = next_discriminator_for_locus (curr_locus);
}
 
+  gimple *last = last_stmt (bb);
+  location_t locus = last ? gimple_location (last) : UNKNOWN_LOCATION;
   if (locus == UNKNOWN_LOCATION)
continue;
 
-- 
2.25.1



Re: [PATCH] Set discriminators for call stmts on the same line within the same basic block

2022-10-06 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the review Jason.

I fixed formatting and updated the commit description:


Call statements are possible split points of a basic block so they may end up
in different basic blocks by the time pass_ipa_auto_profile executes.

This change will also simplify call site lookups since now location with 
discriminator
will uniquely identify the call site (no callee function name is needed).

This change is based on commit 1e6c4a7a8fb8e20545bb9f9032d3854f3f794c18
by Dehao Chen in vendors/google/heads/gcc-4_8.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
* tree-cfg.cc (assign_discriminators): Set discriminators for call stmts
on the same line within the same basic block.
---
 gcc/tree-cfg.cc | 32 
 1 file changed, 32 insertions(+)

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index ade66c54499..f6a465f4c91 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -1203,8 +1203,40 @@ assign_discriminators (void)
 {
   edge e;
   edge_iterator ei;
+  gimple_stmt_iterator gsi;
   gimple *last = last_stmt (bb);
   location_t locus = last ? gimple_location (last) : UNKNOWN_LOCATION;
+  location_t curr_locus = UNKNOWN_LOCATION;
+  int curr_discr = 0;
+
+  /* Traverse the basic block, if two function calls within a basic block
+   are mapped to the same line, assign a new discriminator because a call
+   stmt could be a split point of a basic block.  */
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
+   {
+ gimple *stmt = gsi_stmt (gsi);
+ expanded_location curr_locus_e;
+ if (curr_locus == UNKNOWN_LOCATION)
+   {
+ curr_locus = gimple_location (stmt);
+ curr_locus_e = expand_location (curr_locus);
+   }
+ else if (!same_line_p (curr_locus, _locus_e, gimple_location 
(stmt)))
+   {
+ curr_locus = gimple_location (stmt);
+ curr_locus_e = expand_location (curr_locus);
+ curr_discr = 0;
+   }
+ else if (curr_discr != 0)
+   {
+ location_t loc = gimple_location (stmt);
+ location_t dloc = location_with_discriminator (loc, curr_discr);
+ gimple_set_location (stmt, dloc);
+   }
+ /* Allocate a new discriminator for CALL stmt.  */
+ if (gimple_code (stmt) == GIMPLE_CALL)
+   curr_discr = next_discriminator_for_locus (curr_locus);
+   }
 
   if (locus == UNKNOWN_LOCATION)
continue;
-- 
2.25.1

-Original Message-
From: Jason Merrill  
Sent: Tuesday, October 04, 2022 3:21 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] Set discriminators for call stmts on the same 
line within the same basic block

On 10/3/22 02:08, Eugene Rozenfeld wrote:
> This change is based on commit 
> 1e6c4a7a8fb8e20545bb9f9032d3854f3f794c18
> by Dehao Chen in vendors/google/heads/gcc-4_8.
> 
> Tested on x86_64-pc-linux-gnu.

Brief rationale for the change?

> gcc/ChangeLog:
>  * tree-cfg.cc (assign_discriminators): Set discriminators for call 
> stmts
>  on the same line within the same basic block.
> ---
>   gcc/tree-cfg.cc | 31 +++
>   1 file changed, 31 insertions(+)
> 
> diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 
> ade66c54499..8e2a3a5f6c6 100644
> --- a/gcc/tree-cfg.cc
> +++ b/gcc/tree-cfg.cc
> @@ -1203,8 +1203,39 @@ assign_discriminators (void)
>   {
> edge e;
> edge_iterator ei;
> +  gimple_stmt_iterator gsi;
> gimple *last = last_stmt (bb);
> location_t locus = last ? gimple_location (last) : 
> UNKNOWN_LOCATION;
> +  location_t curr_locus = UNKNOWN_LOCATION;
> +  int curr_discr = 0;
> +
> +  /* Traverse the basic block, if two function calls within a basic block
> +   are mapped to the same line, assign a new discriminator because a call
> +   stmt could be a split point of a basic block.  */
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
> +   {
> + gimple *stmt = gsi_stmt (gsi);
> + expanded_location curr_locus_e;
> + if (curr_locus == UNKNOWN_LOCATION)
> +   {
> + curr_locus = gimple_location (stmt);
> + curr_locus_e = expand_location (curr_locus);
> +   }
> + else if (!same_line_p (curr_locus, _locus_e, gimple_location 
> (stmt)))
> +   {
> + curr_locus = gimple_location (stmt);
> + curr_locus_e = expand_location (curr_locus);
> + curr_discr = 0;
> +   }
> + else if (curr_discr != 0)
> +   {
> + gimple_set_location (stmt, location_with_discriminator (
> + gimple_location (stmt), curr_discr));

This indentation is wonky, with an open paren at the end of the line; I'd 
suggest reformatting to

>  location_t dloc 

[PATCH] Set discriminators for call stmts on the same line within the same basic block

2022-10-03 Thread Eugene Rozenfeld via Gcc-patches
This change is based on commit 1e6c4a7a8fb8e20545bb9f9032d3854f3f794c18
by Dehao Chen in vendors/google/heads/gcc-4_8.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
* tree-cfg.cc (assign_discriminators): Set discriminators for call stmts
on the same line within the same basic block.
---
 gcc/tree-cfg.cc | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index ade66c54499..8e2a3a5f6c6 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -1203,8 +1203,39 @@ assign_discriminators (void)
 {
   edge e;
   edge_iterator ei;
+  gimple_stmt_iterator gsi;
   gimple *last = last_stmt (bb);
   location_t locus = last ? gimple_location (last) : UNKNOWN_LOCATION;
+  location_t curr_locus = UNKNOWN_LOCATION;
+  int curr_discr = 0;
+
+  /* Traverse the basic block, if two function calls within a basic block
+   are mapped to the same line, assign a new discriminator because a call
+   stmt could be a split point of a basic block.  */
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next ())
+   {
+ gimple *stmt = gsi_stmt (gsi);
+ expanded_location curr_locus_e;
+ if (curr_locus == UNKNOWN_LOCATION)
+   {
+ curr_locus = gimple_location (stmt);
+ curr_locus_e = expand_location (curr_locus);
+   }
+ else if (!same_line_p (curr_locus, _locus_e, gimple_location 
(stmt)))
+   {
+ curr_locus = gimple_location (stmt);
+ curr_locus_e = expand_location (curr_locus);
+ curr_discr = 0;
+   }
+ else if (curr_discr != 0)
+   {
+ gimple_set_location (stmt, location_with_discriminator (
+ gimple_location (stmt), curr_discr));
+   }
+ /* Allocate a new discriminator for CALL stmt.  */
+ if (gimple_code (stmt) == GIMPLE_CALL)
+   curr_discr = next_discriminator_for_locus (curr_locus);
+   }

   if (locus == UNKNOWN_LOCATION)
continue;
--
2.25.1


[PATCH] Emit discriminators for inlined call sites.

2022-09-30 Thread Eugene Rozenfeld via Gcc-patches
This change is based on commit 9fa26998a63d4b22b637ed8702520819e408a694
by Dehao Chen in vendors/google/heads/gcc-4_8.

gcc/ChangeLog:

* dwarf2out.cc (add_call_src_coords_attributes): Emit discriminators 
for inlined call sites.
---
 gcc/dwarf2out.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 2df75904022..e81044b8c48 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -24783,7 +24783,8 @@ add_call_src_coords_attributes (tree stmt, dw_die_ref 
die)
   if (RESERVED_LOCATION_P (BLOCK_SOURCE_LOCATION (stmt)))
 return;

-  expanded_location s = expand_location (BLOCK_SOURCE_LOCATION (stmt));
+  location_t locus = BLOCK_SOURCE_LOCATION (stmt);
+  expanded_location s = expand_location (locus);

   if (dwarf_version >= 3 || !dwarf_strict)
 {
@@ -24791,6 +24792,9 @@ add_call_src_coords_attributes (tree stmt, dw_die_ref 
die)
   add_AT_unsigned (die, DW_AT_call_line, s.line);
   if (debug_column_info && s.column)
add_AT_unsigned (die, DW_AT_call_column, s.column);
+  unsigned discr = get_discriminator_from_loc (locus);
+   if (discr != 0)
+ add_AT_unsigned (die, DW_AT_GNU_discriminator, discr);
 }
 }

--
2.25.1


[PATCH][PUSHED] Fix AutoFDO tests to not look for hot/cold splitting.

2022-09-27 Thread Eugene Rozenfeld via Gcc-patches
AutoFDO counts are not reliable and we are currently not
performing hot/cold splitting based on them. This change adjusts
several tree-prof tests not to check for hot/cold splitting
when run with AutoFDO.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/cold_partition_label.c: Don't check for hot/cold 
splitting with AutoFDO.
* gcc.dg/tree-prof/section-attr-1.c: Don't check for hot/cold splitting 
with AutoFDO.
* gcc.dg/tree-prof/section-attr-2.c: Don't check for hot/cold splitting 
with AutoFDO.
* gcc.dg/tree-prof/section-attr-3.c: Don't check for hot/cold splitting 
with AutoFDO.
---
 gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c | 4 ++--
 gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c   | 4 ++--
 gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c   | 4 ++--
 gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c   | 4 ++--
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c 
b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
index 511b61067c0..b85e6c1f93d 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
@@ -43,6 +43,6 @@ main (int argc, char *argv[])
   return 0;
 }
 
-/* { dg-final-use { scan-assembler "foo\[._\]+cold" { target *-*-linux* 
*-*-gnu* } } } */
-/* { dg-final-use { scan-assembler "size\[ \ta-zA-Z0-0\]+foo\[._\]+cold" { 
target *-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "foo\[._\]+cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "size\[ 
\ta-zA-Z0-0\]+foo\[._\]+cold" { target *-*-linux* *-*-gnu* } } } */
 /* { dg-final-use { scan-tree-dump-not "Invalid sum" "optimized"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c 
b/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c
index 2087d0d2059..5376de14a2f 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c
@@ -52,5 +52,5 @@ foo (int path)
 }
 }
 
-/* { dg-final-use { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
-/* { dg-final-use { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c 
b/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c
index b02526beaea..90de2c08ca4 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c
@@ -51,5 +51,5 @@ foo (int path)
 }
 }
 
-/* { dg-final-use { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
-/* { dg-final-use { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
diff --git a/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c 
b/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c
index da064070653..29a48f05feb 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c
@@ -52,5 +52,5 @@ foo (int path)
 }
 }
 
-/* { dg-final-use { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
-/* { dg-final-use { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler "\.section\[\t 
\]*\.text\.unlikely\[\\n\\r\]+\[\t \]*\.size\[\t \]*foo\.cold" { target 
*-*-linux* *-*-gnu* } } } */
+/* { dg-final-use-not-autofdo { scan-assembler {.section[\t 
]*__TEXT,__text_cold[^\n]*[\n\r]+_foo.cold:} { target *-*-darwin* } } } */
-- 
2.25.1



[PATCH] Fix profile count comparison.

2022-09-23 Thread Eugene Rozenfeld via Gcc-patches
The comparison was incorrect when the counts weren't PRECISE.
For example, crossmodule-indir-call-topn-1.c was failing
with AutoFDO: when count_sum is 0 with quality AFDO,
count_sum > profile_count::zero() evaluates to true. Taking that
branch then leads to an assert in the call to to_sreal().

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* ipa-cp.cc (good_cloning_opportunity_p): Fix profile count comparison.
---
 gcc/ipa-cp.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 543a9334e2c..66bba71c068 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -3338,9 +3338,9 @@ good_cloning_opportunity_p (struct cgraph_node *node, 
sreal time_benefit,

   ipa_node_params *info = ipa_node_params_sum->get (node);
   int eval_threshold = opt_for_fn (node->decl, param_ipa_cp_eval_threshold);
-  if (count_sum > profile_count::zero ())
+  if (count_sum.nonzero_p ())
 {
-  gcc_assert (base_count > profile_count::zero ());
+  gcc_assert (base_count.nonzero_p ());
   sreal factor = count_sum.probability_in (base_count).to_sreal ();
   sreal evaluation = (time_benefit * factor) / size_cost;
   evaluation = incorporate_penalties (node, info, evaluation);
--
2.25.1


RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator support.

2022-09-19 Thread Eugene Rozenfeld via Gcc-patches
Hi Jason,

Do you have any more feedback for this patch?

Thanks,

Eugene

-Original Message-
From: Eugene Rozenfeld 
Sent: Thursday, September 08, 2022 5:46 PM
To: Jason Merrill ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

Jason,

Thank for your suggestion. The patch is updated (attached).

Thanks,

Eugene

-Original Message-
From: Jason Merrill 
Sent: Thursday, September 08, 2022 10:26 AM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: Re: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

On 9/1/22 18:22, Eugene Rozenfeld wrote:
> Jason,
> 
> I made another small change in addressing your feedback (attached).
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Gcc-patches
>  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Thursday, September 01, 2022 1:49 PM
> To: Jason Merrill ; gcc-patches@gcc.gnu.org
> Cc: Andi Kleen ; Jan Hubicka 
> Subject: RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
> support.
> 
> Jason,
> 
> Thank you for your review. You are correct, we only need to check 
> has_discriminator for the statement that's on the same line.
> I updated the patch (attached).
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Jason Merrill 
> Sent: Thursday, August 18, 2022 6:55 PM
> To: Eugene Rozenfeld ; 
> gcc-patches@gcc.gnu.org
> Cc: Andi Kleen ; Jan Hubicka 
> Subject: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
> support.
> 
> On 8/3/22 17:25, Eugene Rozenfeld wrote:
>> One more ping for this patch
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
>> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=
>> 0
>> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8
>> 1
>> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63796470932569195
>> 1
>> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6
>> I
>> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2d
>> T
>> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
>>
>> CC Jason since this changes discriminators emitted in dwarf.
>>
>> Thanks,
>>
>> Eugene
>>
>> -Original Message-
>> From: Eugene Rozenfeld
>> Sent: Monday, June 27, 2022 12:45 PM
>> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
>> Hubicka 
>> Subject: RE: [PING][PATCH] Add instruction level discriminator support.
>>
>> Another ping for 
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7Cc7c9fab519184eee0bb808da91bf2fea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637982547579085499%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=Kj3YWJrDqjX%2B0Ml3At5P8NRWc1Xs6mbI%2F1vCk9%2BLaQs%3Dreserved=0
>>  .
>>
>> I got a review from Andi 
>> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596549.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7Cc7c9fab519184eee0bb808da91bf2fea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637982547579085499%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=NBzFtD0mH7EYKxsVWfqZgSpQmUG18SIt01XRYUlwwW4%3Dreserved=0)
>>  but I also need a review from someone who can approve the changes.
>>
>> Thanks,
>>
>> Eugene
>>
>> -Original Message-
>> From: Eugene Rozenfeld
>> Sent: Friday, June 10, 2022 12:03 PM
>> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
>> Hubicka 
>> Subject: [PING][PATCH] Add instruction level discriminator support.
>>
>> Hello,
>>
>> I'd like to ping this patch:
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
>> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=
>> 0
>> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8
>> 1
>> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63796470932569195
>> 1
>> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6
>> I
>> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2d
>> T
>> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
>>
>> Thanks,
>>
>> Eugene
>>
>> -Original Message-
>> From: Gcc-patches
>>  O

[PATCH][PUSHED] Fix for an AutoFDO test.

2022-09-16 Thread Eugene Rozenfeld via Gcc-patches
After 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=c17975d81aaed49ff759c20c68b31304a6953d58
the expected inlining in indir-call-prof-2.c test happens during afdo phase 
instead of einline.
This patch adjusts the test accordingly.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/indir-call-prof-2.c: Fix dg-final-use-autofdo.
---
 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c 
b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c
index 594c3f34d57..1d64d9f3f62 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c
@@ -1,4 +1,4 @@
-/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-profile-optimized 
-fdump-tree-einline-optimized" } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-ipa-profile-optimized 
-fdump-ipa-afdo-optimized" } */
 volatile int one;
 static int
 add1 (int val)
@@ -31,5 +31,5 @@ main (void)
 }
 /* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* 
add1 .will resolve by ipa-profile" "profile"} } */
 /* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* 
sub1 .will resolve by ipa-profile" "profile"} } */
-/* { dg-final-use-autofdo { scan-tree-dump "Inlining add1/1 into main/4." 
"einline"} } */
-/* { dg-final-use-autofdo { scan-tree-dump "Inlining sub1/2 into main/4." 
"einline"} } */
+/* { dg-final-use-autofdo { scan-ipa-dump "Inlining add1/1 into main/4." 
"afdo"} } */
+/* { dg-final-use-autofdo { scan-ipa-dump "Inlining sub1/2 into main/4." 
"afdo"} } */
--
2.25.1


RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator support.

2022-09-08 Thread Eugene Rozenfeld via Gcc-patches
Jason,

Thank for your suggestion. The patch is updated (attached).

Thanks,

Eugene

-Original Message-
From: Jason Merrill  
Sent: Thursday, September 08, 2022 10:26 AM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: Re: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

On 9/1/22 18:22, Eugene Rozenfeld wrote:
> Jason,
> 
> I made another small change in addressing your feedback (attached).
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Gcc-patches 
>  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Thursday, September 01, 2022 1:49 PM
> To: Jason Merrill ; gcc-patches@gcc.gnu.org
> Cc: Andi Kleen ; Jan Hubicka 
> Subject: RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
> support.
> 
> Jason,
> 
> Thank you for your review. You are correct, we only need to check 
> has_discriminator for the statement that's on the same line.
> I updated the patch (attached).
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Jason Merrill 
> Sent: Thursday, August 18, 2022 6:55 PM
> To: Eugene Rozenfeld ; 
> gcc-patches@gcc.gnu.org
> Cc: Andi Kleen ; Jan Hubicka 
> Subject: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
> support.
> 
> On 8/3/22 17:25, Eugene Rozenfeld wrote:
>> One more ping for this patch
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
>> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=
>> 0
>> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8
>> 1
>> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63796470932569195
>> 1 
>> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6
>> I 
>> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2d
>> T
>> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
>>
>> CC Jason since this changes discriminators emitted in dwarf.
>>
>> Thanks,
>>
>> Eugene
>>
>> -Original Message-
>> From: Eugene Rozenfeld
>> Sent: Monday, June 27, 2022 12:45 PM
>> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
>> Hubicka 
>> Subject: RE: [PING][PATCH] Add instruction level discriminator support.
>>
>> Another ping for 
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7Cc7c9fab519184eee0bb808da91bf2fea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637982547579085499%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=Kj3YWJrDqjX%2B0Ml3At5P8NRWc1Xs6mbI%2F1vCk9%2BLaQs%3Dreserved=0
>>  .
>>
>> I got a review from Andi 
>> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596549.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7Cc7c9fab519184eee0bb808da91bf2fea%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637982547579085499%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=NBzFtD0mH7EYKxsVWfqZgSpQmUG18SIt01XRYUlwwW4%3Dreserved=0)
>>  but I also need a review from someone who can approve the changes.
>>
>> Thanks,
>>
>> Eugene
>>
>> -Original Message-
>> From: Eugene Rozenfeld
>> Sent: Friday, June 10, 2022 12:03 PM
>> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
>> Hubicka 
>> Subject: [PING][PATCH] Add instruction level discriminator support.
>>
>> Hello,
>>
>> I'd like to ping this patch:
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
>> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=
>> 0
>> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8
>> 1
>> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63796470932569195
>> 1 
>> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6
>> I 
>> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2d
>> T
>> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
>>
>> Thanks,
>>
>> Eugene
>>
>> -Original Message-
>> From: Gcc-patches
>>  On Behalf Of 
>> Eugene Rozenfeld via Gcc-patches
>> Sent: Thursday, June 02, 2022 12:22 AM
>> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
>> Hubicka 
>> Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.
>>
>> This is the first in a series of patches to enable di

RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator support.

2022-09-01 Thread Eugene Rozenfeld via Gcc-patches
Jason,

I made another small change in addressing your feedback (attached).

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Eugene Rozenfeld via Gcc-patches
Sent: Thursday, September 01, 2022 1:49 PM
To: Jason Merrill ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

Jason,

Thank you for your review. You are correct, we only need to check 
has_discriminator for the statement that's on the same line.
I updated the patch (attached).

Thanks,

Eugene

-Original Message-
From: Jason Merrill 
Sent: Thursday, August 18, 2022 6:55 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

On 8/3/22 17:25, Eugene Rozenfeld wrote:
> One more ping for this patch
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> CC Jason since this changes discriminators emitted in dwarf.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Monday, June 27, 2022 12:45 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: RE: [PING][PATCH] Add instruction level discriminator support.
> 
> Another ping for 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=05%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cf217ebc45428465857bd08da8c5b6fb2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637976621612503972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=b0kTdzWRyiwdtcEFasyNlKv1vj%2FqwnipN3776C7xWcg%3Dreserved=0
>  .
> 
> I got a review from Andi 
> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596549.htmldata=05%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cf217ebc45428465857bd08da8c5b6fb2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637976621612503972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=qxjBUCcGiKXtR4%2BOJq%2FFQN11C2M6BBurTguOBOjWJDw%3Dreserved=0)
>  but I also need a review from someone who can approve the changes.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Friday, June 10, 2022 12:03 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [PING][PATCH] Add instruction level discriminator support.
> 
> Hello,
> 
> I'd like to ping this patch: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Gcc-patches
>  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Thursday, June 02, 2022 12:22 AM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.
> 
> This is the first in a series of patches to enable discriminator support in 
> AutoFDO.
> 
> This patch switches to tracking discriminators per statement/instruction 
> instead of per basic block. Tracking per basic block was problematic since 
> not all statements in a basic block needed a discriminator and, also, later 
> optimizations could move statements between basic blocks making correlation 
> during AutoFDO compilation unreliable. Tracking per statement also allows us 
> to assign different discriminators to multiple function calls in the same 
> basic block. A subsequent patch will add that support.
> 
> The idea of this patch is based on commit 
> 4c311d95cf6d9519c3c20f641cc77af7df491fdf
> by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
> approach. In Dehao's work special (normally unused) location ids and side 
> tables were used to keep track of locations with discriminators. Things have

RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator support.

2022-09-01 Thread Eugene Rozenfeld via Gcc-patches
Jason,

Thank you for your review. You are correct, we only need to check 
has_discriminator for the statement that's on the same line.
I updated the patch (attached).

Thanks,

Eugene

-Original Message-
From: Jason Merrill  
Sent: Thursday, August 18, 2022 6:55 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

On 8/3/22 17:25, Eugene Rozenfeld wrote:
> One more ping for this patch 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> CC Jason since this changes discriminators emitted in dwarf.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Monday, June 27, 2022 12:45 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: RE: [PING][PATCH] Add instruction level discriminator support.
> 
> Another ping for 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8185dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dTDYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
>  .
> 
> I got a review from Andi 
> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596549.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8185dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=se6x1LD0GQyFz%2B28gdVqsye3Aw8kPoMRhVQO1BSPg6I%3Dreserved=0)
>  but I also need a review from someone who can approve the changes.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Friday, June 10, 2022 12:03 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [PING][PATCH] Add instruction level discriminator support.
> 
> Hello,
> 
> I'd like to ping this patch: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Gcc-patches 
>  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Thursday, June 02, 2022 12:22 AM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.
> 
> This is the first in a series of patches to enable discriminator support in 
> AutoFDO.
> 
> This patch switches to tracking discriminators per statement/instruction 
> instead of per basic block. Tracking per basic block was problematic since 
> not all statements in a basic block needed a discriminator and, also, later 
> optimizations could move statements between basic blocks making correlation 
> during AutoFDO compilation unreliable. Tracking per statement also allows us 
> to assign different discriminators to multiple function calls in the same 
> basic block. A subsequent patch will add that support.
> 
> The idea of this patch is based on commit 
> 4c311d95cf6d9519c3c20f641cc77af7df491fdf
> by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
> approach. In Dehao's work special (normally unused) location ids and side 
> tables were used to keep track of locations with discriminators. Things have 
> changed since then and I don't think we have unused location ids anymore. 
> Instead, I made discriminators a part of ad-hoc locations.
> 
> The difference from Dehao's work also includes support for discriminator 
> reading/writing in lto streaming and in modules.
> 
> Tested on x86_64-pc-linux-gnu.

> @@ -1190,12 +1217,12 

RE: [EXTERNAL] [PATCH] contrib: modernize gen_autofdo_event.py

2022-08-26 Thread Eugene Rozenfeld via Gcc-patches
The patch is approved.

Eugene

-Original Message-
From: Andi Kleen  
Sent: Friday, August 05, 2022 11:29 PM
To: Eugene Rozenfeld ; Xi Ruoyao 
; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] [PATCH] contrib: modernize gen_autofdo_event.py


On 8/6/2022 1:07 AM, Eugene Rozenfeld wrote:
> The changes look good to me. Also adding Andi, the author of the script.


Looks all good to me too.


-Andi




RE: [EXTERNAL] [PATCH] contrib: modernize gen_autofdo_event.py

2022-08-05 Thread Eugene Rozenfeld via Gcc-patches
The changes look good to me. Also adding Andi, the author of the script.

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Xi Ruoyao via Gcc-patches
Sent: Sunday, June 26, 2022 11:15 PM
To: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] [PATCH] contrib: modernize gen_autofdo_event.py

Python 2 has been EOL'ed for two years.  egrep has been deprecated for many 
years and the next grep release will start to print warning if it is used.

-E option may be unsupported by some non-POSIX grep implementations, but 
gcc-auto-profile won't work on non-Linux systems anyway.

contrib/ChangeLog:

* gen_autofdo_event.py: Port to Python 3, and use grep -E
instead of egrep.

gcc/ChangeLog:

* config/i386/gcc-auto-profile: Regenerate.
---
 contrib/gen_autofdo_event.py | 80 
 gcc/config/i386/gcc-auto-profile | 31 +++--
 2 files changed, 57 insertions(+), 54 deletions(-)

diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py index 
1eb6f1d6d85..7da2876530d 100755
--- a/contrib/gen_autofdo_event.py
+++ b/contrib/gen_autofdo_event.py
@@ -1,4 +1,4 @@
-#!/usr/bin/python
+#!/usr/bin/python3
 # Generate Intel taken branches Linux perf event script for autofdo profiling.
 
 # Copyright (C) 2016 Free Software Foundation, Inc.
@@ -26,18 +26,19 @@
 # Requires internet (https) access. This may require setting up a proxy  # 
with export https_proxy=...
 #
-import urllib2
+import urllib.request
 import sys
 import json
 import argparse
 import collections
+import os
 
 baseurl = "https://download.01.org/perfmon;
 
-target_events = (u'BR_INST_RETIRED.NEAR_TAKEN',
- u'BR_INST_EXEC.TAKEN',
- u'BR_INST_RETIRED.TAKEN_JCC',
- u'BR_INST_TYPE_RETIRED.COND_TAKEN')
+target_events = ('BR_INST_RETIRED.NEAR_TAKEN',
+ 'BR_INST_EXEC.TAKEN',
+ 'BR_INST_RETIRED.TAKEN_JCC',
+ 'BR_INST_TYPE_RETIRED.COND_TAKEN')
 
 ap = argparse.ArgumentParser()
 ap.add_argument('--all', '-a', help='Print for all CPUs', action='store_true') 
@@ -71,47 +72,46 @@ def get_cpustr():
 return "%s-%d-%X" % tuple(cpu)[:3]
 
 def find_event(eventurl, model):
-print >>sys.stderr, "Downloading", eventurl
-u = urllib2.urlopen(eventurl)
+print("Downloading", eventurl, file = sys.stderr)
+u = urllib.request.urlopen(eventurl)
 events = json.loads(u.read())
 u.close()
 
 found = 0
 for j in events:
-if j[u'EventName'] in target_events:
-event = "cpu/event=%s,umask=%s/" % (j[u'EventCode'], j[u'UMask'])
-if u'PEBS' in j and j[u'PEBS'] > 0:
+if j['EventName'] in target_events:
+event = "cpu/event=%s,umask=%s/" % (j['EventCode'], j['UMask'])
+if 'PEBS' in j and int(j['PEBS']) > 0:
 event += "p"
 if args.script:
 eventmap[event].append(model)
 else:
-print j[u'EventName'], "event for model", model, "is", event
+print(j['EventName'], "event for model", model, "is", 
+ event)
 found += 1
 return found
 
 if not args.all:
-cpu = get_cpu_str()
+cpu = get_cpustr()
 if not cpu:
 sys.exit("Unknown CPU type")
 
 url = baseurl + "/mapfile.csv"
-print >>sys.stderr, "Downloading", url
-u = urllib2.urlopen(url)
+print("Downloading", url, file = sys.stderr) u = 
+urllib.request.urlopen(url)
 found = 0
 cpufound = 0
 for j in u:
-n = j.rstrip().split(',')
+n = j.rstrip().decode().split(',')
 if len(n) >= 4 and (args.all or n[0] == cpu) and n[3] == "core":
-if args.all:
-components = n[0].split("-")
-model = components[2]
-model = int(model, 16)
+components = n[0].split("-")
+model = components[2]
+model = int(model, 16)
 cpufound += 1
 found += find_event(baseurl + n[2], model)
 u.close()
 
 if args.script:
-print '''#!/bin/sh
+print('''#!/bin/sh
 # Profile workload for gcc profile feedback (autofdo) using Linux perf.
 # Auto generated. To regenerate for new CPUs run  # 
contrib/gen_autofdo_event.py --script --all in gcc source @@ -146,27 +146,27 @@ 
if grep -q hypervisor /proc/cpuinfo ; then
   echo >&2 "Warning: branch profiling may not be functional in VMs"
 fi
 
-case `egrep -q "^cpu family\s*: 6" /proc/cpuinfo &&
-  egrep "^model\s*:" /proc/cpuinfo | head -n1` in'''
-for event, mod in eventmap.iteritems():
+case `grep -E -q "^cpu family\s*: 6" /proc/cpuinfo &&
+  grep -E "^model\s*:" /proc/cpuinfo | head -n1` in''')
+for event, mod in eventmap.items():
 for m in mod[:-1]:
-print "model*:\ %s|\\" % m
-print 'model*:\ %s) E="%s$FLAGS" ;;' % (mod[-1], event)
-print '''*)
+print("model*:\ %s|\\" % m)
+print('model*:\ %s) E="%s$FLAGS" ;;' % (mod[-1], event))
+print('''*)
 echo >&2 "Unknown CPU. Run contrib/gen_autofdo_event.py 

[committed] MAINTAINERS: Add myself as AutoFDO maintainer

2022-08-04 Thread Eugene Rozenfeld via Gcc-patches
ChangeLog:

* MAINTAINERS: Add myself as AutoFDO maintainer.
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1a37f4419b9..02ced0c43aa 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -239,6 +239,7 @@ tree-ssaAndrew MacLeod  

 tree browser/unparser  Sebastian Pop   
 scev, data dependence  Sebastian Pop   
 profile feedback   Jan Hubicka 
+AutoFDOEugene Rozenfeld
 reload Ulrich Weigand  
 RTL optimizers Eric Botcazou   
 instruction combiner   Segher Boessenkool  
@@ -603,7 +604,6 @@ Craig Rodrigues 

 Erven Rohou
 Ira Rosen  
 Yvan Roux  
-Eugene Rozenfeld   
 Silvius Rus
 Matthew Sachs  
 Ankur Saini
-- 
2.25.1


RE: [PING][PATCH] Add instruction level discriminator support.

2022-08-03 Thread Eugene Rozenfeld via Gcc-patches
One more ping for this patch 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html

CC Jason since this changes discriminators emitted in dwarf.

Thanks,

Eugene

-Original Message-
From: Eugene Rozenfeld 
Sent: Monday, June 27, 2022 12:45 PM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: RE: [PING][PATCH] Add instruction level discriminator support.

Another ping for 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html .

I got a review from Andi 
(https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596549.html) but I also 
need a review from someone who can approve the changes.

Thanks,

Eugene

-Original Message-
From: Eugene Rozenfeld 
Sent: Friday, June 10, 2022 12:03 PM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: [PING][PATCH] Add instruction level discriminator support.

Hello,

I'd like to ping this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Eugene Rozenfeld via Gcc-patches
Sent: Thursday, June 02, 2022 12:22 AM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.

This is the first in a series of patches to enable discriminator support in 
AutoFDO.

This patch switches to tracking discriminators per statement/instruction 
instead of per basic block. Tracking per basic block was problematic since not 
all statements in a basic block needed a discriminator and, also, later 
optimizations could move statements between basic blocks making correlation 
during AutoFDO compilation unreliable. Tracking per statement also allows us to 
assign different discriminators to multiple function calls in the same basic 
block. A subsequent patch will add that support.

The idea of this patch is based on commit 
4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
approach. In Dehao's work special (normally unused) location ids and side 
tables were used to keep track of locations with discriminators. Things have 
changed since then and I don't think we have unused location ids anymore. 
Instead, I made discriminators a part of ad-hoc locations.

The difference from Dehao's work also includes support for discriminator 
reading/writing in lto streaming and in modules.

Tested on x86_64-pc-linux-gnu.


0001-Add-instruction-level-discriminator-support.patch
Description: 0001-Add-instruction-level-discriminator-support.patch


RE: [EXTERNAL] Re: State of AutoFDO in GCC

2022-07-27 Thread Eugene Rozenfeld via Gcc
Hi Xionghu,

Yes, I'm pretty sure both of the fixes you mentioned are applicable to GCC 10. 
I'm not sure what the bar is for backporting fixes.
Jan, can you comment on that?
The fixes that Xionghu mentioned are:

Fix indirect call inlining with AutoFDO

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=285aa6895d479bed8e72ad363290846645b6faa0
AutoFDO: Don't try to promote indirect calls that result in recursive 
direct calls

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ba125745d9e9fe90a18a2af8701b3269c5fdd468

Thanks,

Eugene

-Original Message-
From: Xionghu Luo  
Sent: Tuesday, July 26, 2022 6:41 PM
To: Eugene Rozenfeld ; Andi Kleen 
; Joseph Myers ; Jan Hubicka 
; gcc 
Subject: [EXTERNAL] Re: State of AutoFDO in GCC

[You don't often get email from yinyuefen...@gmail.com. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

On 2022/7/27 09:31, Xionghu Luo wrote:
>
>
> On 2022/7/27 04:12, Eugene Rozenfeld via Gcc wrote:
>> Hello GCC community.
>>
>> I started this thread on the state of AutoFDO in GCC more than a year 
>> ago. Here is the first message in the thread:
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc
>> .gnu.org%2Fpipermail%2Fgcc%2F2021-April%2F235860.htmldata=05%7C0
>> 1%7CEugene.Rozenfeld%40microsoft.com%7Ccda1f6374c584732950e08da6f7126
>> 83%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637944829019804556%7C
>> Unknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1
>> haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=t%2B94vOYKb8SDZ9i86Fnin
>> iYXt7LTGXqbrbftH1TVqEs%3Dreserved=0
>>
>> Since then I committed a number of patches to revive AutoFDO in GCC:
>>
>> Fix a typo in an AutoFDO error
>> string<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
>> F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D23691ddd3aa
>> 3ffe55892b2bff54f9a15a89de2b4data=05%7C01%7CEugene.Rozenfeld%40m
>> icrosoft.com%7Ccda1f6374c584732950e08da6f712683%7C72f988bf86f141af91a
>> b2d7cd011db47%7C1%7C0%7C637944829019804556%7CUnknown%7CTWFpbGZsb3d8ey
>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C30
>> 00%7C%7C%7Csdata=4WD%2FgKxeXEFz7V6pcqDsYxvAbSOP3ldxXHb%2FoMxV5kc
>> %3Dreserved=0>
>>
>> Update gen_autofdo_event.py and
>> gcc-auto-profile.<https://nam06.safelinks.protection.outlook.com/?url
>> =https%3A%2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D
>> 01d402c5e0ac1ddf5618bbe316b50067625fda46data=05%7C01%7CEugene.Ro
>> zenfeld%40microsoft.com%7Ccda1f6374c584732950e08da6f712683%7C72f988bf
>> 86f141af91ab2d7cd011db47%7C1%7C0%7C637944829019804556%7CUnknown%7CTWF
>> pbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6
>> Mn0%3D%7C3000%7C%7C%7Csdata=TA%2FhV%2FKsRF3MUZmUL%2FpTWbESdx03VK
>> PIudXrrBjAMvQ%3Dreserved=0>
>>
>> Fixes for AutoFDO
>> tests<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F
>> %2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3Df9ad3d5339fa
>> aaed6e15a7b27d90fbc66eb72f37data=05%7C01%7CEugene.Rozenfeld%40mi
>> crosoft.com%7Ccda1f6374c584732950e08da6f712683%7C72f988bf86f141af91ab
>> 2d7cd011db47%7C1%7C0%7C637944829019804556%7CUnknown%7CTWFpbGZsb3d8eyJ
>> WIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C300
>> 0%7C%7C%7Csdata=AvWCTytUxlcXq%2BpjkWJmhkmV0nH%2Fn0CzC4alU9XA9%2F
>> 4%3Dreserved=0>
>>
>> Fix indir-call-prof-2.c with
>> AutoFDO<https://nam06.safelinks.protection.outlook.com/?url=https%3A%
>> 2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D0ed093c7c3
>> f755bc1cd80e5186abeb2f5c50ee0cdata=05%7C01%7CEugene.Rozenfeld%40
>> microsoft.com%7Ccda1f6374c584732950e08da6f712683%7C72f988bf86f141af91
>> ab2d7cd011db47%7C1%7C0%7C637944829019804556%7CUnknown%7CTWFpbGZsb3d8e
>> yJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
>> 000%7C%7C%7Csdata=3dlTtTfe4XOOm6BEy5YLWG0l3WlQdfbCyFiXs3Q7W1I%3D
>> reserved=0>
>>
>> Fixes for AutoFDO
>> testing<https://nam06.safelinks.protection.outlook.com/?url=https%3A%
>> 2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D9265b37853
>> 1391498ec1727f67a45da72a6c07e9data=05%7C01%7CEugene.Rozenfeld%40
>> microsoft.com%7Ccda1f6374c584732950e08da6f712683%7C72f988bf86f141af91
>> ab2d7cd011db47%7C1%7C0%7C637944829019804556%7CUnknown%7CTWFpbGZsb3d8e
>> yJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
>> 000%7C%7C%7Csdata=ZWSIQ0jb6t2DZQpDb%2F7e5FqKM6KKskM%2FAYzLpxbUkp
>> 4%3Dreserved=0>
>>
>> Fix indirect call inlining with
>> AutoFDO<https://nam06.safelinks.protection.outlook.com/?ur

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2022-07-27 Thread Eugene Rozenfeld via Gcc
Hi Jan,

Thank you for your reply. I did have you on the TO line in my latest patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html

That's the patch I need a review on.

I'm looking forward to co-maintaining AutoFDO with you.

Thanks,

Eugene

-Original Message-
From: Jan Hubicka  
Sent: Wednesday, July 27, 2022 12:27 AM
To: David Edelsohn 
Cc: Eugene Rozenfeld ; Martin Liska 
; Xinliang David Li ; gcc 
; Andi Kleen ; Joseph Myers 

Subject: [EXTERNAL] Re: State of AutoFDO in GCC

> On Tue, Jul 26, 2022 at 4:13 PM Eugene Rozenfeld via Gcc 
>  wrote:
> >
> > Hello GCC community.
> >
> > I started this thread on the state of AutoFDO in GCC more than a 
> > year ago. Here is the first message in the thread: 
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgc
> > c.gnu.org%2Fpipermail%2Fgcc%2F2021-April%2F235860.htmldata=05%7
> > C01%7CEugene.Rozenfeld%40microsoft.com%7Cfe5184091a18487fd92d08da6fa
> > 1619e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63794503618637077
> > 0%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTi
> > I6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=9%2Bnv2ShWxKh88K%
> > 2BsOeqPgQX3lOCJQ0lnF%2F7SUs4K4uI%3Dreserved=0
> >
> > Since then I committed a number of patches to revive AutoFDO in GCC:
> >
> > Fix a typo in an AutoFDO error 
> > string<https://nam06.safelinks.protection.outlook.com/?url=https%3A%
> > 2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D23691ddd3
> > aa3ffe55892b2bff54f9a15a89de2b4data=05%7C01%7CEugene.Rozenfeld%
> > 40microsoft.com%7Cfe5184091a18487fd92d08da6fa1619e%7C72f988bf86f141a
> > f91ab2d7cd011db47%7C1%7C0%7C637945036186370770%7CUnknown%7CTWFpbGZsb
> > 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> > D%7C3000%7C%7C%7Csdata=qkecfE9uH5gy91vILQQlCk9RpExqPZxO4q02wiN1
> > EFw%3Dreserved=0> Update gen_autofdo_event.py and 
> > gcc-auto-profile.<https://nam06.safelinks.protection.outlook.com/?ur
> > l=https%3A%2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%
> > 3D01d402c5e0ac1ddf5618bbe316b50067625fda46data=05%7C01%7CEugene
> > .Rozenfeld%40microsoft.com%7Cfe5184091a18487fd92d08da6fa1619e%7C72f9
> > 88bf86f141af91ab2d7cd011db47%7C1%7C0%7C637945036186370770%7CUnknown%
> > 7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLC
> > JXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=E52qVFlfdfFGnW9yDsBNhh4k2ey8g
> > 3aJEGzH40MuSOc%3Dreserved=0> Fixes for AutoFDO 
> > tests<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> > F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3Df9ad3d5339
> > faaaed6e15a7b27d90fbc66eb72f37data=05%7C01%7CEugene.Rozenfeld%4
> > 0microsoft.com%7Cfe5184091a18487fd92d08da6fa1619e%7C72f988bf86f141af
> > 91ab2d7cd011db47%7C1%7C0%7C637945036186370770%7CUnknown%7CTWFpbGZsb3
> > d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
> > %7C3000%7C%7C%7Csdata=XYlFoY3OTTXHp18O1v8BY47A17NyNPXvUWWYsVnbD
> > 0U%3Dreserved=0> Fix indir-call-prof-2.c with 
> > AutoFDO<https://nam06.safelinks.protection.outlook.com/?url=https%3A
> > %2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D0ed093c7
> > c3f755bc1cd80e5186abeb2f5c50ee0cdata=05%7C01%7CEugene.Rozenfeld
> > %40microsoft.com%7Cfe5184091a18487fd92d08da6fa1619e%7C72f988bf86f141
> > af91ab2d7cd011db47%7C1%7C0%7C637945036186370770%7CUnknown%7CTWFpbGZs
> > b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> > 3D%7C3000%7C%7C%7Csdata=sLftY6hjvzSuE9ZgkGmXZLDpRMjlDo%2FEAyDyP
> > CviY5Q%3Dreserved=0> Fixes for AutoFDO 
> > testing<https://nam06.safelinks.protection.outlook.com/?url=https%3A
> > %2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D9265b378
> > 531391498ec1727f67a45da72a6c07e9data=05%7C01%7CEugene.Rozenfeld
> > %40microsoft.com%7Cfe5184091a18487fd92d08da6fa1619e%7C72f988bf86f141
> > af91ab2d7cd011db47%7C1%7C0%7C637945036186370770%7CUnknown%7CTWFpbGZs
> > b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> > 3D%7C3000%7C%7C%7Csdata=lXZ%2F%2FbcfYD%2BQyIiXMAaCxOujEAfDXSY1p
> > 78kUb2md7w%3Dreserved=0> Fix indirect call inlining with 
> > AutoFDO<https://nam06.safelinks.protection.outlook.com/?url=https%3A
> > %2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D285aa689
> > 5d479bed8e72ad363290846645b6faa0data=05%7C01%7CEugene.Rozenfeld
> > %40microsoft.com%7Cfe5184091a18487fd92d08da6fa1619e%7C72f988bf86f141
> > af91ab2d7cd011db47%7C1%7C0%7C637945036186370770%7CUnknown%7CTWFpbGZs
> > b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> > 3D%7C3000%7C%7C%7Csdata=ypoF

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2022-07-27 Thread Eugene Rozenfeld via Gcc
Hi David,

Thank you for your offer to take on the responsibility of maintaining the 
AutoFDO-specific portions of the code. I'll be happy to do that with Jan's and 
other GGC maintainers' help.
I hope more people will start using and contributing to AutoFDO.

Thanks,

Eugene

-Original Message-
From: David Edelsohn  
Sent: Tuesday, July 26, 2022 3:38 PM
To: Eugene Rozenfeld ; Jan Hubicka 
; Martin Liska ; Xinliang David Li 

Cc: Andi Kleen ; Joseph Myers ; 
gcc 
Subject: [EXTERNAL] Re: State of AutoFDO in GCC

[You don't often get email from dje@gmail.com. Learn why this is important 
at https://aka.ms/LearnAboutSenderIdentification ]

On Tue, Jul 26, 2022 at 4:13 PM Eugene Rozenfeld via Gcc  
wrote:
>
> Hello GCC community.
>
> I started this thread on the state of AutoFDO in GCC more than a year 
> ago. Here is the first message in the thread: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc%2F2021-April%2F235860.htmldata=05%7C01%
> 7CEugene.Rozenfeld%40microsoft.com%7Ccabaec504f234632a03a08da6f5780d2%
> 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637944718882043585%7CUnkn
> own%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwi
> LCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=5RoiLypTlkNRIZW8yAufYR4qiO573
> 0PADO%2BFsS6InIU%3Dreserved=0
>
> Since then I committed a number of patches to revive AutoFDO in GCC:
>
> Fix a typo in an AutoFDO error 
> string<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F
> %2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D23691ddd3aa3f
> fe55892b2bff54f9a15a89de2b4data=05%7C01%7CEugene.Rozenfeld%40micr
> osoft.com%7Ccabaec504f234632a03a08da6f5780d2%7C72f988bf86f141af91ab2d7
> cd011db47%7C1%7C0%7C637944718882199821%7CUnknown%7CTWFpbGZsb3d8eyJWIjo
> iMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%
> 7C%7Csdata=LRgW4qyh%2FAdrlKluUXnaDTFCNW8tnS1WX8bCs1WAT7s%3Dr
> eserved=0> Update gen_autofdo_event.py and 
> gcc-auto-profile.<https://nam06.safelinks.protection.outlook.com/?url=
> https%3A%2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D01
> d402c5e0ac1ddf5618bbe316b50067625fda46data=05%7C01%7CEugene.Rozen
> feld%40microsoft.com%7Ccabaec504f234632a03a08da6f5780d2%7C72f988bf86f1
> 41af91ab2d7cd011db47%7C1%7C0%7C637944718882199821%7CUnknown%7CTWFpbGZs
> b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
> %7C3000%7C%7C%7Csdata=oBJvTW6RrkY6uURmr0UK5gaNiVS1b8vFfIyOoqq8AkM
> %3Dreserved=0> Fixes for AutoFDO 
> tests<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%
> 2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3Df9ad3d5339faaa
> ed6e15a7b27d90fbc66eb72f37data=05%7C01%7CEugene.Rozenfeld%40micro
> soft.com%7Ccabaec504f234632a03a08da6f5780d2%7C72f988bf86f141af91ab2d7c
> d011db47%7C1%7C0%7C637944718882199821%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi
> MC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7
> C%7Csdata=6evlP10%2BQJyUuVZ3u%2Fv%2FP9nSIsASLjtWETyqeBfnQ2k%3D
> p;reserved=0> Fix indir-call-prof-2.c with 
> AutoFDO<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D0ed093c7c3f7
> 55bc1cd80e5186abeb2f5c50ee0cdata=05%7C01%7CEugene.Rozenfeld%40mic
> rosoft.com%7Ccabaec504f234632a03a08da6f5780d2%7C72f988bf86f141af91ab2d
> 7cd011db47%7C1%7C0%7C637944718882199821%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C
> %7C%7Csdata=mtMd21By6r8DiIIXzk5ePHP0iU%2FHfnDQG8%2FXJbAr5qE%3D
> p;reserved=0> Fixes for AutoFDO 
> testing<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D9265b3785313
> 91498ec1727f67a45da72a6c07e9data=05%7C01%7CEugene.Rozenfeld%40mic
> rosoft.com%7Ccabaec504f234632a03a08da6f5780d2%7C72f988bf86f141af91ab2d
> 7cd011db47%7C1%7C0%7C637944718882199821%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C
> %7C%7Csdata=QnKrxt7W9wfTYY3Drjm%2FSn7D4yHwxjInvI8dj9KEIs4%3D
> reserved=0> Fix indirect call inlining with 
> AutoFDO<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2
> F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D285aa6895d47
> 9bed8e72ad363290846645b6faa0data=05%7C01%7CEugene.Rozenfeld%40mic
> rosoft.com%7Ccabaec504f234632a03a08da6f5780d2%7C72f988bf86f141af91ab2d
> 7cd011db47%7C1%7C0%7C637944718882199821%7CUnknown%7CTWFpbGZsb3d8eyJWIj
> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C
> %7C%7Csdata=695ieVftlFRwICzxxMmmIb9%2F%2FDBlYBy9jJn%2ByfzkhP0%3D&
> amp;reserved=0> Improve AutoFDO count propagation 
> algorithm<https://nam06.safelinks.protection.outlook.com/?url

Re: State of AutoFDO in GCC

2022-07-26 Thread Eugene Rozenfeld via Gcc
Hello GCC community.

I started this thread on the state of AutoFDO in GCC more than a year ago. Here 
is the first message in the thread: 
https://gcc.gnu.org/pipermail/gcc/2021-April/235860.html

Since then I committed a number of patches to revive AutoFDO in GCC:

Fix a typo in an AutoFDO error 
string
Update gen_autofdo_event.py and 
gcc-auto-profile.
Fixes for AutoFDO 
tests
Fix indir-call-prof-2.c with 
AutoFDO
Fixes for AutoFDO 
testing
Fix indirect call inlining with 
AutoFDO
Improve AutoFDO count propagation 
algorithm
AutoFDO: don't set param_early_inliner_max_iterations to 
10.
AutoFDO: Don't try to promote indirect calls that result in recursive direct 
calls
Fix profile count maintenance in vectorizer 
peeling.

I also made a number of fixes and improvements to create_gcov tool in 
https://github.com/google/autofdo .

AutoFDO in GCC is in a much better shape now.

I have a further set of patches that improve DWARF discriminator support in GCC 
and enable AutoFDO to use discriminators. It's based on commits in an old 
Google vendor branch as described in Andi's mail below
but uses a different approach for keeping track of per-instruction 
discriminators.

I submitted the first (and the biggest) of these patches almost 2 months ago on 
June 2: 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=5af22024f62f1f596a35d3c138d41d47d5697ca0
but only got a review from Andi 
(https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596549.html) who is not 
allowed to approve patches for commit. I pinged gcc-patches twice with no 
success.

I would appreciate help in getting a review on this patch so that I can get it 
committed and submit patches that depend on it.

Thank you,

Eugene

-Original Message-
From: Andi Kleen 
Sent: Monday, May 10, 2021 10:21 AM
To: Joseph Myers 
Cc: Jan Hubicka ; gcc ; Eugene Rozenfeld 

Subject: [EXTERNAL] Re: State of AutoFDO in GCC

On Mon, May 10, 2021 at 04:55:50PM +, Joseph Myers wrote:
> On Mon, 10 May 2021, Andi Kleen via Gcc wrote:
>
> > It's difficult to find now because it was a branch in the old SVN
> > that wasn't converted. Sadly the great git conversion was quite lossy.
>
> All branches and tags, including deleted ones, were converted (under
> not-fetched-by-default refs in some cases); the git repository has
> everything that might plausibly be useful, omitting only a few things
> that would have been meaningless to convert, such as mistaken branch
> creations in the root of the repository and the SVN hooks directory.
> Use "git ls-remote git://gcc.gnu.org/git/gcc.git" to see the full list
> of over 5000 refs available in the repository (or do a clone with
> --mirror to fetch them all).

Okay thanks. I don't see them in any of the web interfaces, neither on
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fgit%2Fgitweb.cgi%3Fp%3Dgcc.gitdata=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C9d79b87018f24bcbf8cc08d913d80bd0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637562640903545786%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=ts53XULDtR3o7fevlntCJdtzRqTo9R85LrxJ0ZfOBnE%3Dreserved=0
nor on
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgcc-mirror%2Fgccdata=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C9d79b87018f24bcbf8cc08d913d80bd0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637562640903545786%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=%2FPGF3vy3hD1OwiXmWzkUnOt9%2BR3YArZw0kCVueOKYpc%3Dreserved=0
but
git fetch origin vendors/google/heads/gcc-4_8 does the trick for fetching the 
commits, but not the symbolic branches.

Anyways with that it looks like the discriminator changes are:

commit fd9de90d750e3588b1e5a218b28102b6c8bb8434
Author: Dehao Chen mailto:de...@gcc.gnu.org>>
Date:   Thu Oct 10 14:39:31 2013 +

Use only lineno+discriminator (remove the callee function name) as the key 
to represent callsite. Because each callsite will have its discriminator if in 
the same line.

2013-10-10  Dehao Chen  

RE: [PING][PATCH] Add instruction level discriminator support.

2022-06-27 Thread Eugene Rozenfeld via Gcc-patches
Another ping for 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html .

I got a review from Andi 
(https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596549.html) but I also 
need a review from someone who can approve the changes.

Thanks,

Eugene

-Original Message-
From: Eugene Rozenfeld 
Sent: Friday, June 10, 2022 12:03 PM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: [PING][PATCH] Add instruction level discriminator support.

Hello,

I'd like to ping this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Eugene Rozenfeld via Gcc-patches
Sent: Thursday, June 02, 2022 12:22 AM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.

This is the first in a series of patches to enable discriminator support in 
AutoFDO.

This patch switches to tracking discriminators per statement/instruction 
instead of per basic block. Tracking per basic block was problematic since not 
all statements in a basic block needed a discriminator and, also, later 
optimizations could move statements between basic blocks making correlation 
during AutoFDO compilation unreliable. Tracking per statement also allows us to 
assign different discriminators to multiple function calls in the same basic 
block. A subsequent patch will add that support.

The idea of this patch is based on commit 
4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
approach. In Dehao's work special (normally unused) location ids and side 
tables were used to keep track of locations with discriminators. Things have 
changed since then and I don't think we have unused location ids anymore. 
Instead, I made discriminators a part of ad-hoc locations.

The difference from Dehao's work also includes support for discriminator 
reading/writing in lto streaming and in modules.

Tested on x86_64-pc-linux-gnu.


0001-Add-instruction-level-discriminator-support.patch
Description: 0001-Add-instruction-level-discriminator-support.patch


RE: [EXTERNAL] Re: [PATCH] Add instruction level discriminator support.

2022-06-12 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the review Andi.

Yes, the corresponding reader change is

@@ -15671,8 +15673,9 @@ module_state::read_location (bytes_in ) const
if (range.m_start == UNKNOWN_LOCATION)
  range.m_start = locus;
range.m_finish = read_location (sec);
+   unsigned discriminator = sec.u ();
if (locus != loc && range.m_start != loc && range.m_finish != loc)
- locus = get_combined_adhoc_loc (line_table, locus, range, NULL);
+ locus = get_combined_adhoc_loc (line_table, locus, range, NULL, 
discriminator);
   }
   break;

-Original Message-
From: Andi Kleen  
Sent: Sunday, June 12, 2022 4:53 PM
To: Eugene Rozenfeld via Gcc-patches 
Cc: Jan Hubicka ; Eugene Rozenfeld 

Subject: [EXTERNAL] Re: [PATCH] Add instruction level discriminator support.

Eugene Rozenfeld via Gcc-patches  writes:
>  {
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc index 
> d1dc73724d1..5ed6b7b0f94 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -15587,6 +15587,8 @@ module_state::write_location (bytes_out , 
> location_t loc)
>   range.m_start = UNKNOWN_LOCATION;
>write_location (sec, range.m_start);
>write_location (sec, range.m_finish);
> +  unsigned discriminator = get_discriminator_from_adhoc_loc (line_table, 
> loc);
> +  sec.u (discriminator);

I hope this has a corresponding reader change, wasn't fully clear. Should it 
use some common function?

The patch looks good to me, but I cannot approve.

-Andi


[PING][PATCH] Add instruction level discriminator support.

2022-06-10 Thread Eugene Rozenfeld via Gcc-patches
Hello,

I'd like to ping this patch: 
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596065.html

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Eugene Rozenfeld via Gcc-patches
Sent: Thursday, June 02, 2022 12:22 AM
To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan Hubicka 

Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.

This is the first in a series of patches to enable discriminator support in 
AutoFDO.

This patch switches to tracking discriminators per statement/instruction 
instead of per basic block. Tracking per basic block was problematic since not 
all statements in a basic block needed a discriminator and, also, later 
optimizations could move statements between basic blocks making correlation 
during AutoFDO compilation unreliable. Tracking per statement also allows us to 
assign different discriminators to multiple function calls in the same basic 
block. A subsequent patch will add that support.

The idea of this patch is based on commit 
4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
approach. In Dehao's work special (normally unused) location ids and side 
tables were used to keep track of locations with discriminators. Things have 
changed since then and I don't think we have unused location ids anymore. 
Instead, I made discriminators a part of ad-hoc locations.

The difference from Dehao's work also includes support for discriminator 
reading/writing in lto streaming and in modules.

Tested on x86_64-pc-linux-gnu.


0001-Add-instruction-level-discriminator-support.patch
Description: 0001-Add-instruction-level-discriminator-support.patch


[PATCH] Add instruction level discriminator support.

2022-06-02 Thread Eugene Rozenfeld via Gcc-patches
This is the first in a series of patches to enable discriminator support
in AutoFDO.

This patch switches to tracking discriminators per statement/instruction
instead of per basic block. Tracking per basic block was problematic since
not all statements in a basic block needed a discriminator and, also, later
optimizations could move statements between basic blocks making correlation
during AutoFDO compilation unreliable. Tracking per statement also allows
us to assign different discriminators to multiple function calls in the same
basic block. A subsequent patch will add that support.

The idea of this patch is based on commit 
4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different
approach. In Dehao's work special (normally unused) location ids and side tables
were used to keep track of locations with discriminators. Things have changed
since then and I don't think we have unused location ids anymore. Instead,
I made discriminators a part of ad-hoc locations.

The difference from Dehao's work also includes support for discriminator
reading/writing in lto streaming and in modules.

Tested on x86_64-pc-linux-gnu.


0001-Add-instruction-level-discriminator-support.patch
Description: 0001-Add-instruction-level-discriminator-support.patch


RE: [EXTERNAL] Re: [PATCH] Guard against applying scale with 0 denominator

2022-05-20 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the feedback Richard. I attached a patch that saves/restores 
counts if the epilog doesn't use a scalar loop.

Eugene

-Original Message-
From: Richard Biener  
Sent: Thursday, May 12, 2022 12:34 AM
To: Eugene Rozenfeld 
Cc: Jan Hubicka ; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH] Guard against applying scale with 0 
denominator

On Thu, May 12, 2022 at 3:37 AM Eugene Rozenfeld 
 wrote:
>
> In my case this is not exactly what the FIXME in the comment above 
> says. The count is 0 even before the initial scaling happens. I hit this case 
> with some changes I'm working on to enable per-instruction discriminators for 
> AutoFDO.
>
> I looked into saving/restoring the old counts but it would be awkward to do. 
> The initial scaling happens here:
>
> if (skip_vector)
> {
>   split_edge (loop_preheader_edge (loop));
>
>   /* Due to the order in which we peel prolog and epilog, we first
>  propagate probability to the whole loop.  The purpose is to
>  avoid adjusting probabilities of both prolog and vector loops
>  separately.  Note in this case, the probability of epilog loop
>  needs to be scaled back later.  */
>   basic_block bb_before_loop = loop_preheader_edge (loop)->src;
>   if (prob_vector.initialized_p ())
> {
>   scale_bbs_frequencies (_before_loop, 1, prob_vector);
>   scale_loop_profile (loop, prob_vector, 0);
> }
> }
>
> The scaling happens before we do the cloning for the epilog so to 
> save/restore the counts we would need to maintain a mapping between the 
> original basic blocks and the cloned basic blocks in the epilog.

There is one already - after the epilogue is copied you can use get_bb_original 
(epilouge_bb) to get at the block it was copied from.
It could also be that we can rely on the basic-block order to be preserved 
between the copies (I _think_ that would work ... but then assert () for this 
using get_bb_original ()).  That means the simplest fix would be to have an 
auto_vec and initialize that from the BB counts in loop body order (we 
also have exactly two BBs in all peeled loops ...

But note we only scaled the scalar if-converted loop but eventually used the 
not if-coverted copy for the epilogue (if not vectorizing the epilogue), so 
indiscriminate scaling back looks wrong in some cases (I'd have to double-check 
the details here).

> I'd like to get my simple fix in since it makes things better even if 
> it doesn't address the issue mentioned In the FIXME.

But don't you need to check that
bbs[i]->count.apply_probability (prob_vector) is not zero instead of checking 
that bbs[i].count is not zero?

Richard.

> -Original Message-
> From: Richard Biener 
> Sent: Monday, May 09, 2022 12:42 AM
> To: Eugene Rozenfeld ; Jan Hubicka 
> 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: [PATCH] Guard against applying scale with 0 
> denominator
>
> On Fri, May 6, 2022 at 10:32 PM Eugene Rozenfeld via Gcc-patches 
>  wrote:
> >
> > Calling count.apply_scale with a 0 denominator causes an assert.
> > This change guards against that.
> >
> > Tested on x86_64-pc-linux-gnu.
> >
> > gcc/ChangeLog:
> > * tree-loop-vect-manip.cc (vect_do_peeling): Guard against applying 
> > scale with 0 denominator.
> > ---
> >  gcc/tree-vect-loop-manip.cc | 9 +
> >  1 file changed, 5 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/tree-vect-loop-manip.cc 
> > b/gcc/tree-vect-loop-manip.cc index 1d4337eb261..db54ae69e45 100644
> > --- a/gcc/tree-vect-loop-manip.cc
> > +++ b/gcc/tree-vect-loop-manip.cc
> > @@ -2989,10 +2989,11 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree 
> > niters, tree nitersm1,
> >  get lost if we scale down to 0.  */
> >   basic_block *bbs = get_loop_body (epilog);
> >   for (unsigned int i = 0; i < epilog->num_nodes; i++)
> > -   bbs[i]->count = bbs[i]->count.apply_scale
> > -(bbs[i]->count,
> > - bbs[i]->count.apply_probability
> > -   (prob_vector));
> > +   if (bbs[i]->count.nonzero_p ())
> > + bbs[i]->count = bbs[i]->count.apply_scale
> > +  (bbs[i]->count,
> > +   bbs[i]->count.apply_probability
> > + (prob_vector));
>
> So exactly what the FIXME in the comment above says happens.   It
> might be better
> to save/restore the old counts if the intent is to get them back.  I'm not 
> exactly sure where the other scaling happens though.
>
> Richard.
>
>
>
> >   free (bbs);
> > }
> >
> > --
> > 2.25.1


0001-Fix-profile-count-maintenance-in-vectorizer-peeling.patch
Description: 0001-Fix-profile-count-maintenance-in-vectorizer-peeling.patch


RE: [EXTERNAL] Re: [PATCH] Guard against applying scale with 0 denominator

2022-05-11 Thread Eugene Rozenfeld via Gcc-patches
In my case this is not exactly what the FIXME in the comment above says. The 
count is 0 even before
the initial scaling happens. I hit this case with some changes I'm working on 
to enable per-instruction discriminators for AutoFDO.

I looked into saving/restoring the old counts but it would be awkward to do. 
The initial scaling happens here:

if (skip_vector)
{
  split_edge (loop_preheader_edge (loop));

  /* Due to the order in which we peel prolog and epilog, we first
 propagate probability to the whole loop.  The purpose is to
 avoid adjusting probabilities of both prolog and vector loops
 separately.  Note in this case, the probability of epilog loop
 needs to be scaled back later.  */
  basic_block bb_before_loop = loop_preheader_edge (loop)->src;
  if (prob_vector.initialized_p ())
{
  scale_bbs_frequencies (_before_loop, 1, prob_vector);
  scale_loop_profile (loop, prob_vector, 0);
}
}

The scaling happens before we do the cloning for the epilog so to save/restore 
the counts
we would need to maintain a mapping between the original basic blocks and the 
cloned basic blocks in the epilog.

I'd like to get my simple fix in since it makes things better even if it 
doesn't address the issue mentioned
In the FIXME.

-Original Message-
From: Richard Biener  
Sent: Monday, May 09, 2022 12:42 AM
To: Eugene Rozenfeld ; Jan Hubicka 

Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] Guard against applying scale with 0 denominator

On Fri, May 6, 2022 at 10:32 PM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> Calling count.apply_scale with a 0 denominator causes an assert.
> This change guards against that.
>
> Tested on x86_64-pc-linux-gnu.
>
> gcc/ChangeLog:
> * tree-loop-vect-manip.cc (vect_do_peeling): Guard against applying 
> scale with 0 denominator.
> ---
>  gcc/tree-vect-loop-manip.cc | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc 
> index 1d4337eb261..db54ae69e45 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -2989,10 +2989,11 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree 
> niters, tree nitersm1,
>  get lost if we scale down to 0.  */
>   basic_block *bbs = get_loop_body (epilog);
>   for (unsigned int i = 0; i < epilog->num_nodes; i++)
> -   bbs[i]->count = bbs[i]->count.apply_scale
> -(bbs[i]->count,
> - bbs[i]->count.apply_probability
> -   (prob_vector));
> +   if (bbs[i]->count.nonzero_p ())
> + bbs[i]->count = bbs[i]->count.apply_scale
> +  (bbs[i]->count,
> +   bbs[i]->count.apply_probability
> + (prob_vector));

So exactly what the FIXME in the comment above says happens.   It
might be better
to save/restore the old counts if the intent is to get them back.  I'm not 
exactly sure where the other scaling happens though.

Richard.



>   free (bbs);
> }
>
> --
> 2.25.1


[PATCH] Guard against applying scale with 0 denominator

2022-05-06 Thread Eugene Rozenfeld via Gcc-patches
Calling count.apply_scale with a 0 denominator causes an assert.
This change guards against that.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
* tree-loop-vect-manip.cc (vect_do_peeling): Guard against applying 
scale with 0 denominator.
---
 gcc/tree-vect-loop-manip.cc | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 1d4337eb261..db54ae69e45 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -2989,10 +2989,11 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, 
tree nitersm1,
 get lost if we scale down to 0.  */
  basic_block *bbs = get_loop_body (epilog);
  for (unsigned int i = 0; i < epilog->num_nodes; i++)
-   bbs[i]->count = bbs[i]->count.apply_scale
-(bbs[i]->count,
- bbs[i]->count.apply_probability
-   (prob_vector));
+   if (bbs[i]->count.nonzero_p ())
+ bbs[i]->count = bbs[i]->count.apply_scale
+  (bbs[i]->count,
+   bbs[i]->count.apply_probability
+ (prob_vector));
  free (bbs);
}

--
2.25.1


[PATCH] AutoFDO: Don't try to promote indirect calls that result in recursive direct calls

2022-02-08 Thread Eugene Rozenfeld via Gcc-patches
AutoFDO tries to promote and inline all indirect calls that were promoted
and inlined in the original binary and that are still hot. In the included
test case, the promotion results in a direct call that is a recursive call.
inline_call and optimize_inline_calls can't handle recursive calls at this 
stage.
Currently, inline_call fails with a segmentation fault.

This change leaves the indirect call alone if promotion will result in a 
recursive call.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
* auto-profile.cc (afdo_indirect_call): Don't attempt to promote 
indirect calls
that will result in direct recursive calls.

gcc/testsuite/g++.dg/tree-prof/ChangeLog:
* indir-call-recursive-inlining.C : New test.
---
 gcc/auto-profile.cc   | 40 --
 .../tree-prof/indir-call-recursive-inlining.C | 54 +++
 2 files changed, 78 insertions(+), 16 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/tree-prof/indir-call-recursive-inlining.C

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index c7cee639c85..2b34b80b82d 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -975,7 +975,7 @@ read_profile (void)
  * after annotation, we just need to mark, and let follow-up logic to
decide if it needs to promote and inline.  */

-static void
+static bool
 afdo_indirect_call (gimple_stmt_iterator *gsi, const icall_target_map ,
 bool transform)
 {
@@ -983,12 +983,12 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
icall_target_map ,
   tree callee;

   if (map.size () == 0)
-return;
+return false;
   gcall *stmt = dyn_cast  (gs);
   if (!stmt
   || gimple_call_internal_p (stmt)
   || gimple_call_fndecl (stmt) != NULL_TREE)
-return;
+return false;

   gcov_type total = 0;
   icall_target_map::const_iterator max_iter = map.end ();
@@ -1003,7 +1003,7 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
icall_target_map ,
   struct cgraph_node *direct_call = cgraph_node::get_for_asmname (
   get_identifier (afdo_string_table->get_name (max_iter->first)));
   if (direct_call == NULL || !direct_call->profile_id)
-return;
+return false;

   callee = gimple_call_fn (stmt);

@@ -1013,20 +1013,27 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
icall_target_map ,
   hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters);
   gimple_add_histogram_value (cfun, stmt, hist);

-  // Total counter
+  /* Total counter */
   hist->hvalue.counters[0] = total;
-  // Number of value/counter pairs
+  /* Number of value/counter pairs */
   hist->hvalue.counters[1] = 1;
-  // Value
+  /* Value */
   hist->hvalue.counters[2] = direct_call->profile_id;
-  // Counter
+  /* Counter */
   hist->hvalue.counters[3] = max_iter->second;

   if (!transform)
-return;
+return false;
+
+  cgraph_node* current_function_node = cgraph_node::get 
(current_function_decl);
+
+  /* If the direct call is a recursive call, don't promote it since
+ we are not set up to inline recursive calls at this stage. */
+  if (direct_call == current_function_node)
+return false;

   struct cgraph_edge *indirect_edge
-  = cgraph_node::get (current_function_decl)->get_edge (stmt);
+  = current_function_node->get_edge (stmt);

   if (dump_file)
 {
@@ -1040,13 +1047,13 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
icall_target_map ,
 {
   if (dump_file)
 fprintf (dump_file, " not transforming\n");
-  return;
+  return false;
 }
   if (DECL_STRUCT_FUNCTION (direct_call->decl) == NULL)
 {
   if (dump_file)
 fprintf (dump_file, " no declaration\n");
-  return;
+  return false;
 }

   if (dump_file)
@@ -1063,16 +1070,17 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
icall_target_map ,
   cgraph_edge::redirect_call_stmt_to_callee (new_edge);
   gimple_remove_histogram_value (cfun, stmt, hist);
   inline_call (new_edge, true, NULL, NULL, false);
+  return true;
 }

 /* From AutoFDO profiles, find values inside STMT for that we want to measure
histograms and adds them to list VALUES.  */

-static void
+static bool
 afdo_vpt (gimple_stmt_iterator *gsi, const icall_target_map ,
   bool transform)
 {
-  afdo_indirect_call (gsi, map, transform);
+  return afdo_indirect_call (gsi, map, transform);
 }

 typedef std::set bb_set;
@@ -1498,8 +1506,8 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmts)
   {
 /* Promote the indirect call and update the promoted_stmts.  */
 promoted_stmts->insert (stmt);
-afdo_vpt (, info.targets, true);
-has_vpt = true;
+if (afdo_vpt (, info.targets, true))
+  has_vpt = true;
   }
   }
   }
diff --git a/gcc/testsuite/g++.dg/tree-prof/indir-call-recursive-inlining.C 
b/gcc/testsuite/g++.dg/tree-prof/indir-call-recursive-inlining.C
new file mode 100644
index 

RE: [EXTERNAL] Re: [PATCH] AutoFDO: don't set param_early_inliner_max_iterations to 10.

2022-01-31 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the review Richard. Is it ok to commit this change to trunk now 
or should I wait till GCC 13 development starts?
This seems like a very low risk change.

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Monday, January 31, 2022 1:23 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] AutoFDO: don't set 
param_early_inliner_max_iterations to 10.

On Sat, Jan 29, 2022 at 12:24 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> param_early_inliner_max_iterations specifies the maximum number of 
> nested indirect inlining iterations performed by early inliner.
> Normally, the default value is 1.
>
> For AutoFDO this parameter was also used as the number of iteration 
> for its indirect call promotion loop and the default value was set to 10.
> While it makes sense to have 10 in the indirect call promotion loop 
> (we want to make the IR match the profiled binary before actual 
> annotation) there is no reason to have a special default value for the 
> regular early inliner.
>
> This change removes the special AutoFDO default value setting for 
> param_early_inliner_max_iterations while keeping 10 as the number of 
> iterations for the AutoFDO indirect call promotion loop.
>
> This change improves a simple fibonacci benchmark in AutoFDO mode by 
> 15% on x86_64-pc-linux-gnu.
>
> Tested on x86_64-pc-linux-gnu.

OK.

Richard.

> gcc/ChangeLog:
> * auto-profile.cc (auto_profile): Hard-code the number of iterations 
> (10).
>
> gcc/ChangeLog:
> * opt.cc (common_handle_option): Don't set 
> param_early_inliner_max_iterations to 10 for AutoFDO.
> ---
>  gcc/auto-profile.cc | 3 +--
>  gcc/opts.cc | 2 --
>  2 files changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc index 
> 0bfaae7b091..c7cee639c85 100644
> --- a/gcc/auto-profile.cc
> +++ b/gcc/auto-profile.cc
> @@ -1644,8 +1644,7 @@ auto_profile (void)
> function before annotation, so the profile inside bar@loc_foo2
> will be useful.  */
>  autofdo::stmt_set promoted_stmts;
> -for (int i = 0; i < opt_for_fn (node->decl,
> -   param_early_inliner_max_iterations); i++)
> +for (int i = 0; i < 10; i++)
>{
>  if (!flag_value_profile_transformations
>  || !autofdo::afdo_vpt_for_early_inline (_stmts)) 
> diff --git a/gcc/opts.cc b/gcc/opts.cc index 17e1884f0e4..f6f6a8e1709 
> 100644
> --- a/gcc/opts.cc
> +++ b/gcc/opts.cc
> @@ -2899,8 +2899,6 @@ common_handle_option (struct gcc_options *opts,
>  case OPT_fauto_profile:
>enable_fdo_optimizations (opts, opts_set, value);
>SET_OPTION_IF_UNSET (opts, opts_set, flag_profile_correction, value);
> -  SET_OPTION_IF_UNSET (opts, opts_set,
> -  param_early_inliner_max_iterations, 10);
>break;
>
>  case OPT_fprofile_generate_:
> --
> 2.25.1


[PATCH] AutoFDO: don't set param_early_inliner_max_iterations to 10.

2022-01-28 Thread Eugene Rozenfeld via Gcc-patches
param_early_inliner_max_iterations specifies the maximum number
of nested indirect inlining iterations performed by early inliner.
Normally, the default value is 1.

For AutoFDO this parameter was also used as the number of iteration for
its indirect call promotion loop and the default value was set to 10.
While it makes sense to have 10 in the indirect call promotion loop
(we want to make the IR match the profiled binary before actual annotation)
there is no reason to have a special default value for the
regular early inliner.

This change removes the special AutoFDO default value setting for
param_early_inliner_max_iterations while keeping 10 as the number of
iterations for the AutoFDO indirect call promotion loop.

This change improves a simple fibonacci benchmark in AutoFDO mode
by 15% on x86_64-pc-linux-gnu.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
* auto-profile.cc (auto_profile): Hard-code the number of iterations 
(10).

gcc/ChangeLog:
* opt.cc (common_handle_option): Don't set 
param_early_inliner_max_iterations to 10 for AutoFDO.
---
 gcc/auto-profile.cc | 3 +--
 gcc/opts.cc | 2 --
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 0bfaae7b091..c7cee639c85 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1644,8 +1644,7 @@ auto_profile (void)
function before annotation, so the profile inside bar@loc_foo2
will be useful.  */
 autofdo::stmt_set promoted_stmts;
-for (int i = 0; i < opt_for_fn (node->decl,
-   param_early_inliner_max_iterations); i++)
+for (int i = 0; i < 10; i++)
   {
 if (!flag_value_profile_transformations
 || !autofdo::afdo_vpt_for_early_inline (_stmts))
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 17e1884f0e4..f6f6a8e1709 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -2899,8 +2899,6 @@ common_handle_option (struct gcc_options *opts,
 case OPT_fauto_profile:
   enable_fdo_optimizations (opts, opts_set, value);
   SET_OPTION_IF_UNSET (opts, opts_set, flag_profile_correction, value);
-  SET_OPTION_IF_UNSET (opts, opts_set,
-  param_early_inliner_max_iterations, 10);
   break;

 case OPT_fprofile_generate_:
--
2.25.1


[PATCH] Improve AutoFDO count propagation algorithm

2021-12-02 Thread Eugene Rozenfeld via Gcc-patches
When a basic block A has been annotated with a count and it has only one
successor (or predecessor) B, we can propagate the A's count to B.
The algorithm without this change could leave B without an annotation if B had
other unannotated predecessors (or successors). For example, in the test case I 
added,
the loop header block was left unannotated, which prevented loop unrolling.

gcc/ChangeLog:
* auto-profile.c (afdo_propagate_edge): Improve count propagation 
algorithm.

gcc/testsuite/ChangeLog:
* gcc.dg/tree-prof/init-array.c: New test for unrolling inner loops.
---
 gcc/auto-profile.c  | 20 +-
 gcc/testsuite/gcc.dg/tree-prof/init-array.c | 43 +
 2 files changed, 61 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/init-array.c

diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 4c1fc6b536b..dfcd68113aa 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1192,7 +1192,8 @@ afdo_find_equiv_class (bb_set *annotated_bb)
 /* If a basic block's count is known, and only one of its in/out edges' count
is unknown, its count can be calculated. Meanwhile, if all of the in/out
edges' counts are known, then the basic block's unknown count can also be
-   calculated.
+   calculated. Also, if a block has a single predecessor or successor, the 
block's
+   count can be propagated to that predecessor or successor.
IS_SUCC is true if out edges of a basic blocks are examined.
Update ANNOTATED_BB accordingly.
Return TRUE if any basic block/edge count is changed.  */
@@ -1208,6 +1209,7 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb)
 edge e, unknown_edge = NULL;
 edge_iterator ei;
 int num_unknown_edge = 0;
+int num_edge = 0;
 profile_count total_known_count = profile_count::zero ().afdo ();

 FOR_EACH_EDGE (e, ei, is_succ ? bb->succs : bb->preds)
@@ -1217,6 +1219,7 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb)
  num_unknown_edge++, unknown_edge = e;
else
  total_known_count += AFDO_EINFO (e)->get_count ();
+   num_edge++;
   }

 /* Be careful not to annotate block with no successor in special cases.  */
@@ -1230,7 +1233,20 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb)
 else if (num_unknown_edge == 1 && is_bb_annotated (bb, *annotated_bb))
   {
if (bb->count > total_known_count)
- AFDO_EINFO (unknown_edge)->set_count (bb->count - total_known_count);
+ {
+ profile_count new_count = bb->count - total_known_count;
+ AFDO_EINFO(unknown_edge)->set_count(new_count);
+ if (num_edge == 1)
+   {
+ basic_block succ_or_pred_bb = is_succ ? unknown_edge->dest : 
unknown_edge->src;
+ if (new_count > succ_or_pred_bb->count)
+   {
+ succ_or_pred_bb->count = new_count;
+ if (!is_bb_annotated (succ_or_pred_bb, *annotated_bb))
+   set_bb_annotated (succ_or_pred_bb, annotated_bb);
+   }
+   }
+  }
else
  AFDO_EINFO (unknown_edge)->set_count (profile_count::zero().afdo ());
AFDO_EINFO (unknown_edge)->set_annotated ();
diff --git a/gcc/testsuite/gcc.dg/tree-prof/init-array.c 
b/gcc/testsuite/gcc.dg/tree-prof/init-array.c
new file mode 100644
index 000..0f7a5c84481
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/init-array.c
@@ -0,0 +1,43 @@
+/* { dg-options "-O3 -fdump-tree-cunrolli-details" } */
+
+static int s[10][10][10];
+static int d[10][10][10];
+
+__attribute__((noipa))
+int array()
+{
+   int i;
+   register int j, k;
+   for (i = 0; i < 10; i++)
+   for (j = 0; j < 10; j++)
+   for (k = 0; k < 10; k++)
+   d[i][j][k] = s[i][j][k];
+
+   return(0);
+}
+
+__attribute__((noipa))
+void TestBench()
+{
+   for (int i = 0; i < 15; ++i)
+   {
+  array();
+   }
+}
+
+int main(int argc, char *argv[])
+{
+
+   TestBench();
+
+   if (d[9][9][9] == 0 && s[9][9][9] == 0)
+   {
+   return 0;
+   }
+   else
+   {
+   return -1;
+   }
+}
+
+/* { dg-final-use { scan-tree-dump-times "loop with 10 iterations completely 
unrolled" 2 "cunrolli"} } */
--
2.25.1


RE: [EXTERNAL] Re: Reporting lto bugs

2021-09-24 Thread Eugene Rozenfeld via Gcc
My scenario is not cygwin. It's linux cross-compile x86_64 to aarch64 in an 
Ubuntu vm so I believe this should produce line info with -flto -g yet I see 
"No Line Number Statements" after Directory Table and File Name Table in the 
output of objdump -g.
DWARF Version is 4.
I verified that I get line numbers if I don't specify -flto.
I have all the lto1 commands and I'd like to try to debug this myself first. Do 
you have any tips on how to debug this?

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Friday, September 24, 2021 12:36 AM
To: Eugene Rozenfeld 
Cc: gcc@gcc.gnu.org
Subject: [EXTERNAL] Re: Reporting lto bugs

On Fri, Sep 24, 2021 at 3:45 AM Eugene Rozenfeld via Gcc  
wrote:
>
> I ran into a bug with lto: no line number info gets emitted when building my 
> project with -flto and -g (with gcc 8.2). I'd like to provide a repro in the 
> bug report but I don't know if there is an easy way to collect everything.
> The instructions at 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugs%2Fdata=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cb7b3ef43536943c30e3108d97f2df355%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637680657602091864%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=yGG5QtNDKjD0z4Q%2FcNmuwx5CZV8cElqGetmfmYzAwjA%3Dreserved=0
>  seem to cover non-lto compilations.
> For MSVC there is an easy way to collect linker repro files for ltcg: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs
> .microsoft.com%2Fen-us%2Fcpp%2Foverview%2Fhow-to-report-a-problem-with
> -the-visual-cpp-toolset%3Fview%3Dmsvc-160%23link-reprosdata=04%7C
> 01%7CEugene.Rozenfeld%40microsoft.com%7Cb7b3ef43536943c30e3108d97f2df3
> 55%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637680657602101859%7CU
> nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1ha
> WwiLCJXVCI6Mn0%3D%7C1000sdata=WhtAKrM1LoveH1Jbi3Ulhrp7hUdXJ46hgh1
> osT%2BB1nc%3Dreserved=0
> Is there something similar for GCC?

We generally prefer preprocessed sources here as with other bugs - for LTO that 
might mean including quite a bunch of files but there's instructions on the 
wiki on how to reduce LTO testcases:
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fwiki%2FA_guide_to_testcase_reduction%23Reducing_LTO_bugsdata=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cb7b3ef43536943c30e3108d97f2df355%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637680657602101859%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=GExhyUNJvSQgMPrSGccVj8woBTLahYVAjI%2FBvQ79QMo%3Dreserved=0

Note that for mingw and cygwin you likely run into the issue that LTO does not 
support -g there (sic), see a similar issue for darwin (PR82005), there must be 
a bug for mingw/cygwin as well but I can't find it right now.

Richard.

> Thanks,
>
> Eugene


Reporting lto bugs

2021-09-23 Thread Eugene Rozenfeld via Gcc
I ran into a bug with lto: no line number info gets emitted when building my 
project with -flto and -g (with gcc 8.2). I'd like to provide a repro in the 
bug report but I don't know if there is an easy way to collect everything.
The instructions at https://gcc.gnu.org/bugs/ seem to cover non-lto 
compilations.
For MSVC there is an easy way to collect linker repro files for ltcg: 
https://docs.microsoft.com/en-us/cpp/overview/how-to-report-a-problem-with-the-visual-cpp-toolset?view=msvc-160#link-repros
Is there something similar for GCC?

Thanks,

Eugene


RE: [EXTERNAL] Re: [PATCH] gcov-profile/71672 Fix indirect call inlining with AutoFDO

2021-08-03 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the reviews, Andy and Richard.

I split up the patch into 4 commits and pushed to trunk.

Eugene

-Original Message-
From: Richard Biener  
Sent: Monday, August 2, 2021 2:57 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org; mli...@suse.cz; Andi Kleen 
Subject: [EXTERNAL] Re: [PATCH] gcov-profile/71672 Fix indirect call inlining 
with AutoFDO

On Fri, Jul 30, 2021 at 9:09 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> This patch has the following changes:
>
> 1. The main fix is in auto-profile.c: the histogram value for
>indirect calls was incorrectly set up. That is fixed now.
>
> 2. Several tests now have -fdump-ipa-afdo-optimized instead of -fdump-ipa-afdo
>in dg-options so that the expected output can be found.
>
> 3. I increased the number of iterations in several tests so that perf can have
>enough sampling events.
>
> 4. indir-call-prof-2.c has -fno-early-inlining but AutoFDO can't work without
>early inlining (it needs to match the inlining of the profiled binary).
>I changed profopt.exp to always pass -fearly-inlining for AutoFDO.
>With that the indirect call inlining in indir-call-prof-2.c happens in the 
> early inliner
>so I changed the dg-final-use-autofdo.
>
> 5. create_gcov tool doesn't currently support dwarf 5 so I made a change in 
> profopt.exp
>to pass -gdwarf-4 when compiling the binary to profile.
>
> 6. I updated the invocation of create_gcov in profopt.exp to pass 
> -gcov_version=2.
>I recently made a change to create_gcov to support version 2:
>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> ub.com%2Fgoogle%2Fautofdo%2Fpull%2F117data=04%7C01%7CEugene.Rozen
> feld%40microsoft.com%7C92927d4029754d0d6b4708d9559be06d%7C72f988bf86f1
> 41af91ab2d7cd011db47%7C1%7C0%7C637634950245832767%7CUnknown%7CTWFpbGZs
> b3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
> %7C1000sdata=Ex1OpS0gt9dpsBVIK71k7hvjJbfIkN%2BlRr%2BYD86%2FqEs%3D
> reserved=0
>
> 7. I removed useless -o perf.data from the invocation of gcc-auto-profile in
>target-supports.exp.
>
> With these changes the tests checking indirect call inlining in gcc.dg 
> and g++.dg are passing.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> PR gcov-profile/71672
> * auto-profile.c (afdo_indirect_call): Fix the setup of the 
> historgram value for indirect calls.
>
> gcc/testsuite/ChangeLog:
> PR gcov-profile/71672
> * g++.dg/tree-prof/indir-call-prof.C: Fix options, increase the 
> number of iterations.
> * g++.dg/tree-prof/morefunc.C: Fix options, increase the number of 
> iterations.
> * g++.dg/tree-prof/reorder.C: Fix options, increase the number of 
> iterations.
> * gcc.dg/tree-prof/indir-call-prof-2.c: Fix options, fix 
> dg-final-use-autofdo, increase the number of iterations.
> * gcc.dg/tree-prof/indir-call-prof.c: Fix options.
> * lib/profopt.exp: Pass gdwarf-4 when compiling binary to profile; 
> pass -fearly-inlining when compiling with AutoFDO; pass -gcov_version=2 to 
> create_gcov.
> * lib/target-supports.exp: Remove unnecessary -o perf.data passed to 
> gcc-auto-profile.
> ---
>  gcc/auto-profile.c | 13 +
>  gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C   |  4 ++--
>  gcc/testsuite/g++.dg/tree-prof/morefunc.C  |  7 ---
>  gcc/testsuite/g++.dg/tree-prof/reorder.C   |  6 +++---
>  gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c |  8 
>  gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c   |  2 +-
>  gcc/testsuite/lib/profopt.exp  |  6 +++---
>  gcc/testsuite/lib/target-supports.exp  |  2 +-
>  8 files changed, 27 insertions(+), 21 deletions(-)
>
> diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c index 
> b23b82b2df4..4c1fc6b536b 100644
> --- a/gcc/auto-profile.c
> +++ b/gcc/auto-profile.c
> @@ -1009,13 +1009,18 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, 
> const icall_target_map ,
>
>histogram_value hist = gimple_alloc_histogram_value (
>cfun, HIST_TYPE_INDIR_CALL, stmt, callee);
> -  hist->n_counters = 3;
> +  hist->n_counters = 4;
>hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters);
>gimple_add_histogram_value (cfun, stmt, hist);
>
> -  hist->hvalue.counters[0] = direct_call->profile_id;
> -  hist->hvalue.counters[1] = max_iter->second;
> -  hist->hvalue.counters[2] = total;
> +  // Total counter
> +  hist->hvalue.counters[0] = total;
> +  // Number of value/counter pairs
> +  hist->hvalue.counters[1] = 1;
> +  // Value
> +  hist->hvalue.counters[2] = direct_call->pro

[PATCH] gcov-profile/71672 Fix indirect call inlining with AutoFDO

2021-07-30 Thread Eugene Rozenfeld via Gcc-patches
This patch has the following changes:

1. The main fix is in auto-profile.c: the histogram value for
   indirect calls was incorrectly set up. That is fixed now.

2. Several tests now have -fdump-ipa-afdo-optimized instead of -fdump-ipa-afdo
   in dg-options so that the expected output can be found.

3. I increased the number of iterations in several tests so that perf can have
   enough sampling events.

4. indir-call-prof-2.c has -fno-early-inlining but AutoFDO can't work without
   early inlining (it needs to match the inlining of the profiled binary).
   I changed profopt.exp to always pass -fearly-inlining for AutoFDO.
   With that the indirect call inlining in indir-call-prof-2.c happens in the 
early inliner
   so I changed the dg-final-use-autofdo.

5. create_gcov tool doesn't currently support dwarf 5 so I made a change in 
profopt.exp
   to pass -gdwarf-4 when compiling the binary to profile.

6. I updated the invocation of create_gcov in profopt.exp to pass 
-gcov_version=2.
   I recently made a change to create_gcov to support version 2:
   https://github.com/google/autofdo/pull/117

7. I removed useless -o perf.data from the invocation of gcc-auto-profile in
   target-supports.exp.

With these changes the tests checking indirect call inlining in gcc.dg and 
g++.dg
are passing.

gcc/ChangeLog:
PR gcov-profile/71672
* auto-profile.c (afdo_indirect_call): Fix the setup of the historgram 
value for indirect calls.

gcc/testsuite/ChangeLog:
PR gcov-profile/71672
* g++.dg/tree-prof/indir-call-prof.C: Fix options, increase the number 
of iterations.
* g++.dg/tree-prof/morefunc.C: Fix options, increase the number of 
iterations.
* g++.dg/tree-prof/reorder.C: Fix options, increase the number of 
iterations.
* gcc.dg/tree-prof/indir-call-prof-2.c: Fix options, fix 
dg-final-use-autofdo, increase the number of iterations.
* gcc.dg/tree-prof/indir-call-prof.c: Fix options.
* lib/profopt.exp: Pass gdwarf-4 when compiling binary to profile; pass 
-fearly-inlining when compiling with AutoFDO; pass -gcov_version=2 to 
create_gcov.
* lib/target-supports.exp: Remove unnecessary -o perf.data passed to 
gcc-auto-profile.
---
 gcc/auto-profile.c | 13 +
 gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C   |  4 ++--
 gcc/testsuite/g++.dg/tree-prof/morefunc.C  |  7 ---
 gcc/testsuite/g++.dg/tree-prof/reorder.C   |  6 +++---
 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-2.c |  8 
 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof.c   |  2 +-
 gcc/testsuite/lib/profopt.exp  |  6 +++---
 gcc/testsuite/lib/target-supports.exp  |  2 +-
 8 files changed, 27 insertions(+), 21 deletions(-)

diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index b23b82b2df4..4c1fc6b536b 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -1009,13 +1009,18 @@ afdo_indirect_call (gimple_stmt_iterator *gsi, const 
icall_target_map ,

   histogram_value hist = gimple_alloc_histogram_value (
   cfun, HIST_TYPE_INDIR_CALL, stmt, callee);
-  hist->n_counters = 3;
+  hist->n_counters = 4;
   hist->hvalue.counters = XNEWVEC (gcov_type, hist->n_counters);
   gimple_add_histogram_value (cfun, stmt, hist);

-  hist->hvalue.counters[0] = direct_call->profile_id;
-  hist->hvalue.counters[1] = max_iter->second;
-  hist->hvalue.counters[2] = total;
+  // Total counter
+  hist->hvalue.counters[0] = total;
+  // Number of value/counter pairs
+  hist->hvalue.counters[1] = 1;
+  // Value
+  hist->hvalue.counters[2] = direct_call->profile_id;
+  // Counter
+  hist->hvalue.counters[3] = max_iter->second;

   if (!transform)
 return;
diff --git a/gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C 
b/gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C
index 3374744613e..b45417106d0 100644
--- a/gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C
+++ b/gcc/testsuite/g++.dg/tree-prof/indir-call-prof.C
@@ -1,4 +1,4 @@
-/* { dg-options "-O2 -fdump-tree-optimized -fdump-ipa-profile-optimized 
-fdump-ipa-afdo" } */
+/* { dg-options "-O2 -fdump-tree-optimized -fdump-ipa-profile-optimized 
-fdump-ipa-afdo-optimized" } */

 struct A {
   A () {}
@@ -26,7 +26,7 @@ main (void)

   int i;

-  for (i = 0; i < 100; i++)
+  for (i = 0; i < 1000; i++)
 {
   p = (A *)wrap ((void *));
   p->AA ();
diff --git a/gcc/testsuite/g++.dg/tree-prof/morefunc.C 
b/gcc/testsuite/g++.dg/tree-prof/morefunc.C
index 621d09aec5b..96e0073ca8f 100644
--- a/gcc/testsuite/g++.dg/tree-prof/morefunc.C
+++ b/gcc/testsuite/g++.dg/tree-prof/morefunc.C
@@ -1,4 +1,5 @@
-/* { dg-options "-O2 -fno-devirtualize --param=profile-func-internal-id=0 
-fdump-ipa-profile-optimized -fdump-ipa-afdo -Wno-attributes 
-Wno-coverage-mismatch -Wno-missing-profile" } */
+/* { dg-options "-O2 -fno-devirtualize --param=profile-func-internal-id=0 
-fdump-ipa-profile-optimized -fdump-ipa-afdo-optimized -Wno-attributes 

[PATCH] Update gen_autofdo_event.py and gcc-auto-profile

2021-07-01 Thread Eugene Rozenfeld via Gcc-patches
gen_autofdo_event.py was stumbling on models with stepping so
I updated the script to handle this case similar to the code in
https://github.com/andikleen/pmu-tools/blob/c6a5f63aede19def8886d6a8b74d7a55c38ca947/event_download.py

The second change was to tolerate cases when the CPU supports PEBS but the
perf command with /p fails. This can happen in, e.g., a virtual machine.

I regenerated gcc-auto-profile using the updated script.

contrib/ChangeLog:

* gen_autofdo_event.py: handle stepping, non-working PEBS

gcc/ChangeLog:

* config/i386/gcc-auto-profile: regenerate
---
 contrib/gen_autofdo_event.py | 54 ++--
 gcc/config/i386/gcc-auto-profile | 41 +++-
 2 files changed, 71 insertions(+), 24 deletions(-)

diff --git a/contrib/gen_autofdo_event.py b/contrib/gen_autofdo_event.py
index c97460c61c6..1eb6f1d6d85 100755
--- a/contrib/gen_autofdo_event.py
+++ b/contrib/gen_autofdo_event.py
@@ -46,20 +46,29 @@ args = ap.parse_args()
 
 eventmap = collections.defaultdict(list)
 
-def get_cpu_str():
-with open('/proc/cpuinfo', 'r') as c:
-vendor, fam, model = None, None, None
-for j in c:
-n = j.split()
-if n[0] == 'vendor_id':
-vendor = n[2]
-elif n[0] == 'model' and n[1] == ':':
-model = int(n[2])
-elif n[0] == 'cpu' and n[1] == 'family':
-fam = int(n[3])
-if vendor and fam and model:
-return "%s-%d-%X" % (vendor, fam, model), model
-return None, None
+def get_cpustr():
+cpuinfo = os.getenv("CPUINFO")
+if cpuinfo is None:
+cpuinfo = '/proc/cpuinfo'
+f = open(cpuinfo, 'r')
+cpu = [None, None, None, None]
+for j in f:
+n = j.split()
+if n[0] == 'vendor_id':
+cpu[0] = n[2]
+elif n[0] == 'model' and n[1] == ':':
+cpu[2] = int(n[2])
+elif n[0] == 'cpu' and n[1] == 'family':
+cpu[1] = int(n[3])
+elif n[0] == 'stepping' and n[1] == ':':
+cpu[3] = int(n[2])
+if all(v is not None for v in cpu):
+break
+# stepping for SKX only
+stepping = cpu[0] == "GenuineIntel" and cpu[1] == 6 and cpu[2] == 0x55
+if stepping:
+return "%s-%d-%X-%X" % tuple(cpu)
+return "%s-%d-%X" % tuple(cpu)[:3]
 
 def find_event(eventurl, model):
 print >>sys.stderr, "Downloading", eventurl
@@ -81,7 +90,7 @@ def find_event(eventurl, model):
 return found
 
 if not args.all:
-cpu, model = get_cpu_str()
+cpu = get_cpu_str()
 if not cpu:
 sys.exit("Unknown CPU type")
 
@@ -94,7 +103,8 @@ for j in u:
 n = j.rstrip().split(',')
 if len(n) >= 4 and (args.all or n[0] == cpu) and n[3] == "core":
 if args.all:
-vendor, fam, model = n[0].split("-")
+components = n[0].split("-")
+model = components[2]
 model = int(model, 16)
 cpufound += 1
 found += find_event(baseurl + n[2], model)
@@ -146,7 +156,17 @@ case `egrep -q "^cpu family\s*: 6" /proc/cpuinfo &&
 echo >&2 "Unknown CPU. Run contrib/gen_autofdo_event.py --all --script to 
update script."
exit 1 ;;'''
 print "esac"
-print 'exec perf record -e $E -b "$@"'
+print "set -x"
+print 'if ! perf record -e $E -b "$@" ; then'
+print '  # PEBS may not actually be working even if the processor supports 
it'
+print '  # (e.g., in a virtual machine). Trying to run without /p.'
+print '  set +x'
+print '  echo >&2 "Retrying without /p."'
+print '  E="$(echo "${E}" | sed -e \'s/\/p/\//\')"'
+print '  set -x'
+print '  exec perf record -e $E -b "$@"'
+print ' set +x'
+print 'fi'
 
 if cpufound == 0 and not args.all:
 sys.exit('CPU %s not found' % cpu)
diff --git a/gcc/config/i386/gcc-auto-profile b/gcc/config/i386/gcc-auto-profile
index 5da5c63cd84..56f64cbff1f 100755
--- a/gcc/config/i386/gcc-auto-profile
+++ b/gcc/config/i386/gcc-auto-profile
@@ -1,7 +1,7 @@
 #!/bin/sh
-# profile workload for gcc profile feedback (autofdo) using Linux perf
-# auto generated. to regenerate for new CPUs run
-# contrib/gen_autofdo_event.py --shell --all in gcc source
+# Profile workload for gcc profile feedback (autofdo) using Linux perf.
+# Auto generated. To regenerate for new CPUs run
+# contrib/gen_autofdo_event.py --script --all in gcc source
 
 # usages:
 # gcc-auto-profile program (profile program and children)
@@ -10,7 +10,7 @@
 # gcc-auto-profile --kernel -a sleep X (profile kernel)
 # gcc-auto-profile --all -a sleep X(profile kernel and user space)
 
-# identify branches taken event for CPU
+# Identify branches taken event for CPU.
 #
 
 FLAGS=u
@@ -37,7 +37,12 @@ case `egrep -q "^cpu family\s*: 6" /proc/cpuinfo &&
   egrep "^model\s*:" /proc/cpuinfo | head -n1` in
 model*:\ 55|\
 model*:\ 77|\
-model*:\ 76) E="cpu/event=0xC4,umask=0xFE/p$FLAGS" ;;
+model*:\ 76|\
+model*:\ 92|\

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-06-24 Thread Eugene Rozenfeld via Gcc
Hi Andy,

I'm trying to revive autofdo testing. One of the issues I'm running into with 
my setup is that PEBS doesn't work for with perf record even though PEBS is 
enabled.
I'm running Ubuntu 20.04 in a Hyper-V virtual machine; the processor is Icelake 
(GenuineIntel-6-7E).

I did the following:

1. Enabled pmu, lbr, and pebs in my Hyper-V virtual machine as described in 
https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/manage/performance-monitoring-hardware
2. Verified that pmu, lbr, and pebs are enabled in the vm by running 
erozen@erozen-Virtual-Machine:~/objdir/gcc$ dmesg | egrep -i 'pmu'
[0.266474] Performance Events: PEBS fmt4+, Icelake events, 32-deep 
LBR, full-width counters, Intel PMU driver.
3. Ran
erozen@erozen-Virtual-Machine:~/objdir/gcc$ perf record -e 
cpu/event=0xc4,umask=0x20/pu -b -m8 true -v
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) 
for event (cpu/event=0xc4,umask=0x20/pu).
/bin/dmesg | grep -i perf may provide additional information.

Omitting /p works fine:
erozen@erozen-Virtual-Machine:~/objdir/gcc$ perf record -e 
cpu/event=0xc4,umask=0x20/u -b -m8 true -v
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 0.007 MB perf.data (11 samples) ]

Is there a way to get PEBS working with perf record in a vm? I would appreciate 
any pointers on how to investigate this.

The version of perf I'm using is 5.8.18.

Thanks,

Eugene

-Original Message-
From: Andi Kleen  
Sent: Friday, April 30, 2021 2:46 PM
To: Eugene Rozenfeld via Gcc 
Cc: Xinliang David Li ; Richard Biener 
; Eugene Rozenfeld 
; Jan Hubicka 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

Eugene Rozenfeld via Gcc  writes:

> Is the format produced by create_gcov and expected by GCC under 
> -fauto-rpofile documented somewhere? How is it different from .gcda 
> used in FDO, e.g., as described here:
> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsrc.gnu-darwin.org%2Fsrc%2Fcontrib%2Fgcc%2Fgcov-io.h.htmldata=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C6c14ea3d93c44364845008d90c214cb6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637554159427749575%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=q9ROiOTma41UeQ%2FQG%2BUktEOrHAWonTTpcPRPmx%2Fgw0g%3Dreserved=0?

I believe it's very similar.

> I would prefer that AutoFDO is not removed from GCC and it would be 
> helpful if create_gcov were restored in google/autofdo. I checked out 
> a revision before the recent merge and tried it on a simple example 
> and it seems to work.
> I'm also interested in contributing improvements for AutoFDO so will 
> try to investigate 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D71672data=04%7C01%7CEuge
> ne.Rozenfeld%40microsoft.com%7C6c14ea3d93c44364845008d90c214cb6%7C72f9
> 88bf86f141af91ab2d7cd011db47%7C1%7C0%7C637554159427749575%7CUnknown%7C
> TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVC
> I6Mn0%3D%7C1000sdata=99Igueuxq7AoHU%2B20BZs4E4K5rgdPFCiR8eygKaJdK
> E%3Dreserved=0 and 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D81379data=04%7C01%7CEuge
> ne.Rozenfeld%40microsoft.com%7C6c14ea3d93c44364845008d90c214cb6%7C72f9
> 88bf86f141af91ab2d7cd011db47%7C1%7C0%7C637554159427759566%7CUnknown%7C
> TWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVC
> I6Mn0%3D%7C1000sdata=Es91Dtt5Wt6%2BJtPWxHhkHqdWBVwzCiF5PcuXoHjY%2
> Bzs%3Dreserved=0

That would be great.

-Andi


RE: [PATCH][pushed] gcov: update documentation entry about string format

2021-06-21 Thread Eugene Rozenfeld via Gcc-patches
Thank you for updating the documentation Martin.

The following line can now be removed:

padding: | char:0 | char:0 char:0 | char:0 char:0 char:0

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Martin Liška
Sent: Thursday, June 17, 2021 2:40 AM
To: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] [PATCH][pushed] gcov: update documentation entry about 
string format

gcc/ChangeLog:

* gcov-io.h: Update documentation entry about string format.
---
  gcc/gcov-io.h | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h index f7584eb9679..ff92afe63df 100644
--- a/gcc/gcov-io.h
+++ b/gcc/gcov-io.h
@@ -42,15 +42,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
  
 Numbers are recorded in the 32 bit unsigned binary form of the
 endianness of the machine generating the file. 64 bit numbers are
-   stored as two 32 bit numbers, the low part first.  Strings are
-   padded with 1 to 4 NUL bytes, to bring the length up to a multiple
-   of 4. The number of 4 bytes is stored, followed by the padded
+   stored as two 32 bit numbers, the low part first.
+   The number of bytes is stored, followed by the
 string. Zero length and NULL strings are simply stored as a length
 of zero (they have no trailing NUL or padding).
  
int32:  byte3 byte2 byte1 byte0 | byte0 byte1 byte2 byte3
int64:  int32:low int32:high
-   string: int32:0 | int32:length char* char:0 padding
+   string: int32:0 | int32:length char* char:0
padding: | char:0 | char:0 char:0 | char:0 char:0 char:0
item: int32 | int64 | string
  
--
2.32.0



RE: [EXTERNAL] [PATCH] gcov: Use system IO buffering

2021-06-17 Thread Eugene Rozenfeld via Gcc-patches
Thank you for your reply Martin!

AUTO_PROFILE_VERSION should also be changed. Then create_gcov can be updated to 
support both the old format and the new format.

Eugene

-Original Message-
From: Martin Liška  
Sent: Thursday, June 17, 2021 2:38 AM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] [PATCH] gcov: Use system IO buffering

On 6/17/21 3:59 AM, Eugene Rozenfeld wrote:
> |The commit from this patch 
> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fgit%2F%3Fp%3Dgcc.git%3Ba%3Dcommit%3Bh%3D23eb66d1d46a34cb28c4acbdf8a1deb80a7c5a05data=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C508d63026ea84be211cc08d9317395bb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637595194782996821%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=UG%2B41tXMZ94%2Ff80qCnmq%2BtZsFkLXc9NrdWF8KXwPjnk%3Dreserved=0)
>  changed the semantics of gcov_read_string and gcov_write_string. Before this 
> change the string storage was as described in gcov-io.h: "Strings are padded 
> with 1 to 4 NUL bytes, to bring the length up to a multiple of 4. The number 
> of 4 bytes is stored, followed by the padded string." After this change the 
> number before the string indicates the number of bytes (not words) and there 
> is no padding. Was this file format change intentional?

Hello.

Thanks for heads up!

Yes, the change was intentional and I'm going to update documentation entry in 
gcov-io.h.

> It breaks AutoFDO because create_gcov produces strings in the format 
> specified in gcov-io.h. Thanks, Eugene

Sure, that needs to be adjusted.

Martin



RE: [EXTERNAL] [PATCH] gcov: Use system IO buffering

2021-06-16 Thread Eugene Rozenfeld via Gcc-patches
The commit from this patch 
(https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23eb66d1d46a34cb28c4acbdf8a1deb80a7c5a05)
 changed the semantics of gcov_read_string and gcov_write_string. Before this 
change the string storage was as described in gcov-io.h:

"Strings are
   padded with 1 to 4 NUL bytes, to bring the length up to a multiple
   of 4. The number of 4 bytes is stored, followed by the padded
   string."

After this change the number before the string indicates the number of bytes 
(not words) and there is no padding.

Was this file format change intentional? It breaks AutoFDO because create_gcov 
produces strings in the format specified in gcov-io.h.

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On Behalf Of Martin Liška
Sent: Wednesday, April 21, 2021 12:52 AM
To: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] [PATCH] gcov: Use system IO buffering

Hey.

I/O buffering in gcov seems duplicite to what modern C library can provide.
The patch is a simplification and can provide easier interface for system that 
don't have a filesystem and would like using GCOV.

I'm going to install the patch after 11.1 if there are no objections.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin

gcc/ChangeLog:

* gcov-io.c (gcov_write_block): Remove.
(gcov_write_words): Likewise.
(gcov_read_words): Re-implement using gcov_read_bytes.
(gcov_allocate): Remove.
(GCOV_BLOCK_SIZE): Likewise.
(struct gcov_var): Remove most of the fields.
(gcov_position): Implement with ftell.
(gcov_rewrite): Remove setting of start and offset fields.
(from_file): Re-format.
(gcov_open): Remove setbuf call. It should not be needed.
(gcov_close): Remove internal buffer handling.
(gcov_magic): Use __builtin_bswap32.
(gcov_write_counter): Use directly gcov_write_unsigned.
(gcov_write_string): Use direct fwrite and do not round
to 4 bytes.
(gcov_seek): Use directly fseek.
(gcov_write_tag): Use gcov_write_unsigned directly.
(gcov_write_length): Likewise.
(gcov_write_tag_length): Likewise.
(gcov_read_bytes): Use directly fread.
(gcov_read_unsigned): Use gcov_read_words.
(gcov_read_counter): Likewise.
(gcov_read_string): Use gcov_read_bytes.
* gcov-io.h (GCOV_WORD_SIZE): Adjust to reflect
that size is not in bytes, but words (4B).
(GCOV_TAG_FUNCTION_LENGTH): Likewise.
(GCOV_TAG_ARCS_LENGTH): Likewise.
(GCOV_TAG_ARCS_NUM): Likewise.
(GCOV_TAG_COUNTER_LENGTH): Likewise.
(GCOV_TAG_COUNTER_NUM): Likewise.
(GCOV_TAG_SUMMARY_LENGTH): Likewise.

libgcc/ChangeLog:

* libgcov-driver.c: Fix GNU coding style.
---
 gcc/gcov-io.c   | 282 +---
 gcc/gcov-io.h   |  17 ++-
 libgcc/libgcov-driver.c |   6 +-
 3 files changed, 76 insertions(+), 229 deletions(-)

diff --git a/gcc/gcov-io.c b/gcc/gcov-io.c index 80c9082a649..bd2316dedab 100644
--- a/gcc/gcov-io.c
+++ b/gcc/gcov-io.c
@@ -27,40 +27,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 /* Routines declared in gcov-io.h.  This file should be #included by
another source file, after having #included gcov-io.h.  */
 
-#if !IN_GCOV
-static void gcov_write_block (unsigned); -static gcov_unsigned_t 
*gcov_write_words (unsigned); -#endif -static const gcov_unsigned_t 
*gcov_read_words (unsigned); -#if !IN_LIBGCOV -static void gcov_allocate 
(unsigned); -#endif
-
-/* Optimum number of gcov_unsigned_t's read from or written to disk.  */ 
-#define GCOV_BLOCK_SIZE (1 << 10)
+static gcov_unsigned_t *gcov_read_words (void *buffer, unsigned);
 
 struct gcov_var
 {
   FILE *file;
-  gcov_position_t start;   /* Position of first byte of block */
-  unsigned offset; /* Read/write position within the block.  */
-  unsigned length; /* Read limit in the block.  */
-  unsigned overread;   /* Number of words overread.  */
   int error;   /* < 0 overflow, > 0 disk error.  */
-  int mode;/* < 0 writing, > 0 reading */
+  int mode;/* < 0 writing, > 0 reading.  */
   int endian;  /* Swap endianness.  */
-#if IN_LIBGCOV
-  /* Holds one block plus 4 bytes, thus all coverage reads & writes
- fit within this buffer and we always can transfer GCOV_BLOCK_SIZE
- to and from the disk. libgcov never backtracks and only writes 4
- or 8 byte objects.  */
-  gcov_unsigned_t buffer[GCOV_BLOCK_SIZE + 1]; -#else
-  /* Holds a variable length block, as the compiler can write
- strings and needs to backtrack.  */
-  size_t alloc;
-  gcov_unsigned_t *buffer;
-#endif
 } gcov_var;
 
 /* Save the current position in the gcov file.  */ @@ -71,8 +45,7 @@ static 
inline  gcov_position_t  gcov_position (void)  {
-  gcov_nonruntime_assert (gcov_var.mode 

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-06-11 Thread Eugene Rozenfeld via Gcc
It appears that create_gcov doesn't support binaries with dwarf version 5 
(which is the current default). 
I tried a trivial example and got reasonable gcov files for binaries with dwarf 
v2, v3, and v4 but the same example with dwarf v5 produced 
"File './sort' has mangled .debug_info section."
and a gcov file with 0 functions.

Does create_llvm_prof has the same limitation?

Eugene

-Original Message-
From: Wei Mi  
Sent: Wednesday, May 26, 2021 7:52 PM
To: Eugene Rozenfeld 
Cc: Andi Kleen ; Hongtao Yu ; Xinliang David 
Li ; Jan Hubicka ; gcc@gcc.gnu.org; Wenlei 
He 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

Thanks. Good to know the build works with newer protobuf.

Wei.

On Wed, May 26, 2021 at 4:40 PM Eugene Rozenfeld 
 wrote:
>
> 3.0.0 is the latest supported version on Ubuntu 18.04. I verified that the 
> build works on Ubuntu 20.04 with  protobuf-compiler and libprotobuf-dev 
> version 3.6.1.3.
>
> Eugene
>
> -Original Message-
> From: Wei Mi 
> Sent: Tuesday, May 25, 2021 8:07 PM
> To: Eugene Rozenfeld 
> Cc: Andi Kleen ; Hongtao Yu ; Xinliang 
> David Li ; Jan Hubicka ; 
> gcc@gcc.gnu.org; Wenlei He 
> Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
>
> I checked the source of protobuf 3.0.0 and it didn't contain the operator[] 
> in RepeatedField. Need to install a newer version of protobuf.
>
> Thanks,
> Wei.
>
> On Tue, May 25, 2021 at 1:49 PM Eugene Rozenfeld 
>  wrote:
> >
> > Both are 3.0.0-9.1ubuntu1:
> >
> > eugene@eugene-Virtual-Machine:~/autofdo1/build$ apt list 
> > protobuf-compiler Listing... Done protobuf-compiler/bionic,now
> > 3.0.0-9.1ubuntu1 amd64 [installed]
> > eugene@eugene-Virtual-Machine:~/autofdo1/build$ apt list 
> > libprotobuf-dev Listing... Done libprotobuf-dev/bionic,now
> > 3.0.0-9.1ubuntu1 amd64 [installed]
> >
> > -Original Message-
> > From: Wei Mi 
> > Sent: Tuesday, May 25, 2021 9:17 AM
> > To: Eugene Rozenfeld 
> > Cc: Andi Kleen ; Hongtao Yu ; 
> > Xinliang David Li ; Jan Hubicka 
> > ; gcc@gcc.gnu.org; Wenlei He 
> > Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
> >
> > It looks like some version problem about protobuf-compiler and 
> > libprotobuf-dev. Could you check what is the installed version on your end 
> > for those two packages and see if they are consistent?
> >
> > On my platform, they are both 3.12.4.
> >
> > On Tue, May 25, 2021 at 12:01 AM Eugene Rozenfeld 
> >  wrote:
> > >
> > > That eliminates the previous error but there is a new one:
> > >
> > > eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja [3/199] 
> > > Building CXX object 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/qu
> > > ip
> > > pe
> > > r/perf_reader.cc.o
> > > FAILED: 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
> > > /usr/bin/c++   -I../third_party/perf_data_converter/src 
> > > -I../third_party/perf_data_converter/src/quipper -I../ 
> > > -I../third_party/glog/src -I../third_party/abseil -I../util -I. 
> > > -Ithird_party/glog -std=gnu++1z -MD -MT 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
> > >  -MF 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o.d
> > >  -o 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
> > >  -c ../third_party/perf_data_converter/src/quipper/perf_reader.cc
> > > ../third_party/perf_data_converter/src/quipper/perf_reader.cc: In member 
> > > function 'bool 
> > > quipper::PerfReader::ReadCPUTopologyMetadata(quipper::DataReader*, 
> > > size_t)':
> > > ../third_party/perf_data_converter/src/quipper/perf_reader.cc:1518:46: 
> > > error: no match for 'operator[]' (operand types are 'const 
> > > google::protobuf::RepeatedField' and 'int')
> > >  nrcpus = proto_uint32_metadata.data()[0];
> > >
> > > -Original Message-
> > > From: Wei Mi 
> > > Sent: Monday, May 24, 2021 8:54 PM
> > > To: Eugene Rozenfeld 
> > > Cc: Andi Kleen ; Hongtao Yu ; 
> > > Xinliang David Li ; Jan Hubicka 
> > > ; gcc@gcc.gnu.org; Wenlei He 
> > > Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
> > >
> > > It isn't exposed on my platform either. Looks like a bug in 
> > > perf_data_converter (i.e., quipper). Could you try adding #include 
> > >  in 
> > > third_party/perf_data_converter/src/quipper/huge_page_deducer.cc and see 
> > > if it fixes the problem? If it works, I will need to file a bug against 
> > > perf_data_converter.
> > >
> > > Thanks,
> > > Wei.
> > >
> > > On Mon, May 24, 2021 at 8:33 PM Eugene Rozenfeld 
> > >  wrote:
> > > >
> > > > That fixed the error I saw before but the build still fails. The 
> > > > errors start with
> > > >
> > > >
> > > >
> > > > eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
> > > >
> > > > [2/217] Building CXX object
> > > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/
> > > > qu
> > > > ip
> > > > pe
> > > > 

[PATCH 1/1] Fix a typo in an AutoFDO error string

2021-06-11 Thread Eugene Rozenfeld via Gcc-patches
Trivial change, committed to trunk.

[PATCH 1/1] Fix a typo in an AutoFDO error string

gcc/ChangeLog:

* auto-profile.c (read_profile): fix a typo in an error string
---
 gcc/auto-profile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
index 2a6d9a1fc24..a4601243dc9 100644
--- a/gcc/auto-profile.c
+++ b/gcc/auto-profile.c
@@ -939,7 +939,7 @@ read_profile (void)
   unsigned version = gcov_read_unsigned ();
   if (version != AUTO_PROFILE_VERSION)
 {
-  error ("AutoFDO profile version %u does match %u",
+  error ("AutoFDO profile version %u does not match %u",
 version, AUTO_PROFILE_VERSION);
   return;
 }
-- 
2.25.1



RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-05-26 Thread Eugene Rozenfeld via Gcc
3.0.0 is the latest supported version on Ubuntu 18.04. I verified that the 
build works on Ubuntu 20.04 with  protobuf-compiler and libprotobuf-dev version 
3.6.1.3.

Eugene

-Original Message-
From: Wei Mi  
Sent: Tuesday, May 25, 2021 8:07 PM
To: Eugene Rozenfeld 
Cc: Andi Kleen ; Hongtao Yu ; Xinliang David 
Li ; Jan Hubicka ; gcc@gcc.gnu.org; Wenlei 
He 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

I checked the source of protobuf 3.0.0 and it didn't contain the operator[] in 
RepeatedField. Need to install a newer version of protobuf.

Thanks,
Wei.

On Tue, May 25, 2021 at 1:49 PM Eugene Rozenfeld 
 wrote:
>
> Both are 3.0.0-9.1ubuntu1:
>
> eugene@eugene-Virtual-Machine:~/autofdo1/build$ apt list 
> protobuf-compiler Listing... Done protobuf-compiler/bionic,now 
> 3.0.0-9.1ubuntu1 amd64 [installed] 
> eugene@eugene-Virtual-Machine:~/autofdo1/build$ apt list 
> libprotobuf-dev Listing... Done libprotobuf-dev/bionic,now 
> 3.0.0-9.1ubuntu1 amd64 [installed]
>
> -Original Message-
> From: Wei Mi 
> Sent: Tuesday, May 25, 2021 9:17 AM
> To: Eugene Rozenfeld 
> Cc: Andi Kleen ; Hongtao Yu ; Xinliang 
> David Li ; Jan Hubicka ; 
> gcc@gcc.gnu.org; Wenlei He 
> Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
>
> It looks like some version problem about protobuf-compiler and 
> libprotobuf-dev. Could you check what is the installed version on your end 
> for those two packages and see if they are consistent?
>
> On my platform, they are both 3.12.4.
>
> On Tue, May 25, 2021 at 12:01 AM Eugene Rozenfeld 
>  wrote:
> >
> > That eliminates the previous error but there is a new one:
> >
> > eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja [3/199] 
> > Building CXX object 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quip
> > pe
> > r/perf_reader.cc.o
> > FAILED: 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
> > /usr/bin/c++   -I../third_party/perf_data_converter/src 
> > -I../third_party/perf_data_converter/src/quipper -I../ 
> > -I../third_party/glog/src -I../third_party/abseil -I../util -I. 
> > -Ithird_party/glog -std=gnu++1z -MD -MT 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
> >  -MF 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o.d
> >  -o 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
> >  -c ../third_party/perf_data_converter/src/quipper/perf_reader.cc
> > ../third_party/perf_data_converter/src/quipper/perf_reader.cc: In member 
> > function 'bool 
> > quipper::PerfReader::ReadCPUTopologyMetadata(quipper::DataReader*, size_t)':
> > ../third_party/perf_data_converter/src/quipper/perf_reader.cc:1518:46: 
> > error: no match for 'operator[]' (operand types are 'const 
> > google::protobuf::RepeatedField' and 'int')
> >  nrcpus = proto_uint32_metadata.data()[0];
> >
> > -Original Message-
> > From: Wei Mi 
> > Sent: Monday, May 24, 2021 8:54 PM
> > To: Eugene Rozenfeld 
> > Cc: Andi Kleen ; Hongtao Yu ; 
> > Xinliang David Li ; Jan Hubicka 
> > ; gcc@gcc.gnu.org; Wenlei He 
> > Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
> >
> > It isn't exposed on my platform either. Looks like a bug in 
> > perf_data_converter (i.e., quipper). Could you try adding #include 
> >  in 
> > third_party/perf_data_converter/src/quipper/huge_page_deducer.cc and see if 
> > it fixes the problem? If it works, I will need to file a bug against 
> > perf_data_converter.
> >
> > Thanks,
> > Wei.
> >
> > On Mon, May 24, 2021 at 8:33 PM Eugene Rozenfeld 
> >  wrote:
> > >
> > > That fixed the error I saw before but the build still fails. The 
> > > errors start with
> > >
> > >
> > >
> > > eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
> > >
> > > [2/217] Building CXX object
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/qu
> > > ip
> > > pe
> > > r/huge_page_deducer.cc.o
> > >
> > > FAILED:
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/qu
> > > ip
> > > pe
> > > r/huge_page_deducer.cc.o
> > >
> > > /usr/bin/c++   -I../third_party/perf_data_converter/src 
> > > -I../third_party/perf_data_converter/src/quipper -I../ 
> > > -I../third_party/glog/src -I../third_party/abseil -I../util -I. 
> > > -Ithird_party/glog -std=gnu++1z -MD -MT 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
> > >  -MF 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o.d
> > >  -o 
> > > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
> > >  -c ../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc
> > >
> > > ../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc:
> > > 14
> > > 4:26: error: 'unordered_map' in namespace 'std' does not name a 
> > > template type
> > >
> > 

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-05-25 Thread Eugene Rozenfeld via Gcc
Both are 3.0.0-9.1ubuntu1:

eugene@eugene-Virtual-Machine:~/autofdo1/build$ apt list protobuf-compiler
Listing... Done
protobuf-compiler/bionic,now 3.0.0-9.1ubuntu1 amd64 [installed]
eugene@eugene-Virtual-Machine:~/autofdo1/build$ apt list libprotobuf-dev
Listing... Done
libprotobuf-dev/bionic,now 3.0.0-9.1ubuntu1 amd64 [installed]

-Original Message-
From: Wei Mi  
Sent: Tuesday, May 25, 2021 9:17 AM
To: Eugene Rozenfeld 
Cc: Andi Kleen ; Hongtao Yu ; Xinliang David 
Li ; Jan Hubicka ; gcc@gcc.gnu.org; Wenlei 
He 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

It looks like some version problem about protobuf-compiler and libprotobuf-dev. 
Could you check what is the installed version on your end for those two 
packages and see if they are consistent?

On my platform, they are both 3.12.4.

On Tue, May 25, 2021 at 12:01 AM Eugene Rozenfeld 
 wrote:
>
> That eliminates the previous error but there is a new one:
>
> eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja [3/199] Building 
> CXX object 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quippe
> r/perf_reader.cc.o
> FAILED: 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
> /usr/bin/c++   -I../third_party/perf_data_converter/src 
> -I../third_party/perf_data_converter/src/quipper -I../ 
> -I../third_party/glog/src -I../third_party/abseil -I../util -I. 
> -Ithird_party/glog -std=gnu++1z -MD -MT 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
>  -MF 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o.d
>  -o 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
>  -c ../third_party/perf_data_converter/src/quipper/perf_reader.cc
> ../third_party/perf_data_converter/src/quipper/perf_reader.cc: In member 
> function 'bool 
> quipper::PerfReader::ReadCPUTopologyMetadata(quipper::DataReader*, size_t)':
> ../third_party/perf_data_converter/src/quipper/perf_reader.cc:1518:46: error: 
> no match for 'operator[]' (operand types are 'const 
> google::protobuf::RepeatedField' and 'int')
>  nrcpus = proto_uint32_metadata.data()[0];
>
> -Original Message-
> From: Wei Mi 
> Sent: Monday, May 24, 2021 8:54 PM
> To: Eugene Rozenfeld 
> Cc: Andi Kleen ; Hongtao Yu ; Xinliang 
> David Li ; Jan Hubicka ; 
> gcc@gcc.gnu.org; Wenlei He 
> Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
>
> It isn't exposed on my platform either. Looks like a bug in 
> perf_data_converter (i.e., quipper). Could you try adding #include 
>  in 
> third_party/perf_data_converter/src/quipper/huge_page_deducer.cc and see if 
> it fixes the problem? If it works, I will need to file a bug against 
> perf_data_converter.
>
> Thanks,
> Wei.
>
> On Mon, May 24, 2021 at 8:33 PM Eugene Rozenfeld 
>  wrote:
> >
> > That fixed the error I saw before but the build still fails. The 
> > errors start with
> >
> >
> >
> > eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
> >
> > [2/217] Building CXX object
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quip
> > pe
> > r/huge_page_deducer.cc.o
> >
> > FAILED:
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quip
> > pe
> > r/huge_page_deducer.cc.o
> >
> > /usr/bin/c++   -I../third_party/perf_data_converter/src 
> > -I../third_party/perf_data_converter/src/quipper -I../ 
> > -I../third_party/glog/src -I../third_party/abseil -I../util -I. 
> > -Ithird_party/glog -std=gnu++1z -MD -MT 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
> >  -MF 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o.d
> >  -o 
> > CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
> >  -c ../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc
> >
> > ../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc:
> > 14
> > 4:26: error: 'unordered_map' in namespace 'std' does not name a 
> > template type
> >
> >using container = std::unordered_map;
> >
> >   ^
> >
> >
> >
> >
> >
> >
> >
> > From: Wei Mi 
> > Sent: Monday, May 24, 2021 8:12 PM
> > To: Eugene Rozenfeld 
> > Cc: Andi Kleen ; Hongtao Yu ; 
> > Xinliang David Li ; Jan Hubicka 
> > ; gcc@gcc.gnu.org; Wenlei He 
> > Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
> >
> >
> >
> > Sorry, I added dependency for create_gcov but missed it for dump_gcov. 
> > Fixed it at 
> > 

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-05-25 Thread Eugene Rozenfeld via Gcc
That eliminates the previous error but there is a new one:

eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
[3/199] Building CXX object 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
FAILED: 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
 
/usr/bin/c++   -I../third_party/perf_data_converter/src 
-I../third_party/perf_data_converter/src/quipper -I../ 
-I../third_party/glog/src -I../third_party/abseil -I../util -I. 
-Ithird_party/glog -std=gnu++1z -MD -MT 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
 -MF 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o.d
 -o 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/perf_reader.cc.o
 -c ../third_party/perf_data_converter/src/quipper/perf_reader.cc
../third_party/perf_data_converter/src/quipper/perf_reader.cc: In member 
function 'bool 
quipper::PerfReader::ReadCPUTopologyMetadata(quipper::DataReader*, size_t)':
../third_party/perf_data_converter/src/quipper/perf_reader.cc:1518:46: error: 
no match for 'operator[]' (operand types are 'const 
google::protobuf::RepeatedField' and 'int')
 nrcpus = proto_uint32_metadata.data()[0];

-Original Message-
From: Wei Mi  
Sent: Monday, May 24, 2021 8:54 PM
To: Eugene Rozenfeld 
Cc: Andi Kleen ; Hongtao Yu ; Xinliang David 
Li ; Jan Hubicka ; gcc@gcc.gnu.org; Wenlei 
He 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

It isn't exposed on my platform either. Looks like a bug in perf_data_converter 
(i.e., quipper). Could you try adding #include  in 
third_party/perf_data_converter/src/quipper/huge_page_deducer.cc and see if it 
fixes the problem? If it works, I will need to file a bug against 
perf_data_converter.

Thanks,
Wei.

On Mon, May 24, 2021 at 8:33 PM Eugene Rozenfeld 
 wrote:
>
> That fixed the error I saw before but the build still fails. The 
> errors start with
>
>
>
> eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
>
> [2/217] Building CXX object 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quippe
> r/huge_page_deducer.cc.o
>
> FAILED: 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quippe
> r/huge_page_deducer.cc.o
>
> /usr/bin/c++   -I../third_party/perf_data_converter/src 
> -I../third_party/perf_data_converter/src/quipper -I../ 
> -I../third_party/glog/src -I../third_party/abseil -I../util -I. 
> -Ithird_party/glog -std=gnu++1z -MD -MT 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
>  -MF 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o.d
>  -o 
> CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
>  -c ../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc
>
> ../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc:14
> 4:26: error: 'unordered_map' in namespace 'std' does not name a 
> template type
>
>using container = std::unordered_map;
>
>   ^
>
>
>
>
>
>
>
> From: Wei Mi 
> Sent: Monday, May 24, 2021 8:12 PM
> To: Eugene Rozenfeld 
> Cc: Andi Kleen ; Hongtao Yu ; Xinliang 
> David Li ; Jan Hubicka ; 
> gcc@gcc.gnu.org; Wenlei He 
> Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC
>
>
>
> Sorry, I added dependency for create_gcov but missed it for dump_gcov. Fixed 
> it at 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fautofdo%2Fcommit%2F6ca36cdc30986f13583a3aef3e27746ca4fc5bf6data=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C1bcb2fda4fce4f173c1808d91f30d1fc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637575116816277204%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=hwJ%2BG64Yw%2BEGMAV7kzilOjAOkHoOv3TQpdqkzcHGO%2FM%3Dreserved=0.
>
>
>
> Thanks,
>
> Wei.
>
>
>
> On Mon, May 24, 2021 at 6:39 PM Eugene Rozenfeld 
>  wrote:
>
> Thank you Wei. Looks like something is still missing. This time 
> perf_data.pb.h is not found. I'm getting the error below (on Ubuntu 18.04 
> with cmake 3.12.1):
>
> eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja [1/241] Building 
> CXX object CMakeFiles/dump_gcov_lib.dir/profile.cc.o
> FAILED: CMakeFiles/dump_gcov_lib.dir/profile.cc.o
> /usr/bin/c++   -I../ -I../third_party/glog/src -I../third_party/abseil 
> -I../third_party/perf_data_converter/src 
> -I../third_party/perf_data_converter/src/quipper -I../util -I. 
> -Ithird_party/glog -std=gnu++1z -MD -MT 
> CMakeFiles/dump_gcov_lib.dir/profile.cc.o -MF 
> CMakeFiles/dump_gcov_lib.dir/profile.cc.o.d -o 
> CMakeFiles/dump_gcov_lib.dir/profile.cc.o -c ../profile.cc
> In file included from 
> ../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
>  from ../sample_reader.h:18,
>  from ../profile.h:15,
>  

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-05-24 Thread Eugene Rozenfeld via Gcc
That fixed the error I saw before but the build still fails. The errors start 
with

eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
[2/217] Building CXX object 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
FAILED: 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
/usr/bin/c++   -I../third_party/perf_data_converter/src 
-I../third_party/perf_data_converter/src/quipper -I../ 
-I../third_party/glog/src -I../third_party/abseil -I../util -I. 
-Ithird_party/glog -std=gnu++1z -MD -MT 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
 -MF 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o.d
 -o 
CMakeFiles/quipper_perf.dir/third_party/perf_data_converter/src/quipper/huge_page_deducer.cc.o
 -c ../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc
../third_party/perf_data_converter/src/quipper/huge_page_deducer.cc:144:26: 
error: 'unordered_map' in namespace 'std' does not name a template type
   using container = std::unordered_map;
  ^



From: Wei Mi 
Sent: Monday, May 24, 2021 8:12 PM
To: Eugene Rozenfeld 
Cc: Andi Kleen ; Hongtao Yu ; Xinliang David 
Li ; Jan Hubicka ; gcc@gcc.gnu.org; Wenlei 
He 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

Sorry, I added dependency for create_gcov but missed it for dump_gcov. Fixed it 
at 
https://github.com/google/autofdo/commit/6ca36cdc30986f13583a3aef3e27746ca4fc5bf6.

Thanks,
Wei.

On Mon, May 24, 2021 at 6:39 PM Eugene Rozenfeld 
mailto:eugene.rozenf...@microsoft.com>> wrote:
Thank you Wei. Looks like something is still missing. This time perf_data.pb.h 
is not found. I'm getting the error below (on Ubuntu 18.04 with cmake 3.12.1):

eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
[1/241] Building CXX object CMakeFiles/dump_gcov_lib.dir/profile.cc.o
FAILED: CMakeFiles/dump_gcov_lib.dir/profile.cc.o
/usr/bin/c++   -I../ -I../third_party/glog/src -I../third_party/abseil 
-I../third_party/perf_data_converter/src 
-I../third_party/perf_data_converter/src/quipper -I../util -I. 
-Ithird_party/glog -std=gnu++1z -MD -MT 
CMakeFiles/dump_gcov_lib.dir/profile.cc.o -MF 
CMakeFiles/dump_gcov_lib.dir/profile.cc.o.d -o 
CMakeFiles/dump_gcov_lib.dir/profile.cc.o -c ../profile.cc
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/base/macros.h:8:0: warning: 
"DISALLOW_COPY_AND_ASSIGN" redefined
 #define DISALLOW_COPY_AND_ASSIGN(TypeName) \

In file included from ../profile.h:14:0,
 from ../profile.cc:5:
../base/macros.h:114:0: note: this is the location of the previous definition
 #define DISALLOW_COPY_AND_ASSIGN(TypeName) \

In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/base/macros.h:12:0: warning: 
"arraysize" redefined
 #define arraysize(x) (sizeof(x) / sizeof(*x))

In file included from ../profile.h:14:0,
 from ../profile.cc:5:
../base/macros.h:162:0: note: this is the location of the previous definition
 #define arraysize(array) (sizeof(ArraySizeHelper(array)))

In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:21:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/compat/proto.h:14:10: fatal 
error: perf_data.pb.h: No such file or directory
 #include "perf_data.pb.h"
  ^~~~
compilation terminated.
[6/241] Building CXX object CMakeFiles/dump_gcov_lib.dir/symbol_map.cc.o
ninja: build stopped: subcommand failed.

Thanks,

Eugene

From: Wei Mi mailto:w...@google.com>>
Sent: Saturday, May 22, 2021 9:37 AM
To: Eugene Rozenfeld 
mailto:eugene.rozenf...@microsoft.com>>
Cc: Andi Kleen mailto:a...@linux.intel.com>>; Hongtao Yu 
mailto:h...@fb.com>>; Xinliang David Li 
mailto:davi...@google.com>>; Jan Hubicka 
mailto:hubi...@ucw.cz>>; 
gcc@gcc.gnu.org; Wenlei He 
mailto:wen...@fb.com>>
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

It is a proto library build 

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-05-24 Thread Eugene Rozenfeld via Gcc
Thank you Wei. Looks like something is still missing. This time perf_data.pb.h 
is not found. I'm getting the error below (on Ubuntu 18.04 with cmake 3.12.1):

eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
[1/241] Building CXX object CMakeFiles/dump_gcov_lib.dir/profile.cc.o
FAILED: CMakeFiles/dump_gcov_lib.dir/profile.cc.o 
/usr/bin/c++   -I../ -I../third_party/glog/src -I../third_party/abseil 
-I../third_party/perf_data_converter/src 
-I../third_party/perf_data_converter/src/quipper -I../util -I. 
-Ithird_party/glog -std=gnu++1z -MD -MT 
CMakeFiles/dump_gcov_lib.dir/profile.cc.o -MF 
CMakeFiles/dump_gcov_lib.dir/profile.cc.o.d -o 
CMakeFiles/dump_gcov_lib.dir/profile.cc.o -c ../profile.cc
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/base/macros.h:8:0: warning: 
"DISALLOW_COPY_AND_ASSIGN" redefined
 #define DISALLOW_COPY_AND_ASSIGN(TypeName) \
 
In file included from ../profile.h:14:0,
 from ../profile.cc:5:
../base/macros.h:114:0: note: this is the location of the previous definition
 #define DISALLOW_COPY_AND_ASSIGN(TypeName) \
 
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/base/macros.h:12:0: warning: 
"arraysize" redefined
 #define arraysize(x) (sizeof(x) / sizeof(*x))
 
In file included from ../profile.h:14:0,
 from ../profile.cc:5:
../base/macros.h:162:0: note: this is the location of the previous definition
 #define arraysize(array) (sizeof(ArraySizeHelper(array)))
 
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:21:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/compat/proto.h:14:10: fatal 
error: perf_data.pb.h: No such file or directory
 #include "perf_data.pb.h"
  ^~~~
compilation terminated.
[6/241] Building CXX object CMakeFiles/dump_gcov_lib.dir/symbol_map.cc.o
ninja: build stopped: subcommand failed.

Thanks,

Eugene

From: Wei Mi  
Sent: Saturday, May 22, 2021 9:37 AM
To: Eugene Rozenfeld 
Cc: Andi Kleen ; Hongtao Yu ; Xinliang David 
Li ; Jan Hubicka ; gcc@gcc.gnu.org; Wenlei 
He 
Subject: Re: [EXTERNAL] Re: State of AutoFDO in GCC

It is a proto library build dependency issue which didn't expose on my 
platform. I fix it at 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fautofdo%2Fcommit%2F98269aee9674cc885cc5eb1bd917eb2d12731710=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C414e79d9b075428ceffc08d91d3fd7ff%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637572982342270322%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=TxMivTXj9w0W5OFLunWcOpqf9wefrrUd0m3RqV8JLaA%3D=0.
 Please try again.

Thanks,
Wei.

On Fri, May 21, 2021 at 6:28 PM Eugene Rozenfeld 
 wrote:
I tried following the instructions in "2.2 Build autofdo tool for gcc" in 
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fautofdo%23readme=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C414e79d9b075428ceffc08d91d3fd7ff%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637572982342280318%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=KBPIcf5k1v230tbLke9%2FdSt25nbPrHhpBvPb8PiebPY%3D=0and
 got build failures:
 
eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
[1/228] Building CXX object CMakeFiles/create_gcov_lib.dir/profile.cc.o
FAILED: CMakeFiles/create_gcov_lib.dir/profile.cc.o 
/usr/bin/c++  -I../ -I../third_party/glog/src -I../third_party/abseil 
-I../third_party/perf_data_converter/src 
-I../third_party/perf_data_converter/src/quipper -I../util -I. 
-Ithird_party/glog -std=gnu++1z -MD -MT 
CMakeFiles/create_gcov_lib.dir/profile.cc.o -MF 
CMakeFiles/create_gcov_lib.dir/profile.cc.o.d -o 
CMakeFiles/create_gcov_lib.dir/profile.cc.o -c ../profile.cc
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
 from ../sample_reader.h:18,
     from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/base/macros.h:8:0: warning: 
"DISALLOW_COPY_AND_ASSIGN" redefined
#define DISALLOW_COPY_AND_ASSIGN(TypeName) \
In file included from ../profile.h:14:0,
 from ../profile.cc:5:
../base/macros.h:114:0: note: this is the location of the previous definition
#define DISALLOW_COPY_AND_ASSIGN(TypeName) \
In file included from 

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-05-21 Thread Eugene Rozenfeld via Gcc
I tried following the instructions in "2.2 Build autofdo tool for gcc" in 
https://github.com/google/autofdo#readme
and got build failures:

eugene@eugene-Virtual-Machine:~/autofdo1/build$ ninja
[1/228] Building CXX object CMakeFiles/create_gcov_lib.dir/profile.cc.o
FAILED: CMakeFiles/create_gcov_lib.dir/profile.cc.o
/usr/bin/c++  -I../ -I../third_party/glog/src -I../third_party/abseil 
-I../third_party/perf_data_converter/src 
-I../third_party/perf_data_converter/src/quipper -I../util -I. 
-Ithird_party/glog -std=gnu++1z -MD -MT 
CMakeFiles/create_gcov_lib.dir/profile.cc.o -MF 
CMakeFiles/create_gcov_lib.dir/profile.cc.o.d -o 
CMakeFiles/create_gcov_lib.dir/profile.cc.o -c ../profile.cc
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/base/macros.h:8:0: warning: 
"DISALLOW_COPY_AND_ASSIGN" redefined
#define DISALLOW_COPY_AND_ASSIGN(TypeName) \
In file included from ../profile.h:14:0,
 from ../profile.cc:5:
../base/macros.h:114:0: note: this is the location of the previous definition
#define DISALLOW_COPY_AND_ASSIGN(TypeName) \
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:18:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/base/macros.h:12:0: warning: 
"arraysize" redefined
#define arraysize(x) (sizeof(x) / sizeof(*x))
In file included from ../profile.h:14:0,
 from ../profile.cc:5:
../base/macros.h:162:0: note: this is the location of the previous definition
#define arraysize(array) (sizeof(ArraySizeHelper(array)))
In file included from 
../third_party/perf_data_converter/src/quipper/perf_parser.h:21:0,
 from ../sample_reader.h:18,
 from ../profile.h:15,
 from ../profile.cc:5:
../third_party/perf_data_converter/src/quipper/compat/proto.h:16:10: fatal 
error: perf_stat.pb.h: No such file or directory
#include "perf_stat.pb.h"
  ^~~~
compilation terminated.

What is supposed to generate perf_stat.pb.h?

Thanks,

Eugene

From: Wei Mi 
Sent: Monday, May 10, 2021 4:47 PM
To: Andi Kleen 
Cc: Hongtao Yu ; Xinliang David Li ; Jan 
Hubicka ; gcc@gcc.gnu.org; Eugene Rozenfeld 
; Wenlei He 
Subject: [EXTERNAL] Re: State of AutoFDO in GCC

https://github.com/google/autofdo
 has been updated. Now create_gcov/dump_gcov are added back and can be built 
separately.

Please look at "2.2 Build autofdo tool for gcc" in 
https://github.com/google/autofdo#readme

On Wed, Apr 28, 2021 at 10:40 PM Andi Kleen 
mailto:a...@linux.intel.com>> wrote:
>
> On Mon, Apr 26, 2021 at 06:40:56PM +, Hongtao Yu wrote:
> >Andi, thanks for pointing out the perf script issues. Can you please
> >elaborate a bit on the exact issue you have seen? We've been using
> >specific output of perf script such as mmap, LBR and callstack events
> >filtered by process id. It works fine so far but may certainly hit issues
> >in the future with extended uses.
>
> Okay I took a look at the latest autofdo now. It seems to be basically
> a LLVM project now that depends on LLVM to even build with all kinds
> of dependency hell on some old LLVM version and other packages.
>
> I guess gcc will really need a replacement that doesn't pull in
> all of LLVM if it wants to continue supporting autofdo.
>
> I'm myself unable to build now.
>
> I'm using the old version I had a git fork of and that
> was before all of this. I added a patch to make it work
> with the latest perf by ignoring increased perf_attr
> and unknown perf events.
>
> Honza please use
>
> 

RE: [EXTERNAL] Re: State of AutoFDO in GCC

2021-04-30 Thread Eugene Rozenfeld via Gcc
Is the format produced by create_gcov and expected by GCC under -fauto-rpofile 
documented somewhere? How is it different from .gcda used in FDO, e.g., as 
described here: http://src.gnu-darwin.org/src/contrib/gcc/gcov-io.h.html?
My input data is different from perf.data and I'd like to write a tool that 
produces the format needed for AutoFDO.

I would prefer that AutoFDO is not removed from GCC and it would be helpful if 
create_gcov were restored in google/autofdo. I checked out a revision before 
the recent merge and tried it on a simple example and it seems to work.
I'm also interested in contributing improvements for AutoFDO so will try to 
investigate https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71672 and 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81379

Thanks,

Eugene

From: Xinliang David Li 
Sent: Friday, April 23, 2021 9:36 AM
To: Richard Biener 
Cc: Jan Hubicka ; gcc@gcc.gnu.org; Eugene Rozenfeld 

Subject: [EXTERNAL] Re: State of AutoFDO in GCC



On Fri, Apr 23, 2021 at 12:00 AM Richard Biener 
mailto:richard.guent...@gmail.com>> wrote:
On Fri, Apr 23, 2021 at 7:28 AM Xinliang David Li via Gcc
mailto:gcc@gcc.gnu.org>> wrote:
>
> Hi, the create_gcov tool was probably removed with the assumption that it
> was only used with Google GCC branch, but it is actually used with GCC
> trunk as well.
>
> Given that, the tool will be restored in the github repo. It seems to build
> and work fine with the regression test.
>
> The tool may ust work as it is right now, but there is no guarantee it
> won't break in the future unless someone in the GCC community tries to
> maintain it.

I think if we want to keep the feature it makes sense to provide create_gcov
functionality either directly from perf (input data producer) or from gcc
(data consumer).  Of course I have no idea about its complexity, license
or implementation language ...

Right. What it takes is a perf data (can be text format) parser to produce the 
format GCC needs, but someone in the community needs to take the lead. It 
should not involve too much effort.

David

Having the tool third-party makes keeping the whole chain working more
difficult.

Richard.

> Thanks,
>
> David
>
> On Thu, Apr 22, 2021 at 3:29 PM Jan Hubicka 
> mailto:hubi...@ucw.cz>> wrote:
>
> > > On 4/22/21 9:58 PM, Eugene Rozenfeld via Gcc wrote:
> > > > GCC documentation for AutoFDO points to create_gcov tool that converts
> > perf.data file into gcov format that can be consumed by gcc with
> > -fauto-profile 
> > (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Fgcc%2FOptimize-Options.html=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C08a2bfb9135c41955a7708d90675e9ba%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637547925772845557%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=%2FAYvyG0%2BxQ%2BbgLgEHUckUngTXDLoNJ4AASMeMVAhHWE%3D=0>,
> > https://gcc.gnu.org/wiki/AutoFDO/Tutorial<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fwiki%2FAutoFDO%2FTutorial=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C08a2bfb9135c41955a7708d90675e9ba%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63754792577285%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=IqMDaxmMTM64eavwksTEXZKwlVhR5UZZiL4tfcyu3io%3D=0>).
> > > >
> > > > I noticed that the source code for create_gcov has been deleted from
> > https://github.com/google/autofdo<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fautofdo=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C08a2bfb9135c41955a7708d90675e9ba%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637547925772865552%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=6m86ZH8vmrXFIJOGO8WPYBxap1R4uULFZi5mE04dFbc%3D=0>
> >  on April 7. I asked about that change
> > in that repo and got the following reply:
> > > >
> > > > https://github.com/google/autofdo/pull/107#issuecomment-819108738<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgoogle%2Fautofdo%2Fpull%2F107%23issuecomment-819108738=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7C08a2bfb9135c41955a7708d90675e9ba%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637547925772875543%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=9k945Ma9HRo4RzafSMzr%2FiKx8ochLr77GyIO7pzBVJ8%3D=0>
> > > >
> > > > "Actually we didn't use create_gcov and havn't updated create_gcov for
> > years, and we also didn't have enough tests to guarantee it works (It was
> > gcc-4.8 when we used and verified create_gcov). If you 

State of AutoFDO in GCC

2021-04-22 Thread Eugene Rozenfeld via Gcc
GCC documentation for AutoFDO points to create_gcov tool that converts 
perf.data file into gcov format that can be consumed by gcc with -fauto-profile 
(https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, 
https://gcc.gnu.org/wiki/AutoFDO/Tutorial).

I noticed that the source code for create_gcov has been deleted from 
https://github.com/google/autofdo on April 7. I asked about that change in that 
repo and got the following reply:

https://github.com/google/autofdo/pull/107#issuecomment-819108738

"Actually we didn't use create_gcov and havn't updated create_gcov for years, 
and we also didn't have enough tests to guarantee it works (It was gcc-4.8 when 
we used and verified create_gcov). If you need it, it is welcomed to update 
create_gcov and add it to the respository."

Does this mean that AutoFDO is currently dead in gcc?

Thanks,

Eugene


[COMMITTED] MAINTAINERS: Add myself for write after approval

2021-03-12 Thread Eugene Rozenfeld via Gcc-patches
ChangeLog:

2021-03-12  Eugene Rozenfeld  

* MAINTAINERS (Write After Approval): Add myself.


0001-MAINTAINERS-Add-myself-for-write-after-approval.patch
Description: 0001-MAINTAINERS-Add-myself-for-write-after-approval.patch


RE: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of comparisons to dec+compare

2021-01-19 Thread Eugene Rozenfeld via Gcc-patches
Richard,

Can you please commit this patch for me? I don't have write access yet, I'm 
still working on getting copyright assignment/disclaimer signed by my employer.

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Friday, January 15, 2021 3:55 AM
To: Eugene Rozenfeld 
Cc: gabrav...@gmail.com; ja...@gcc.gnu.org; gcc-patches@gcc.gnu.org
Subject: Re: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of 
comparisons to dec+compare

On Thu, Jan 14, 2021 at 10:04 PM Eugene Rozenfeld 
 wrote:
>
> I got more feedback for the patch from Gabriel Ravier and Jakub Jelinek in 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D96674data=04%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cf4b1e41de6b4469fd3bb08d8b94c5a20%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637463084866262315%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=%2Fc77uS%2BNlNGmXwOmK729BByEW0VDiq1HEe8BA7DpI30%3Dreserved=0
>  and re-worked it accordingly.
>
> The changes from the previous patch are:
> 1. Switched the tests to use __attribute__((noipa)) instead of 
> __attribute__((noinline)) .
> 2. Fixed a type in the pattern comment.
> 3. Added :c for top-level bit_ior expression.
> 4. Added :s for the subexpressions.
> 5. Added a pattern for the negated expression:
> x >= y && y != XXX_MIN --> x > y - 1
> and the corresponding tests.
>
> The new patch is attached.

OK.

Thanks,
Richard.

> Eugene
>
> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, January 5, 2021 4:21 AM
> To: Eugene Rozenfeld 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination 
> of comparisons to dec+compare
>
> On Mon, Jan 4, 2021 at 9:50 PM Eugene Rozenfeld 
>  wrote:
> >
> > Ping.
> >
> > -Original Message-
> > From: Eugene Rozenfeld
> > Sent: Tuesday, December 22, 2020 3:01 PM
> > To: Richard Biener ; 
> > gcc-patches@gcc.gnu.org
> > Subject: RE: Optimize combination of comparisons to dec+compare
> >
> > Re-sending my question and re-attaching the patch.
> >
> > Richard, can you please clarify your feedback?
>
> Hmm, OK.
>
> The patch is OK.
>
> Thanks,
> Richard.
>
>
> > Thanks,
> >
> > Eugene
> >
> > -Original Message-
> > From: Gcc-patches  On Behalf Of 
> > Eugene Rozenfeld via Gcc-patches
> > Sent: Tuesday, December 15, 2020 2:06 PM
> > To: Richard Biener 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: [EXTERNAL] Re: Optimize combination of comparisons to
> > dec+compare
> >
> > Richard,
> >
> > > Do we already handle x < y || x <= CST to x <= y - CST?
> >
> > That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
> > Can you please clarify?
> >
> > Thanks,
> >
> > Eugene
> >
> > -Original Message-
> > From: Richard Biener 
> > Sent: Thursday, December 10, 2020 12:21 AM
> > To: Eugene Rozenfeld 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: Optimize combination of comparisons to dec+compare
> >
> > On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
> >  wrote:
> > >
> > > This patch adds a pattern for optimizing x < y || x == XXX_MIN to 
> > > x <=
> > > y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.
> >
> > Do we already handle x < y || x <= CST to x <= y - CST?
> > That is, the XXX_MIN case is just a special-case of generic anti-range 
> > testing?  For anti-range testing with signed types we pun to unsigned when 
> > possible.
> >
> > > This fixes pr96674.
> > >
> > > Tested on x86_64-pc-linux-gnu.
> > >
> > > For this function
> > >
> > > bool f(unsigned a, unsigned b)
> > > {
> > > return (b == 0) | (a < b);
> > > }
> > >
> > > the code without the patch is
> > >
> > > test   esi,esi
> > > sete   al
> > > cmpesi,edi
> > > seta   dl
> > > or eax,edx
> > > ret
> > >
> > > the code with the patch is
> > >
> > > subesi,0x1
> > > cmpesi,edi
> > > setae  al
> > > ret
> > >
> > > Eugene
> > >
> > > gcc/
> > > PR tree-optimization/96674
> > > * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
> > >
> > > gcc/testsuite
> > > * gcc.dg/pr96674.c: New test.
> > >


RE: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of comparisons to dec+compare

2021-01-14 Thread Eugene Rozenfeld via Gcc-patches
I got more feedback for the patch from Gabriel Ravier and Jakub Jelinek in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96674 and re-worked it accordingly.

The changes from the previous patch are:
1. Switched the tests to use __attribute__((noipa)) instead of 
__attribute__((noinline)) .
2. Fixed a type in the pattern comment.
3. Added :c for top-level bit_ior expression.
4. Added :s for the subexpressions.
5. Added a pattern for the negated expression:
x >= y && y != XXX_MIN --> x > y - 1
and the corresponding tests.

The new patch is attached.

Eugene

-Original Message-
From: Richard Biener  
Sent: Tuesday, January 5, 2021 4:21 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH][tree-optimization]Optimize combination of 
comparisons to dec+compare

On Mon, Jan 4, 2021 at 9:50 PM Eugene Rozenfeld 
 wrote:
>
> Ping.
>
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Tuesday, December 22, 2020 3:01 PM
> To: Richard Biener ; 
> gcc-patches@gcc.gnu.org
> Subject: RE: Optimize combination of comparisons to dec+compare
>
> Re-sending my question and re-attaching the patch.
>
> Richard, can you please clarify your feedback?

Hmm, OK.

The patch is OK.

Thanks,
Richard.


> Thanks,
>
> Eugene
>
> -Original Message-
> From: Gcc-patches  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Tuesday, December 15, 2020 2:06 PM
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [EXTERNAL] Re: Optimize combination of comparisons to 
> dec+compare
>
> Richard,
>
> > Do we already handle x < y || x <= CST to x <= y - CST?
>
> That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
> Can you please clarify?
>
> Thanks,
>
> Eugene
>
> -Original Message-
> From: Richard Biener 
> Sent: Thursday, December 10, 2020 12:21 AM
> To: Eugene Rozenfeld 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: Optimize combination of comparisons to dec+compare
>
> On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
>  wrote:
> >
> > This patch adds a pattern for optimizing x < y || x == XXX_MIN to x 
> > <=
> > y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.
>
> Do we already handle x < y || x <= CST to x <= y - CST?
> That is, the XXX_MIN case is just a special-case of generic anti-range 
> testing?  For anti-range testing with signed types we pun to unsigned when 
> possible.
>
> > This fixes pr96674.
> >
> > Tested on x86_64-pc-linux-gnu.
> >
> > For this function
> >
> > bool f(unsigned a, unsigned b)
> > {
> > return (b == 0) | (a < b);
> > }
> >
> > the code without the patch is
> >
> > test   esi,esi
> > sete   al
> > cmpesi,edi
> > seta   dl
> > or eax,edx
> > ret
> >
> > the code with the patch is
> >
> > subesi,0x1
> > cmpesi,edi
> > setae  al
> > ret
> >
> > Eugene
> >
> > gcc/
> > PR tree-optimization/96674
> > * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
> >
> > gcc/testsuite
> > * gcc.dg/pr96674.c: New test.
> >


0002-Optimize-combination-of-comparisons-to-dec-compare.patch
Description: 0002-Optimize-combination-of-comparisons-to-dec-compare.patch


[PATCH][tree-optimization]Optimize combination of comparisons to dec+compare

2021-01-04 Thread Eugene Rozenfeld via Gcc-patches
Ping.

-Original Message-
From: Eugene Rozenfeld 
Sent: Tuesday, December 22, 2020 3:01 PM
To: Richard Biener ; gcc-patches@gcc.gnu.org
Subject: RE: Optimize combination of comparisons to dec+compare

Re-sending my question and re-attaching the patch.

Richard, can you please clarify your feedback?

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On Behalf Of Eugene 
Rozenfeld via Gcc-patches
Sent: Tuesday, December 15, 2020 2:06 PM
To: Richard Biener 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: Optimize combination of comparisons to dec+compare

Richard,

> Do we already handle x < y || x <= CST to x <= y - CST?

That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
Can you please clarify?

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Thursday, December 10, 2020 12:21 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Optimize combination of comparisons to dec+compare

On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> This patch adds a pattern for optimizing x < y || x == XXX_MIN to x <= 
> y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.

Do we already handle x < y || x <= CST to x <= y - CST?
That is, the XXX_MIN case is just a special-case of generic anti-range testing? 
 For anti-range testing with signed types we pun to unsigned when possible.

> This fixes pr96674.
>
> Tested on x86_64-pc-linux-gnu.
>
> For this function
>
> bool f(unsigned a, unsigned b)
> {
> return (b == 0) | (a < b);
> }
>
> the code without the patch is
>
> test   esi,esi
> sete   al
> cmpesi,edi
> seta   dl
> or eax,edx
> ret
>
> the code with the patch is
>
> subesi,0x1
> cmpesi,edi
> setae  al
> ret
>
> Eugene
>
> gcc/
> PR tree-optimization/96674
> * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
>
> gcc/testsuite
> * gcc.dg/pr96674.c: New test.
>


0001-Optimize-combination-of-comparisons-to-dec-compare.patch
Description: 0001-Optimize-combination-of-comparisons-to-dec-compare.patch


RE: Optimize combination of comparisons to dec+compare

2020-12-22 Thread Eugene Rozenfeld via Gcc-patches
Re-sending my question and re-attaching the patch.

Richard, can you please clarify your feedback?

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On Behalf Of Eugene 
Rozenfeld via Gcc-patches
Sent: Tuesday, December 15, 2020 2:06 PM
To: Richard Biener 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: Optimize combination of comparisons to dec+compare

Richard,

> Do we already handle x < y || x <= CST to x <= y - CST?

That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
Can you please clarify?

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Thursday, December 10, 2020 12:21 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Optimize combination of comparisons to dec+compare

On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> This patch adds a pattern for optimizing x < y || x == XXX_MIN to x <= 
> y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.

Do we already handle x < y || x <= CST to x <= y - CST?
That is, the XXX_MIN case is just a special-case of generic anti-range testing? 
 For anti-range testing with signed types we pun to unsigned when possible.

> This fixes pr96674.
>
> Tested on x86_64-pc-linux-gnu.
>
> For this function
>
> bool f(unsigned a, unsigned b)
> {
> return (b == 0) | (a < b);
> }
>
> the code without the patch is
>
> test   esi,esi
> sete   al
> cmpesi,edi
> seta   dl
> or eax,edx
> ret
>
> the code with the patch is
>
> subesi,0x1
> cmpesi,edi
> setae  al
> ret
>
> Eugene
>
> gcc/
> PR tree-optimization/96674
> * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
>
> gcc/testsuite
> * gcc.dg/pr96674.c: New test.
>


0001-Optimize-combination-of-comparisons-to-dec-compare.patch
Description: 0001-Optimize-combination-of-comparisons-to-dec-compare.patch


Re: Optimize combination of comparisons to dec+compare

2020-12-15 Thread Eugene Rozenfeld via Gcc-patches
Richard,

> Do we already handle x < y || x <= CST to x <= y - CST?

That is an invalid transformation: e.g., consider x=3, y=4, CST=2.
Can you please clarify?

Thanks,

Eugene

-Original Message-
From: Richard Biener  
Sent: Thursday, December 10, 2020 12:21 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: Optimize combination of comparisons to dec+compare

On Thu, Dec 10, 2020 at 1:52 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> This patch adds a pattern for optimizing x < y || x == XXX_MIN to x <= 
> y-1 if y is an integer with TYPE_OVERFLOW_WRAPS.

Do we already handle x < y || x <= CST to x <= y - CST?
That is, the XXX_MIN case is just a special-case of generic anti-range testing? 
 For anti-range testing with signed types we pun to unsigned when possible.

> This fixes pr96674.
>
> Tested on x86_64-pc-linux-gnu.
>
> For this function
>
> bool f(unsigned a, unsigned b)
> {
> return (b == 0) | (a < b);
> }
>
> the code without the patch is
>
> test   esi,esi
> sete   al
> cmpesi,edi
> seta   dl
> or eax,edx
> ret
>
> the code with the patch is
>
> subesi,0x1
> cmpesi,edi
> setae  al
> ret
>
> Eugene
>
> gcc/
> PR tree-optimization/96674
> * match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1
>
> gcc/testsuite
> * gcc.dg/pr96674.c: New test.
>


Optimize combination of comparisons to dec+compare

2020-12-09 Thread Eugene Rozenfeld via Gcc-patches
This patch adds a pattern for optimizing 
x < y || x == XXX_MIN to x <= y-1
if y is an integer with TYPE_OVERFLOW_WRAPS.

This fixes pr96674.

Tested on x86_64-pc-linux-gnu.

For this function

bool f(unsigned a, unsigned b)
{
return (b == 0) | (a < b);
}

the code without the patch is

test   esi,esi
sete   al
cmpesi,edi
seta   dl
or eax,edx
ret

the code with the patch is

subesi,0x1
cmpesi,edi
setae  al
ret

Eugene

gcc/
PR tree-optimization/96674
* match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1

gcc/testsuite
* gcc.dg/pr96674.c: New test.



0001-Optimize-combination-of-comparisons-to-dec-compare.patch
Description: 0001-Optimize-combination-of-comparisons-to-dec-compare.patch


Re: [PATCH] [tree-optimization] Optimize max/min pattern with comparison

2020-11-30 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the review Jeff.

I don't need to look at the opcode to know the result. The pattern will be 
matched only in these 4 cases:

X <= MAX(X, Y) -> true
X > MAX(X, Y) -> false
X >= MIN(X, Y) -> true
X < MIN(X, Y) -> false

So, the result will be true for GE_EXPR and LE_EXPR and false otherwise.

I added two test files: one for positive cases and one for negative cases. The 
updated patch is attached.

Thanks,

Eugene

-Original Message-
From: Jeff Law  
Sent: Monday, November 30, 2020 9:51 AM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] [tree-optimization] Optimize max/min pattern with 
comparison



On 11/25/20 3:04 PM, Eugene Rozenfeld via Gcc-patches wrote:
> Make the following simplifications:
> X <= MAX(X, Y) -> true
> X > MAX(X, Y) -> false
> X >= MIN(X, Y) -> true
> X < MIN(X, Y) -> false
> 
> This fixes PR96708.
> 
> Tested on x86_64-pc-linux-gnu.
>
> bool f(int a, int b)
> {
> int tmp = (a < b) ? b : a;
> return tmp >= a;
> }
>
> Code without the patch:
>
> vmovd  xmm0,edi
> vmovd  xmm1,esi
> vpmaxsd xmm0,xmm0,xmm1
> vmovd  eax,xmm0
> cmpeax,edi
> setge  al
> ret
>
> Code with the patch:
>
> moveax,0x1
> ret
>
> Eugene
>
> 0001-Optimize-max-pattern-with-comparison.patch
>
> From f6391c197b670b516238ac7707512c1358336520 Mon Sep 17 00:00:00 2001
> From: Eugene Rozenfeld 
> Date: Sat, 21 Nov 2020 01:08:50 -0800
> Subject: [PATCH] Optimize max pattern with comparison
>
> Make the following simplifications:
> X <= MAX(X, Y) -> true
> X > MAX(X, Y) -> false
> X >= MIN(X, Y) -> true
> X < MIN(X, Y) -> false
>
> This fixes PR96708.
>
> gcc/
> * match.pd : New patterns.
> ---
>  gcc/match.pd | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd index 
> cbb4bf0b32d..75237741946 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -2851,6 +2851,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(cmp (minmax @0 INTEGER_CST@1) INTEGER_CST@2)
>(comb (cmp @0 @2) (cmp @1 @2
>  
> +/* X <= MAX(X, Y) -> true
> +   X > MAX(X, Y) -> false 
> +   X >= MIN(X, Y) -> true
> +   X < MIN(X, Y) -> false */
> +(for minmax (min min max max )
> + cmp(ge  lt  le  gt  )
> + (simplify
> +  (cmp @0 (minmax:c @0 @1))
> +  { constant_boolean_node (cmp == GE_EXPR || cmp == LE_EXPR, type); } 
> +))
Don't you need to look at the opcode (MIN vs MAX) vs CMP to know the result?  
I'd really like to see some tests for the testsuite.    In particular I'd 
like to see positive tests where we should apply the optimization and negative 
tests when we should not apply the optimization.

I also wonder if there's value in handling this in Ranger and/or DOM. Though 
I'd probably wait to see if fixing in match.pd is sufficient to cover the cases 
I'm thinking of in Ranger & DOM.

Jeff




0001-Optimize-min-and-max-patterns-with-comparison.patch
Description: 0001-Optimize-min-and-max-patterns-with-comparison.patch


[PATCH] [tree-optimization] Optimize max/min pattern with comparison

2020-11-25 Thread Eugene Rozenfeld via Gcc-patches
Make the following simplifications:
X <= MAX(X, Y) -> true
X > MAX(X, Y) -> false
X >= MIN(X, Y) -> true
X < MIN(X, Y) -> false

This fixes PR96708.

Tested on x86_64-pc-linux-gnu.

bool f(int a, int b)
{
int tmp = (a < b) ? b : a;
return tmp >= a;
}

Code without the patch:

vmovd  xmm0,edi
vmovd  xmm1,esi
vpmaxsd xmm0,xmm0,xmm1
vmovd  eax,xmm0
cmpeax,edi
setge  al
ret

Code with the patch:

moveax,0x1
ret

Eugene


0001-Optimize-max-pattern-with-comparison.patch
Description: 0001-Optimize-max-pattern-with-comparison.patch


[PATCH] [tree-optimization] Optimize or+and+or pattern to and+or

2020-11-22 Thread Eugene Rozenfeld via Gcc-patches
Simplify
((b | c) & a) | b
to 
(a & c) | b.

This fixes PR96679.

Tested on x86_64-pc-linux-gnu.

int f(int a, int b, int c)
{
return ((b | c) & a) | b;
}

Code without the patch:
or edx,esi
andedx,edi
moveax,edx
or eax,esi
ret

Code with the patch:
andedi,edx
moveax,edi
or eax,esi
ret

Eugene


0001-Optimize-or-and-or-pattern-to-and-or.patch
Description: 0001-Optimize-or-and-or-pattern-to-and-or.patch


RE: [EXTERNAL] Re: [PATCH] [tree-optimization] Optimize two patterns with three xors.

2020-11-19 Thread Eugene Rozenfeld via Gcc-patches
Thank you for installing my patch Jeff!

Yes, I intend to contribute regularly. I'm working on getting copyright 
assignment/disclaimer paperwork approved by my employer. I'll apply for commit 
privs after that.

Eugene

-Original Message-
From: Jeff Law  
Sent: Wednesday, November 18, 2020 11:33 AM
To: Richard Biener ; Eugene Rozenfeld 

Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] [tree-optimization] Optimize two patterns with 
three xors.



On 11/17/20 12:57 AM, Richard Biener via Gcc-patches wrote:
> On Tue, Nov 17, 2020 at 3:19 AM Eugene Rozenfeld 
>  wrote:
>> Thank you for the review Richard!
>>
>> I re-worked the patch based on your suggestions (attached).
>> I made the change to reuse the first bit_xor in both patterns and I added :s 
>> to the last xor in the first pattern.
>> For the second pattern I didn't add :s because I think the simplification is 
>> beneficial even if the second or third bit_xor has more than one use since 
>> we are simplifying them to just a single operand (@2). If that is incorrect, 
>> please explain why.
> Ah, true, that's correct.
>
> The patch is OK.
I've installed this on the trunk.

Eugene, if you're going to contribute regularly you should probably go ahead 
and get commit privs so that you can commit ACK's patches yourself.   There 
should be a link to a form from this page:

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fgitwrite.htmldata=04%7C01%7Ceugene.rozenfeld%40microsoft.com%7Ca31f5335968749cdfe1708d88bf8bfb2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637413247775369684%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=BfQGAtU%2F8IJ8%2BRcvuNI8qKWgnC9oPFtWQhXo1RzlBTU%3Dreserved=0


Jeff



Re: [PATCH] [tree-optimization] Optimize two patterns with three xors.

2020-11-16 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the review Richard!

I re-worked the patch based on your suggestions (attached).
I made the change to reuse the first bit_xor in both patterns and I added :s to 
the last xor in the first pattern.
For the second pattern I didn't add :s because I think the simplification is 
beneficial even if the second or third bit_xor has more than one use since we 
are simplifying them to just a single operand (@2). If that is incorrect, 
please explain why.

Eugene

-Original Message-
From: Richard Biener  
Sent: Monday, November 16, 2020 4:11 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] [tree-optimization] Optimize two patterns with three xors.

On Thu, Nov 12, 2020 at 2:53 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> Simplify (a ^ b) & ((b ^ c) ^ a) --> (a ^ b) & ~c.
>
> int f(int a, int b, int c)
> {
> return (a ^ b) & ((b ^ c) ^ a);
> }
>
> Code without the patch:
> moveax,edx
> xoreax,esi
> xoreax,edi
> xoredi,esi
> andeax,edi
> ret
>
> Code with the patch:
> xoredi,esi
> andn   eax,edx,edi
> ret
>
> Simplify (a ^ b) | ((b ^ c) ^ a) --> (a ^ b) | c.
> int g(int a, int b, int c)
> {
> return (a ^ b) | ((b ^ c) ^ a);
> }
>
> Code without the patch:
> moveax,edx
> xoreax,esi
> xoreax,edi
> xoredi,esi
> or eax,edi
> ret
>
> Code with the patch:
> xoredi,esi
> moveax,edi
> or eax,edx
> ret
>
> This fixes PR96671.

+/* (a ^ b) & ((b ^ c) ^ a) --> (a ^ b) & ~c */ (simplify  (bit_and:c 
+(bit_xor:c @0 @1) (bit_xor:cs (bit_xor:c @1 @2) @0))  (bit_and (bit_xor 
+@0 @1) (bit_not @2)))

you should be able to re-use the first bit_xor by doing

+/* (a ^ b) & ((b ^ c) ^ a) --> (a ^ b) & ~c */ (simplify  (bit_and:c 
+(bit_xor:c@3 @0 @1) (bit_xor:cs (bit_xor:c @1 @2) @0))  (bit_and @3 
+(bit_not @2)))

For completeness the last (bit_xor:c @1 @2) should also have a :s

+/* (a ^ b) | ((b ^ c) ^ a) --> (a ^ b) | c */ (simplify  (bit_ior:c 
+(bit_xor:c @0 @1) (bit_xor:c (bit_xor:c @1 @2) @0))  (bit_ior (bit_xor 
+@0 @1) @2))

completely misses :s here and the same comment about the re-use of the existing 
xor applies.

Otherwise looks OK to me, sorry for the delay.

Thanks,
Richard.

> Tested on x86_64-pc-linux-gnu.
>


0001-Optimize-two-patterns-with-three-xors.patch
Description: 0001-Optimize-two-patterns-with-three-xors.patch


[PATCH] [tree-optimization] Optimize two patterns with three xors.

2020-11-11 Thread Eugene Rozenfeld via Gcc-patches
Simplify (a ^ b) & ((b ^ c) ^ a) --> (a ^ b) & ~c.

int f(int a, int b, int c)
{
return (a ^ b) & ((b ^ c) ^ a);
}

Code without the patch:
moveax,edx
xoreax,esi
xoreax,edi
xoredi,esi
andeax,edi
ret  

Code with the patch:
xoredi,esi
andn   eax,edx,edi
ret  

Simplify (a ^ b) | ((b ^ c) ^ a) --> (a ^ b) | c.
int g(int a, int b, int c)
{
return (a ^ b) | ((b ^ c) ^ a);
}

Code without the patch:
moveax,edx
xoreax,esi
xoreax,edi
xoredi,esi
or eax,edi
ret

Code with the patch:
xoredi,esi
moveax,edi
or eax,edx
ret

This fixes PR96671.

Tested on x86_64-pc-linux-gnu.



0001-Optimize-two-patterns-with-three-xors.patch
Description: 0001-Optimize-two-patterns-with-three-xors.patch


[PATCH] [tree-optimization] Fix for PR96701

2020-10-29 Thread Eugene Rozenfeld via Gcc-patches
This patch adds a pattern for folding 
x >> x
to
  0
as described in PR96701.

Without this patch the x86_64-pc-linux-gnu code generated for this function

int
foo (int i)
{
  return i >> i;
}

is

mov    ecx,edi
sar    edi,cl
test   edi,edi
setne  al
ret

With the patch the code is 

xor    eax,eax
ret  

Tested on x86_64-pc-linux-gnu.

Eugene


0001-Optimize-self-right-shift-to-0.patch
Description: 0001-Optimize-self-right-shift-to-0.patch


[PATCH] [tree-optimization] Fix for PR96701

2020-10-29 Thread Eugene Rozenfeld via Gcc-patches
This patch adds a pattern for folding

x >> x

to

  0

as described in PR96701.


Without this patch the x86_64-pc-linux-gnu code generated for this function



int

foo (int i)

{

  return i >> i;

}



is



movecx,edi

saredi,cl

test   edi,edi

setne  al

ret



With the patch the code is


xoreax,eax
ret


Tested on x86_64-pc-linux-gnu.

Eugene


0001-Optimize-self-right-shift-to-0.patch
Description: 0001-Optimize-self-right-shift-to-0.patch


RE: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223

2020-10-29 Thread Eugene Rozenfeld via Gcc-patches
Thank you for the review Richard!

I re-worked the patch based on your suggestions. I combined the two patterns. 
Neither one requires a signedness check as long as the type of the 'add' has 
overflow wrap semantics.

I had to modify the regular expression in no-strict-overflow-4.c test. In that 
test the following function is compiled with -fno-strict-overflow :

int
foo (int i)
{
  return i + 1 > i;
}

We now optimize this function so that the tree-optimized dump has

;; Function foo (foo, funcdef_no=0, decl_uid=1931, cgraph_uid=1, symbol_order=0)

foo (int i)
{
  _Bool _1;
  int _3;

   [local count: 1073741824]:
  _1 = i_2(D) != 2147483647;
  _3 = (int) _1;
  return _3;
}

This is a correct optimization since -fno-strict-overflow implies -fwrapv.

Eugene

-Original Message-
From: Richard Biener  
Sent: Tuesday, October 27, 2020 2:23 AM
To: Eugene Rozenfeld 
Cc: gcc-patches@gcc.gnu.org
Subject: [EXTERNAL] Re: [PATCH] [tree-optimization] Fix for PR97223

On Sat, Oct 24, 2020 at 2:20 AM Eugene Rozenfeld via Gcc-patches 
 wrote:
>
> This patch adds a pattern for folding
> x < (short) ((unsigned short)x + const) to
>  x <= SHORT_MAX - const
> (and similarly for other integral types) if const is not 0.
> as described in PR97223.
>
> For example, without this patch the x86_64-pc-linux code generated for 
> this function
>
> bool f(char x)
> {
> return x < (char)(x + 12);
> }
>
> is
>
> leaeax,[rdi+0xc]
> cmpal,dil
> setg   al
> ret
>
> With the patch the code is
>
> cmpdil,0x73
> setle  al
> ret
>
> Tested on x86_64-pc-linux.

+/* Similar to the previous pattern but with additional casts. */ (for 
+cmp (lt le ge gt)
+ out (gt gt le le)
+ (simplify
+  (cmp:c (convert@3 (plus@2 (convert@4 @0) INTEGER_CST@1)) @0)
+  (if (!TYPE_UNSIGNED (TREE_TYPE (@0))
+   && types_match (TREE_TYPE (@0), TREE_TYPE (@3))
+   && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE (@0)))
+   && TYPE_OVERFLOW_WRAPS (TREE_TYPE (@4))
+   && wi::to_wide (@1) != 0
+   && single_use (@2))
+   (with { unsigned int prec = TYPE_PRECISION (TREE_TYPE (@0)); }
+(out @0 { wide_int_to_tree (TREE_TYPE (@0),
+   wi::max_value (prec, SIGNED)
+   - wi::to_wide (@1)); })

I think it's reasonable but the comment can be made more precise.
In particular I wonder why we require a signed comparison here while the 
previous pattern requires an unsigned comparison.  It might be an artifact and 
the restriction instead only applies to the plus?

Note that

+   && types_match (TREE_TYPE (@4), unsigned_type_for (TREE_TYPE 
+ (@0)))

unsigned_type_for should be avoided since it's quite expensive.  May I suggest

  && TYPE_UNSIGNED (TREE_TYPE (@4))
  && tree_nop_conversion_p (TREE_TYPE (@4), TREE_TYPE (@0))

instead?

I originally wondered if "but with additional casts" could be done in a single 
pattern via (convert? ...) uses but then I noticed the strange difference in 
the comparison signedness requirement ...

Richard.

> Eugene
>


0001-Add-a-tree-optimization-described-in-PR97223.patch
Description: 0001-Add-a-tree-optimization-described-in-PR97223.patch


[PATCH] [tree-optimization] Fix for PR97223

2020-10-23 Thread Eugene Rozenfeld via Gcc-patches
This patch adds a pattern for folding 
x < (short) ((unsigned short)x + const)
to
 x <= SHORT_MAX - const
(and similarly for other integral types) if const is not 0.
as described in PR97223.

For example, without this patch the x86_64-pc-linux code generated for this 
function

bool f(char x)
{
return x < (char)(x + 12);
}

is

leaeax,[rdi+0xc]
cmpal,dil
setg   al
ret  

With the patch the code is 

cmpdil,0x73
setle  al
ret

Tested on x86_64-pc-linux.

Eugene



0001-Add-a-tree-optimization-described-in-PR97223.patch
Description: 0001-Add-a-tree-optimization-described-in-PR97223.patch


  1   2   >