[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
davidxl accepted this revision. davidxl added a comment. lgtm https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
rsmith added a comment. Thanks, looks great. https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
danielcdh updated this revision to Diff 112604. danielcdh marked an inline comment as done. danielcdh added a comment. update https://reviews.llvm.org/D37091 Files: include/clang/Driver/Options.td include/clang/Frontend/CodeGenOptions.def lib/CodeGen/CodeGenFunction.cpp lib/Driver/ToolChains/Clang.cpp lib/Frontend/CompilerInvocation.cpp test/CodeGen/profile-sample-accurate.c test/Driver/clang_f_opts.c test/Integration/thinlto_profile_sample_accurate.c Index: test/Integration/thinlto_profile_sample_accurate.c === --- /dev/null +++ test/Integration/thinlto_profile_sample_accurate.c @@ -0,0 +1,9 @@ +// Test to ensure -emit-llvm profile-sample-accurate is honored in ThinLTO. +// RUN: %clang -O2 %s -flto=thin -fprofile-sample-accurate -c -o %t.o +// RUN: llvm-lto -thinlto -o %t %t.o +// RUN: %clang_cc1 -O2 -x ir %t.o -fthinlto-index=%t.thinlto.bc -emit-llvm -o - | FileCheck %s + +// CHECK: define void @foo() +// CHECK: attributes {{.*}} "profile-sample-accurate" +void foo() { +} Index: test/Driver/clang_f_opts.c === --- test/Driver/clang_f_opts.c +++ test/Driver/clang_f_opts.c @@ -53,6 +53,9 @@ // CHECK-REROLL-LOOPS: "-freroll-loops" // CHECK-NO-REROLL-LOOPS-NOT: "-freroll-loops" +// RUN: %clang -### -S -fprofile-sample-accurate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-SAMPLE-ACCURATE %s +// CHECK-PROFILE-SAMPLE-ACCURATE: "-fprofile-sample-accurate" + // RUN: %clang -### -S -fprofile-sample-use=%S/Inputs/file.prof %s 2>&1 | FileCheck -check-prefix=CHECK-SAMPLE-PROFILE %s // CHECK-SAMPLE-PROFILE: "-fprofile-sample-use={{.*}}/file.prof" Index: test/CodeGen/profile-sample-accurate.c === --- /dev/null +++ test/CodeGen/profile-sample-accurate.c @@ -0,0 +1,7 @@ +// Test to ensure -emit-llvm profile-sample-accurate is honored by clang. +// RUN: %clang -S -emit-llvm %s -fprofile-sample-accurate -o - | FileCheck %s + +// CHECK: define void @foo() +// CHECK: attributes {{.*}} "profile-sample-accurate" +void foo() { +} Index: lib/Frontend/CompilerInvocation.cpp === --- lib/Frontend/CompilerInvocation.cpp +++ lib/Frontend/CompilerInvocation.cpp @@ -652,6 +652,8 @@ Opts.NoUseJumpTables = Args.hasArg(OPT_fno_jump_tables); + Opts.ProfileSampleAccurate = Args.hasArg(OPT_fprofile_sample_accurate); + Opts.PrepareForLTO = Args.hasArg(OPT_flto, OPT_flto_EQ); Opts.EmitSummaryIndex = false; if (Arg *A = Args.getLastArg(OPT_flto_EQ)) { Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,10 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_fprofile_sample_accurate, + options::OPT_fno_profile_sample_accurate, false)) +CmdArgs.push_back("-fprofile-sample-accurate"); + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: lib/CodeGen/CodeGenFunction.cpp === --- lib/CodeGen/CodeGenFunction.cpp +++ lib/CodeGen/CodeGenFunction.cpp @@ -837,6 +837,10 @@ Fn->addFnAttr("no-jump-tables", llvm::toStringRef(CGM.getCodeGenOpts().NoUseJumpTables)); + // Add profile-sample-accurate value. + if (CGM.getCodeGenOpts().ProfileSampleAccurate) +Fn->addFnAttr("profile-sample-accurate"); + if (getLangOpts().OpenCL) { // Add metadata for a kernel function. if (const FunctionDecl *FD = dyn_cast_or_null(D)) Index: include/clang/Frontend/CodeGenOptions.def === --- include/clang/Frontend/CodeGenOptions.def +++ include/clang/Frontend/CodeGenOptions.def @@ -183,6 +183,7 @@ CODEGENOPT(UnwindTables , 1, 0) ///< Emit unwind tables. CODEGENOPT(VectorizeLoop , 1, 0) ///< Run loop vectorizer. CODEGENOPT(VectorizeSLP , 1, 0) ///< Run SLP vectorizer. +CODEGENOPT(ProfileSampleAccurate, 1, 0) ///< Sample profile is accurate. /// Attempt to use register sized accesses to bit-fields in structures, when /// possible. Index: include/clang/Driver/Options.td === --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -637,12 +637,25 @@ def fprofile_sample_use_EQ : Joined<["-"], "fprofile-sample-use=">, Group, Flags<[DriverOption, CC1Option]>, HelpText<"Enable sample-based profile guided optimizations">; +def fprofile_sample_accurate : Flag<["-"], "fprofile-sample-accurate">, +Group, Flags<[DriverOption, CC1Option]>, +HelpText<"Specifies
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
rsmith added inline comments. Comment at: test/CodeGen/thinlto-profile-sample-accurate.c:2-4 +// RUN: %clang -O2 %s -flto=thin -fprofile-sample-accurate -c -o %t.o +// RUN: llvm-lto -thinlto -o %t %t.o +// RUN: %clang_cc1 -O2 -x ir %t.o -fthinlto-index=%t.thinlto.bc -emit-llvm -o - | FileCheck %s test/CodeGen tests should generally not run the optimizer; these tests are intended to check that the correct IR is produced by CodeGen prior to any optimization. So: we should have a unit test that the IR produced by Clang contains the attribute in the right place (prior to any LLVM passes running), here in test/CodeGen. It seems fine to me to have an end-to-end integration test in addition to that (but not instead of it); we haven't yet established a systematic organization for those tests, but putting it in test/Integration for now will make it easier to find in the future when we work that out. https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
danielcdh updated this revision to Diff 112601. danielcdh added a comment. Herald added subscribers: eraman, mehdi_amini. Add an end-to-end test. https://reviews.llvm.org/D37091 Files: include/clang/Driver/Options.td include/clang/Frontend/CodeGenOptions.def lib/CodeGen/CodeGenFunction.cpp lib/Driver/ToolChains/Clang.cpp lib/Frontend/CompilerInvocation.cpp test/CodeGen/thinlto-profile-sample-accurate.c test/Driver/clang_f_opts.c Index: test/Driver/clang_f_opts.c === --- test/Driver/clang_f_opts.c +++ test/Driver/clang_f_opts.c @@ -53,6 +53,9 @@ // CHECK-REROLL-LOOPS: "-freroll-loops" // CHECK-NO-REROLL-LOOPS-NOT: "-freroll-loops" +// RUN: %clang -### -S -fprofile-sample-accurate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-SAMPLE-ACCURATE %s +// CHECK-PROFILE-SAMPLE-ACCURATE: "-fprofile-sample-accurate" + // RUN: %clang -### -S -fprofile-sample-use=%S/Inputs/file.prof %s 2>&1 | FileCheck -check-prefix=CHECK-SAMPLE-PROFILE %s // CHECK-SAMPLE-PROFILE: "-fprofile-sample-use={{.*}}/file.prof" Index: test/CodeGen/thinlto-profile-sample-accurate.c === --- /dev/null +++ test/CodeGen/thinlto-profile-sample-accurate.c @@ -0,0 +1,9 @@ +// Test to ensure -emit-llvm profile-sample-accurate is honored in ThinLTO. +// RUN: %clang -O2 %s -flto=thin -fprofile-sample-accurate -c -o %t.o +// RUN: llvm-lto -thinlto -o %t %t.o +// RUN: %clang_cc1 -O2 -x ir %t.o -fthinlto-index=%t.thinlto.bc -emit-llvm -o - | FileCheck %s + +// CHECK: define void @foo() +// CHECK: attributes {{.*}} "profile-sample-accurate" +void foo() { +} Index: lib/Frontend/CompilerInvocation.cpp === --- lib/Frontend/CompilerInvocation.cpp +++ lib/Frontend/CompilerInvocation.cpp @@ -652,6 +652,8 @@ Opts.NoUseJumpTables = Args.hasArg(OPT_fno_jump_tables); + Opts.ProfileSampleAccurate = Args.hasArg(OPT_fprofile_sample_accurate); + Opts.PrepareForLTO = Args.hasArg(OPT_flto, OPT_flto_EQ); Opts.EmitSummaryIndex = false; if (Arg *A = Args.getLastArg(OPT_flto_EQ)) { Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,10 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_fprofile_sample_accurate, + options::OPT_fno_profile_sample_accurate, false)) +CmdArgs.push_back("-fprofile-sample-accurate"); + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: lib/CodeGen/CodeGenFunction.cpp === --- lib/CodeGen/CodeGenFunction.cpp +++ lib/CodeGen/CodeGenFunction.cpp @@ -837,6 +837,10 @@ Fn->addFnAttr("no-jump-tables", llvm::toStringRef(CGM.getCodeGenOpts().NoUseJumpTables)); + // Add profile-sample-accurate value. + if (CGM.getCodeGenOpts().ProfileSampleAccurate) +Fn->addFnAttr("profile-sample-accurate"); + if (getLangOpts().OpenCL) { // Add metadata for a kernel function. if (const FunctionDecl *FD = dyn_cast_or_null(D)) Index: include/clang/Frontend/CodeGenOptions.def === --- include/clang/Frontend/CodeGenOptions.def +++ include/clang/Frontend/CodeGenOptions.def @@ -183,6 +183,7 @@ CODEGENOPT(UnwindTables , 1, 0) ///< Emit unwind tables. CODEGENOPT(VectorizeLoop , 1, 0) ///< Run loop vectorizer. CODEGENOPT(VectorizeSLP , 1, 0) ///< Run SLP vectorizer. +CODEGENOPT(ProfileSampleAccurate, 1, 0) ///< Sample profile is accurate. /// Attempt to use register sized accesses to bit-fields in structures, when /// possible. Index: include/clang/Driver/Options.td === --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -637,12 +637,25 @@ def fprofile_sample_use_EQ : Joined<["-"], "fprofile-sample-use=">, Group, Flags<[DriverOption, CC1Option]>, HelpText<"Enable sample-based profile guided optimizations">; +def fprofile_sample_accurate : Flag<["-"], "fprofile-sample-accurate">, +Group, Flags<[DriverOption, CC1Option]>, +HelpText<"Specifies that the sample profile is accurate">, +DocBrief<[{Specifies that the sample profile is accurate. If the sample + profile is accurate, callsites without profile samples are marked + as cold. Otherwise, treat callsites without profile samples as if + we have no profile}]>; +def fno_profile_sample_accurate : Flag<["-"], "fno-profile-sample-accurate">, + Group, Flags<[DriverOption]>; def fauto_profile : Flag<["-"],
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
rsmith accepted this revision. rsmith added a comment. This revision is now accepted and ready to land. Please add a test that the attribute is emitted into IR. Other than that, this looks good to me. Comment at: include/clang/Driver/Options.td:645 + profile is accurate, callsites without profile samples are marked + as cold. Otherwise, treat callsites without profile samples as if + we have no profile}]>; Consistently use passive voice here: "treat callsites without profile samples" -> "callsites without profile samples are treated" Comment at: include/clang/Driver/Options.td:646 + as cold. Otherwise, treat callsites without profile samples as if + we have no profile}]>; +def fno_profile_sample_accurate : Flag<["-"], "fno-profile-sample-accurate">, Add a trailing period. https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
danielcdh updated this revision to Diff 112597. danielcdh marked 3 inline comments as done. danielcdh added a comment. update https://reviews.llvm.org/D37091 Files: include/clang/Driver/Options.td include/clang/Frontend/CodeGenOptions.def lib/CodeGen/CodeGenFunction.cpp lib/Driver/ToolChains/Clang.cpp lib/Frontend/CompilerInvocation.cpp test/Driver/clang_f_opts.c Index: test/Driver/clang_f_opts.c === --- test/Driver/clang_f_opts.c +++ test/Driver/clang_f_opts.c @@ -53,6 +53,9 @@ // CHECK-REROLL-LOOPS: "-freroll-loops" // CHECK-NO-REROLL-LOOPS-NOT: "-freroll-loops" +// RUN: %clang -### -S -fprofile-sample-accurate %s 2>&1 | FileCheck -check-prefix=CHECK-PROFILE-SAMPLE-ACCURATE %s +// CHECK-PROFILE-SAMPLE-ACCURATE: "-fprofile-sample-accurate" + // RUN: %clang -### -S -fprofile-sample-use=%S/Inputs/file.prof %s 2>&1 | FileCheck -check-prefix=CHECK-SAMPLE-PROFILE %s // CHECK-SAMPLE-PROFILE: "-fprofile-sample-use={{.*}}/file.prof" Index: lib/Frontend/CompilerInvocation.cpp === --- lib/Frontend/CompilerInvocation.cpp +++ lib/Frontend/CompilerInvocation.cpp @@ -652,6 +652,8 @@ Opts.NoUseJumpTables = Args.hasArg(OPT_fno_jump_tables); + Opts.ProfileSampleAccurate = Args.hasArg(OPT_fprofile_sample_accurate); + Opts.PrepareForLTO = Args.hasArg(OPT_flto, OPT_flto_EQ); Opts.EmitSummaryIndex = false; if (Arg *A = Args.getLastArg(OPT_flto_EQ)) { Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,10 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_fprofile_sample_accurate, + options::OPT_fno_profile_sample_accurate, false)) +CmdArgs.push_back("-fprofile-sample-accurate"); + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: lib/CodeGen/CodeGenFunction.cpp === --- lib/CodeGen/CodeGenFunction.cpp +++ lib/CodeGen/CodeGenFunction.cpp @@ -837,6 +837,10 @@ Fn->addFnAttr("no-jump-tables", llvm::toStringRef(CGM.getCodeGenOpts().NoUseJumpTables)); + // Add profile-sample-accurate value. + if (CGM.getCodeGenOpts().ProfileSampleAccurate) +Fn->addFnAttr("profile-sample-accurate"); + if (getLangOpts().OpenCL) { // Add metadata for a kernel function. if (const FunctionDecl *FD = dyn_cast_or_null(D)) Index: include/clang/Frontend/CodeGenOptions.def === --- include/clang/Frontend/CodeGenOptions.def +++ include/clang/Frontend/CodeGenOptions.def @@ -183,6 +183,7 @@ CODEGENOPT(UnwindTables , 1, 0) ///< Emit unwind tables. CODEGENOPT(VectorizeLoop , 1, 0) ///< Run loop vectorizer. CODEGENOPT(VectorizeSLP , 1, 0) ///< Run SLP vectorizer. +CODEGENOPT(ProfileSampleAccurate, 1, 0) ///< Sample profile is accurate. /// Attempt to use register sized accesses to bit-fields in structures, when /// possible. Index: include/clang/Driver/Options.td === --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -637,12 +637,25 @@ def fprofile_sample_use_EQ : Joined<["-"], "fprofile-sample-use=">, Group, Flags<[DriverOption, CC1Option]>, HelpText<"Enable sample-based profile guided optimizations">; +def fprofile_sample_accurate : Flag<["-"], "fprofile-sample-accurate">, +Group, Flags<[DriverOption, CC1Option]>, +HelpText<"Specifies that the sample profile is accurate">, +DocBrief<[{Specifies that the sample profile is accurate. If the sample + profile is accurate, callsites without profile samples are marked + as cold. Otherwise, treat callsites without profile samples as if + we have no profile}]>; +def fno_profile_sample_accurate : Flag<["-"], "fno-profile-sample-accurate">, + Group, Flags<[DriverOption]>; def fauto_profile : Flag<["-"], "fauto-profile">, Group, Alias; def fno_auto_profile : Flag<["-"], "fno-auto-profile">, Group, Alias; def fauto_profile_EQ : Joined<["-"], "fauto-profile=">, Alias; +def fauto_profile_accurate : Flag<["-"], "fauto-profile-accurate">, +Group, Alias; +def fno_auto_profile_accurate : Flag<["-"], "fno-auto-profile-accurate">, +Group, Alias; def fdebug_info_for_profiling : Flag<["-"], "fdebug-info-for-profiling">, Group, Flags<[CC1Option]>, HelpText<"Emit extra debug info to make sample profile more accurate.">; Index: test/Driver/clang_f_opts.c === ---
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
rsmith added inline comments. Comment at: docs/ClangCommandLineReference.rst:173-180 +.. option:: -faccurate-sample-profile, -fno-accurate-sample-profile +.. program:: clang + +If the sample profile is accurate, callsites without profile samples are marked +as cold. Otherwise, treat un-sampled callsites as if we have no profile. This +option can be used to enable more aggressive size optimization based on +profiles. This is a generated file; please don't modify it by hand. Comment at: include/clang/Driver/Options.td:590 def faccess_control : Flag<["-"], "faccess-control">, Group; +def faccurate_sample_profile : Flag<["-"], "faccurate-sample-profile">, + Group, Flags<[DriverOption, CC1Option]>, We generally try to group similar options together under a common prefix. Would `-fprofile-sample-accurate` work here? Comment at: include/clang/Driver/Options.td:592-594 + HelpText<"If sample profile is accurate, we will mark all un-sampled " + "callsite as cold. Otherwise, treat callsites without profile " + "samples as if we have no profile">; `HelpText` should be a very brief string that can be included in a one-line description of the flag for `--help`. Longer text for inclusion in the option reference should be in a `DocBrief<{blah blah blah.}>`. Also, it seems to me that this help text doesn't actually say what the option does. Does this request that accurate sample profiles be generated, or specify that an accurate sample profile was provided, or what? Suggestion: ``` HelpText<"Specifies that the sample profile is accurate">, DocBrief<{Specifies that the sample profile is accurate. If the sample profile is accurate, callsites without profile samples are marked as cold. [...same as current reference documentation text...]}> ``` https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
davidxl added a comment. Looks fine to me, but please wait for Richard's comment. https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
danielcdh updated this revision to Diff 112574. danielcdh marked 2 inline comments as done. danielcdh added a comment. updated the patch to put it into function attribute so that it works with ThinLTO https://reviews.llvm.org/D37091 Files: docs/ClangCommandLineReference.rst include/clang/Driver/Options.td include/clang/Frontend/CodeGenOptions.def lib/CodeGen/CodeGenFunction.cpp lib/Driver/ToolChains/Clang.cpp lib/Frontend/CompilerInvocation.cpp test/Driver/clang_f_opts.c Index: test/Driver/clang_f_opts.c === --- test/Driver/clang_f_opts.c +++ test/Driver/clang_f_opts.c @@ -53,6 +53,9 @@ // CHECK-REROLL-LOOPS: "-freroll-loops" // CHECK-NO-REROLL-LOOPS-NOT: "-freroll-loops" +// RUN: %clang -### -S -faccurate-sample-profile %s 2>&1 | FileCheck -check-prefix=CHECK-ACCURATE-SAMPLE-PROFILE %s +// CHECK-ACCURATE-SAMPLE-PROFILE: "-faccurate-sample-profile" + // RUN: %clang -### -S -fprofile-sample-use=%S/Inputs/file.prof %s 2>&1 | FileCheck -check-prefix=CHECK-SAMPLE-PROFILE %s // CHECK-SAMPLE-PROFILE: "-fprofile-sample-use={{.*}}/file.prof" Index: lib/Frontend/CompilerInvocation.cpp === --- lib/Frontend/CompilerInvocation.cpp +++ lib/Frontend/CompilerInvocation.cpp @@ -652,6 +652,8 @@ Opts.NoUseJumpTables = Args.hasArg(OPT_fno_jump_tables); + Opts.AccurateSampleProfile = Args.hasArg(OPT_faccurate_sample_profile); + Opts.PrepareForLTO = Args.hasArg(OPT_flto, OPT_flto_EQ); Opts.EmitSummaryIndex = false; if (Arg *A = Args.getLastArg(OPT_flto_EQ)) { Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,10 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_faccurate_sample_profile, + options::OPT_fno_accurate_sample_profile, false)) +CmdArgs.push_back("-faccurate-sample-profile"); + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: lib/CodeGen/CodeGenFunction.cpp === --- lib/CodeGen/CodeGenFunction.cpp +++ lib/CodeGen/CodeGenFunction.cpp @@ -837,6 +837,10 @@ Fn->addFnAttr("no-jump-tables", llvm::toStringRef(CGM.getCodeGenOpts().NoUseJumpTables)); + // Add accurate-sample-profile value. + if (CGM.getCodeGenOpts().AccurateSampleProfile) +Fn->addFnAttr("accurate-sample-profile"); + if (getLangOpts().OpenCL) { // Add metadata for a kernel function. if (const FunctionDecl *FD = dyn_cast_or_null(D)) Index: include/clang/Frontend/CodeGenOptions.def === --- include/clang/Frontend/CodeGenOptions.def +++ include/clang/Frontend/CodeGenOptions.def @@ -183,6 +183,7 @@ CODEGENOPT(UnwindTables , 1, 0) ///< Emit unwind tables. CODEGENOPT(VectorizeLoop , 1, 0) ///< Run loop vectorizer. CODEGENOPT(VectorizeSLP , 1, 0) ///< Run SLP vectorizer. +CODEGENOPT(AccurateSampleProfile, 1, 0) ///< Sample profile is accurate. /// Attempt to use register sized accesses to bit-fields in structures, when /// possible. Index: include/clang/Driver/Options.td === --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -587,6 +587,14 @@ def fPIE : Flag<["-"], "fPIE">, Group; def fno_PIE : Flag<["-"], "fno-PIE">, Group; def faccess_control : Flag<["-"], "faccess-control">, Group; +def faccurate_sample_profile : Flag<["-"], "faccurate-sample-profile">, + Group, Flags<[DriverOption, CC1Option]>, + HelpText<"If sample profile is accurate, we will mark all un-sampled " + "callsite as cold. Otherwise, treat callsites without profile " + "samples as if we have no profile">; +def fno_accurate_sample_profile : Flag<["-"], "fno-accurate-sample-profile">, + Group, Flags<[DriverOption]>; + def fallow_unsupported : Flag<["-"], "fallow-unsupported">, Group; def fapple_kext : Flag<["-"], "fapple-kext">, Group, Flags<[CC1Option]>, HelpText<"Use Apple's kernel extensions ABI">; @@ -643,6 +651,10 @@ Alias; def fauto_profile_EQ : Joined<["-"], "fauto-profile=">, Alias; +def fauto_profile_accurate : Flag<["-"], "fauto-profile-accurate">, +Group, Alias; +def fno_auto_profile_accurate : Flag<["-"], "fno-auto-profile-accurate">, +Group, Alias; def fdebug_info_for_profiling : Flag<["-"], "fdebug-info-for-profiling">, Group, Flags<[CC1Option]>, HelpText<"Emit extra debug info to make sample profile more accurate.">; Index: docs/ClangCommandLineReference.rst ===
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
davidxl added inline comments. Comment at: docs/ClangCommandLineReference.rst:176 + +If sample profile is accurate, we will mark all un-sampled callsite as cold. Otherwise, treat un-sampled callsites as if we have no profile + If the sample profile is accurate, callsites without profile samples are marked as cold. Otherwise, ..., This option can be used to enable more aggressive size optimization based on profiles. Comment at: include/clang/Driver/Options.td:593 + HelpText<"If sample profile is accurate, we will mark all un-sampled " + "callsite as cold. Otherwise, treat un-sampled callsites as if " + "we have no profile">; un-sampled callsites --> callsites without profile samples https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
danielcdh updated this revision to Diff 112496. danielcdh added a comment. add document and test https://reviews.llvm.org/D37091 Files: docs/ClangCommandLineReference.rst include/clang/Driver/Options.td lib/Driver/ToolChains/Clang.cpp test/Driver/clang_f_opts.c Index: test/Driver/clang_f_opts.c === --- test/Driver/clang_f_opts.c +++ test/Driver/clang_f_opts.c @@ -53,6 +53,9 @@ // CHECK-REROLL-LOOPS: "-freroll-loops" // CHECK-NO-REROLL-LOOPS-NOT: "-freroll-loops" +// RUN: %clang -### -S -faccurate-sample-profile %s 2>&1 | FileCheck -check-prefix=CHECK-ACCURATE-SAMPLE-PROFILE %s +// CHECK-ACCURATE-SAMPLE-PROFILE: "-accurate-sample-profile" + // RUN: %clang -### -S -fprofile-sample-use=%S/Inputs/file.prof %s 2>&1 | FileCheck -check-prefix=CHECK-SAMPLE-PROFILE %s // CHECK-SAMPLE-PROFILE: "-fprofile-sample-use={{.*}}/file.prof" Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,12 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_faccurate_sample_profile, + options::OPT_fno_accurate_sample_profile, false)) { +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("-accurate-sample-profile"); + } + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: include/clang/Driver/Options.td === --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -587,6 +587,14 @@ def fPIE : Flag<["-"], "fPIE">, Group; def fno_PIE : Flag<["-"], "fno-PIE">, Group; def faccess_control : Flag<["-"], "faccess-control">, Group; +def faccurate_sample_profile : Flag<["-"], "faccurate-sample-profile">, + Group, Flags<[DriverOption]>, + HelpText<"If sample profile is accurate, we will mark all un-sampled " + "callsite as cold. Otherwise, treat un-sampled callsites as if " + "we have no profile">; +def fno_accurate_sample_profile : Flag<["-"], "fno-accurate-sample-profile">, + Group, Flags<[DriverOption]>; + def fallow_unsupported : Flag<["-"], "fallow-unsupported">, Group; def fapple_kext : Flag<["-"], "fapple-kext">, Group, Flags<[CC1Option]>, HelpText<"Use Apple's kernel extensions ABI">; @@ -643,6 +651,10 @@ Alias; def fauto_profile_EQ : Joined<["-"], "fauto-profile=">, Alias; +def fauto_profile_accurate : Flag<["-"], "fauto-profile-accurate">, +Group, Alias; +def fno_auto_profile_accurate : Flag<["-"], "fno-auto-profile-accurate">, +Group, Alias; def fdebug_info_for_profiling : Flag<["-"], "fdebug-info-for-profiling">, Group, Flags<[CC1Option]>, HelpText<"Emit extra debug info to make sample profile more accurate.">; Index: docs/ClangCommandLineReference.rst === --- docs/ClangCommandLineReference.rst +++ docs/ClangCommandLineReference.rst @@ -170,6 +170,11 @@ .. option:: -exported\_symbols\_list +.. option:: -faccurate-sample-profile, -fno-accurate-sample-profile +.. program:: clang + +If sample profile is accurate, we will mark all un-sampled callsite as cold. Otherwise, treat un-sampled callsites as if we have no profile + .. option:: -faligned-new= .. option:: -fcuda-approx-transcendentals, -fno-cuda-approx-transcendentals Index: test/Driver/clang_f_opts.c === --- test/Driver/clang_f_opts.c +++ test/Driver/clang_f_opts.c @@ -53,6 +53,9 @@ // CHECK-REROLL-LOOPS: "-freroll-loops" // CHECK-NO-REROLL-LOOPS-NOT: "-freroll-loops" +// RUN: %clang -### -S -faccurate-sample-profile %s 2>&1 | FileCheck -check-prefix=CHECK-ACCURATE-SAMPLE-PROFILE %s +// CHECK-ACCURATE-SAMPLE-PROFILE: "-accurate-sample-profile" + // RUN: %clang -### -S -fprofile-sample-use=%S/Inputs/file.prof %s 2>&1 | FileCheck -check-prefix=CHECK-SAMPLE-PROFILE %s // CHECK-SAMPLE-PROFILE: "-fprofile-sample-use={{.*}}/file.prof" Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,12 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_faccurate_sample_profile, + options::OPT_fno_accurate_sample_profile, false)) { +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("-accurate-sample-profile"); + } + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: include/clang/Driver/Options.td
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
davidxl added a comment. Documentation needs to be added to clang/docs/ClangCommandLineReference.rst . There probably also needs some kind of testing for the option processing: see clang_f_opts.c https://reviews.llvm.org/D37091 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[PATCH] D37091: Expose -mllvm -accurate-sample-profile to clang.
danielcdh created this revision. Herald added a subscriber: sanjoy. With accurate sample profile, we can do more aggressive size optimization. For some size-critical application, this can reduce the text size by 20% https://reviews.llvm.org/D37091 Files: include/clang/Driver/Options.td lib/Driver/ToolChains/Clang.cpp Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,12 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_faccurate_sample_profile, + options::OPT_fno_accurate_sample_profile, false)) { +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("-accurate-sample-profile"); + } + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: include/clang/Driver/Options.td === --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -587,6 +587,14 @@ def fPIE : Flag<["-"], "fPIE">, Group; def fno_PIE : Flag<["-"], "fno-PIE">, Group; def faccess_control : Flag<["-"], "faccess-control">, Group; +def faccurate_sample_profile : Flag<["-"], "faccurate-sample-profile">, + Group, Flags<[DriverOption]>, + HelpText<"If sample profile is accurate, we will mark all un-sampled " + "callsite as cold. Otherwise, treat un-sampled callsites as if " + "we have no profile">; +def fno_accurate_sample_profile : Flag<["-"], "fno-accurate-sample-profile">, + Group, Flags<[DriverOption]>; + def fallow_unsupported : Flag<["-"], "fallow-unsupported">, Group; def fapple_kext : Flag<["-"], "fapple-kext">, Group, Flags<[CC1Option]>, HelpText<"Use Apple's kernel extensions ABI">; @@ -643,6 +651,10 @@ Alias; def fauto_profile_EQ : Joined<["-"], "fauto-profile=">, Alias; +def fauto_profile_accurate : Flag<["-"], "fauto-profile-accurate">, +Group, Alias; +def fno_auto_profile_accurate : Flag<["-"], "fno-auto-profile-accurate">, +Group, Alias; def fdebug_info_for_profiling : Flag<["-"], "fdebug-info-for-profiling">, Group, Flags<[CC1Option]>, HelpText<"Emit extra debug info to make sample profile more accurate.">; Index: lib/Driver/ToolChains/Clang.cpp === --- lib/Driver/ToolChains/Clang.cpp +++ lib/Driver/ToolChains/Clang.cpp @@ -2340,6 +2340,12 @@ true)) CmdArgs.push_back("-fno-jump-tables"); + if (Args.hasFlag(options::OPT_faccurate_sample_profile, + options::OPT_fno_accurate_sample_profile, false)) { +CmdArgs.push_back("-mllvm"); +CmdArgs.push_back("-accurate-sample-profile"); + } + if (!Args.hasFlag(options::OPT_fpreserve_as_comments, options::OPT_fno_preserve_as_comments, true)) CmdArgs.push_back("-fno-preserve-as-comments"); Index: include/clang/Driver/Options.td === --- include/clang/Driver/Options.td +++ include/clang/Driver/Options.td @@ -587,6 +587,14 @@ def fPIE : Flag<["-"], "fPIE">, Group; def fno_PIE : Flag<["-"], "fno-PIE">, Group; def faccess_control : Flag<["-"], "faccess-control">, Group; +def faccurate_sample_profile : Flag<["-"], "faccurate-sample-profile">, + Group, Flags<[DriverOption]>, + HelpText<"If sample profile is accurate, we will mark all un-sampled " + "callsite as cold. Otherwise, treat un-sampled callsites as if " + "we have no profile">; +def fno_accurate_sample_profile : Flag<["-"], "fno-accurate-sample-profile">, + Group, Flags<[DriverOption]>; + def fallow_unsupported : Flag<["-"], "fallow-unsupported">, Group; def fapple_kext : Flag<["-"], "fapple-kext">, Group, Flags<[CC1Option]>, HelpText<"Use Apple's kernel extensions ABI">; @@ -643,6 +651,10 @@ Alias; def fauto_profile_EQ : Joined<["-"], "fauto-profile=">, Alias; +def fauto_profile_accurate : Flag<["-"], "fauto-profile-accurate">, +Group, Alias; +def fno_auto_profile_accurate : Flag<["-"], "fno-auto-profile-accurate">, +Group, Alias; def fdebug_info_for_profiling : Flag<["-"], "fdebug-info-for-profiling">, Group, Flags<[CC1Option]>, HelpText<"Emit extra debug info to make sample profile more accurate.">; ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits