> On 7 May 2025, at 10:15 AM, Andrew Pinski <pins...@gmail.com> wrote: > > External email: Use caution opening links or attachments > > > On Tue, May 6, 2025 at 1:35 AM <soum...@nvidia.com> wrote: >> >> From: Soumya AR <soum...@nvidia.com> >> >> Hi, >> >> This RFC and subsequent patch series introduces support for printing and >> parsing >> of aarch64 tuning parameters in the form of JSON. >> >> It is important to note that this mechanism is specifically intended for >> power >> users to experiment with tuning parameters. This proposal does not suggest >> the >> use of JSON tuning files in production. Additionally, the JSON format should >> not >> be considered stable and may change as GCC evolves. >> >> [1] Introduction >> >> Currently, the aarch64 backend in GCC (15) stores the tuning parameteres of >> all >> backends under gcc/config/aarch64/tuning_models/. Since these parameters are >> hardcoded for each CPU, this RFC proposes a technique to support the >> adjustment >> of these parameters at runtime. This allows easier experimentation with more >> aggressive parameters to find optimal numbers. >> >> The tuning data is fed to the compiler in JSON format, which was primarily >> chosen for the following reasons: >> >> * JSON can represent hierarchical data. This is useful for incorporating the >> nested nature of the tuning structures. >> * JSON supports integers, strings, booleans, and arrays. >> * GCC already has support for parsing and printing JSON, removing the need >> for >> writing APIs to read and write the JSON files. >> >> Thus, if we take the following example of some tuning parameters: >> >> static struct cpu_addrcost_table generic_armv9_a_addrcost_table = >> { >> { >> 1, /* hi */ >> 0, /* si */ >> 0, /* di */ >> 1, /* ti */ >> }, >> 0, /* pre_modify */ >> 0, /* post_modify */ >> 2, /* post_modify_ld3_st3 */ >> 2, /* post_modify_ld4_st4 */ >> }; >> >> static cpu_prefetch_tune generic_armv9a_prefetch_tune = >> { >> 0, /* num_slots */ >> -1, /* l1_cache_size */ >> 64, /* l1_cache_line_size */ >> -1, /* l2_cache_size */ >> true, /* prefetch_dynamic_strides */ >> }; >> >> static struct tune_params neoversev3_tunings = >> { >> &generic_armv9_a_addrcost_table, >> 10, /* issue_rate */ >> AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops */ >> "32:16", /* function_align. */ >> &generic_armv9a_prefetch_tune, >> AARCH64_LDP_STP_POLICY_ALWAYS, /* ldp_policy_model. */ >> }; >> >> We can represent them in JSON as: >> >> { >> "tune_params": { >> "addr_cost": { >> "addr_scale_costs": { "hi": 1, "si": 0, "di": 0, "ti": 1 }, >> "pre_modify": 0, >> "post_modify": 0, >> "post_modify_ld3_st3": 2, >> "post_modify_ld4_st4": 2 >> }, >> "issue_rate": 10, >> "fusible_ops": 1584, >> "function_align": "32:16", >> "prefetch": { >> "num_slots": 0, >> "l1_cache_size": -1, >> "l1_cache_line_size": 64, >> "l2_cache_size": -1, >> "prefetch_dynamic_strides": true >> }, >> "ldp_policy_model": "AARCH64_LDP_STP_POLICY_ALWAYS" >> } >> } >> >> --- >> >> [2] Methodology >> >> Before the internal tuning parameters are overridden with user provided >> ones, we >> must ensure the validity of the provided data. >> >> This is done using a "base" JSON schema, which contains information about the >> tune_params data structure used by the aarch64 backend. >> >> Example: >> >> { >> "tune_params": { >> "addr_cost": { >> "addr_scale_costs": { >> "hi": "int", >> "si": "int", >> "di": "int", >> "ti": "int" >> }, >> "pre_modify": "int", >> "post_modify": "int", >> "post_modify_ld3_st3": "int", >> "post_modify_ld4_st4": "int" >> }, >> "issue_rate": "int", >> "fusible_ops": "uint", >> "function_align": "string", >> "prefetch": { >> "num_slots": "int", >> "l1_cache_size": "int", >> "l1_cache_line_size": "int", >> "l2_cache_size": "int", >> "prefetch_dynamic_strides": "boolean" >> }, >> "ldp_policy_model": "string" >> } >> } >> >> Using this schema, we can: >> * Verify that the correct datatypes have been used. >> * Verify if the user provided "key" or tuning parameter exists. >> * Allow user to only specify the required fields (in nested fashion), >> eliminating the need to list down every single paramter if they only >> wish to experiment with some. >> >> The schema is currently stored as a raw JSON string in >> config/aarch64/aarch64-json-schema.h. >> >> 1: Parsing User Input and Overriding aarch64_tune_params >> >> Once validated, the data can be extracted and stored into >> aarch64_tune_params, >> overriding the default tunings. >> >> Thus, if >> -muser-provided-CPU=<json_file> is specified, we can call the following >> function >> in aarch64.cc, to override the default tuning parameters: >> >> void >> aarch64_load_tuning_params_from_json (const char *data_filename, >> struct tune_params *tune); >> >> 2: Dumping Back the Tuning Data (in JSON) >> >> If needed, the user can choose to print back the tuning data used during >> runtime. This is helpful for debugging and getting access to a "starter" >> tuning >> file template, which can be then modified and re-fed to the compiler. >> >> Thus, if >> -muser-provided-CPU=<json_file> is specified, we can call the following >> function >> in aarch64.cc, after the final tuning structure has been populated: >> >> void >> aarch64_print_tune_params (const tune_params ¶ms, const char *filename); >> >> --- >> >> [3] Testing >> >> To test out the functionality for this change, we have to ensure the >> following >> things are happening correctly: >> >> 1. The JSON tunings printer is able to print back the correct values, >> especially >> when it comes to trickier datatypes like enums. >> 2. The error handling works as expected, espcially in the case of incorrect >> JSON >> syntax, incorrect datatypes, and incorrect tuning data structure. >> 3. During GCC invokation, the values from JSON are correctly loaded in >> aarch64_tune_params. >> >> To test these, we make use of a combination of regression tests (in >> gcc.target/aarch64/aarch64-json-tunings/) as well as self-tests to check the >> contents of aarch64_tune_params during the GCC build. >> >> --- >> >> [4] Limitations: >> >> Lack of comments in JSON: >> * JSON does not have the ability to store comments, which leads to the >> loss of useful information that is provided in the form of comments in >> the header files. A workaround is to have a dummy "comment" key and >> ignore it when parsing. (e.g., "comment": "parameter description") >> >> No enum support in JSON: >> * The current workaround for this is to use strings instead of enums, >> but we lose out on the ability to pass enum as values, as well as >> doing >> bitwise operations on the enums, something used quite frequently for >> some parameters. >> >> No type distinction in JSON: >> * JSON uses the "number" type which allows signed and unsigned >> integers >> as well as floats but provides no distinction between them. >> >> Storing the JSON schema: >> * The JSON schema is currently stored as a raw JSON string in >> aarch64-json-schema.h. This is helpful in exposing the file to the >> testing framework, but is not the cleanest solution. >> >> * Theoretically, the schema could be stored in the installation >> directory, but this interferes with the idea of having self-tests for >> the JSON parser. >> >> Maintaing the printer/parser routines and JSON schema: >> * Any change in the aarch64 tuning format will result in the need for >> manual changes to be made to the routines for the JSON tunings >> printer, >> parser, and schema. >> >> --- >> >> [5] Follow-Up Ideas: >> >> JSON to C++ File Conversion: >> * Once the user has a JSON file with tuning values they are satisfied >> with, they have to manually translate the file back to CPP header >> files >> using the correct structure formats. This can be automated using a >> script that reads the JSON data and generates the appropriate header >> file. > > > One suggestion is document this at least in the internals manual so > folks don't need to point back to sources and can look at a decent > description of how to use it. Note I don't think this should be > documented in the user manual though as I don't think users, even > power ones should depend on it being the same across versions; just in > a similar fashion `--param` options are treated (or rather should be). > My only worry about having this ability is that folks will mine to > find the best idea for their program and their specific core at the > time and somehow those become the standard for all the future; (an > example of this is > https://www.eecis.udel.edu/~xli/publications/park2007dynamic.pdf). >
Hi Andrew! Thanks for this suggestion. I'm sorry for the time it took to get back to you. I’ve updated these patches here: https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689893.html I checked how --param comes with an explicit warning for this kind of use. I guess for this feature as well, we would add similar warnings and make it clear that the feature is version-dependent and unstable. And additionally, that JSON itself is subject to change, like I mentioned in the RFC. Thanks for pointing this out. Best, Soumya > Thanks, > Andrew Pinski > >> >> Soumya AR (5): >> aarch64 + arm: Remove const keyword from tune_params members and >> nested members >> aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON >> json: Add get_map() method to JSON object class >> aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters >> aarch64: Regression tests for parsing of user-provided AArch64 CPU >> tuning parameters >> >> gcc/config.gcc | 2 +- >> gcc/config/aarch64/aarch64-cost-tables.h | 18 +- >> gcc/config/aarch64/aarch64-json-schema.h | 261 ++++++ >> .../aarch64/aarch64-json-tunings-parser.cc | 837 ++++++++++++++++++ >> .../aarch64/aarch64-json-tunings-parser.h | 29 + >> .../aarch64/aarch64-json-tunings-printer.cc | 517 +++++++++++ >> .../aarch64/aarch64-json-tunings-printer.h | 28 + >> gcc/config/aarch64/aarch64-protos.h | 182 ++-- >> gcc/config/aarch64/aarch64.cc | 45 +- >> gcc/config/aarch64/aarch64.opt | 8 + >> gcc/config/aarch64/t-aarch64 | 19 + >> gcc/config/aarch64/tuning_models/a64fx.h | 14 +- >> gcc/config/aarch64/tuning_models/ampere1.h | 8 +- >> gcc/config/aarch64/tuning_models/ampere1a.h | 2 +- >> gcc/config/aarch64/tuning_models/ampere1b.h | 8 +- >> gcc/config/aarch64/tuning_models/cortexa35.h | 2 +- >> gcc/config/aarch64/tuning_models/cortexa53.h | 4 +- >> gcc/config/aarch64/tuning_models/cortexa57.h | 8 +- >> gcc/config/aarch64/tuning_models/cortexa72.h | 2 +- >> gcc/config/aarch64/tuning_models/cortexa73.h | 2 +- >> gcc/config/aarch64/tuning_models/cortexx925.h | 18 +- >> gcc/config/aarch64/tuning_models/emag.h | 2 +- >> gcc/config/aarch64/tuning_models/exynosm1.h | 14 +- >> .../aarch64/tuning_models/fujitsu_monaka.h | 2 +- >> gcc/config/aarch64/tuning_models/generic.h | 18 +- >> .../aarch64/tuning_models/generic_armv8_a.h | 18 +- >> .../aarch64/tuning_models/generic_armv9_a.h | 22 +- >> .../aarch64/tuning_models/neoverse512tvb.h | 10 +- >> gcc/config/aarch64/tuning_models/neoversen1.h | 2 +- >> gcc/config/aarch64/tuning_models/neoversen2.h | 18 +- >> gcc/config/aarch64/tuning_models/neoversen3.h | 18 +- >> gcc/config/aarch64/tuning_models/neoversev1.h | 20 +- >> gcc/config/aarch64/tuning_models/neoversev2.h | 18 +- >> gcc/config/aarch64/tuning_models/neoversev3.h | 18 +- >> .../aarch64/tuning_models/neoversev3ae.h | 18 +- >> gcc/config/aarch64/tuning_models/qdf24xx.h | 12 +- >> gcc/config/aarch64/tuning_models/saphira.h | 2 +- >> gcc/config/aarch64/tuning_models/thunderx.h | 10 +- >> .../aarch64/tuning_models/thunderx2t99.h | 12 +- >> .../aarch64/tuning_models/thunderx3t110.h | 12 +- >> .../aarch64/tuning_models/thunderxt88.h | 4 +- >> gcc/config/aarch64/tuning_models/tsv110.h | 12 +- >> gcc/config/aarch64/tuning_models/xgene1.h | 14 +- >> gcc/config/arm/aarch-common-protos.h | 128 +-- >> gcc/config/arm/aarch-cost-tables.h | 12 +- >> gcc/config/arm/arm-protos.h | 2 +- >> gcc/config/arm/arm.cc | 20 +- >> gcc/json.h | 21 +- >> gcc/selftest-run-tests.cc | 1 + >> gcc/selftest.h | 1 + >> .../aarch64-json-tunings.exp | 35 + >> .../aarch64/aarch64-json-tunings/boolean-1.c | 6 + >> .../aarch64-json-tunings/boolean-1.json | 9 + >> .../aarch64/aarch64-json-tunings/boolean-2.c | 7 + >> .../aarch64-json-tunings/boolean-2.json | 9 + >> .../aarch64-json-tunings/empty-brackets.c | 6 + >> .../aarch64-json-tunings/empty-brackets.json | 1 + >> .../aarch64/aarch64-json-tunings/empty.c | 6 + >> .../aarch64/aarch64-json-tunings/empty.json | 0 >> .../aarch64/aarch64-json-tunings/enum-1.c | 8 + >> .../aarch64/aarch64-json-tunings/enum-1.json | 7 + >> .../aarch64/aarch64-json-tunings/enum-2.c | 7 + >> .../aarch64/aarch64-json-tunings/enum-2.json | 7 + >> .../aarch64/aarch64-json-tunings/integer-1.c | 7 + >> .../aarch64-json-tunings/integer-1.json | 6 + >> .../aarch64/aarch64-json-tunings/integer-2.c | 7 + >> .../aarch64-json-tunings/integer-2.json | 6 + >> .../aarch64/aarch64-json-tunings/integer-3.c | 7 + >> .../aarch64-json-tunings/integer-3.json | 5 + >> .../aarch64/aarch64-json-tunings/integer-4.c | 6 + >> .../aarch64-json-tunings/integer-4.json | 5 + >> .../aarch64/aarch64-json-tunings/string-1.c | 8 + >> .../aarch64-json-tunings/string-1.json | 7 + >> .../aarch64/aarch64-json-tunings/string-2.c | 7 + >> .../aarch64-json-tunings/string-2.json | 5 + >> .../aarch64-json-tunings/unidentified-key.c | 6 + >> .../unidentified-key.json | 5 + >> 77 files changed, 2289 insertions(+), 381 deletions(-) >> create mode 100644 gcc/config/aarch64/aarch64-json-schema.h >> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc >> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h >> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc >> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/aarch64-json-tunings.exp >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.json >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.c >> create mode 100644 >> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.json >> >> -- >> 2.44.0