> On 7 May 2025, at 10:15 AM, Andrew Pinski <pins...@gmail.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> On Tue, May 6, 2025 at 1:35 AM <soum...@nvidia.com> wrote:
>> 
>> From: Soumya AR <soum...@nvidia.com>
>> 
>> Hi,
>> 
>> This RFC and subsequent patch series introduces support for printing and 
>> parsing
>> of aarch64 tuning parameters in the form of JSON.
>> 
>> It is important to note that this mechanism is specifically intended for 
>> power
>> users to experiment with tuning parameters. This proposal does not suggest 
>> the
>> use of JSON tuning files in production. Additionally, the JSON format should 
>> not
>> be considered stable and may change as GCC evolves.
>> 
>> [1] Introduction
>> 
>> Currently, the aarch64 backend in GCC (15) stores the tuning parameteres of 
>> all
>> backends under gcc/config/aarch64/tuning_models/. Since these parameters are
>> hardcoded for each CPU, this RFC proposes a technique to support the 
>> adjustment
>> of these parameters at runtime. This allows easier experimentation with more
>> aggressive parameters to find optimal numbers.
>> 
>> The tuning data is fed to the compiler in JSON format, which was primarily
>> chosen for the following reasons:
>> 
>> * JSON can represent hierarchical data. This is useful for incorporating the
>> nested nature of the tuning structures.
>> * JSON supports integers, strings, booleans, and arrays.
>> * GCC already has support for parsing and printing JSON, removing the need 
>> for
>> writing APIs to read and write the JSON files.
>> 
>> Thus, if we take the following example of some tuning parameters:
>> 
>> static struct cpu_addrcost_table generic_armv9_a_addrcost_table =
>> {
>>    {
>>      1, /* hi  */
>>      0, /* si  */
>>      0, /* di  */
>>      1, /* ti  */
>>    },
>>  0, /* pre_modify  */
>>  0, /* post_modify  */
>>  2, /* post_modify_ld3_st3  */
>>  2, /* post_modify_ld4_st4  */
>> };
>> 
>> static cpu_prefetch_tune generic_armv9a_prefetch_tune =
>> {
>>  0,                    /* num_slots  */
>>  -1,                   /* l1_cache_size  */
>>  64,                   /* l1_cache_line_size  */
>>  -1,                   /* l2_cache_size  */
>>  true,                 /* prefetch_dynamic_strides */
>> };
>> 
>> static struct tune_params neoversev3_tunings =
>> {
>>  &generic_armv9_a_addrcost_table,
>>  10, /* issue_rate  */
>>  AARCH64_FUSE_NEOVERSE_BASE, /* fusible_ops  */
>>  "32:16",      /* function_align.  */
>>  &generic_armv9a_prefetch_tune,
>>  AARCH64_LDP_STP_POLICY_ALWAYS,   /* ldp_policy_model.  */
>> };
>> 
>> We can represent them in JSON as:
>> 
>> {
>>  "tune_params": {
>>    "addr_cost": {
>>      "addr_scale_costs": { "hi": 1, "si": 0, "di": 0, "ti": 1 },
>>      "pre_modify": 0,
>>      "post_modify": 0,
>>      "post_modify_ld3_st3": 2,
>>      "post_modify_ld4_st4": 2
>>    },
>>    "issue_rate": 10,
>>    "fusible_ops": 1584,
>>    "function_align": "32:16",
>>    "prefetch": {
>>      "num_slots": 0,
>>      "l1_cache_size": -1,
>>      "l1_cache_line_size": 64,
>>      "l2_cache_size": -1,
>>      "prefetch_dynamic_strides": true
>>    },
>>    "ldp_policy_model": "AARCH64_LDP_STP_POLICY_ALWAYS"
>>  }
>> }
>> 
>> ---
>> 
>> [2] Methodology
>> 
>> Before the internal tuning parameters are overridden with user provided 
>> ones, we
>> must ensure the validity of the provided data.
>> 
>> This is done using a "base" JSON schema, which contains information about the
>> tune_params data structure used by the aarch64 backend.
>> 
>> Example:
>> 
>> {
>>  "tune_params": {
>>    "addr_cost": {
>>      "addr_scale_costs": {
>>        "hi": "int",
>>        "si": "int",
>>        "di": "int",
>>        "ti": "int"
>>      },
>>      "pre_modify": "int",
>>      "post_modify": "int",
>>      "post_modify_ld3_st3": "int",
>>      "post_modify_ld4_st4": "int"
>>    },
>>    "issue_rate": "int",
>>    "fusible_ops": "uint",
>>    "function_align": "string",
>>    "prefetch": {
>>      "num_slots": "int",
>>      "l1_cache_size": "int",
>>      "l1_cache_line_size": "int",
>>      "l2_cache_size": "int",
>>      "prefetch_dynamic_strides": "boolean"
>>    },
>>    "ldp_policy_model": "string"
>>  }
>> }
>> 
>> Using this schema, we can:
>>        * Verify that the correct datatypes have been used.
>>        * Verify if the user provided "key" or tuning parameter exists.
>>        * Allow user to only specify the required fields (in nested fashion),
>>        eliminating the need to list down every single paramter if they only
>>        wish to experiment with some.
>> 
>> The schema is currently stored as a raw JSON string in
>> config/aarch64/aarch64-json-schema.h.
>> 
>> 1: Parsing User Input and Overriding aarch64_tune_params
>> 
>> Once validated, the data can be extracted and stored into 
>> aarch64_tune_params,
>> overriding the default tunings.
>> 
>> Thus, if
>> -muser-provided-CPU=<json_file> is specified, we can call the following 
>> function
>> in aarch64.cc, to override the default tuning parameters:
>> 
>> void
>> aarch64_load_tuning_params_from_json (const char *data_filename,
>>                                      struct tune_params *tune);
>> 
>> 2: Dumping Back the Tuning Data (in JSON)
>> 
>> If needed, the user can choose to print back the tuning data used during
>> runtime. This is helpful for debugging and getting access to a "starter" 
>> tuning
>> file template, which can be then modified and re-fed to the compiler.
>> 
>> Thus, if
>> -muser-provided-CPU=<json_file> is specified, we can call the following 
>> function
>> in aarch64.cc, after the final tuning structure has been populated:
>> 
>> void
>> aarch64_print_tune_params (const tune_params &params, const char *filename);
>> 
>> ---
>> 
>> [3] Testing
>> 
>> To test out the functionality for this change, we have to ensure the 
>> following
>> things are happening correctly:
>> 
>> 1. The JSON tunings printer is able to print back the correct values, 
>> especially
>> when it comes to trickier datatypes like enums.
>> 2. The error handling works as expected, espcially in the case of incorrect 
>> JSON
>> syntax, incorrect datatypes, and incorrect tuning data structure.
>> 3. During GCC invokation, the values from JSON are correctly loaded in
>> aarch64_tune_params.
>> 
>> To test these, we make use of a combination of regression tests (in
>> gcc.target/aarch64/aarch64-json-tunings/) as well as self-tests to check the
>> contents of aarch64_tune_params during the GCC build.
>> 
>> ---
>> 
>> [4] Limitations:
>> 
>> Lack of comments in JSON:
>>        * JSON does not have the ability to store comments, which leads to the
>>        loss of useful information that is provided in the form of comments in
>>        the header files. A workaround is to have a dummy "comment" key and
>>        ignore it when parsing. (e.g., "comment": "parameter description")
>> 
>> No enum support in JSON:
>>        * The current workaround for this is to use strings instead of enums,
>>        but we lose out on the ability to pass enum as values, as well as 
>> doing
>>        bitwise operations on the enums, something used quite frequently for
>>        some parameters.
>> 
>> No type distinction in JSON:
>>        * JSON uses the "number" type which allows signed and unsigned 
>> integers
>>        as well as floats but provides no distinction between them.
>> 
>> Storing the JSON schema:
>>        * The JSON schema is currently stored as a raw JSON string in
>>        aarch64-json-schema.h. This is helpful in exposing the file to the
>>        testing framework, but is not the cleanest solution.
>> 
>>        * Theoretically, the schema could be stored in the installation
>>        directory, but this interferes with the idea of having self-tests for
>>        the JSON parser.
>> 
>> Maintaing the printer/parser routines and JSON schema:
>>        * Any change in the aarch64 tuning format will result in the need for
>>        manual changes to be made to the routines for the JSON tunings 
>> printer,
>>        parser, and schema.
>> 
>> ---
>> 
>> [5] Follow-Up Ideas:
>> 
>> JSON to C++ File Conversion:
>>        * Once the user has a JSON file with tuning values they are satisfied
>>        with, they have to manually translate the file back to CPP header 
>> files
>>        using the correct structure formats. This can be automated using a
>>        script that reads the JSON data and generates the appropriate header
>>        file.
> 
> 
> One suggestion is document this at least in the internals manual so
> folks don't need to point back to sources and can look at a decent
> description of how to use it. Note I don't think this should be
> documented in the user manual though as I don't think users, even
> power ones should depend on it being the same across versions; just in
> a similar fashion `--param` options are treated (or rather should be).
> My only worry about having this ability is that folks will mine to
> find the best idea for their program and their specific core at the
> time and somehow those become the standard for all the future; (an
> example of this is
> https://www.eecis.udel.edu/~xli/publications/park2007dynamic.pdf).
> 

Hi Andrew!

Thanks for this suggestion. I'm sorry for the time it took to get back to you.

I’ve updated these patches here:
https://gcc.gnu.org/pipermail/gcc-patches/2025-July/689893.html

I checked how --param comes with an explicit warning for this kind of use.

I guess for this feature as well, we would add similar warnings and make it
clear that the feature is version-dependent and unstable. And additionally, that
JSON itself is subject to change, like I mentioned in the RFC.

Thanks for pointing this out.

Best,
Soumya


> Thanks,
> Andrew Pinski
> 
>> 
>> Soumya AR (5):
>>  aarch64 + arm: Remove const keyword from tune_params members and
>>    nested members
>>  aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON
>>  json: Add get_map() method to JSON object class
>>  aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters
>>  aarch64: Regression tests for parsing of user-provided AArch64 CPU
>>    tuning parameters
>> 
>> gcc/config.gcc                                |   2 +-
>> gcc/config/aarch64/aarch64-cost-tables.h      |  18 +-
>> gcc/config/aarch64/aarch64-json-schema.h      | 261 ++++++
>> .../aarch64/aarch64-json-tunings-parser.cc    | 837 ++++++++++++++++++
>> .../aarch64/aarch64-json-tunings-parser.h     |  29 +
>> .../aarch64/aarch64-json-tunings-printer.cc   | 517 +++++++++++
>> .../aarch64/aarch64-json-tunings-printer.h    |  28 +
>> gcc/config/aarch64/aarch64-protos.h           | 182 ++--
>> gcc/config/aarch64/aarch64.cc                 |  45 +-
>> gcc/config/aarch64/aarch64.opt                |   8 +
>> gcc/config/aarch64/t-aarch64                  |  19 +
>> gcc/config/aarch64/tuning_models/a64fx.h      |  14 +-
>> gcc/config/aarch64/tuning_models/ampere1.h    |   8 +-
>> gcc/config/aarch64/tuning_models/ampere1a.h   |   2 +-
>> gcc/config/aarch64/tuning_models/ampere1b.h   |   8 +-
>> gcc/config/aarch64/tuning_models/cortexa35.h  |   2 +-
>> gcc/config/aarch64/tuning_models/cortexa53.h  |   4 +-
>> gcc/config/aarch64/tuning_models/cortexa57.h  |   8 +-
>> gcc/config/aarch64/tuning_models/cortexa72.h  |   2 +-
>> gcc/config/aarch64/tuning_models/cortexa73.h  |   2 +-
>> gcc/config/aarch64/tuning_models/cortexx925.h |  18 +-
>> gcc/config/aarch64/tuning_models/emag.h       |   2 +-
>> gcc/config/aarch64/tuning_models/exynosm1.h   |  14 +-
>> .../aarch64/tuning_models/fujitsu_monaka.h    |   2 +-
>> gcc/config/aarch64/tuning_models/generic.h    |  18 +-
>> .../aarch64/tuning_models/generic_armv8_a.h   |  18 +-
>> .../aarch64/tuning_models/generic_armv9_a.h   |  22 +-
>> .../aarch64/tuning_models/neoverse512tvb.h    |  10 +-
>> gcc/config/aarch64/tuning_models/neoversen1.h |   2 +-
>> gcc/config/aarch64/tuning_models/neoversen2.h |  18 +-
>> gcc/config/aarch64/tuning_models/neoversen3.h |  18 +-
>> gcc/config/aarch64/tuning_models/neoversev1.h |  20 +-
>> gcc/config/aarch64/tuning_models/neoversev2.h |  18 +-
>> gcc/config/aarch64/tuning_models/neoversev3.h |  18 +-
>> .../aarch64/tuning_models/neoversev3ae.h      |  18 +-
>> gcc/config/aarch64/tuning_models/qdf24xx.h    |  12 +-
>> gcc/config/aarch64/tuning_models/saphira.h    |   2 +-
>> gcc/config/aarch64/tuning_models/thunderx.h   |  10 +-
>> .../aarch64/tuning_models/thunderx2t99.h      |  12 +-
>> .../aarch64/tuning_models/thunderx3t110.h     |  12 +-
>> .../aarch64/tuning_models/thunderxt88.h       |   4 +-
>> gcc/config/aarch64/tuning_models/tsv110.h     |  12 +-
>> gcc/config/aarch64/tuning_models/xgene1.h     |  14 +-
>> gcc/config/arm/aarch-common-protos.h          | 128 +--
>> gcc/config/arm/aarch-cost-tables.h            |  12 +-
>> gcc/config/arm/arm-protos.h                   |   2 +-
>> gcc/config/arm/arm.cc                         |  20 +-
>> gcc/json.h                                    |  21 +-
>> gcc/selftest-run-tests.cc                     |   1 +
>> gcc/selftest.h                                |   1 +
>> .../aarch64-json-tunings.exp                  |  35 +
>> .../aarch64/aarch64-json-tunings/boolean-1.c  |   6 +
>> .../aarch64-json-tunings/boolean-1.json       |   9 +
>> .../aarch64/aarch64-json-tunings/boolean-2.c  |   7 +
>> .../aarch64-json-tunings/boolean-2.json       |   9 +
>> .../aarch64-json-tunings/empty-brackets.c     |   6 +
>> .../aarch64-json-tunings/empty-brackets.json  |   1 +
>> .../aarch64/aarch64-json-tunings/empty.c      |   6 +
>> .../aarch64/aarch64-json-tunings/empty.json   |   0
>> .../aarch64/aarch64-json-tunings/enum-1.c     |   8 +
>> .../aarch64/aarch64-json-tunings/enum-1.json  |   7 +
>> .../aarch64/aarch64-json-tunings/enum-2.c     |   7 +
>> .../aarch64/aarch64-json-tunings/enum-2.json  |   7 +
>> .../aarch64/aarch64-json-tunings/integer-1.c  |   7 +
>> .../aarch64-json-tunings/integer-1.json       |   6 +
>> .../aarch64/aarch64-json-tunings/integer-2.c  |   7 +
>> .../aarch64-json-tunings/integer-2.json       |   6 +
>> .../aarch64/aarch64-json-tunings/integer-3.c  |   7 +
>> .../aarch64-json-tunings/integer-3.json       |   5 +
>> .../aarch64/aarch64-json-tunings/integer-4.c  |   6 +
>> .../aarch64-json-tunings/integer-4.json       |   5 +
>> .../aarch64/aarch64-json-tunings/string-1.c   |   8 +
>> .../aarch64-json-tunings/string-1.json        |   7 +
>> .../aarch64/aarch64-json-tunings/string-2.c   |   7 +
>> .../aarch64-json-tunings/string-2.json        |   5 +
>> .../aarch64-json-tunings/unidentified-key.c   |   6 +
>> .../unidentified-key.json                     |   5 +
>> 77 files changed, 2289 insertions(+), 381 deletions(-)
>> create mode 100644 gcc/config/aarch64/aarch64-json-schema.h
>> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.cc
>> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-parser.h
>> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.cc
>> create mode 100644 gcc/config/aarch64/aarch64-json-tunings-printer.h
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/aarch64-json-tunings.exp
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-1.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/boolean-2.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty-brackets.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/empty.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-1.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/enum-2.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-1.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-2.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-3.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/integer-4.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-1.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/string-2.json
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.c
>> create mode 100644 
>> gcc/testsuite/gcc.target/aarch64/aarch64-json-tunings/unidentified-key.json
>> 
>> --
>> 2.44.0


Reply via email to