https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118356
Bug ID: 118356
Summary: RISC-V: -falign-labels=0 should (probably) default to
4
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: cousteaulecommandant at gmail dot com
Target Milestone: ---
Some RISC-V implementations, including the CORE-V CVE4 family [1], allow having
instructions aligned to 2- or 4-byte boundaries, but introduce an extra clock
cycle penalty if the target of a branch instruction is a 4-byte instruction
that is not aligned to a 4-byte boundary (but not if the target instruction is
aligned to a 4-byte boundary, or if it's a 2-byte instruction).
In those cases, forcing alignment of branch targets to 4 bytes (which can be
achieved by providing `-falign-labels=4`) can provide a great improvement on
certain programs. For example, a tight `for` loop may take 9 clock cycles to
run if the branch target is aligned but 10 if it's not, resulting in a 10%
performance loss. (What's worse, this performance loss will only kick in
arbitrarily, and can appear or disappear even if I change a completely
different part of the code, which drove me crazy when I was trying to measure
the performance of a function affected by this issue; enabling
`-falign-labels=4` also has the advantage of removing this uncertainty.)
Here, my expectation would be that enabling a certain optimization level (such
as `-O2`) enabled this particular optimization. In fact, the documentation [2]
states that `-O2` enables the `-falign-labels` flag, but without specifying an
alignment. It later states that `-falign-labels` without a value or with `=0`
will "use a machine-dependent default which is very likely to be ‘1’, meaning
no alignment".
Now, I don't quite understand why `-O2` would want to enable an optimization
option whose default behavior is to do nothing, but my guess is that this is so
that specific targets where setting `-falign-labels=X` can provide an advantage
(as is the case with RISC-V) use `X` as the default value rather than 1.
What do you think? Would it make sense for RISC-V targets to make
`-falign-labels=4` the default alignment value when `-falign-labels` is used
without providing an explicit value, so that this forced alignment will happen
when `-O2` or `-O3` are used?
[1]:
https://docs.openhwgroup.org/projects/cv32e40p-user-manual/en/latest/pipeline.html#cycle-counts-per-instruction-type
[2]: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html