Re: [PATCH] doc: c: c++: Document the C/C++ extended asm empty input constraints
On Mon, 22 Feb 2021 at 16:30, Segher Boessenkool wrote: > > Hi! > > First off, thanbk you for the patch! You're welcome! > On Mon, Feb 15, 2021 at 11:22:52PM +0000, Neven Sajko via Gcc-patches wrote: > > There is a long-standing, but undocumented GCC inline assembly feature > > that's part of the extended asm GCC extension to C and C++: extended > > asm empty input constraints. > > There is no such thing. *All* empty constraints have the same > semantics: anything whatsoever will do. Any register, any constant, any > memory. What I was trying to express is that input operand constraints are unlike output operand constraints in that they can be empty. I now realize I ended up being slightly confusing, though. > > --- a/gcc/doc/md.texi > > +++ b/gcc/doc/md.texi > > @@ -1131,7 +1131,102 @@ the addressing register. > > @subsection Simple Constraints > > @cindex simple constraints > > > > -The simplest kind of constraint is a string full of letters, each of > > +An input constraint is allowed to be an empty string, in which case it is > > +called an empty input constraint. > > That is just shorthand for "empty constraint that is used for an input > operand". It is not special, and it *is* documented: > https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#Simple-Constraints > The simplest kind of constraint is a string full of letters, each of > which describes one kind of operand that is permitted. > > A length zero string is allowed as well. This could be made more > explicit sure; OTOH, it isn't very often useful. So your example > (using it for making a dependency) is certainly useful to have. But > it is not a special case at all. Syntactically, it's not a special case; but I definitely think the semantics could be better documented. Proof: * There's a relevant Stack Overflow question. If I didn't know better I'd conclude from the discussion there that empty input constraints are undocumented and unsupported, and there would surely be an answer if the documentation on the GCC side was a bit better: https://stackoverflow.com/questions/63305223/gcc-asm-with-empty-input-operand-constraint * Clang erroneously doesn't support empty constraints for many years now (even though their internal documentation still says empty input constraints are supported, and external documentation says they support all the same constraints as GCC does). I suppose they may have been mislead by the lack of explicit mention of the feature in GCC's documentation. > > (When an empty input constraint is used, > > +the assembler template will most probably also be empty. I.e., the > > @code{asm} > > +declaration need not contain actual assembly code.) > > Don't use parentheses like this in documentation please. OK. > > An empty input > > +constraint can be used to create an artificial dependency on a C or C++ > > +variable (the variable that appears in the expression associated with the > > +constraint) without incurring unnecessary costs to performance. > > It still needs a register (or memory) reserved there (or sometimes a > constant can be used, but you have no dependency in that case!) Yeah, this is a bit more complicated than I perhaps implied. An asm volatile can tell the compiler "I need this value calculated at this point", but the compiler may still choose to eliminate the calculation from the generated code if it can perform it itself at compilation time. Thus currently the programmer must be able to predict if GCC will be able compute the value of some variable or expression; the good thing is that this is usually easy to predict. > > +An example of where such behavior may be useful is for preventing compiler > > +optimizations like dead store elimination or hoisting code outside a loop > > for > > +certain pieces of C or C++ code. > > You should not think about preventing the compiler from doing something. > Instead, you can give the compiler extra information that makes it *do* > something: it has to, because it has to implement the semantics your > source program has. > > > Specific applications may include direct > > +interaction with hardware features; or things like testing, fuzzing and > > +benchmarking. > > What does this mean? The manual already has examples for "direct interaction with hardware features". Benchmarking is another relatively well known example of an activity during which we may be inconvenienced by the compiler doing dead store elimination and loop hoisting at certain specific places in the code. E.g., Google's Benchmark has DoNotOptimize and Facebook's Folly has doNotOptimizeAway: https://github.com/google/benchmark/blo
[PATCH] doc: c: c++: Document the C/C++ extended asm empty input constraints
There is a long-standing, but undocumented GCC inline assembly feature that's part of the extended asm GCC extension to C and C++: extended asm empty input constraints. Although I don't really use extended asm much, and I never contributed to GCC before; I tried to document the feature as far as I understand it. I ran make html to check that the changed Texinfo is well formed. FTR, empty input constraints have been mentioned on the GCC mailing lists, e.g.: https://gcc.gnu.org/pipermail/gcc-help/2015-June/124410.html I release this contribution into the public domain. Neven Sajko gcc/ChangeLog: * doc/md.texi: Document extended asm empty input constraints diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index e3686dbfe..deccfd38a 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -1131,7 +1131,102 @@ the addressing register. @subsection Simple Constraints @cindex simple constraints -The simplest kind of constraint is a string full of letters, each of +An input constraint is allowed to be an empty string, in which case it is +called an empty input constraint. (When an empty input constraint is used, +the assembler template will most probably also be empty. I.e., the @code{asm} +declaration need not contain actual assembly code.) An empty input +constraint can be used to create an artificial dependency on a C or C++ +variable (the variable that appears in the expression associated with the +constraint) without incurring unnecessary costs to performance. + +An example of where such behavior may be useful is for preventing compiler +optimizations like dead store elimination or hoisting code outside a loop for +certain pieces of C or C++ code. Specific applications may include direct +interaction with hardware features; or things like testing, fuzzing and +benchmarking. + +Here's a simple C++20 program that is not useful in practice but demonstrates +relevant behavior; store it as a file called asm.cc: + +@verbatim +#include + +int +main() { +// Greater than or equal to zero. +constexpr int asmV = ASM_V; + +// The exact content of v is irrelevant for +// this example. +std::vector v{7, 6, 9, 3, 2, 0}; + +for (int i{0}; i < (1 << 28); i++) { +for (int j{0}; j < 6; j++) { +// The exact operation on the contents +// of v is not relevant for this +// example. +v[j]++; + +if constexpr (1 <= asmV) { +asm volatile ("" :: ""(v.size())); +for (auto x: v) { +asm volatile ("" :: ""(x)); +} +} +if constexpr (2 <= asmV) { +asm volatile ("" :: ""(v.size())); +for (auto x: v) { +asm volatile ("" :: ""(x)); +} +} +if constexpr (3 <= asmV) { +asm volatile ("" :: ""(v.size())); +for (auto x: v) { +asm volatile ("" :: ""(x)); +} +} +} +} + +return 0; +} +@end verbatim + +Compile with, e.g., the following command (with @code{XXX} equal to @code{0}, +@code{1}, @code{2}, and @code{3}). + +@verbatim +g++ -std=c++20 -O3 -flto -march=native -D ASM_V=XXX -o XXX asm.cc +@end verbatim + +Firstly, for @code{XXX} equal to @code{0}; all of the @code{asm} declarations +are dead code, thus formally the contents of @var{v} are not observable, +thus the program consists almost entirely of code that may be eliminated by a +(valid) compiler. While this usually aligns with what the programming user +wants, sometimes we might want to, e.g., measure how long does it take for +some piece of code to execute, even if we aren't interested in its results +(or already know what its results must be). Such is the case in, e.g., +benchmarking. + +Secondly, for @code{XXX} equal to @code{1}; only the first part with +@code{asm} declarations (the body of the first @code{if} statement) is +effective, and because of it the preceding code can not be eliminated, +because the @code{asm} declarations depend on @var{v} and its contents as +input operands. The same effect would exist with a nonempty input constraint +in place of the empty input constraints, but probably with additional +unnecessary code generation and diminished performance. The innermost loop +should not cause any code to be generated, because the input constraint is +empty. + +Thirdly, for @code{XXX} equal to @code{2} or @code{3}; assuming the required compiler +optimizations are successful, the generated code should be the same as for +@code{XXX} equal to @code{1}. This is again because of the empty input constraint +preventing unnecessary code generation (a nonempty input constraint would +probably require that the compiler store values into either registers or +memory, even though the assembler template is empty). + +The simplest kind of constraint, apart from the empty constraint, +is a string full of letters, each