On Fri, 2021-04-09 at 22:09 +0200, Peter Zijlstra wrote: > On Fri, Apr 09, 2021 at 03:21:49PM -0400, David Malcolm wrote: > > [Caveat: I'm a gcc developer, not a kernel expert] > > > > But it's not *quite* a global constant, or presumably you would be > > simply using a global constant, right? As the optimizer gets > > smarter, > > you don't want to have it one day decide that actually it really is > > constant, and optimize away everything at compile-time (e.g. when > > LTO > > is turned on, or whatnot). > > Right; as I said, the result is not a constant, but any invocation > ever, > will return the same result. Small but subtle difference :-) > > > I get the impression that you're resorting to assembler because > > you're > > pushing beyond what the C language can express. > > Of course :-) I tend to always push waaaaay past what C considers > sane. > Lets say I'm firmly in the C-as-Optimizing-Assembler camp :-)
Yeah, I got that :) > > Taking things to a slightly higher level, am I right in thinking > > that > > what you're trying to achieve is a control flow construct that > > almost > > always takes one of the given branches, but which can (very rarely) > > be > > switched to permanently take one of the other branches, and that > > you > > want the lowest possible overhead for the common case where the > > control flow hasn't been touched yet? > > Correct, that's what it is. We do runtime code patching to flip the > branch if/when needed. We've been doing this for many many years now. > > The issue of today is all this clever stuff defeating some simple > optimizations. It's certainly clever - though, if you'll forgive me, that's not always a good thing :) > > (and presumably little overhead for when it > > has been?)... and that you want to be able to merge repeated such > > conditionals. > > This.. So the 'static' branches have been upstream and in use ever > since > GCC added asm-goto, it was in fact the driving force to get asm-goto > implemented. This was 2010 according to git history. > > So we emit, using asm goto, either a "NOP5" or "JMP.d32" (x86 > speaking), > and a special section entry into which we encode the key address and > the > instruction address and the jump target. > > GCC, not knowing what the asm does, only sees the 2 edges and all is > well. > > Then, at runtime, when we decide we want the other edge for a given > key, > we iterate our section and rewrite the code to either nop5 or jmp.d32 > with the correct jump target. > > > It's kind of the opposite of "volatile" - something that the user > > is > > happy for the compiler to treat as not changing much, as opposed to > > something the user is warning the compiler about changing from > > under > > it. A "const-ish" value? > > Just so. Encoded in text, not data. > > > Sorry if I'm being incoherent; I'm kind of thinking aloud here. > > No problem, we're way outside of what is generally considered normal, > and I did somewhat assume people were familiar with our 'dodgy' > construct (some on this list are more than others). > > I hope it's all a little clearer now. Yeah. This is actually on two mailing lists; I'm only subscribed to linux-toolchains, which AIUI is about sharing ideas between Linux and the toolchains. You've built a very specific thing out of asm-goto to fulfil the tough requirements you outlined above - as well as the nops, there's a thing in another section to contend with. How to merge these asm-goto constructs? Doing so feels very special-case to the kernel and not something that other GCC users would find useful. I can imagine a GCC plugin that implemented a custom optimization pass for that - basically something that spots the asm-gotos in the gimple IR and optimizes away duplicates by replacing them with jumps, but having read about Linus's feelings about GCC plugins recently: https://lwn.net/Articles/851090/ I suspect that that isn't going to fly (and if you're going down the route of adding an optimization pass via a plugin, there's probably a way to do that that doesn't involve asm). In theory, something to optimize the asm-gotos could be relatively simple, but that said, we don't really have a GCC plugin API; all of our internal APIs are exposed, and are liable to change from release to release, which I know is a pain (I've managed to break one of my own plugins with one of my own API changes at least once). Hope this is constructive Dave