Re: Builtin for consulting value analysis (better ffs() code gen)
在 2024-03-14 23:33, Andrew Cooper via Gcc 写道: And for x86's arch_ffs(), unsigned int arch_ffs(unsigned int x) { unsigned int res; if ( __builtin_constant_p(x > 0) && x > 0 ) { // Well defined when x is known non-zero asm("bsf %1, %0" : "=r"(res) : "rm"(x)); Even if you may assume that the destination operand is always destroyed, the hardware has no knowledge about that, so this statement has a dependency on `res`, and shouldn't be declared `=r`. I think it's better to remove this `if`. The other branch below clearly eliminates the dependency. } else { // The architects say this is safe even for 0. res = -1; asm("bsf %1, %0" : "+r"(res) : "rm"(x)); } return res + 1; } -- Best regards, LIU Hao OpenPGP_signature.asc Description: OpenPGP digital signature
Re: Builtin for consulting value analysis (better ffs() code gen)
On 14/03/2024 12:03 pm, Andreas Schwab wrote: > On Mär 14 2024, Andrew Cooper via Gcc wrote: > >> so any known-constant value can be folded. What I'm dealing with is the >> remainder of the cases. > Which cases remain? None, thanks to the answers on this thread. The overall structure I've got now is: unsigned int ffs(unsigned int x) { if ( __builtin_constant_p(x) ) return __builtin_ffs(x); // Allows constant folding #ifndef arch_ffs #define arch_ffs __builtin_ffs #endif return arch_ffs(x); } And for x86's arch_ffs(), unsigned int arch_ffs(unsigned int x) { unsigned int res; if ( __builtin_constant_p(x > 0) && x > 0 ) { // Well defined when x is known non-zero asm("bsf %1, %0" : "=r"(res) : "rm"(x)); } else { // The architects say this is safe even for 0. res = -1; asm("bsf %1, %0" : "+r"(res) : "rm"(x)); } return res + 1; } This gives the same code gen as before (give or take some register shuffling), without having to rely on the programmer to remember to get their ffs()'s separated from their __ffs()'s as it pertains to undefined input. The other architectures which have better-defined instructions don't need any of these games to retain the same good code-gen that is currently expressed with a maze of subtly-different functions. ~Andrew
Re: Builtin for consulting value analysis (better ffs() code gen)
On Mär 14 2024, Andrew Cooper via Gcc wrote: > so any known-constant value can be folded. What I'm dealing with is the > remainder of the cases. Which cases remain? -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: Builtin for consulting value analysis (better ffs() code gen)
On 14/03/2024 11:02 am, Florian Weimer wrote: > * Andrew Cooper via Gcc: > >> Anyway - is there a way of doing this that I've managed to overlook? > Use __builtin_ffs? > > I think there's some concern for GCC that some of the alternative x86-64 > implementations do not follow the AMD and Intel behavior. This is why I'm not asking "please improve the code gen of __builtin_ffs()". Its a can of worms I don't expect anyone wants to open. But in the case that I can do better owing to a narrower scope, I'd like to do so. ~Andrew
Re: Builtin for consulting value analysis (better ffs() code gen)
On 14/03/2024 6:04 am, Alexander Monakov wrote: > On Thu, 14 Mar 2024, Andrew Cooper via Gcc wrote: > >> I suppose that what I'm looking for is something a little like >> __builtin_constant_p() which can either be used in a straight if(), or >> in a __builtin_choose_expr(). >> >> Anyway - is there a way of doing this that I've managed to overlook? > I am missing what is lacking for you with __builtin_constant_p, I would > do it like this: > > unsigned ffs(unsigned x) > { > unsigned res; > unsigned nonzero = x != 0; > if (__builtin_constant_p(nonzero) && nonzero) > asm("bsf %1, %0" : "=r"(res) : "rm"(x)); > else { > res = -1; > asm("bsf %1, %0" : "+r"(res) : "rm"(x)); > } > return res; > } Oh - so it does. I'd not considered that expressing it like that would still work. > > or with handling known-zero-input case like this: > > unsigned ffs(unsigned x) > { > unsigned res; > unsigned nonzero = x != 0; > if (!__builtin_constant_p(nonzero)) { > res = -1; > asm("bsf %1, %0" : "+r"(res) : "rm"(x)); > } else if (nonzero) { > asm("bsf %1, %0" : "=r"(res) : "rm"(x)); > } else { > res = -1; > } > return res; > } > > > Does it work for you? I simplified things when asking the question. The real implementation has a general if (__builtin_constant_p(x)) return __builtin_ffs(x); so any known-constant value can be folded. What I'm dealing with is the remainder of the cases. Anyway - thankyou for your help. ~Andrew
Re: Builtin for consulting value analysis (better ffs() code gen)
* Andrew Cooper via Gcc: > Anyway - is there a way of doing this that I've managed to overlook? Use __builtin_ffs? I think there's some concern for GCC that some of the alternative x86-64 implementations do not follow the AMD and Intel behavior. Thanks, Florian
Re: Builtin for consulting value analysis (better ffs() code gen)
On Thu, 14 Mar 2024, Andrew Cooper via Gcc wrote: > I suppose that what I'm looking for is something a little like > __builtin_constant_p() which can either be used in a straight if(), or > in a __builtin_choose_expr(). > > Anyway - is there a way of doing this that I've managed to overlook? I am missing what is lacking for you with __builtin_constant_p, I would do it like this: unsigned ffs(unsigned x) { unsigned res; unsigned nonzero = x != 0; if (__builtin_constant_p(nonzero) && nonzero) asm("bsf %1, %0" : "=r"(res) : "rm"(x)); else { res = -1; asm("bsf %1, %0" : "+r"(res) : "rm"(x)); } return res; } or with handling known-zero-input case like this: unsigned ffs(unsigned x) { unsigned res; unsigned nonzero = x != 0; if (!__builtin_constant_p(nonzero)) { res = -1; asm("bsf %1, %0" : "+r"(res) : "rm"(x)); } else if (nonzero) { asm("bsf %1, %0" : "=r"(res) : "rm"(x)); } else { res = -1; } return res; } Does it work for you? HTH Alexander