Re: Builtin for consulting value analysis (better ffs() code gen)

2024-03-21 Thread LIU Hao via Gcc

在 2024-03-14 23:33, Andrew Cooper via Gcc 写道:

And for x86's arch_ffs(),

unsigned int arch_ffs(unsigned int x)
{
     unsigned int res;

     if ( __builtin_constant_p(x > 0) && x > 0 )
     {
         // Well defined when x is known non-zero
         asm("bsf %1, %0" : "=r"(res) : "rm"(x));


Even if you may assume that the destination operand is always destroyed, the hardware has no 
knowledge about that, so this statement has a dependency on `res`, and shouldn't be declared `=r`.


I think it's better to remove this `if`. The other branch below clearly 
eliminates the dependency.



     }
     else
     {
         // The architects say this is safe even for 0.
         res = -1;
     asm("bsf %1, %0" : "+r"(res) : "rm"(x));
     }

     return res + 1;
}


--
Best regards,
LIU Hao



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Builtin for consulting value analysis (better ffs() code gen)

2024-03-14 Thread Andrew Cooper via Gcc
On 14/03/2024 12:03 pm, Andreas Schwab wrote:
> On Mär 14 2024, Andrew Cooper via Gcc wrote:
>
>> so any known-constant value can be folded.  What I'm dealing with is the
>> remainder of the cases.
> Which cases remain?

None, thanks to the answers on this thread.

The overall structure I've got now is:

unsigned int ffs(unsigned int x)
{
    if ( __builtin_constant_p(x) )
        return __builtin_ffs(x);  // Allows constant folding

#ifndef arch_ffs
#define arch_ffs __builtin_ffs
#endif

    return arch_ffs(x);
}

And for x86's arch_ffs(),

unsigned int arch_ffs(unsigned int x)
{
    unsigned int res;

    if ( __builtin_constant_p(x > 0) && x > 0 )
    {
        // Well defined when x is known non-zero
        asm("bsf %1, %0" : "=r"(res) : "rm"(x));
    }
    else
    {
        // The architects say this is safe even for 0.
        res = -1;
    asm("bsf %1, %0" : "+r"(res) : "rm"(x));
    }

    return res + 1;
}


This gives the same code gen as before (give or take some register
shuffling), without having to rely on the programmer to remember to get
their ffs()'s separated from their __ffs()'s as it pertains to undefined
input.

The other architectures which have better-defined instructions don't
need any of these games to retain the same good code-gen that is
currently expressed with a maze of subtly-different functions.

~Andrew


Re: Builtin for consulting value analysis (better ffs() code gen)

2024-03-14 Thread Andreas Schwab via Gcc
On Mär 14 2024, Andrew Cooper via Gcc wrote:

> so any known-constant value can be folded.  What I'm dealing with is the
> remainder of the cases.

Which cases remain?

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Builtin for consulting value analysis (better ffs() code gen)

2024-03-14 Thread Andrew Cooper via Gcc
On 14/03/2024 11:02 am, Florian Weimer wrote:
> * Andrew Cooper via Gcc:
>
>> Anyway - is there a way of doing this that I've managed to overlook?
> Use __builtin_ffs?
>
> I think there's some concern for GCC that some of the alternative x86-64
> implementations do not follow the AMD and Intel behavior.

This is why I'm not asking "please improve the code gen of
__builtin_ffs()".  Its a can of worms I don't expect anyone wants to open.

But in the case that I can do better owing to a narrower scope, I'd like
to do so.

~Andrew


Re: Builtin for consulting value analysis (better ffs() code gen)

2024-03-14 Thread Andrew Cooper via Gcc
On 14/03/2024 6:04 am, Alexander Monakov wrote:
> On Thu, 14 Mar 2024, Andrew Cooper via Gcc wrote:
>
>> I suppose that what I'm looking for is something a little like
>> __builtin_constant_p() which can either be used in a straight if(), or
>> in a __builtin_choose_expr().
>>
>> Anyway - is there a way of doing this that I've managed to overlook?
> I am missing what is lacking for you with __builtin_constant_p, I would
> do it like this:
>
> unsigned ffs(unsigned x)
> {
> unsigned res;
> unsigned nonzero = x != 0;
> if (__builtin_constant_p(nonzero) && nonzero)
> asm("bsf %1, %0" : "=r"(res) : "rm"(x));
> else {
> res = -1;
> asm("bsf %1, %0" : "+r"(res) : "rm"(x));
> }
> return res;
> }

Oh - so it does.  I'd not considered that expressing it like that would
still work.

>
> or with handling known-zero-input case like this:
>
> unsigned ffs(unsigned x)
> {
> unsigned res;
> unsigned nonzero = x != 0;
> if (!__builtin_constant_p(nonzero)) {
> res = -1;
> asm("bsf %1, %0" : "+r"(res) : "rm"(x));
> } else if (nonzero) {
> asm("bsf %1, %0" : "=r"(res) : "rm"(x));
> } else {
> res = -1;
> }
> return res;
> }
>
>
> Does it work for you?

I simplified things when asking the question.  The real implementation
has a general

    if (__builtin_constant_p(x))
        return __builtin_ffs(x);

so any known-constant value can be folded.  What I'm dealing with is the
remainder of the cases.

Anyway - thankyou for your help.

~Andrew


Re: Builtin for consulting value analysis (better ffs() code gen)

2024-03-14 Thread Florian Weimer via Gcc
* Andrew Cooper via Gcc:

> Anyway - is there a way of doing this that I've managed to overlook?

Use __builtin_ffs?

I think there's some concern for GCC that some of the alternative x86-64
implementations do not follow the AMD and Intel behavior.

Thanks,
Florian



Re: Builtin for consulting value analysis (better ffs() code gen)

2024-03-13 Thread Alexander Monakov


On Thu, 14 Mar 2024, Andrew Cooper via Gcc wrote:

> I suppose that what I'm looking for is something a little like
> __builtin_constant_p() which can either be used in a straight if(), or
> in a __builtin_choose_expr().
> 
> Anyway - is there a way of doing this that I've managed to overlook?

I am missing what is lacking for you with __builtin_constant_p, I would
do it like this:

unsigned ffs(unsigned x)
{
unsigned res;
unsigned nonzero = x != 0;
if (__builtin_constant_p(nonzero) && nonzero)
asm("bsf %1, %0" : "=r"(res) : "rm"(x));
else {
res = -1;
asm("bsf %1, %0" : "+r"(res) : "rm"(x));
}
return res;
}

or with handling known-zero-input case like this:

unsigned ffs(unsigned x)
{
unsigned res;
unsigned nonzero = x != 0;
if (!__builtin_constant_p(nonzero)) {
res = -1;
asm("bsf %1, %0" : "+r"(res) : "rm"(x));
} else if (nonzero) {
asm("bsf %1, %0" : "=r"(res) : "rm"(x));
} else {
res = -1;
}
return res;
}


Does it work for you?

HTH
Alexander