"Paul Koning" <paulkon...@comcast.net> wrote:

>> On Jan 9, 2023, at 10:20 AM, Stefan Kanthak <stefan.kant...@nexgo.de> wrote:
>> 
>> "Paul Koning" <paulkon...@comcast.net> wrote:
>> 
>>>> On Jan 9, 2023, at 7:20 AM, Stefan Kanthak <stefan.kant...@nexgo.de> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> GCC (and other C compilers too) support the widening multiplication
>>>> of i386/AMD64 processors, but DON'T support their narrowing division:
>>> 
>>> I wonder if this changed in the recent past.
>>> I have a pattern for this type of thing in pdp11.md:
>> [...]
>>> and I'm pretty sure this worked at some point in the past.  
>> 
>> Unfortunately the C standard defines that the smaller operand (of lesser
>> conversion rank), here divisor, has to undergo a conversion to the "real
>> common type", i.e. the broader operand (of higher conversion rank), here
>> dividend. Unless the information about promotion/conversion is handed over
>> to the code generator it can't apply such patterns -- as demonstrated by
>> the demo code.

> Yes, I was thinking the same.  But I spent a while on that pattern -- I
> wanted to support div/mod as a single operation because the machine has
> that primitive.  And I'm pretty sure I saw it work before I committed
> that change.  That's why I'm wondering if something changed.

I can't tell from the past how GCC once worked, but today it can't
(or doesn't) use such patterns, at least not on i386/AMD64 processors.
To give another example where the necessary information is most
obviously NOT propagated from front end to back end:

--- clmul.c ---
// widening carry-less multiplication

unsigned long long clmul(unsigned long p, unsigned long q)
{
    unsigned long long r = 0;
    unsigned long      s = 1UL << 31;

    do {
        r <<= 1;
        if (q & s)
#ifdef _MSC_VER
            (unsigned long) r ^= p;
#else
            r ^= p; // no need to promote/convert p here!
#endif
    } while (s >>= 1);

    return r;
}
--- EOF ---

# https://gcc.godbolt.org/z/E99v7fEP3
clmul(unsigned long, unsigned long):
        push    ebp
        mov     ecx, -2147483648
        xor     eax, eax
        xor     edx, edx
        push    edi        # OOPS: superfluous
        xor     edi, edi   # OOPS: superfluous
        push    esi
        push    ebx        # OUCH: WTF?
        mov     ebp, DWORD PTR [esp+24]
        mov     ebx, 32    # OUCH: WTF?
        mov     esi, DWORD PTR [esp+20]
.L3:
        shld    edx, eax, 1
        add     eax, eax
        test    ebp, ecx
        je      .L2
        xor     eax, esi
        xor     edx, edi   # OOPS: superfluous
.L2:
        shr     ecx, 1
        sub     ebx, 1     # OUCH: WTF?
        jne     .L3
        pop     ebx        # OUCH: WTF?
        pop     esi
        pop     edi        # OOPS: superfluous
        pop     ebp
        ret

8 superfluous instructions out of the total 25 instructions!

NOT AMUSED
Stefan

Reply via email to