https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #57 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #56)
> Created attachment 55244 [details]
> gcc14-bitint-wip-inc.patch
> 
> Incremental patch on top of the above patch.
> 
> I've tried to make some progress and implement the simplest large _BitInt
> cases,
> &/|/^/~, but ran into a problem there, both BIT_FIELD_REF and
> BIT_INSERT_EXPR disallow
> operating on non-mode precisions, while for _BitInt I think it would be
> really useful
> to use them on the large/huge _BitInts (which I will force into memory
> during expansion most likely).  Sure, for huge _BitInts, what is handled in
> the loop will use either
> ARRAY_REF on VIEW_CONVERT_EXPR for operands or TARGET_MEM_REFs on VAR_DECLs
> for the results in the loop, but even for those there is the partial most
> significant limb in some cases that needs to be handled separately.
> 
> So, do you think it is ok to make an exception for
> BIT_FIELD_REF/BIT_INSERT_EXPR and
> allow them on non-mode precision BITINT_TYPEs (the incremental patch enables
> that) plus
> handle it during the expansion?

The incremental patch doesn't implement the expansion part, right?  The
problem is that BIT_* are specified to work on the in-memory representation
and a non-mode precision entity doesn't have this specified - you'd have
to extend / shift that to some mode to be able to store it.

So to extract from or insert into some bit-precision entity you have to
perform this conversion somehow.  Why do you have this anyway?  Is it
really that the ABIs(?) allow for the padding up to limb size of the
partial limb to be not present (aka in unmapped memory?)?  Why can't
you work on the full libm and just "pollute" the padding at will
but then also zero-extending on loads?

> Another thing, started to think about PLUS_EXPR/MINUS_EXPR, we have
> __builtin_ia32_addcarryx_u64/__builtin_ia32_sbb_u64 builtins on x86-64, but
> from what
> I can see don't really pattern recognize even simple add + adc.
> 
> Given:
> void
> foo (unsigned long *p, unsigned long *q, unsigned long *r)
> {
>   unsigned long p0 = p[0], q0 = q[0];
>   unsigned long p1 = p[1], q1 = q[1];
>   unsigned long r0 = p0 + q0;
>   unsigned long r1 = p1 + q1 + (r0 < p0);
>   r[0] = r0;
>   r[1] = r1;
> }
> 
> void
> bar (unsigned long *p, unsigned long *q, unsigned long *r)
> {
>   unsigned long p0 = p[0], q0 = q[0];
>   unsigned long p1 = p[1], q1 = q[1];
>   unsigned long p2 = p[2], q2 = q[2];
>   unsigned long r0 = p0 + q0;
>   unsigned long r1 = p1 + q1 + (r0 < p0);
>   unsigned long r2 = p2 + q2 + (r1 < p1 || r1 < q1);
>   r[0] = r0;
>   r[1] = r1;
>   r[2] = r2;
> }
> 
> llvm seems to pattern recognize foo, but doesn't pattern recognize bar as
> add; adc; adc
> (is that actually a correct C for that though?).
> 
> So, shouldn't we implement the clang's
> https://clang.llvm.org/docs/LanguageExtensions.html#multiprecision-
> arithmetic-builtins
> builtins (add least the __builtin_{add,sub}c{,l,ll} builtins), lower them
> into ifns early (similarly to .{ADD,SUB}_OVERFLOW returning complex integer
> with 2 returns) and add optabs so that targets can implement those
> efficiently?

Improving code-gen for add-with carry would be indeed nice, I'm not sure
we need the user-visible builtins though, matching the open-coded variants
to appropriate IFNs would work.  But can the _OVERFLOW variants not be
used here, at least for unsigned?

Reply via email to