[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-11-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #116 from CVS Commits  ---
The master branch has been updated by Gaius Mulley :

https://gcc.gnu.org/g:9693459e030977d6e906ea7eb587ed09ee4fddbd

commit r14-5054-g9693459e030977d6e906ea7eb587ed09ee4fddbd
Author: Gaius Mulley 
Date:   Wed Nov 1 09:05:10 2023 +

PR modula2/102989: reimplement overflow detection in ztype though
WIDE_INT_MAX_PRECISION

The ZTYPE in iso modula2 is used to denote intemediate ordinal type const
expressions and these are always converted into the
approriate language or user ordinal type prior to code generation.
The increase of bits supported by _BitInt causes the modula2 largeconst.mod
regression failure tests to pass.  The largeconst.mod test has been
increased to fail, however the char at a time overflow check is now too
slow
to detect failure.  The overflow detection for the ZTYPE has been
rewritten to check against exceeding WIDE_INT_MAX_PRECISION (many orders of
magnitude faster).

gcc/m2/ChangeLog:

PR modula2/102989
* gm2-compiler/SymbolTable.mod (OverflowZType): Import from m2expr.
(ConstantStringExceedsZType): Remove import.
(GetConstLitType): Replace ConstantStringExceedsZType with
OverflowZType.
* gm2-gcc/m2decl.cc (m2decl_ConstantStringExceedsZType): Remove.
(m2decl_BuildConstLiteralNumber): Re-write.
* gm2-gcc/m2decl.def (ConstantStringExceedsZType): Remove.
* gm2-gcc/m2decl.h (m2decl_ConstantStringExceedsZType): Remove.
* gm2-gcc/m2expr.cc (m2expr_StrToWideInt): Rewrite to check
overflow.
(m2expr_OverflowZType): New function.
(ToWideInt): New function.
* gm2-gcc/m2expr.def (OverflowZType): New procedure function
declaration.
* gm2-gcc/m2expr.h (m2expr_OverflowZType): New prototype.

gcc/testsuite/ChangeLog:

PR modula2/102989
* gm2/pim/fail/largeconst.mod: Updated foo to an outrageous value.
* gm2/pim/fail/largeconst2.mod: Duplicate test removed.

Signed-off-by: Gaius Mulley 

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-11-01 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #115 from Gaius Mulley  ---
Created attachment 56482
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56482=edit
modula2: proposed fix to fix largeconst.mod

Here is a patch set for the modula2 fe which re-implements the ZTYPE overflow
detection.

Bootstrapped on x86_64 all regressions pass.

The ZTYPE in iso modula2 is used to denote intemediate ordinal type const
expressions and these are always converted into the
approriate language or user ordinal type prior to code generation.
The increase of bits supported by _BitInt causes the modula2 largeconst.mod
regression failure tests to pass.  The largeconst.mod test has been
increased to fail, however the char at a time overflow check is now too slow
to detect failure.  The overflow detection for the ZTYPE has been
rewritten to check against exceeding WIDE_INT_MAX_PRECISION (many orders of
magnitude faster).

gcc/m2/ChangeLog:

* gm2-compiler/SymbolTable.mod (OverflowZType): Import from m2expr.
(ConstantStringExceedsZType): Remove import.
(GetConstLitType): Replace ConstantStringExceedsZType with
OverflowZType.
* gm2-gcc/m2decl.cc (m2decl_ConstantStringExceedsZType): Remove.
(m2decl_BuildConstLiteralNumber): Re-write.
* gm2-gcc/m2decl.def (ConstantStringExceedsZType): Remove.
* gm2-gcc/m2decl.h (m2decl_ConstantStringExceedsZType): Remove.
* gm2-gcc/m2expr.cc (m2expr_StrToWideInt): Rewrite to check overflow.
(m2expr_OverflowZType): New function.
(ToWideInt): New function.
* gm2-gcc/m2expr.def (OverflowZType): New procedure function
declaration.
* gm2-gcc/m2expr.h (m2expr_OverflowZType): New prototype.

gcc/testsuite/ChangeLog:

* gm2/pim/fail/largeconst.mod: Updated foo to an outrageous value.
* gm2/pim/fail/largeconst2.mod: Duplicate test removed.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-10-14 Thread gaius at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Gaius Mulley  changed:

   What|Removed |Added

 CC||gaius at gcc dot gnu.org

--- Comment #114 from Gaius Mulley  ---
This comment is to acknowledge the bug in cc1gm2 regarding the false positives:

 gm2/pim/fail/largeconst.mod
 gm2/pim/fail/largeconst2.mod

when encountering large ZTYPE constants.

Will fix - and thanks for the data type hint.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-10-14 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #113 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:cb0119242317c2a6f3127b4acff6aadbfd1dfbc4

commit r14-4635-gcb0119242317c2a6f3127b4acff6aadbfd1dfbc4
Author: Jakub Jelinek 
Date:   Sat Oct 14 09:35:44 2023 +0200

middle-end: Allow _BitInt(65535) [PR102989]

The following patch lifts further restrictions which limited _BitInt to at
most 16319 bits up to 65535.
The problem was mainly in INTEGER_CST representation, which had 3
unsigned char members to describe lengths in number of 64-bit limbs, which
it wanted to fit into 32 bits.  This patch removes the third one which was
just a cache to save a few compile time cycles for wi::to_offset and
enlarges the other two members to unsigned short.
Furthermore, the same problem has been in some uses of trailing_wide_int*
(in value-range-storage*) and value-range-storage* itself, while other
uses of trailing_wide_int* have been fine (e.g. CONST_POLY_INT, where no
constants will be larger than 3/5/9/11 limbs depending on target, so 255
limit is plenty).  The patch turns all those length representations to be
unsigned short for consistency, so value-range-storage* can handle even
16320-65535 bits BITINT_TYPE ranges.  The cc1plus growth is about 16K,
so not really significant for 38M .text section.

Note, the reason for the new limit is
  unsigned int precision : 16;
TYPE_PRECISION limit, if we wanted to overcome that, TYPE_PRECISION would
need to use some other member for BITINT_TYPE from all the others and
we could reach that way 4194239 limit (65535 * 64 - 1, again implied by
INTEGER_CST and value-range-storage*).  Dunno if that is
worth it or if it is something we want to do for GCC 14 though.

2023-10-14  Jakub Jelinek  

PR c/102989
gcc/
* tree-core.h (struct tree_base): Remove int_length.offset
member, change type of int_length.unextended and
int_length.extended
from unsigned char to unsigned short.
* tree.h (TREE_INT_CST_OFFSET_NUNITS): Remove.
(wi::extended_tree ::get_len): Don't use
TREE_INT_CST_OFFSET_NUNITS,
instead compute it at runtime from TREE_INT_CST_EXT_NUNITS and
TREE_INT_CST_NUNITS.
* tree.cc (wide_int_to_tree_1): Don't assert
TREE_INT_CST_OFFSET_NUNITS value.
(make_int_cst): Don't initialize TREE_INT_CST_OFFSET_NUNITS.
* wide-int.h (WIDE_INT_MAX_ELTS): Change from 255 to 1024.
(WIDEST_INT_MAX_ELTS): Change from 510 to 2048, adjust comment.
(trailing_wide_int_storage): Change m_len type from unsigned char *
to unsigned short *.
(trailing_wide_int_storage::trailing_wide_int_storage): Change
second
argument from unsigned char * to unsigned short *.
(trailing_wide_ints): Change m_max_len type from unsigned char to
unsigned short.  Change m_len element type from
struct{unsigned char len;} to unsigned short.
(trailing_wide_ints ::operator []): Remove .len from m_len
accesses.
* value-range-storage.h (irange_storage::lengths_address): Change
return type from const unsigned char * to const unsigned short *.
(irange_storage::write_lengths_address): Change return type from
unsigned char * to unsigned short *.
* value-range-storage.cc (irange_storage::write_lengths_address):
Likewise.
(irange_storage::lengths_address): Change return type from
const unsigned char * to const unsigned short *.
(write_wide_int): Change len argument type from unsigned char *&
to unsigned short *&.
(irange_storage::set_irange): Change len variable type from
unsigned char * to unsigned short *.
(read_wide_int): Change len argument type from unsigned char to
unsigned short.  Use trailing_wide_int_storage 
instead of trailing_wide_int_storage and
trailing_wide_int  instead of trailing_wide_int.
(irange_storage::get_irange): Change len variable type from
unsigned char * to unsigned short *.
(irange_storage::size): Multiply n by sizeof (unsigned short)
in len_size variable initialization.
(irange_storage::dump): Change len variable type from
unsigned char * to unsigned short *.
gcc/cp/
* module.cc (trees_out::start, trees_in::start): Remove
TREE_INT_CST_OFFSET_NUNITS handling.
gcc/testsuite/
* gcc.dg/bitint-38.c: Change into dg-do run test, in addition
to checking the addition, division and right shift results at
compile
time check it also at runtime.
* gcc.dg/bitint-39.c: New 

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-10-12 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #112 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:0d00385eaf72ccacff17935b0d214a26773e095f

commit r14-4592-g0d00385eaf72ccacff17935b0d214a26773e095f
Author: Jakub Jelinek 
Date:   Thu Oct 12 16:01:12 2023 +0200

wide-int: Allow up to 16320 bits wide_int and change widest_int precision
to 32640 bits [PR102989]

As mentioned in the _BitInt support thread, _BitInt(N) is currently limited
by the wide_int/widest_int maximum precision limitation, which is depending
on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION).
That is fairly low limit for _BitInt, especially on the targets with the
191
bit limitation.

The following patch bumps that limit to 16319 bits on all arches (which
support
_BitInt at all), which is the limit imposed by INTEGER_CST representation
(unsigned char members holding number of HOST_WIDE_INT limbs).

In order to achieve that, wide_int is changed from a trivially copyable
type
which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or
11 limbs depending on target) limbs into a non-trivially copy
constructible,
copy assignable and destructible type which for the usual small cases (up
to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still uses
an inline array of limbs, but for larger precisions uses heap allocated
limb array.  This makes wide_int unusable in GC structures, so for
dwarf2out
which was the only place which needed it there is a new rwide_int type
(restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs
inline and is trivially copyable (dwarf2out should never deal with large
_BitInt constants, those should have been lowered earlier).

Similarly, widest_int has been changed from a trivially copyable type which
contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike
wide_int didn't contain precision and assumed that to be
WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy
assignable and destructible type which has always WIDEST_INT_MAX_PRECISION
precision (32640 bits currently, twice as much as INTEGER_CST limitation
allows) and unlike wide_int decides depending on get_len () value whether
it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap
allocated one.  In wide-int.h this means we need to estimate an upper
bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h)
need to write, heap allocate if needed based on that estimation and upon
set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_ELTS
and allocated dynamically, while we actually need less than that
copy/deallocate.  The unexact guesses are needed because the exact
computation of the length in wide-int.cc is sometimes quite complex and
especially canonicalize at the end can decrease it.  widest_int is again
because of this not usable in GC structures, so cfgloop.h has been changed
to use fixed_wide_int_storage  and punt if
we'd have larger _BitInt based iterators, programs having more than 128-bit
iterators will be hopefully rare and I think it is fine to treat loops with
more than 2^127 iterations as effectively possibly infinite, omp-general.cc
is changed to use fixed_wide_int_storage <1024>, as it better should
support
scores with the same precision on all arches.

Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing
wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for
larger lengths.

On x86_64, the patch in --enable-checking=yes,rtl,extra configured
bootstrapped cc1plus enlarges the .text section by 1.01% - from
0x25725a5 to 0x25e and similarly at least when compiling insn-recog.cc
with the usual bootstrap option slows compilation down by 1.01%,
user 4m22.046s and 4m22.384s on vanilla trunk vs.
4m25.947s and 4m25.581s on patched trunk.  I'm afraid some code size growth
and compile time slowdown is unavoidable in this case, we use wide_int and
widest_int everywhere, and while the rare cases are marked with UNLIKELY
macros, it still means extra checks for it.

The patch also regresses
+FAIL: gm2/pim/fail/largeconst.mod,  -O
+FAIL: gm2/pim/fail/largeconst.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer
-finline-functions
+FAIL: gm2/pim/fail/largeconst.mod,  -Os
+FAIL: gm2/pim/fail/largeconst.mod,  -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O
+FAIL: gm2/pim/fail/largeconst2.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer
-finline-functions
+FAIL: gm2/pim/fail/largeconst2.mod,  -Os

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-07 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #111 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:18c90eaa25363d34b5bef444fbbad04f5da2522d

commit r14-3774-g18c90eaa25363d34b5bef444fbbad04f5da2522d
Author: Jakub Jelinek 
Date:   Thu Sep 7 11:17:04 2023 +0200

middle-end: Avoid calling targetm.c.bitint_type_info inside of gcc_assert
[PR102989]

On Thu, Sep 07, 2023 at 10:36:02AM +0200, Thomas Schwinge wrote:
> Minor comment/question: are we doing away with the property that
> 'assert'-like "calls" must not have side effects?  Per 'gcc/system.h',
> this is "OK" for 'gcc_assert' for '#if ENABLE_ASSERT_CHECKING' or
> '#elif (GCC_VERSION >= 4005)' -- that is, GCC 4.5, which is always-true,
> thus the "offending" '#else' is never active.  However, it's different
> for standard 'assert' and 'gcc_checking_assert', so I'm not sure if
> that's a good property for 'gcc_assert' only?  For example, see also
>  "warn about asserts with side effects", or
> recent 
> "RFE: could -fanalyzer warn about assertions that have side effects?".

You're right, the
  #define gcc_assert(EXPR) ((void)(0 && (EXPR)))
fallback definition is incompatible with the way I've used it, so for
--disable-checking built by non-GCC it would not work properly.

2023-09-07  Jakub Jelinek  

PR c/102989
* expr.cc (expand_expr_real_1): Don't call
targetm.c.bitint_type_info
inside gcc_assert, as later code relies on it filling info
variable.
* gimple-fold.cc (clear_padding_bitint_needs_padding_p,
clear_padding_type): Likewise.
* varasm.cc (output_constant): Likewise.
* fold-const.cc (native_encode_int, native_interpret_int):
Likewise.
* stor-layout.cc (finish_bitfield_representative, layout_type):
Likewise.
* gimple-lower-bitint.cc (bitint_precision_kind): Likewise.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #110 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:52e270e847d240fb68a27c88ee60189515a6

commit r14-3759-g52e270e847d240fb68a27c88ee60189515a6
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:52:24 2023 +0200

Additional _BitInt test coverage [PR102989]

On Tue, Sep 05, 2023 at 10:40:26PM +, Joseph Myers wrote:
> Additional tests I think should be added (for things I expect should
> already work):
>
> * Tests for BITINT_MAXWIDTH in .  Test that it's defined for
> C2x, but not defined for C11/C17 (the latter independent of whether the
> target has _BitInt support).  Test the value as well: _BitInt
> (BITINT_MAXWIDTH) should be OK (both signed and unsigned) but _BitInt
> (BITINT_MAXWIDTH + 1) should not be OK.  Also test that BITINT_MAXWIDTH
>=
> ULLONG_MAX.
>
> * Test _BitInt (N) where N is a constexpr variable or enum constant (I
> expect these should work - the required call to convert_lvalue_to_rvalue
> for constexpr to work is present - but I don't see such tests in the
> testsuite).
>
> * Test that -funsigned-bitfields does not affect the signedness of
_BitInt
> (N) bit-fields (the standard wording isn't entirely clear, but that's
> what's implemented in the patches).
>
> * Test the errors for _Sat used with _BitInt (though such a test might
not
> actually run at present because no target supports both features).

The following patch does that plus for most of the new changes in the
C _BitInt support patch requested in patch review it also does testsuite
coverage.

2023-09-06  Jakub Jelinek  

PR c/102989
* gcc.dg/bitint-2.c (foo): Add tests for constexpr var or
enumerator
arguments of _BitInt.
* gcc.dg/bitint-31.c: Remove forgotten 0 &&.
* gcc.dg/bitint-32.c: New test.
* gcc.dg/bitint-33.c: New test.
* gcc.dg/bitint-34.c: New test.
* gcc.dg/bitint-35.c: New test.
* gcc.dg/bitint-36.c: New test.
* gcc.dg/fixed-point/bitint-1.c: New test.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #109 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:dce6f6a974d4ecce8491c989c35e23c59223f762

commit r14-3758-gdce6f6a974d4ecce8491c989c35e23c59223f762
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:50:49 2023 +0200

Handle BITINT_TYPE in build_{,minus_}one_cst [PR102989]

Recent match.pd changes trigger ICE in build_minus_one_cst, apparently
I forgot to handle BITINT_TYPE in these (while I've handled it in
build_zero_cst).

2023-09-06  Jakub Jelinek  

PR c/102989
* tree.cc (build_one_cst, build_minus_one_cst): Handle BITINT_TYPE
like INTEGER_TYPE.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #107 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:c62c82dc98dcb7420498b7114bf4cd2ec1a81405

commit r14-3756-gc62c82dc98dcb7420498b7114bf4cd2ec1a81405
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:47:49 2023 +0200

Add further _BitInt <-> floating point tests [PR102989]

Here are just the testsuite additions from libgcc _BitInt patch review.

On Fri, Sep 01, 2023 at 09:48:22PM +, Joseph Myers wrote:
> 1. Test overflowing conversions to integers (including from inf or NaN)
> raise FE_INVALID.  (Note: it's not specified in the standard whether
> inexact conversions to integers raise FE_INEXACT or not, so testing that
> seems less important.)

This is in gcc.dg/bitint-28.c (FE_INVALID) and gcc.dg/bitint-29.c
(FE_INEXACT) for binary and dfp/bitint-8.c new tests.

> 2. Test conversions from integers to floating point raise FE_INEXACT when
> inexact, together with FE_OVERFLOW when overflowing (while exact
> conversions don't raise exceptions).

This is in gcc.dg/bitint-30.c new test.

> 3. Test conversions from integers to floating point respect the rounding
> mode.

This is in gcc.dg/bitint-31.c new test.

> 4. Test converting floating-point values in the range (-1.0, 0.0] to both
> unsigned and signed _BitInt; I didn't see such tests for binary floating
> types, only for decimal types, and the decimal tests didn't include tests
> of negative zero itself as the value converted to _BitInt.

This is done as incremental changes to existing tests.

> 5. Test conversions of noncanonical BID zero to integers (these tests
> would be specific to BID).  See below for a bug in this area.

This is done in dfp/bitint-7.c test.

2023-09-06  Jakub Jelinek  

PR c/102989
* gcc.dg/torture/bitint-21.c (main): Add tests for -1 for signed
only,
-1 + epsilon, another (-1, 0) range value and -0.
* gcc.dg/torture/bitint-22.c (main): Likewise.
* gcc.dg/bitint-28.c: New test.
* gcc.dg/bitint-29.c: New test.
* gcc.dg/bitint-30.c: New test.
* gcc.dg/bitint-31.c: New test.
* gcc.dg/dfp/bitint-1.c (main): Add tests for -1 for signed only,
-1 + epsilon and -0.
* gcc.dg/dfp/bitint-2.c (main): Likewise.
* gcc.dg/dfp/bitint-3.c (main): Likewise.
* gcc.dg/dfp/bitint-7.c: New test.
* gcc.dg/dfp/bitint-8.c: New test.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #104 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:a2f50aa2c578eb0572935e61818e1f2b18b53fd6

commit r14-3753-ga2f50aa2c578eb0572935e61818e1f2b18b53fd6
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:37:53 2023 +0200

testsuite part 2 for _BitInt support [PR102989]

This is second part of the testcase additions in order to fit into
mailing lists limits.  Most of these tests are for the floating
point conversions, atomics, __builtin_*_overflow and -fsanitize=undefined.

2023-09-06  Jakub Jelinek  

PR c/102989
gcc/testsuite/
* gcc.dg/torture/bitint-20.c: New test.
* gcc.dg/torture/bitint-21.c: New test.
* gcc.dg/torture/bitint-22.c: New test.
* gcc.dg/torture/bitint-23.c: New test.
* gcc.dg/torture/bitint-24.c: New test.
* gcc.dg/torture/bitint-25.c: New test.
* gcc.dg/torture/bitint-26.c: New test.
* gcc.dg/torture/bitint-27.c: New test.
* gcc.dg/torture/bitint-28.c: New test.
* gcc.dg/torture/bitint-29.c: New test.
* gcc.dg/torture/bitint-30.c: New test.
* gcc.dg/torture/bitint-31.c: New test.
* gcc.dg/torture/bitint-32.c: New test.
* gcc.dg/torture/bitint-33.c: New test.
* gcc.dg/torture/bitint-34.c: New test.
* gcc.dg/torture/bitint-35.c: New test.
* gcc.dg/torture/bitint-36.c: New test.
* gcc.dg/torture/bitint-37.c: New test.
* gcc.dg/torture/bitint-38.c: New test.
* gcc.dg/torture/bitint-39.c: New test.
* gcc.dg/torture/bitint-40.c: New test.
* gcc.dg/torture/bitint-41.c: New test.
* gcc.dg/torture/bitint-42.c: New test.
* gcc.dg/atomic/stdatomic-bitint-1.c: New test.
* gcc.dg/atomic/stdatomic-bitint-2.c: New test.
* gcc.dg/dfp/bitint-1.c: New test.
* gcc.dg/dfp/bitint-2.c: New test.
* gcc.dg/dfp/bitint-3.c: New test.
* gcc.dg/dfp/bitint-4.c: New test.
* gcc.dg/dfp/bitint-5.c: New test.
* gcc.dg/dfp/bitint-6.c: New test.
* gcc.dg/ubsan/bitint-1.c: New test.
* gcc.dg/ubsan/bitint-2.c: New test.
* gcc.dg/ubsan/bitint-3.c: New test.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #105 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:f76ae4369cb6f38e17510704e5b6e53847d2a648

commit r14-3754-gf76ae4369cb6f38e17510704e5b6e53847d2a648
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:39:15 2023 +0200

C _BitInt incremental fixes [PR102989]

On Wed, Aug 09, 2023 at 09:17:57PM +, Joseph Myers wrote:
> > - _Complex _BitInt(N) isn't supported; again mainly because none of the
psABIs
> >   mention how those should be passed/returned; in a limited way they
are
> >   supported internally because the internal functions into which
> >   __builtin_{add,sub,mul}_overflow{,_p} is lowered return COMPLEX_TYPE
as a
> >   hack to return 2 values without using references/pointers
>
> What happens when the usual arithmetic conversions are applied to
> operands, one of which is a complex integer type and the other of which
is
> a wider _BitInt type?  I don't see anything in the code to disallow this
> case (which would produce an expression with a _Complex _BitInt type), or
> any testcases for it.

I've added a sorry for that case (+ return the narrower COMPLEX_TYPE).
Also added testcase to verify we don't create VECTOR_TYPEs of BITINT_TYPE
even if they have mode precision and suitable size (others were rejected
already before).

> Other testcases I think should be present (along with any corresponding
> changes needed to the code itself):
>
> * Verifying that the new integer constant suffix is rejected for C++.

Done.

> * Verifying appropriate pedwarn-if-pedantic for the new constant suffix
> for versions of C before C2x (and probably for use of _BitInt type
> specifiers before C2x as well) - along with the expected -Wc11-c2x-compat
> handling (in C2x mode) / -pedantic -Wno-c11-c2x-compat in older modes.

Done.

Here is an incremental patch which does that.

2023-09-06  Jakub Jelinek  

PR c/102989
gcc/c/
* c-decl.cc (finish_declspecs): Emit pedwarn_c11 on _BitInt.
* c-typeck.cc (c_common_type): Emit sorry for common type between
_Complex integer and larger _BitInt and return the _Complex
integer.
gcc/c-family/
* c-attribs.cc (type_valid_for_vector_size): Reject vector types
with BITINT_TYPE elements even if they have mode precision and
suitable size.
gcc/testsuite/
* gcc.dg/bitint-19.c: New test.
* gcc.dg/bitint-20.c: New test.
* gcc.dg/bitint-21.c: New test.
* gcc.dg/bitint-22.c: New test.
* gcc.dg/bitint-23.c: New test.
* gcc.dg/bitint-24.c: New test.
* gcc.dg/bitint-25.c: New test.
* gcc.dg/bitint-26.c: New test.
* gcc.dg/bitint-27.c: New test.
* g++.dg/ext/bitint1.C: New test.
* g++.dg/ext/bitint2.C: New test.
* g++.dg/ext/bitint3.C: New test.
* g++.dg/ext/bitint4.C: New test.
libcpp/
* expr.cc (cpp_classify_number): Diagnose wb literal suffixes
for -pedantic* before C2X or -Wc11-c2x-compat.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #106 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:f6e0ec5696ec5f52baed71fe23f978bcef80d458

commit r14-3755-gf6e0ec5696ec5f52baed71fe23f978bcef80d458
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:42:37 2023 +0200

libgcc _BitInt helper documentation [PR102989]

On Mon, Aug 21, 2023 at 05:32:04PM +, Joseph Myers wrote:
> I think the libgcc functions (i.e. those exported by libgcc, to which
> references are generated by the compiler) need documenting in
libgcc.texi.
> Internal functions or macros in the libgcc patch need appropriate
comments
> specifying their semantics; especially FP_TO_BITINT and FP_FROM_BITINT
> which have a lot of arguments and no comments saying what the semantics
of
> the macros and their arguments are supposed to me.

Here is an incremental patch which does that.

2023-09-06  Jakub Jelinek  

PR c/102989
gcc/
* doc/libgcc.texi (Bit-precise integer arithmetic functions):
Document general rules for _BitInt support library functions
and document __mulbitint3 and __divmodbitint4.
(Conversion functions): Document __fix{s,d,x,t}fbitint,
__floatbitint{s,d,x,t,h,b}f, __bid_fix{s,d,t}dbitint and
__bid_floatbitint{s,d,t}d.
libgcc/
* libgcc2.c (bitint_negate): Add function comment.
* soft-fp/bitint.h (bitint_negate): Add function comment.
(FP_TO_BITINT, FP_FROM_BITINT): Add comment explaining the macros.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #108 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:3ad9948b3e716885ce66bdf1c8e053880a843a2b

commit r14-3757-g3ad9948b3e716885ce66bdf1c8e053880a843a2b
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:49:44 2023 +0200

_BitInt profile fixes [PR102989]

On Thu, Aug 24, 2023 at 03:14:32PM +0200, Jan Hubicka via Gcc-patches
wrote:
> this patch extends verifier to check that all probabilities and counts
are
> initialized if profile is supposed to be present.  This is a bit
complicated
> by the posibility that we inline !flag_guess_branch_probability function
> into function with profile defined and in this case we need to stop
> verification.  For this reason I added flag to cfg structure tracking
this.

This patch broke a couple of _BitInt tests (in the admittedly still
uncommitted series - still waiting for review of the C FE bits).

Here is a minimal patch to make it work again, though I'm not sure
if in the if_then_else and if_then_if_then_else cases I shouldn't scale
count of the other bbs as well.  if_then method creates
if (COND) new_bb1;
in a middle of some pre-existing bb (with PROB that COND is true),
if_then_else
if (COND) new_bb1; else new_bb2;
and if_then_if_then_else
if (COND1) { if (COND2) new_bb2; else new_bb1; }
with PROB1 and PROB2 probabilities that COND1 and COND2 are true.
The lowering happens shortly after IPA.

2023-09-06  Jakub Jelinek  

PR c/102989
* gimple-lower-bitint.cc (bitint_large_huge::if_then_else,
bitint_large_huge::if_then_if_then_else): Use make_single_succ_edge
rather than make_edge, initialize bb->count.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #103 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:faff31701d50fab08d75fbb13affc82cff74a72c

commit r14-3752-gfaff31701d50fab08d75fbb13affc82cff74a72c
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:36:41 2023 +0200

testsuite part 1 for _BitInt support [PR102989]

This patch adds first part of the testsuite support.  When creating the
testcases, I've been using
https://defuse.ca/big-number-calculator.htm
tool, a randombitint tool I wrote (posted as a reply to the first series)
plus
LLVM trunk on godbolt and the WIP GCC support checking if both compilers
agree on stuff (and in case of differences tried constant evaluation etc.).
The whole testsuite has been also tested with
make -j32 -k check-gcc GCC_TEST_RUN_EXPENSIVE=1 \
RUNTESTFLAGS='GCC_TEST_RUN_EXPENSIVE=1 --target_board=unix\{-m32,-m64\}
ubsan.exp=bitint*.c dg.exp=bitint* dg-torture.exp=bitint*'
to verify it in all modes, normally I'm limitting the torture tests to just
-O0 and -O2 because they are quite large and expensive.
Generally it is needed to test different _BitInt precisions because they
are lowered differently (the small vs. medium vs. large vs. huge, precision
of multiples of limb precision or some other etc.).

2023-09-06  Jakub Jelinek  

PR c/102989
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_bitint,
check_effective_target_bitint128,
check_effective_target_bitint575):
New effective targets.
* gcc.dg/bitint-1.c: New test.
* gcc.dg/bitint-2.c: New test.
* gcc.dg/bitint-3.c: New test.
* gcc.dg/bitint-4.c: New test.
* gcc.dg/bitint-5.c: New test.
* gcc.dg/bitint-6.c: New test.
* gcc.dg/bitint-7.c: New test.
* gcc.dg/bitint-8.c: New test.
* gcc.dg/bitint-9.c: New test.
* gcc.dg/bitint-10.c: New test.
* gcc.dg/bitint-11.c: New test.
* gcc.dg/bitint-12.c: New test.
* gcc.dg/bitint-13.c: New test.
* gcc.dg/bitint-14.c: New test.
* gcc.dg/bitint-15.c: New test.
* gcc.dg/bitint-16.c: New test.
* gcc.dg/bitint-17.c: New test.
* gcc.dg/bitint-18.c: New test.
* gcc.dg/torture/bitint-1.c: New test.
* gcc.dg/torture/bitint-2.c: New test.
* gcc.dg/torture/bitint-3.c: New test.
* gcc.dg/torture/bitint-4.c: New test.
* gcc.dg/torture/bitint-5.c: New test.
* gcc.dg/torture/bitint-6.c: New test.
* gcc.dg/torture/bitint-7.c: New test.
* gcc.dg/torture/bitint-8.c: New test.
* gcc.dg/torture/bitint-9.c: New test.
* gcc.dg/torture/bitint-10.c: New test.
* gcc.dg/torture/bitint-11.c: New test.
* gcc.dg/torture/bitint-12.c: New test.
* gcc.dg/torture/bitint-13.c: New test.
* gcc.dg/torture/bitint-14.c: New test.
* gcc.dg/torture/bitint-15.c: New test.
* gcc.dg/torture/bitint-16.c: New test.
* gcc.dg/torture/bitint-17.c: New test.
* gcc.dg/torture/bitint-18.c: New test.
* gcc.dg/torture/bitint-19.c: New test.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #101 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:2ce182e258d3ab11310442d5f4dd1d063018aca9

commit r14-3750-g2ce182e258d3ab11310442d5f4dd1d063018aca9
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:33:05 2023 +0200

libgcc _BitInt support [PR102989]

This patch adds the library helpers for multiplication, division + modulo
and casts from and to floating point (both binary and decimal).
As described in the intro, the first step is try to reduce further the
passed in precision by skipping over most significant limbs with just zeros
or sign bit copies.  For multiplication and division I've implemented
a simple algorithm, using something smarter like Karatsuba or Toom N-Way
might be faster for very large _BitInts (which we don't support right now
anyway), but could mean more code in libgcc, which maybe isn't what people
are willing to accept.
For the to/from floating point conversions the patch uses soft-fp, because
it already has tons of handy macros which can be used for that.  In theory
it could be implemented using {,unsigned} long long or {,unsigned} __int128
to/from floating point conversions with some frexp before/after, but at
that
point we already need to force it into integer registers and analyze it
anyway.  Plus, for 32-bit arches there is no __int128 that could be used
for XF/TF mode stuff.
I know that soft-fp is owned by glibc and I think the op-common.h change
should be propagated there, but the bitint stuff is really GCC specific
and IMHO doesn't belong into the glibc copy.

2023-09-06  Jakub Jelinek  

PR c/102989
libgcc/
* config/aarch64/t-softfp (softfp_extras): Use += rather than :=.
* config/i386/64/t-softfp (softfp_extras): Likewise.
* config/i386/libgcc-glibc.ver (GCC_14.0.0): Export _BitInt support
routines.
* config/i386/t-softfp (softfp_extras): Add fixxfbitint and
bf, hf and xf mode floatbitint.
(CFLAGS-floatbitintbf.c, CFLAGS-floatbitinthf.c): Add -msse2.
* config/riscv/t-softfp32 (softfp_extras): Use += rather than :=.
* config/rs6000/t-e500v1-fp (softfp_extras): Likewise.
* config/rs6000/t-e500v2-fp (softfp_extras): Likewise.
* config/t-softfp (softfp_floatbitint_funcs): New.
(softfp_bid_list): New.
(softfp_func_list): Add sf and df mode from and to _BitInt
libcalls.
(softfp_bid_file_list): New.
(LIB2ADD_ST): Add $(softfp_bid_file_list).
* config/t-softfp-sfdftf (softfp_extras): Add fixtfbitint and
floatbitinttf.
* config/t-softfp-tf (softfp_extras): Likewise.
* libgcc2.c (bitint_reduce_prec): New inline function.
(BITINT_INC, BITINT_END): Define.
(bitint_mul_1, bitint_addmul_1): New helper functions.
(__mulbitint3): New function.
(bitint_negate, bitint_submul_1): New helper functions.
(__divmodbitint4): New function.
* libgcc2.h (LIBGCC2_UNITS_PER_WORD): When building _BitInt support
libcalls, redefine depending on __LIBGCC_BITINT_LIMB_WIDTH__.
(__mulbitint3, __divmodbitint4): Declare.
* libgcc-std.ver.in (GCC_14.0.0): Export _BitInt support routines.
* Makefile.in (lib2funcs): Add _mulbitint3.
(LIB2_DIVMOD_FUNCS): Add _divmodbitint4.
* soft-fp/bitint.h: New file.
* soft-fp/fixdfbitint.c: New file.
* soft-fp/fixsfbitint.c: New file.
* soft-fp/fixtfbitint.c: New file.
* soft-fp/fixxfbitint.c: New file.
* soft-fp/floatbitintbf.c: New file.
* soft-fp/floatbitintdf.c: New file.
* soft-fp/floatbitinthf.c: New file.
* soft-fp/floatbitintsf.c: New file.
* soft-fp/floatbitinttf.c: New file.
* soft-fp/floatbitintxf.c: New file.
* soft-fp/op-common.h (_FP_FROM_INT): Add support for rsize up to
4 * _FP_W_TYPE_SIZE rather than just 2 * _FP_W_TYPE_SIZE.
* soft-fp/bitintpow10.c: New file.
* soft-fp/fixsdbitint.c: New file.
* soft-fp/fixddbitint.c: New file.
* soft-fp/fixtdbitint.c: New file.
* soft-fp/floatbitintsd.c: New file.
* soft-fp/floatbitintdd.c: New file.
* soft-fp/floatbitinttd.c: New file.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #102 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:8c984a1c3693df63520558631c827bb2c2d8b5bc

commit r14-3751-g8c984a1c3693df63520558631c827bb2c2d8b5bc
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:34:49 2023 +0200

C _BitInt support [PR102989]

This patch adds the C FE support, c-family support, small libcpp change
so that 123wb and 42uwb suffixes are handled plus glimits.h change
to define BITINT_MAXWIDTH macro.

The previous patches really do nothing without this, which enables
all the support.

2023-09-06  Jakub Jelinek  

PR c/102989
gcc/
* glimits.h (BITINT_MAXWIDTH): Define if __BITINT_MAXWIDTH__ is
predefined.
gcc/c-family/
* c-common.cc (c_common_reswords): Add _BitInt as keyword.
(unsafe_conversion_p): Handle BITINT_TYPE like INTEGER_TYPE.
(c_common_signed_or_unsigned_type): Handle BITINT_TYPE.
(c_common_truthvalue_conversion, c_common_get_alias_set,
check_builtin_function_arguments): Handle BITINT_TYPE like
INTEGER_TYPE.
(sync_resolve_size): Add ORIG_FORMAT argument.  If
FETCH && !ORIG_FORMAT, type is BITINT_TYPE, return -1 if size isn't
one of 1, 2, 4, 8 or 16 or if it is 16 but TImode is not supported.
(atomic_bitint_fetch_using_cas_loop): New function.
(resolve_overloaded_builtin): Adjust sync_resolve_size caller.  If
-1 is returned, use atomic_bitint_fetch_using_cas_loop to lower it.
Formatting fix.
(keyword_begins_type_specifier): Handle RID_BITINT.
* c-common.h (enum rid): Add RID_BITINT enumerator.
* c-cppbuiltin.cc (c_cpp_builtins): For C call
targetm.c.bitint_type_info and predefine __BITINT_MAXWIDTH__
and for -fbuilding-libgcc also __LIBGCC_BITINT_LIMB_WIDTH__ and
__LIBGCC_BITINT_ORDER__ macros if _BitInt is supported.
* c-lex.cc (interpret_integer): Handle CPP_N_BITINT.
* c-pretty-print.cc (c_pretty_printer::simple_type_specifier,
c_pretty_printer::direct_abstract_declarator,
c_pretty_printer::direct_declarator, c_pretty_printer::declarator):
Handle BITINT_TYPE.
(pp_c_integer_constant): Handle printing of large precision
wide_ints
which would buffer overflow digit_buffer.
* c-warn.cc (conversion_warning, warnings_for_convert_and_check,
warnings_for_convert_and_check): Handle BITINT_TYPE like
INTEGER_TYPE.
gcc/c/
* c-convert.cc (c_convert): Handle BITINT_TYPE like INTEGER_TYPE.
* c-decl.cc (check_bitfield_type_and_width): Allow BITINT_TYPE
bit-fields.
(finish_struct): Prefer to use BITINT_TYPE for BITINT_TYPE
bit-fields
if possible.
(declspecs_add_type): Formatting fixes.  Handle cts_bitint.  Adjust
for added union in *specs.  Handle RID_BITINT.
(finish_declspecs): Handle cts_bitint.  Adjust for added union
in *specs.
* c-parser.cc (c_keyword_starts_typename, c_token_starts_declspecs,
c_parser_declspecs, c_parser_gnu_attribute_any_word): Handle
RID_BITINT.
(c_parser_omp_clause_schedule): Handle BITINT_TYPE like
INTEGER_TYPE.
* c-tree.h (enum c_typespec_keyword): Mention _BitInt in comment.
Add cts_bitint enumerator.
(struct c_declspecs): Move int_n_idx and floatn_nx_idx into a union
and add bitint_prec there as well.
* c-typeck.cc (c_common_type, comptypes_internal):
Handle BITINT_TYPE.
(perform_integral_promotions): Promote BITINT_TYPE bit-fields to
their declared type.
(build_array_ref, build_unary_op, build_conditional_expr,
build_c_cast, convert_for_assignment, digest_init,
build_binary_op):
Handle BITINT_TYPE.
* c-fold.cc (c_fully_fold_internal): Handle BITINT_TYPE like
INTEGER_TYPE.
* c-aux-info.cc (gen_type): Handle BITINT_TYPE.
libcpp/
* expr.cc (interpret_int_suffix): Handle wb and WB suffixes.
* include/cpplib.h (CPP_N_BITINT): Define.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #100 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:7a610d44d855424518ecb4429ea5226ed2c32543

commit r14-3749-g7a610d44d855424518ecb4429ea5226ed2c32543
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:31:23 2023 +0200

libgcc: Generated tables for _BitInt <-> _Decimal* conversions [PR102989]

The following patch adds a header with generated helper tables to support
computation of powers of 10 from 10^0 to 10^6111 inclusive into a
sufficiently large array of _BitInt limbs.  This is split from the rest
of the libgcc _BitInt support because it is quite large and together it
would run into gcc-patches mail length limits.

2023-09-06  Jakub Jelinek  

PR c/102989
libgcc/
* soft-fp/bitintpow10.h: New file.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #99 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:95521e15b6ef00c192a1bbd7c13b5f35395c7c9e

commit r14-3748-g95521e15b6ef00c192a1bbd7c13b5f35395c7c9e
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:30:07 2023 +0200

ubsan: _BitInt -fsanitize=undefined support [PR102989]

The following patch introduces some -fsanitize=undefined support for
_BitInt,
but some of the diagnostics is limited by lack of proper support in the
library.
I've filed https://github.com/llvm/llvm-project/issues/64100 to request
proper support, for now some of the diagnostics might have less or more
confusing or inaccurate wording but UB should still be diagnosed when it
happens.

2023-09-06  Jakub Jelinek  

PR c/102989
gcc/
* internal-fn.cc (expand_ubsan_result_store): Add LHS, MODE and
DO_ERROR arguments.  For non-mode precision BITINT_TYPE results
check if all padding bits up to mode precision are zeros or sign
bit copies and if not, jump to DO_ERROR.
(expand_addsub_overflow, expand_neg_overflow, expand_mul_overflow):
Adjust expand_ubsan_result_store callers.
* ubsan.cc: Include target.h and langhooks.h.
(ubsan_encode_value): Pass BITINT_TYPE values which fit into
pointer
size converted to pointer sized integer, pass BITINT_TYPE values
which fit into TImode (if supported) or DImode as those integer
types
or otherwise for now punt (pass 0).
(ubsan_type_descriptor): Handle BITINT_TYPE.  For pstyle of
UBSAN_PRINT_FORCE_INT use TK_Integer (0x) mode with a
TImode/DImode precision rather than TK_Unknown used otherwise for
large/huge BITINT_TYPEs.
(instrument_si_overflow): Instrument BITINT_TYPE operations even
when
they don't have mode precision.
* ubsan.h (enum ubsan_print_style): New enumerator.
gcc/c-family/
* c-ubsan.cc (ubsan_instrument_shift): Use UBSAN_PRINT_FORCE_INT
for type0 type descriptor.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #98 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:b38deff6127778fed453bb647e32738ba5c78e33

commit r14-3747-gb38deff6127778fed453bb647e32738ba5c78e33
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:29:17 2023 +0200

i386: Enable _BitInt on x86-64 [PR102989]

The following patch enables _BitInt support on x86-64, the only
target which has _BitInt specified in psABI.

2023-09-06  Jakub Jelinek  

PR c/102989
* config/i386/i386.cc (classify_argument): Handle BITINT_TYPE.
(ix86_bitint_type_info): New function.
(TARGET_C_BITINT_TYPE_INFO): Redefine.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #96 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:4f4fa2501186e43d115238ae938b3df322c9e02a

commit r14-3745-g4f4fa2501186e43d115238ae938b3df322c9e02a
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:25:49 2023 +0200

Middle-end _BitInt support [PR102989]

The following patch introduces the middle-end part of the _BitInt
support, a new BITINT_TYPE, handling it where needed, except the lowering
pass and sanitizer support.

2023-09-06  Jakub Jelinek  

PR c/102989
* tree.def (BITINT_TYPE): New type.
* tree.h (TREE_CHECK6, TREE_NOT_CHECK6): Define.
(NUMERICAL_TYPE_CHECK, INTEGRAL_TYPE_P): Include
BITINT_TYPE.
(BITINT_TYPE_P): Define.
(CONSTRUCTOR_BITFIELD_P): Return true even for BLKmode bit-fields
if
they have BITINT_TYPE type.
(tree_check6, tree_not_check6): New inline functions.
(any_integral_type_check): Include BITINT_TYPE.
(build_bitint_type): Declare.
* tree.cc (tree_code_size, wide_int_to_tree_1, cache_integer_cst,
build_zero_cst, type_hash_canon_hash, type_cache_hasher::equal,
type_hash_canon): Handle BITINT_TYPE.
(bitint_type_cache): New variable.
(build_bitint_type): New function.
(signed_or_unsigned_type_for, verify_type_variant, verify_type):
Handle BITINT_TYPE.
(tree_cc_finalize): Free bitint_type_cache.
* builtins.cc (type_to_class): Handle BITINT_TYPE.
(fold_builtin_unordered_cmp): Handle BITINT_TYPE like INTEGER_TYPE.
* cfgexpand.cc (expand_debug_expr): Punt on BLKmode BITINT_TYPE
INTEGER_CSTs.
* convert.cc (convert_to_pointer_1, convert_to_real_1,
convert_to_complex_1): Handle BITINT_TYPE like INTEGER_TYPE.
(convert_to_integer_1): Likewise.  For BITINT_TYPE don't check
GET_MODE_PRECISION (TYPE_MODE (type)).
* doc/generic.texi (BITINT_TYPE): Document.
* doc/tm.texi.in (TARGET_C_BITINT_TYPE_INFO): New.
* doc/tm.texi: Regenerated.
* dwarf2out.cc (base_type_die, is_base_type, modified_type_die,
gen_type_die_with_usage): Handle BITINT_TYPE.
(rtl_for_decl_init): Punt on BLKmode BITINT_TYPE INTEGER_CSTs or
handle those which fit into shwi.
* expr.cc (expand_expr_real_1): Define EXTEND_BITINT macro, reduce
to bitfield precision reads from BITINT_TYPE vars, parameters or
memory locations.  Expand large/huge BITINT_TYPE INTEGER_CSTs into
memory.
* fold-const.cc (fold_convert_loc, make_range_step): Handle
BITINT_TYPE.
(extract_muldiv_1): For BITINT_TYPE use TYPE_PRECISION rather than
GET_MODE_SIZE (SCALAR_INT_TYPE_MODE).
(native_encode_int, native_interpret_int, native_interpret_expr):
Handle BITINT_TYPE.
* gimple-expr.cc (useless_type_conversion_p): Make BITINT_TYPE
to some other integral type or vice versa conversions non-useless.
* gimple-fold.cc (gimple_fold_builtin_memset): Punt for
BITINT_TYPE.
(clear_padding_unit): Mention in comment that _BitInt types don't
need
to fit either.
(clear_padding_bitint_needs_padding_p): New function.
(clear_padding_type_may_have_padding_p): Handle BITINT_TYPE.
(clear_padding_type): Likewise.
* internal-fn.cc (expand_mul_overflow): For unsigned non-mode
precision operands force pos_neg? to 1.
(expand_MULBITINT, expand_DIVMODBITINT, expand_FLOATTOBITINT,
expand_BITINTTOFLOAT): New functions.
* internal-fn.def (MULBITINT, DIVMODBITINT, FLOATTOBITINT,
BITINTTOFLOAT): New internal functions.
* internal-fn.h (expand_MULBITINT, expand_DIVMODBITINT,
expand_FLOATTOBITINT, expand_BITINTTOFLOAT): Declare.
* match.pd (non-equality compare simplifications from fold_binary):
Punt if TYPE_MODE (arg1_type) is BLKmode.
* pretty-print.h (pp_wide_int): Handle printing of large precision
wide_ints which would buffer overflow digit_buffer.
* stor-layout.cc (finish_bitfield_representative): For bit-fields
with BITINT_TYPE, prefer representatives with precisions in
multiple of limb precision.
(layout_type): Handle BITINT_TYPE.  Handle COMPLEX_TYPE with
BLKmode
element type and assert it is BITINT_TYPE.
* target.def (bitint_type_info): New C target hook.
* target.h (struct bitint_info): New type.
* targhooks.cc (default_bitint_type_info): New function.
* targhooks.h (default_bitint_type_info): 

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-09-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #97 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:a9d6c7fbeb374365058ffe2b9815d2b4b7193d38

commit r14-3746-ga9d6c7fbeb374365058ffe2b9815d2b4b7193d38
Author: Jakub Jelinek 
Date:   Wed Sep 6 17:27:41 2023 +0200

_BitInt lowering support [PR102989]

The following patch adds a new bitintlower lowering pass which lowers most
operations on medium _BitInt into operations on corresponding integer
types,
large _BitInt into straight line code operating on 2 or more limbs and
finally huge _BitInt into a loop plus optional straight line code.

As the only supported architecture is little-endian, the lowering only
supports little-endian for now, because it would be impossible to test it
all for big-endian.  Rest is written with any endian support in mind, but
of course only little-endian has been actually tested.
I hope it is ok to add big-endian support to the lowering pass
incrementally
later when first big-endian target shows with the backend support.
There are 2 possibilities of adding such support, one would be minimal one,
just tweak limb_access function and perhaps one or two other spots and
transform there the indexes from little endian (index 0 is least
significant)
to big endian for just the memory access.  Advantage is I think
maintainance
costs, disadvantage is that the loops will still iterate from 0 to some
number
of limbs and we'd rely on IVOPTs or something similar changing it later if
needed.  Or we could make those indexes endian related everywhere, though
I'm afraid that would be several hundreds of changes.

For switches indexed by large/huge _BitInt the patch invokes what the
switch
lowering pass does (but only on those specific switches, not all of them);
the switch lowering breaks the switches into clusters and none of the
clusters
can have a range which doesn't fit into 64-bit UWHI, everything else will
be
turned into a tree of comparisons.  For clusters normally emitted as
smaller
switches, because we already have a guarantee that the low .. high range is
at most 64 bits, the patch forces subtraction of the low and turns it into
a 64-bit switch.  This is done before the actual pass starts.
Similarly, we cancel lowering of certain constructs like ABS_EXPR,
ABSU_EXPR,
MIN_EXPR, MAX_EXPR and COND_EXPR and turn those back to simpler comparisons
etc., so that fewer operations need to be lowered later.

2023-09-06  Jakub Jelinek  

PR c/102989
* Makefile.in (OBJS): Add gimple-lower-bitint.o.
* passes.def: Add pass_lower_bitint after pass_lower_complex and
pass_lower_bitint_O0 after pass_lower_complex_O0.
* tree-pass.h (PROP_gimple_lbitint): Define.
(make_pass_lower_bitint_O0, make_pass_lower_bitint): Declare.
* gimple-lower-bitint.h: New file.
* tree-ssa-live.h (struct _var_map): Add bitint member.
(init_var_map): Adjust declaration.
(region_contains_p): Handle map->bitint like map->outofssa_p.
* tree-ssa-live.cc (init_var_map): Add BITINT argument, initialize
map->bitint and set map->outofssa_p to false if it is non-NULL.
* tree-ssa-coalesce.cc: Include gimple-lower-bitint.h.
(build_ssa_conflict_graph): Call build_bitint_stmt_ssa_conflicts if
map->bitint.
(create_coalesce_list_for_region): For map->bitint ignore SSA_NAMEs
not in that bitmap, and allow res without default def.
(compute_optimized_partition_bases): In map->bitint mode try hard
to
coalesce any SSA_NAMEs with the same size.
(coalesce_bitint): New function.
(coalesce_ssa_name): In map->bitint mode, or map->bitmap into
used_in_copies and call coalesce_bitint.
* gimple-lower-bitint.cc: New file.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-08-14 Thread tmgross at umich dot edu via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Trevor Gross  changed:

   What|Removed |Added

 CC||tmgross at umich dot edu

--- Comment #95 from Trevor Gross  ---
Just as a heads up: there is an ongoing conversation at the x86 psABI about
adjusting `_BitInt(128)` to have the same alignment as `__int128`, which would
help address some of the issues mentioned here. Please join in the discussion
if you have any comments: https://groups.google.com/g/x86-64-abi/c/-JeR9HgUU20

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-08-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #94 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:8afe9d5d2fdd047cbd4e3531170af6b66d30e74a

commit r14-3128-g8afe9d5d2fdd047cbd4e3531170af6b66d30e74a
Author: Jakub Jelinek 
Date:   Thu Aug 10 17:29:23 2023 +0200

phiopt: Fix phiopt ICE on vops [PR102989]

I've ran into ICE on gcc.dg/torture/bitint-42.c with -O1 or -Os
when enabling expensive tests, and unfortunately I can't reproduce without
_BitInt.  The IL before phiopt3 has:
   [local count: 203190070]:
  # .MEM_428 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR(*.LC3);
  goto ; [100.00%]

   [local count: 203190070]:
  # .MEM_427 = VDEF <.MEM_367>
  bitint.159 = VIEW_CONVERT_EXPR(*.LC4);

   [local count: 406380139]:
  # .MEM_368 = PHI <.MEM_428(87), .MEM_427(88)>
  # VUSE <.MEM_368>
  _123 = VIEW_CONVERT_EXPR(r495[i_107].D.2780)[0];
and factor_out_conditional_operation is called on the vop PHI, it
sees it has exactly two operands and defining statements of both
PHI arguments are converts (VCEs in this case), so it thinks it is
a good idea to try to optimize that and while doing that it constructs
void type SSA_NAMEs and the like.

2023-08-10  Jakub Jelinek  

PR c/102989
* tree-ssa-phiopt.cc (single_non_singleton_phi_for_edges): Never
return virtual phis and return NULL if there is a virtual phi
where the arguments from E0 and E1 edges aren't equal.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-08-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #93 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:d5ad55a83d504df582d1e6f1c168454a028c0437

commit r14-3120-gd5ad55a83d504df582d1e6f1c168454a028c0437
Author: Jakub Jelinek 
Date:   Thu Aug 10 09:23:08 2023 +0200

lto-streamer-in: Adjust assert [PR102989]

With _BitInt(575) or any other _BitInt(513) or larger constants we can
run into this assertion.  MAX_BITSIZE_MODE_ANY_INT is just a value from
which WIDE_INT_MAX_PRECISION is derived.

2023-08-10  Jakub Jelinek  

PR c/102989
* lto-streamer-in.cc (lto_input_tree_1): Assert TYPE_PRECISION
is up to WIDE_INT_MAX_PRECISION rather than
MAX_BITSIZE_MODE_ANY_INT.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-08-10 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #92 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:b129d6b5f5f13995d57d677afcb3e94d0d9c327f

commit r14-3119-gb129d6b5f5f13995d57d677afcb3e94d0d9c327f
Author: Jakub Jelinek 
Date:   Thu Aug 10 09:22:03 2023 +0200

expr: Small optimization [PR102989]

Small optimization to avoid testing modifier multiple times.

2023-08-10  Jakub Jelinek  

PR c/102989
* expr.cc (expand_expr_real_1) : Add an early return
for
EXPAND_WRITE or EXPAND_MEMORY modifiers to avoid testing it
multiple
times.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55642|0   |1
is obsolete||

--- Comment #91 from Jakub Jelinek  ---
Created attachment 55649
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55649=edit
gcc14-bitint.patch

Full patch including ChangeLog I'll submit after testing finishes.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55637|0   |1
is obsolete||

--- Comment #90 from Jakub Jelinek  ---
Created attachment 55642
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55642=edit
gcc14-bitint-wip.patch

Inline asm support with large/huge _BitInt (limited usefulness and makes mostly
sense with g constraint), abs/absu/min/max fixes (had a bug in one testcase
which prevented from those bugs to be seen) and one .{ADD,SUB}_OVERFLOW fix;
all the torture bitint run tests now pass even with -fsanitize=undefined.
Have to do something about stmt_ends_bb_p calls with large/huge _BitInt lhs and
deal with debuginfo, then bootstrap/regtest it as whole and submit.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55628|0   |1
is obsolete||

--- Comment #89 from Jakub Jelinek  ---
Created attachment 55637
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55637=edit
gcc14-bitint-wip.patch

Updated patch with -fsanitize=undefined _BitInt support.
Some of the runtime messages are inaccurate and some are totally incorrect, but
I'm afraid I can't do much until libubsan adds support for _BitInt, which I've
requested in https://github.com/llvm/llvm-project/issues/64100
For +-* overflow the messages look good until (inclusive) _BitInt(128) on
64-bit arches (or _BitInt(64) on 32-bit ones), larger print  instead
of numbers and think it is unsigned integer overflow rather than signed (but I
think that is better than what clang does when stuff just crashes with what it
emits or prints random numbers).
For / overflow, again up to _BitInt(128) it works fine, otherwise prints
division by zero rather than minimum / -1.  For shifts with non-mode precision
_BitInts, even small ones, there are various inaccuracies, because libubsan
think the mode precision is the precision of the type.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55596|0   |1
is obsolete||

--- Comment #88 from Jakub Jelinek  ---
Created attachment 55628
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55628=edit
gcc14-bitint-wip.patch

Updated version which passes all the __builtin_*_overflow{,_p} tests.  I also
used
gcov on gimple-lower-bitint.cc to make sure testsuite coverage covers almost
everything in the file.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-21 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55592|0   |1
is obsolete||

--- Comment #87 from Jakub Jelinek  ---
Created attachment 55596
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55596=edit
gcc14-bitint-wip.patch

large/huge _BitInt __builtin_{add,sub}_overflow mostly implemented (I've left 2
spots to finish - gcc_unreachable () - which only trigger rarely).
Though, e.g. in bitint-41.c test still
t113sub
t122mul
t125mul
t127mul
t160sub
t171mul
t174mul
t176mul
functions abort, so to be debugged next week, then ubsan, inline asm and then
hopefully submit.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55572|0   |1
is obsolete||

--- Comment #86 from Jakub Jelinek  ---
Created attachment 55592
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55592=edit
gcc14-bitint-wip.patch

small/medium _BitInt __builtin_{add,sub,mul}_overflow support (with testsuite
coverage) and large/huge _BitInt __builtin_mul_overflow support (just compile
tested on a simple testcase, more testing will need to wait until
__builtin_{add,sub}_overflow support is added for large/huge _BitInt.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55567|0   |1
is obsolete||

--- Comment #85 from Jakub Jelinek  ---
Created attachment 55572
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55572=edit
gcc14-bitint-wip.patch

At least the x86-64 _BitInt psABI says that the padding bits are undefined and
the various other psABI proposals do that as well.
Though, when looking at RTL expansion, we were doing REDUCE_BIT_FIELD after
operations, meaning that that we effectively relied on those bits at least for
small/middle _BitInt to be sign or zero extended.
This change tries to force sign/zero extensions when reading _BitInt from
memory, parameters etc.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-18 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55562|0   |1
is obsolete||

--- Comment #84 from Jakub Jelinek  ---
Created attachment 55567
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55567=edit
gcc14-bitint-wip.patch

Actually implemented support for switches first.  The switchlower support pass
has most of the support, so all we need is if we detect large/huge _BitInt
indexed switch is to lower it at the start of the bitintlower pass with small
tweak in the switchlower pass
to transform jump tables from ones indexed by large/huge _BitInt into ones
indexed by unsigned long long; switchlower never creates clusters with range
which doesn't fit into 64 bits, which makes this possible.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55561|0   |1
is obsolete||

--- Comment #83 from Jakub Jelinek  ---
Created attachment 55562
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55562=edit
gcc14-bitint-wip.patch

Now with support for passing large/huge _BitInt(N) INTEGER_CSTs as function
arguments (although the RTL could be improved later), -fnon-call-exceptions
support for large/huge _BitInt(N) loads/stores/divide/modulo and large/huge
_BitInt(N) -> floating point conversions and support for uninited large/huge
_BitInt SSA_NAMEs.
Next will be ubsan and __builtin_*_overflow.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-17 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55545|0   |1
is obsolete||

--- Comment #82 from Jakub Jelinek  ---
Created attachment 55561
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55561=edit
gcc14-bitint-wip.patch

Remaining _BitInt to floating point conversions.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55542|0   |1
is obsolete||

--- Comment #81 from Jakub Jelinek  ---
Created attachment 55545
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55545=edit
gcc14-bitint-wip.patch

_BitInt -> double conversion (float, long double, __float128, _Float16 and
__bf16 conversions still to be implemented).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-14 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55538|0   |1
is obsolete||
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #80 from Jakub Jelinek  ---
Created attachment 55542
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55542=edit
gcc14-bitint-wip.patch

Now float,double,long double,__float128 -> {signed,unsigned} _BitInt(N)
conversions seem to work (at least on the testsuite coverage).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-13 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55530|0   |1
is obsolete||

--- Comment #79 from Jakub Jelinek  ---
Created attachment 55538
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55538=edit
gcc14-bitint-wip.patch

double -> large/huge signed/unsigned _BitInt support.
Works on 8 conversions, will need to add larger testcase coverage and when
happy with double, add it for float, long double and __float128 as well.
And then large/huge signed/unsigned _BitInt support -> floating point.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55522|0   |1
is obsolete||

--- Comment #78 from Jakub Jelinek  ---
Created attachment 55530
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55530=edit
gcc14-bitint-wip.patch

Division/modulo should now work too.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-11 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55500|0   |1
is obsolete||

--- Comment #77 from Jakub Jelinek  ---
Created attachment 55522
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55522=edit
gcc14-bitint-wip.patch

Working multiplication now, division/modulo next.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55499|0   |1
is obsolete||

--- Comment #76 from Jakub Jelinek  ---
Created attachment 55500
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55500=edit
gcc14-bitint-wip.patch

Now with support for INTEGER_CST PHI arguments.
Will start work on large/huge _BitInt multiplication/division next.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-07 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55482|0   |1
is obsolete||

--- Comment #75 from Jakub Jelinek  ---
Created attachment 55499
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55499=edit
gcc14-bitint-wip.patch

Cast fixes, now it passes the whole testsuite.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-07-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55435|0   |1
is obsolete||

--- Comment #74 from Jakub Jelinek  ---
Created attachment 55482
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55482=edit
gcc14-bitint-wip.patch

Further progress, bitint-22.c ICEs, so there is still further work needed in
handle_cast, but getting closer.  Also, fixed up the liveness analysis during
bitint coalescing.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-30 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55427|0   |1
is obsolete||

--- Comment #73 from Jakub Jelinek  ---
Created attachment 55435
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55435=edit
gcc14-bitint-wip.patch

WIP on casts, casts from non-_BitInt integers or small/middle _BitInt to
large/huge _BitInt tested (at least when used as operands of mergeable
operations or comparisons/shifts/stores), cast between different precision
large/huge _BitInt implemented but so far untested, casts from large/huge
_BitInt to non-_BitInt integers or small/middle _BitInt yet to be implemented.
After that multiplication/division/modulo, then casts from/to floating point.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-29 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55416|0   |1
is obsolete||

--- Comment #72 from Jakub Jelinek  ---
Created attachment 55427
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55427=edit
gcc14-bitint-wip.patch

Testsuite coverage for shifts and />= comparisons and associated fixes
discovered by that.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55392|0   |1
is obsolete||

--- Comment #71 from Jakub Jelinek  ---
Created attachment 55416
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55416=edit
gcc14-bitint-wip.patch

Updated patch which handles newly arbitrary shits and debug info.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #70 from Jakub Jelinek  ---
For right shifts, I wonder if we shouldn't emit inline (perhaps with exception
of -Os) something like:
__attribute__((noipa)) void
ashiftrt575 (unsigned long *p, unsigned long *q, int n)
{
  int prec = 575;
  int n1 = n & 63;
  int n2 = n / 64;
  int n3 = n1 != 0;
  int n4 = (-n1) & 63;
  unsigned long ext;
  int i;
  for (i = n2; i < prec / 64 - n3; ++i)
p[i - n2] = (q[i] >> n1) | (q[i + n3] << n4);
  ext = ((signed long) (q[prec / 64] << (64 - (prec & 63 >> (64 - (prec &
63));
  if (n1 && i == prec / 64 - n3)
{
  p[i - n2] = (q[i] >> n1) | (ext << n4);
  ++i;
}
  i -= n2;
  p[i] = ((signed long) ext) >> n1;
  ext = ((signed long) ext) >> 63;
  for (++i; i < prec / 64 + 1; ++i)
p[i] = ext;
}

__attribute__((noipa)) void
lshiftrt575 (unsigned long *p, unsigned long *q, int n)
{
  int prec = 575;
  int n1 = n & 63;
  int n2 = n / 64;
  int n3 = n1 != 0;
  int n4 = (-n1) & 63;
  unsigned long ext;
  int i;
  for (i = n2; i < prec / 64 - n3; ++i)
p[i - n2] = (q[i] >> n1) | (q[i + n3] << n4);
  ext = q[prec / 64] & ((1UL << (prec % 64)) - 1);
  if (n1 && i == prec / 64 - n3)
{
  p[i - n2] = (q[i] >> n1) | (ext << n4);
  ++i;
}
  i -= n2;
  p[i] = ext >> n1;
  ext = 0;
  for (++i; i < prec / 64 + 1; ++i)
p[i] = 0;
}
(for _BitInt(575) and 64-bit limb little endian).  If the shift count is
constant, it will allow further optimizations,
and if e.g. get_nonzero_bits tells us that n is variable but multiple of limb
precision, we can optimize some more as well.
Looking at what LLVM does, they seem to sign extend in memory to twice as many
bits and then just use an unrolled loop without any conditionals, but that
doesn't look well for memory usage etc.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55386|0   |1
is obsolete||

--- Comment #69 from Jakub Jelinek  ---
Created attachment 55392
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55392=edit
gcc14-bitint-wip.patch

Now with some runtime test coverage for +,-,|,^,&,~,<

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-22 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55376|0   |1
is obsolete||

--- Comment #68 from Jakub Jelinek  ---
Created attachment 55386
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55386=edit
gcc14-bitint-wip.patch

Further progress, this handles also constants, left shifts by small amount (0
to limb_prec - 1) and ==/!= comparisons and calls.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-20 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55364|0   |1
is obsolete||

--- Comment #67 from Jakub Jelinek  ---
Created attachment 55376
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55376=edit
gcc14-bitint-wip.patch

Further update which handles additions/subtractions/negations.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-19 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55329|0   |1
is obsolete||

--- Comment #66 from Jakub Jelinek  ---
Created attachment 55364
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55364=edit
gcc14-bitint-wip.patch

Updated patch.  This can already do some simple lowering of the large/huge
_BitInt operations, like:
void
foo (_BitInt(192) *x, _BitInt(192) *y, _BitInt(135) *z, _BitInt(135) *w)
{
  x[0] &= y[0];
  x[1] |= y[1];
  x[2] ^= y[2];
  x[3] = ~y[3];
  z[0] &= w[0];
  z[1] |= w[1];
  z[2] ^= w[2];
  z[3] = ~w[3];
}

_BitInt(517) a, b, c, d, e, f;

void
bar (void)
{
  a &= b;
  c |= b;
  d ^= b;
  e = ~f;
}

Additions/subtractions/left shift by small constant next.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55327|0   |1
is obsolete||

--- Comment #65 from Jakub Jelinek  ---
Created attachment 55329
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55329=edit
gcc14-bitint-wip.patch

Sorry for false alarm, that has been my screw-up on the coalescing side, now
fixed.
Here is an updated version, which already creates the temporary variables for
each of the partitions, so next step will be start implementing the operations.
One thing to figure out I have are loads from memory into large/huge _BitInt. 
I think
we could in that case avoid copying into a temporary VAR_DECL if we can prove
that in all the use stmts of them the memory they are loading from couldn't be
clobbered (and for the case of a loop merging multiple operations together the
last statement from those), but those statements might very well not have vops,
so unsure how to find out
the current vop SSA_NAME so that I can ask alias oracle.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55240|0   |1
is obsolete||
  Attachment #55244|0   |1
is obsolete||

--- Comment #64 from Jakub Jelinek  ---
Created attachment 55327
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55327=edit
gcc14-bitint-wip.patch

Some further progress.  I found that out of SSA coalescing coalesces only a
very small subset of SSA_NAMEs, for _BitInt we need to coalesce significantly
more, try to use as few VAR_DECL arrays as possible so that we don't blow away
stack sizes.

So, I'm trying to find the large/huge _BitInt SSA_NAMEs, quickly find out some
which won't be needed as they could be handled inside of a single loop (to be
improved later) and then doing aggressive coalesing on those and eventually map
those SSA_NAMEs to VAR_DECLs.

On
void
foo (_BitInt(192) *x, _BitInt(192) *y, _BitInt(135) *z, _BitInt(135) *w)
{
  _BitInt(192) a;
  if (x[0] == y[0])
a = 123wb;
  else if (x[0] == y[1])
a = y[2];
  else if (x[0] == y[2])
a = y[3];
  else
a = 0wb;
  x[4] = a;
  x[5] = x[0] == y[0] ? x[6] : x[0] == y[1] ? x[7] : x[0] == y[2] ? x[8] :
x[9];
  x[0] &= y[0];
  x[1] |= y[1];
  x[2] ^= y[2];
  x[3] = ~y[3];
  z[0] &= w[0];
  z[1] |= w[1];
  z[2] ^= w[2];
  z[3] = ~w[3];
}
I'm seeing weird results though, e.g.
  _1 = *x_32(D);
  _2 = *y_33(D);
  if (_1 == _2)
but
After Coalescing:

Partition map

Partition 0 (_1 - 1 2 3 4 5 6 7 8 10 11 13 14 16 29 30 34 35 37 38 39 40 )
Partition 1 (_9 - 9 )
Partition 2 (_12 - 12 )
Partition 3 (_15 - 15 )
Partition 4 (_17 - 17 )
Partition 5 (_18 - 18 19 21 22 24 25 27 )
Partition 6 (_20 - 20 )
Partition 7 (_23 - 23 )
Partition 8 (_26 - 26 )
Partition 9 (_28 - 28 )
Partition 10 (x_32(D) - 32 )
Partition 11 (y_33(D) - 33 )
Partition 12 (z_46(D) - 46 )
Partition 13 (w_47(D) - 47 )

Obviously, _1 and _2 need to conflict because they have overlapping live ranges
(sure, later on loads from memory should be handled in a smarter way, no need
to copy it into another array if at the point of a single use within the same
bb (at least) the memory couldn't be clobbered yet).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #63 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #62)
> What the patch including incremental one currently does is:
> 1) small _BitInt (on x86-64 N <= 64) - the BITINT_TYPEs are kept as is in
> the IL
>and expanded, they always have non-BLKmode (QI/HI/SI/DI) and are handled
> like any
>other INTEGER_TYPEs (except preserved in calls to ensure correct ABI
> passing)
> 2) middle _BitInt (on x86-64 N <= 128) - I keep in the IL just copy
> operations and
>casts between them and INTEGER_TYPE (TImode in this case), actual
> arithmetics is
>done on the INTEGER_TYPE
> 3) large _BitInt (on x86-64 that will be <= 255) - the intent was using
>BIT_FIELD_REFs/BIT_INSERT_EXPR to make the IL simple and perform stuff on
> the
>up to 4 limbs in this case in straight line code

So these large _BitInt already have BLKmode?  If so I'd suggest to
initially handle them like the huge _BitInt code but "unrolled" and
iterate on the code-gen later - I can have a look once one can play
with the actual code and testcases.

> 4) huge _BitInt (on x86-64 N > 255) use loops, VAR_DECL destination,
> VCE+ARRAY_REF
>on the sources,
>dunno yet if I can get good code by making the VAR_DECL clobbered
> immediately after
>I load a SSA_NAME from it (whether out of SSA/expansion could then extend
> the
>lifetime of the VAR_DECL, or if I should have some pass do that, or the
> bitint pass
>figure out the last use and put clobber only after that, or even replace
> the SSA_NAME
>uses with accesses to VAR_DECL
> 
> Anyway, I think I'll work now on the add/sub with carry now and continue on
> _BitInt only after that.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #62 from Jakub Jelinek  ---
What the patch including incremental one currently does is:
1) small _BitInt (on x86-64 N <= 64) - the BITINT_TYPEs are kept as is in the
IL
   and expanded, they always have non-BLKmode (QI/HI/SI/DI) and are handled
like any
   other INTEGER_TYPEs (except preserved in calls to ensure correct ABI
passing)
2) middle _BitInt (on x86-64 N <= 128) - I keep in the IL just copy operations
and
   casts between them and INTEGER_TYPE (TImode in this case), actual
arithmetics is
   done on the INTEGER_TYPE
3) large _BitInt (on x86-64 that will be <= 255) - the intent was using
   BIT_FIELD_REFs/BIT_INSERT_EXPR to make the IL simple and perform stuff on
the
   up to 4 limbs in this case in straight line code
4) huge _BitInt (on x86-64 N > 255) use loops, VAR_DECL destination,
VCE+ARRAY_REF
   on the sources,
   dunno yet if I can get good code by making the VAR_DECL clobbered
immediately after
   I load a SSA_NAME from it (whether out of SSA/expansion could then extend
the
   lifetime of the VAR_DECL, or if I should have some pass do that, or the
bitint pass
   figure out the last use and put clobber only after that, or even replace the
SSA_NAME
   uses with accesses to VAR_DECL

Anyway, I think I'll work now on the add/sub with carry now and continue on
_BitInt only after that.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-05 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #61 from rguenther at suse dot de  ---
On Mon, 5 Jun 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989
> 
> --- Comment #60 from Jakub Jelinek  ---
> (In reply to Richard Biener from comment #59)
> > Oh, so BITINT_TYPE is INTEGRAL_TYPE_P but not INTEGER_TYPE (I think we
> > don't have any BLKmode integer types?).
> 
> Yes.  Some BITINT_TYPEs have BLKmode.
> 
> > I think the intent was to
> > restrict the operation on actual mode entities, BLKmode means memory
> > where it isn't necessary to restrict things.  So you could add
> > a BLKmode exception here (but then for _BitInt<63> you will likely
> > get DImode?)
> 
> Sure, _BitInt<63> has DImode, _BitInt<127> has TImode if it is supported.
> TYPE_MODE is set according to the rules for structures (so that it would help
> with function_arg etc. implementation on some targets), so I think say OImode
> for _BitInt<254> isn't impossible.
> 
> > Can't you use a MEM_REF to extract limb-size INTEGER_TYPE from the
> > _BitInt<> and then operate on those with BIT_FIELD_REF and BIT_INSERT_EXPR?
> > Of course when the whole _BitInt<> is a SSA name MEM_REF won't work
> > (but when you use ARRAY_REF/VIEW_CONVERT the same holds true)
> 
> I wanted to avoid forcing the smaller _BitInt results into VAR_DECLs and only
> do it
> for the ones where I'd use loops (the huge category).
> The plan for loops is to do 2 limbs per iteration initially, plus if there is
> odd number of limbs or even with partial limb 1-2 limbs done after the loop. 
> So, the large
> category where loop isn't used would be up to 3 full limbs or 3 full limbs + 1
> partial.

So for the large case you are not using BIT_FIELD_REF on _BitInt<>?  But
for the small case like _BitInt<63> with DImode you want to do that
and also the variables are likely in SSA form, right?  How is
endianess defined?  Probably per limb?  Consider

unsigned _BitInt<16> a, b;
unsigned _BitInt<32> c;

c = ((_BitInt<32>)a << 16) | (_BitInt<32>)b;

(not sure whether the cast are required).  It's all difficult enough
if you don't need to wrap your heads around padding.  Note that followup
optimization passes will refrain from touching the 
!type_has_mode_precision cases because of padding.  So I think it
would be good to work on full-limb precision for the actual
operations.  It should be possible to VIEW_CONVERT a
_BitInt<63> to _BitInt<64>, aka VIEW_CONVERT to the modes precision
bit-int variant here (or to the actual integer mode integer type
which would be even better).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #60 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #59)
> Oh, so BITINT_TYPE is INTEGRAL_TYPE_P but not INTEGER_TYPE (I think we
> don't have any BLKmode integer types?).

Yes.  Some BITINT_TYPEs have BLKmode.

> I think the intent was to
> restrict the operation on actual mode entities, BLKmode means memory
> where it isn't necessary to restrict things.  So you could add
> a BLKmode exception here (but then for _BitInt<63> you will likely
> get DImode?)

Sure, _BitInt<63> has DImode, _BitInt<127> has TImode if it is supported.
TYPE_MODE is set according to the rules for structures (so that it would help
with function_arg etc. implementation on some targets), so I think say OImode
for _BitInt<254> isn't impossible.

> Can't you use a MEM_REF to extract limb-size INTEGER_TYPE from the
> _BitInt<> and then operate on those with BIT_FIELD_REF and BIT_INSERT_EXPR?
> Of course when the whole _BitInt<> is a SSA name MEM_REF won't work
> (but when you use ARRAY_REF/VIEW_CONVERT the same holds true)

I wanted to avoid forcing the smaller _BitInt results into VAR_DECLs and only
do it
for the ones where I'd use loops (the huge category).
The plan for loops is to do 2 limbs per iteration initially, plus if there is
odd number of limbs or even with partial limb 1-2 limbs done after the loop. 
So, the large
category where loop isn't used would be up to 3 full limbs or 3 full limbs + 1
partial.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #59 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #58)
> (In reply to Richard Biener from comment #57)
> > (In reply to Jakub Jelinek from comment #56)
> > > Created attachment 55244 [details]
> > > gcc14-bitint-wip-inc.patch
> > > 
> > > Incremental patch on top of the above patch.
> > > 
> > > I've tried to make some progress and implement the simplest large _BitInt
> > > cases,
> > > &/|/^/~, but ran into a problem there, both BIT_FIELD_REF and
> > > BIT_INSERT_EXPR disallow
> > > operating on non-mode precisions, while for _BitInt I think it would be
> > > really useful
> > > to use them on the large/huge _BitInts (which I will force into memory
> > > during expansion most likely).  Sure, for huge _BitInts, what is handled 
> > > in
> > > the loop will use either
> > > ARRAY_REF on VIEW_CONVERT_EXPR for operands or TARGET_MEM_REFs on 
> > > VAR_DECLs
> > > for the results in the loop, but even for those there is the partial most
> > > significant limb in some cases that needs to be handled separately.
> > > 
> > > So, do you think it is ok to make an exception for
> > > BIT_FIELD_REF/BIT_INSERT_EXPR and
> > > allow them on non-mode precision BITINT_TYPEs (the incremental patch 
> > > enables
> > > that) plus
> > > handle it during the expansion?
> > 
> > The incremental patch doesn't implement the expansion part, right?  The
> 
> Not yet.
> 
> > problem is that BIT_* are specified to work on the in-memory representation
> > and a non-mode precision entity doesn't have this specified - you'd have
> > to extend / shift that to some mode to be able to store it.
> 
> One thing is that the checking and expansion constraints preclude using it
> even on
> full limbs of a _BitInt which has precision in multiples of limb precision.
> Say _BitInt(192) has on x86-64 3 64-bit limbs, but the type necessarily has
> BLKmode,
> because there are no 192-bit modes.
> If we allowed BIT_FIELD_REF/BIT_INSERT_EXPR on non-type_has_mode_precision_p
> BITINT_TYPEs, perhaps we could restrict it to the cases we really need and
> which can be easily implemented.  That is, they'd need to extract or insert
> bits within the same single limb, making it effectively operate on mode
> precision of the limb for all the limbs other than the most significant
> partial one if any, and in the case of the most significant one it could
> either ignore the padding bits above it or sign/zero extend
> into the padding bits when touching the MSB bit (depending on if target says
> those bits are well defined or undefined).

Oh, so BITINT_TYPE is INTEGRAL_TYPE_P but not INTEGER_TYPE (I think we
don't have any BLKmode integer types?).  I think the intent was to
restrict the operation on actual mode entities, BLKmode means memory
where it isn't necessary to restrict things.  So you could add
a BLKmode exception here (but then for _BitInt<63> you will likely
get DImode?)

Can't you use a MEM_REF to extract limb-size INTEGER_TYPE from the
_BitInt<> and then operate on those with BIT_FIELD_REF and BIT_INSERT_EXPR?
Of course when the whole _BitInt<> is a SSA name MEM_REF won't work
(but when you use ARRAY_REF/VIEW_CONVERT the same holds true)

> > Improving code-gen for add-with carry would be indeed nice, I'm not sure
> > we need the user-visible builtins though, matching the open-coded variants
> > to appropriate IFNs would work.  But can the _OVERFLOW variants not be
> > used here, at least for unsigned?
> 
> Yeah, just noticed the clang builtins are badly designed, see PR79173 for
> that,
> so will try to introduce a new ifns and pattern detect them somewhere.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #58 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #57)
> (In reply to Jakub Jelinek from comment #56)
> > Created attachment 55244 [details]
> > gcc14-bitint-wip-inc.patch
> > 
> > Incremental patch on top of the above patch.
> > 
> > I've tried to make some progress and implement the simplest large _BitInt
> > cases,
> > &/|/^/~, but ran into a problem there, both BIT_FIELD_REF and
> > BIT_INSERT_EXPR disallow
> > operating on non-mode precisions, while for _BitInt I think it would be
> > really useful
> > to use them on the large/huge _BitInts (which I will force into memory
> > during expansion most likely).  Sure, for huge _BitInts, what is handled in
> > the loop will use either
> > ARRAY_REF on VIEW_CONVERT_EXPR for operands or TARGET_MEM_REFs on VAR_DECLs
> > for the results in the loop, but even for those there is the partial most
> > significant limb in some cases that needs to be handled separately.
> > 
> > So, do you think it is ok to make an exception for
> > BIT_FIELD_REF/BIT_INSERT_EXPR and
> > allow them on non-mode precision BITINT_TYPEs (the incremental patch enables
> > that) plus
> > handle it during the expansion?
> 
> The incremental patch doesn't implement the expansion part, right?  The

Not yet.

> problem is that BIT_* are specified to work on the in-memory representation
> and a non-mode precision entity doesn't have this specified - you'd have
> to extend / shift that to some mode to be able to store it.

One thing is that the checking and expansion constraints preclude using it even
on
full limbs of a _BitInt which has precision in multiples of limb precision.
Say _BitInt(192) has on x86-64 3 64-bit limbs, but the type necessarily has
BLKmode,
because there are no 192-bit modes.
If we allowed BIT_FIELD_REF/BIT_INSERT_EXPR on non-type_has_mode_precision_p
BITINT_TYPEs, perhaps we could restrict it to the cases we really need and
which can be easily implemented.  That is, they'd need to extract or insert
bits within the same single limb, making it effectively operate on mode
precision of the limb for all the limbs other than the most significant partial
one if any, and in the case of the most significant one it could either ignore
the padding bits above it or sign/zero extend
into the padding bits when touching the MSB bit (depending on if target says
those bits are well defined or undefined).

> Improving code-gen for add-with carry would be indeed nice, I'm not sure
> we need the user-visible builtins though, matching the open-coded variants
> to appropriate IFNs would work.  But can the _OVERFLOW variants not be
> used here, at least for unsigned?

Yeah, just noticed the clang builtins are badly designed, see PR79173 for that,
so will try to introduce a new ifns and pattern detect them somewhere.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-05 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #57 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #56)
> Created attachment 55244 [details]
> gcc14-bitint-wip-inc.patch
> 
> Incremental patch on top of the above patch.
> 
> I've tried to make some progress and implement the simplest large _BitInt
> cases,
> &/|/^/~, but ran into a problem there, both BIT_FIELD_REF and
> BIT_INSERT_EXPR disallow
> operating on non-mode precisions, while for _BitInt I think it would be
> really useful
> to use them on the large/huge _BitInts (which I will force into memory
> during expansion most likely).  Sure, for huge _BitInts, what is handled in
> the loop will use either
> ARRAY_REF on VIEW_CONVERT_EXPR for operands or TARGET_MEM_REFs on VAR_DECLs
> for the results in the loop, but even for those there is the partial most
> significant limb in some cases that needs to be handled separately.
> 
> So, do you think it is ok to make an exception for
> BIT_FIELD_REF/BIT_INSERT_EXPR and
> allow them on non-mode precision BITINT_TYPEs (the incremental patch enables
> that) plus
> handle it during the expansion?

The incremental patch doesn't implement the expansion part, right?  The
problem is that BIT_* are specified to work on the in-memory representation
and a non-mode precision entity doesn't have this specified - you'd have
to extend / shift that to some mode to be able to store it.

So to extract from or insert into some bit-precision entity you have to
perform this conversion somehow.  Why do you have this anyway?  Is it
really that the ABIs(?) allow for the padding up to limb size of the
partial limb to be not present (aka in unmapped memory?)?  Why can't
you work on the full libm and just "pollute" the padding at will
but then also zero-extending on loads?

> Another thing, started to think about PLUS_EXPR/MINUS_EXPR, we have
> __builtin_ia32_addcarryx_u64/__builtin_ia32_sbb_u64 builtins on x86-64, but
> from what
> I can see don't really pattern recognize even simple add + adc.
> 
> Given:
> void
> foo (unsigned long *p, unsigned long *q, unsigned long *r)
> {
>   unsigned long p0 = p[0], q0 = q[0];
>   unsigned long p1 = p[1], q1 = q[1];
>   unsigned long r0 = p0 + q0;
>   unsigned long r1 = p1 + q1 + (r0 < p0);
>   r[0] = r0;
>   r[1] = r1;
> }
> 
> void
> bar (unsigned long *p, unsigned long *q, unsigned long *r)
> {
>   unsigned long p0 = p[0], q0 = q[0];
>   unsigned long p1 = p[1], q1 = q[1];
>   unsigned long p2 = p[2], q2 = q[2];
>   unsigned long r0 = p0 + q0;
>   unsigned long r1 = p1 + q1 + (r0 < p0);
>   unsigned long r2 = p2 + q2 + (r1 < p1 || r1 < q1);
>   r[0] = r0;
>   r[1] = r1;
>   r[2] = r2;
> }
> 
> llvm seems to pattern recognize foo, but doesn't pattern recognize bar as
> add; adc; adc
> (is that actually a correct C for that though?).
> 
> So, shouldn't we implement the clang's
> https://clang.llvm.org/docs/LanguageExtensions.html#multiprecision-
> arithmetic-builtins
> builtins (add least the __builtin_{add,sub}c{,l,ll} builtins), lower them
> into ifns early (similarly to .{ADD,SUB}_OVERFLOW returning complex integer
> with 2 returns) and add optabs so that targets can implement those
> efficiently?

Improving code-gen for add-with carry would be indeed nice, I'm not sure
we need the user-visible builtins though, matching the open-coded variants
to appropriate IFNs would work.  But can the _OVERFLOW variants not be
used here, at least for unsigned?

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-02 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #56 from Jakub Jelinek  ---
Created attachment 55244
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55244=edit
gcc14-bitint-wip-inc.patch

Incremental patch on top of the above patch.

I've tried to make some progress and implement the simplest large _BitInt
cases,
&/|/^/~, but ran into a problem there, both BIT_FIELD_REF and BIT_INSERT_EXPR
disallow
operating on non-mode precisions, while for _BitInt I think it would be really
useful
to use them on the large/huge _BitInts (which I will force into memory during
expansion most likely).  Sure, for huge _BitInts, what is handled in the loop
will use either
ARRAY_REF on VIEW_CONVERT_EXPR for operands or TARGET_MEM_REFs on VAR_DECLs for
the results in the loop, but even for those there is the partial most
significant limb in some cases that needs to be handled separately.

So, do you think it is ok to make an exception for
BIT_FIELD_REF/BIT_INSERT_EXPR and
allow them on non-mode precision BITINT_TYPEs (the incremental patch enables
that) plus
handle it during the expansion?

Another thing, started to think about PLUS_EXPR/MINUS_EXPR, we have
__builtin_ia32_addcarryx_u64/__builtin_ia32_sbb_u64 builtins on x86-64, but
from what
I can see don't really pattern recognize even simple add + adc.

Given:
void
foo (unsigned long *p, unsigned long *q, unsigned long *r)
{
  unsigned long p0 = p[0], q0 = q[0];
  unsigned long p1 = p[1], q1 = q[1];
  unsigned long r0 = p0 + q0;
  unsigned long r1 = p1 + q1 + (r0 < p0);
  r[0] = r0;
  r[1] = r1;
}

void
bar (unsigned long *p, unsigned long *q, unsigned long *r)
{
  unsigned long p0 = p[0], q0 = q[0];
  unsigned long p1 = p[1], q1 = q[1];
  unsigned long p2 = p[2], q2 = q[2];
  unsigned long r0 = p0 + q0;
  unsigned long r1 = p1 + q1 + (r0 < p0);
  unsigned long r2 = p2 + q2 + (r1 < p1 || r1 < q1);
  r[0] = r0;
  r[1] = r1;
  r[2] = r2;
}

llvm seems to pattern recognize foo, but doesn't pattern recognize bar as add;
adc; adc
(is that actually a correct C for that though?).

So, shouldn't we implement the clang's
https://clang.llvm.org/docs/LanguageExtensions.html#multiprecision-arithmetic-builtins
builtins (add least the __builtin_{add,sub}c{,l,ll} builtins), lower them into
ifns early (similarly to .{ADD,SUB}_OVERFLOW returning complex integer with 2
returns) and add optabs so that targets can implement those efficiently?

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-02 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #55 from Jakub Jelinek  ---
(In reply to rguent...@suse.de from comment #54)
> At least for -Os we probably want to consider moving everything but 
> small and maybe middle to out of line library functions?

Not sure about that, we need to judge the space savings vs. having all those
routines
in libgcc_s.so.1 where the size price would be paid by all processes even when
they don't use the large/huge _BitInt at all.
I certainly plan to have multiplication and division/modulo on libgcc_s.so.1.
Admittedly, some entrypoints could be just in libgcc.a and not libgcc_s.so.1.
Don't we already have a case for that - the DFP stuff?
There are very cheap operations (say bitwise &/|/^/~) which have no
dependencies in between limbs, then some with small dependencies (e.g. +/- or
shifts or rotates by constant), but e.g. already shifts/rotates by variable
count is already going to be ugly at least for the huge ones.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-02 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #54 from rguenther at suse dot de  ---
On Fri, 2 Jun 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989
> 
> Jakub Jelinek  changed:
> 
>What|Removed |Added
> 
>   Attachment #55169|0   |1
> is obsolete||
> 
> --- Comment #53 from Jakub Jelinek  ---
> Created attachment 55240
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55240=edit
> gcc14-bitint-wip.patch
> 
> Further updates.  This introduces a new bitintlower (and bitintlower0) pass,
> categorizes
> _BitInt types into 4 categories (small, which are kept as is as they work out
> of the box, middle, which have already more than one limb, but there exists
> DImode or TImode
> type which is supported and covers the precision, here lowering is done by
> casting to
> INTEGER_TYPE and back, large which is up to double that size (so it will be
> lowered to straight line code) and huge, which will use loops.  The lowering 
> is
> so far implemented for the middle _BitInts.
> Added some runtime testsuite coverage for the small and middle _BitInts (so on
> x86-64 up to 128 bits).

At least for -Os we probably want to consider moving everything but 
small and maybe middle to out of line library functions?

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-06-02 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55169|0   |1
is obsolete||

--- Comment #53 from Jakub Jelinek  ---
Created attachment 55240
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55240=edit
gcc14-bitint-wip.patch

Further updates.  This introduces a new bitintlower (and bitintlower0) pass,
categorizes
_BitInt types into 4 categories (small, which are kept as is as they work out
of the box, middle, which have already more than one limb, but there exists
DImode or TImode
type which is supported and covers the precision, here lowering is done by
casting to
INTEGER_TYPE and back, large which is up to double that size (so it will be
lowered to straight line code) and huge, which will use loops.  The lowering is
so far implemented for the middle _BitInts.
Added some runtime testsuite coverage for the small and middle _BitInts (so on
x86-64 up to 128 bits).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #52 from Jakub Jelinek  ---
(In reply to H.J. Lu from comment #14)
> (In reply to jos...@codesourcery.com from comment #13)
> > https://gitlab.com/x86-psABIs/i386-ABI/-/issues/5 to request such an ABI 
> > for 32-bit x86.  I don't know if there are other psABIs with public issue 
> > trackers where such issues can be filed (but we'll need some sensible 
> > default anyway for architectures where we can't get an ABI properly 
> > specified in an upstream-maintained ABI document).
> 
> ia32 psABI will follow x86-64 psABI.

Is it a good idea to use 64-bit limbs and 64-bit alignment for the ia32 ABI?
I mean, it is fine to use that _BitInt(N) for N 33..64 has
size/alignment/passing of long long, but wonder if for N > 64 the ABI shouldn't
use 32-bit limbs, 32-bit alignments and passing as struct containing the 32-bit
limbs.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #51 from Jakub Jelinek  ---
Note, I've only tested it so far on
_BitInt(256) a = 0x1234ab461289cdab8d111007b461289cdab8d1wb;
_BitInt(256) b = 0x2385eabcd072311074bcaa385eabcd07111007b46128wb;
_BitInt(384) c = (_BitInt(384)) 0x1234ab461289cdab8d111007b461289cdab8d1wb *
0x2385eabcd072311074bcaa385eabcd07111007b46128wb;
_BitInt(384) d;
extern void __mulbitint3 (unsigned long *, int, const unsigned long *, int,
const unsigned long *, int);

void
foo ()
{
  __mulbitint3 (, 384, , 256, , 196);
}
multiplication, nothing else, guess it will be easier to test it when we can
emit from the compiler.  And obviously no testing of the big endian limb
ordering handling until we add some arch that will support it (if we do that at
all).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55151|0   |1
is obsolete||

--- Comment #50 from Jakub Jelinek  ---
Created attachment 55169
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55169=edit
gcc14-bitint-wip.patch

Update, this time with addition of libgcc _BitInt multiplication libcall (but
not really wiring it on the compiler side yet, that would be part of the new
_BitInt lowering pass).

The function currently is
void __mulbitint3 (__bitint_limb *ret, int retprec, const __bitint_limb *u, int
uprec, const __bitint_limb *v, int vprec);
which allows mixing different precisions (at compile time, or at runtime using
the bitint_reduce_prec function); while in GIMPLE before _BitInt lowering pass
MULT_EXPR
will obviously have same precision for result and both operands, the lowering
pass could
spot zero or sign extensions from narrower _BitInts for the operands, or VRP
could figure out smaller ranges of values for the operands.
Negative uprec or vprec would mean the operand is sign extended from precision
-[uv]prec, positive it is zero extended from [uv]prec.
u/v could be the same or overlapping, but as the function writes result before
consuming all inputs, doesn't allow aliasing between operands and return value.
Also, at least in the x86-64 psABI, _BitInt(N) for N < 64 is special and it
isn't expected  this function would be really used for multiplication of such
_BitInts, but of course if say multiplicating say _BitInt(512) by _Bitint(24),
it is expected the lowering pass would push those 24 bits into a 64-bit 64-bit
aligned limb and pass 24 for that operand.
For inputs it assumes bits above precision but still within a limb are
uninitialized (and so zero or sign extends when reading it), for the output it
always writes full limb (with hopefully proper zero/sign extensions).
The implemented algorith is the base school book multiplication, if really
needed, we could do Karatsuba for larger inputs.

What do you think about this API?
Shall I continue and create similar API for divmod?

Also, wonder what to do about _BitInt(N) in __builtin_mul_overflow{,_p}.  One
option would be to say that negative retprec is a request to return a nonzero
result for the overflow case, but wonder how much larger the routine would be
in that case.  Or if we
should have two, one for multiplication and one for multiplication with
overflow checking.  Yet another possibility is to do a dumb thing on the
compiler side, call the multiplication with a temporary result as large that it
would never overflow and check for the overflow on the caller side.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55148|0   |1
is obsolete||

--- Comment #49 from Jakub Jelinek  ---
Created attachment 55151
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55151=edit
gcc14-bitint-wip.patch

Added a testcase with various operations with _BitInt(N) operands and tweaked
c-typeck.cc/fold-const.cc to accept those.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-24 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #48 from rguenther at suse dot de  ---
> Am 24.05.2023 um 16:18 schrieb jakub at gcc dot gnu.org 
> :
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989
> 
> --- Comment #47 from Jakub Jelinek  ---
> But then the pass effectively has to do lifetime analysis of the _BitInt(N) 
> for
> N > 128 etc. SSA_NAMEs and perform the partitioning of those SSA_NAMEs into
> VAR_DECLs/PARM_DECLs/RESULT_DECLs, so that we don't blow away the local stack;
> perhaps as you wrote with some local subgraphs turned into a loop which will
> handle multiple operations together instead of just one operation per loop.
> Or just use different VAR_DECLs but stick in clobbers where they will be dead
> and hope out of ssa can merge those.
> Anyway, more work than I hoped.
> Though, perhaps it can be also done incrementally, with bare minimum first and
> improvements later.

Sure, this is just what I think users will expect.  We don’t have the high
level infrastructure to do this afterwards such as loop fusion and variable
contraction (well, in theory graphite can do it but even there we lack actual
transform bits).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #47 from Jakub Jelinek  ---
But then the pass effectively has to do lifetime analysis of the _BitInt(N) for
N > 128 etc. SSA_NAMEs and perform the partitioning of those SSA_NAMEs into
VAR_DECLs/PARM_DECLs/RESULT_DECLs, so that we don't blow away the local stack;
perhaps as you wrote with some local subgraphs turned into a loop which will
handle multiple operations together instead of just one operation per loop.
Or just use different VAR_DECLs but stick in clobbers where they will be dead
and hope out of ssa can merge those.
Anyway, more work than I hoped.
Though, perhaps it can be also done incrementally, with bare minimum first and
improvements later.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-24 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #46 from rguenther at suse dot de  ---
On Wed, 24 May 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989
> 
> --- Comment #45 from Jakub Jelinek  ---
> Let's consider some simple testcase (where one doesn't really mix different
> _BitInt sizes etc.).
> _BitInt(512)
> foo (_BitInt(512) a, _BitInt(512) b, _BitInt(512) c, _BitInt(512) d)
> {
>   return (a + b) - (c + d);
> }
> With the patch, this now ICEs during expansion, because while we can handle
> copying of even the larger _BitInt vars, we don't handle (nor plan to) +/- 
> etc.
> during expansion for that, it would be in the earlier lowering pass.
> If I'd emit straight line code here, I suppose I could use
> BIT_FIELD_REFs/BIT_INSERT_EXPRs, but if I want loopy code, as you wrote 
> perhaps
> ARRAY_REF on VCE could work fine for the input operands, but dunno what to use
> for the
> result of the operation, forcing it into a VAR_DECL I'm afraid will mean we
> can't coalesce it much, the above would force the 2 + results and 1 - result
> into VAR_DECLs.
> Could we e.g. allow BIT_INSERT_EXPRs or have some new ref for this purpose to
> update a single limb in a BITTYPE_INT SSA_NAME?

I think for complex expressions that involve SSA temporaries the lowering
pass has to be more complex as well and gather as much of the expression
as possible so it can avoid _BitInt typed temporaries but instead create

 for (...)
  {
limb_t tem1 = a[i] + b[i];
limb_t tem2 = c[i] + d[i];
limb_t tem3 = tem1 - tem2;
res[i] = tem3;
  }

but yes, for the result you want to force a VAR_DECL (I suppose
DECL_RESULT for the above example will be one).  I'd probably avoid
rewriting user variables into SSA form and only have temporaries
created by gimplifications in SSA form.  You should be able to use
DECL_NOT_GIMPLE_REG_P to force this and make sure update-address-taken
leaves things this way unless, say, the user variable is only
initialized by a constant?

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #45 from Jakub Jelinek  ---
Let's consider some simple testcase (where one doesn't really mix different
_BitInt sizes etc.).
_BitInt(512)
foo (_BitInt(512) a, _BitInt(512) b, _BitInt(512) c, _BitInt(512) d)
{
  return (a + b) - (c + d);
}
With the patch, this now ICEs during expansion, because while we can handle
copying of even the larger _BitInt vars, we don't handle (nor plan to) +/- etc.
during expansion for that, it would be in the earlier lowering pass.
If I'd emit straight line code here, I suppose I could use
BIT_FIELD_REFs/BIT_INSERT_EXPRs, but if I want loopy code, as you wrote perhaps
ARRAY_REF on VCE could work fine for the input operands, but dunno what to use
for the
result of the operation, forcing it into a VAR_DECL I'm afraid will mean we
can't coalesce it much, the above would force the 2 + results and 1 - result
into VAR_DECLs.
Could we e.g. allow BIT_INSERT_EXPRs or have some new ref for this purpose to
update a single limb in a BITTYPE_INT SSA_NAME?

Now, looking what we do right now, detailed expand dump before emergency dump
shows:
Partition map

Partition 0 (_1 - 1 )
Partition 1 (_2 - 2 )
Partition 2 (_3 - 3 )
Partition 3 (a_4(D) - 4 )
Partition 4 (b_5(D) - 5 )
Partition 5 (c_6(D) - 6 )
Partition 6 (d_7(D) - 7 )
which I believe means it didn't actually coalesce anything at all.  For the
larger BITINT_TYPEs it will be very much desirable to coalesce as much as
possible, given that none of the default def SSA_NAMEs are really use I'd think
ideally we'd do
a += b
c += d
result = a - c

For at least multiplication/division and I assume conversions to/from floating
point (and decimal), we'll need some library calls.
One question is what ABI to use for them, whether to e.g. pass pointer to the
limbs
(and when -fbuilding-libgcc predefine macros on what mode is the limb mode,
whether the limbs are ordered from least significant to most or vice versa,
etc.) and in addition to that precision in bits for each argument and whether
it is zero or sign extended from that, so that we could e.g. handle more
efficiently
_BitInt(16384)
foo (unsigned _BitInt(2048) a, _BitInt(1024) b)
{
  return (_BitInt(16384) a) * b;
}
by passing e.g. _mulwhatever (, 16384, , 2048, , -1024)
where -1024 would mean 1024 bits sign extended, 2048 2048 bits zero extended,
result
is 16384 bits.  And for GIMPLE a question is how to express it before
expansion, whether
we use some ifn that is then lowered.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-24 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #44 from rguenther at suse dot de  ---
On Wed, 24 May 2023, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989
> 
> Jakub Jelinek  changed:
> 
>What|Removed |Added
> 
>   Attachment #55141|0   |1
> is obsolete||
> 
> --- Comment #43 from Jakub Jelinek  ---
> Created attachment 55148
>   --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55148=edit
> gcc14-bitint-wip.patch
> 
> Another update.  This version can emit _BitInt(N) values in non-automatic
> variable initializers, handles passing/returning _BitInt(N) and for N <= 64
> (i.e. what fits into a single limb) from what I can see handling it in GIMPLE
> passes and and even expansion/RTL seems to work.
> Now, as discussed earlier, for N > GET_MODE_PRECISION (limb_mode) I think we
> want to lower it in some pass in between IPA and vectorization.  For N which
> fits into DImode if limb is 32-bit (currently no target does that as we have
> just x86-64 support) or which fits into TImode for 64-bit if TImode is
> supported, I guess we want to map arithmetics
> to TImode arithmetics, for say 2-4x larger emit code for arithmetics (except
> perhaps multiplication/division) inline as straight line code and for even
> larger as loops.
> In the last case, a question is if we could use e.g. TARGET_MEM_REF for the
> variable offset in those loops on the vars even when they aren't
> TREE_ADDRESSABLE (but would force them into memory during expansion).

Note you should use TARGET_MEM_REF only when it describes the actual
addressing mode you want to use.  Otherwise just synthesize ARRAY_REFs
like ARRAY_REF , index> with
an appropriate VLA libm[] array type.

I'd do the lowering right before pass_complete_unrolli and generally
emit loopy form (another pass placement required in the -Og pipeline).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55141|0   |1
is obsolete||

--- Comment #43 from Jakub Jelinek  ---
Created attachment 55148
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55148=edit
gcc14-bitint-wip.patch

Another update.  This version can emit _BitInt(N) values in non-automatic
variable initializers, handles passing/returning _BitInt(N) and for N <= 64
(i.e. what fits into a single limb) from what I can see handling it in GIMPLE
passes and and even expansion/RTL seems to work.
Now, as discussed earlier, for N > GET_MODE_PRECISION (limb_mode) I think we
want to lower it in some pass in between IPA and vectorization.  For N which
fits into DImode if limb is 32-bit (currently no target does that as we have
just x86-64 support) or which fits into TImode for 64-bit if TImode is
supported, I guess we want to map arithmetics
to TImode arithmetics, for say 2-4x larger emit code for arithmetics (except
perhaps multiplication/division) inline as straight line code and for even
larger as loops.
In the last case, a question is if we could use e.g. TARGET_MEM_REF for the
variable offset in those loops on the vars even when they aren't
TREE_ADDRESSABLE (but would force them into memory during expansion).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55094|0   |1
is obsolete||

--- Comment #42 from Jakub Jelinek  ---
Created attachment 55141
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55141=edit
gcc14-bitint-wip.patch

Further progress, _BitInt constants now seem to work (up to the
__BITINT_MAXWIDTH__ limit, currently 575 bits) and folding can fold expressions
involving those.
No code generation yet though.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #41 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #40)
> Created attachment 55094 [details]
> gcc14-bitint-wip.patch
> 
> So, on IRC we've agreed with Richi that given the limits we have in the
> compiler
> (what wide_int/widest_int can represent at most without making the types have
> optional arbitrary length indirect payload, what INTEGER_CST can handle
> (right
> now 255 64-bit limbs) and TYPE_PRECISION limitation (max 65535 precision))
> it would be best to first try to implement _BitInt support with small
> BITINT_MAXWIDTH (in particular, what fits into wide_int, which is e.g. on
> x86_64
> 575 bits) and only when the implementation of that is complete, attempt to
> lift
> up some of the limits (start with the wide_int/widest_int one, INTEGER_CST
> could
> be handled by bumping the 2 counters from 8-bit to 16-bit and killing the
> cache,
> with that we'd be at 65535 as BITINT_MAXWIDTH and whether we'd want to grow
> it
> further is a question).
> 
> This patch implements some WIP, as the testcases show, it can already do
> something, but doesn't have any of the argument/return value passing code
> implemented, nor middle-end needed changes (promoting as much as possible to
> small INTEGER_TYPEs early for small BITINT_TYPEs and adding a lowering pass
> which will turn the larger ones into loops etc.).  Also, wb/uwb constants
> aren't
> really done yet.

Another idea is to have a large BITINT_MAXWIDTH (up to what TYPE_PRECISION
supports) but restrict constant folding to the cases we can represent in
INTEGER_CST.  For the cases where the language requires constant evaluation
we'd then sorry ().  I think we should be able to handle all-ones
encoded and since constant initializers are restricted it should handle
most practical cases already.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

  Attachment #55056|0   |1
is obsolete||

--- Comment #40 from Jakub Jelinek  ---
Created attachment 55094
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55094=edit
gcc14-bitint-wip.patch

So, on IRC we've agreed with Richi that given the limits we have in the
compiler
(what wide_int/widest_int can represent at most without making the types have
optional arbitrary length indirect payload, what INTEGER_CST can handle (right
now 255 64-bit limbs) and TYPE_PRECISION limitation (max 65535 precision))
it would be best to first try to implement _BitInt support with small
BITINT_MAXWIDTH (in particular, what fits into wide_int, which is e.g. on
x86_64
575 bits) and only when the implementation of that is complete, attempt to lift
up some of the limits (start with the wide_int/widest_int one, INTEGER_CST
could
be handled by bumping the 2 counters from 8-bit to 16-bit and killing the
cache,
with that we'd be at 65535 as BITINT_MAXWIDTH and whether we'd want to grow it
further is a question).

This patch implements some WIP, as the testcases show, it can already do
something, but doesn't have any of the argument/return value passing code
implemented, nor middle-end needed changes (promoting as much as possible to
small INTEGER_TYPEs early for small BITINT_TYPEs and adding a lowering pass
which will turn the larger ones into loops etc.).  Also, wb/uwb constants
aren't
really done yet.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-12 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #39 from Richard Biener  ---
(In reply to Jakub Jelinek from comment #38)
> I guess there are other options.
> If we could make wide_int/widest_int non-POD, one option would be to turn
> their storage into a union of the normal small case we use now everywhere
> (i.e. fixed one) and one where the val array is not stored directly in the
> storage but pointed to by some pointer.
> E.g.
> class GTY(()) wide_int_storage
> {
> private:
>   HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
>   unsigned int len;
>   unsigned int precision;
> could be
> private:
>   union { HOST_WIDE_INT val[WIDE_INT_MAX_ELTS]; HOST_WIDE_INT *valp; };
>   unsigned int len;
>   unsigned int precision;
> and decide which one is which based on len > WIDE_INT_MAX_ELTS or something
> similar.
> Or, if we can't affort to make it non-POD, perhaps valp would refer to
> obstack destroyed at the end of each pass or something similar.
> Another problem is with INTEGER_CST (note, if we lower this stuff before
> expansion hopefully we wouldn't need something similar for rtxes).
> Currently INTEGER_CST has:
> /* The number of HOST_WIDE_INTs in an INTEGER_CST.  */
> struct {
>   /* The number of HOST_WIDE_INTs if the INTEGER_CST is accessed in
>  its native precision.  */
>   unsigned char unextended;
>
>   /* The number of HOST_WIDE_INTs if the INTEGER_CST is extended to
>  wider precisions based on its TYPE_SIGN.  */
>   unsigned char extended;
> 
>   /* The number of HOST_WIDE_INTs if the INTEGER_CST is accessed in
>  offset_int precision, with smaller integers being extended
>  according to their TYPE_SIGN.  This is equal to one of the two
>  fields above but is cached for speed.  */
>   unsigned char offset;   
> } int_length;
> Now, this obviously limits the largest representable constants to 0xFF
> HOST_WIDE_INTs,

It might be possible to elide 'offset' given it is just a cache.  Also
'extended' can possibly be computed as well.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-12 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #38 from Jakub Jelinek  ---
I guess there are other options.
If we could make wide_int/widest_int non-POD, one option would be to turn their
storage into a union of the normal small case we use now everywhere (i.e. fixed
one) and one where the val array is not stored directly in the storage but
pointed to by some pointer.
E.g.
class GTY(()) wide_int_storage
{
private:
  HOST_WIDE_INT val[WIDE_INT_MAX_ELTS];
  unsigned int len;
  unsigned int precision;
could be
private:
  union { HOST_WIDE_INT val[WIDE_INT_MAX_ELTS]; HOST_WIDE_INT *valp; };
  unsigned int len;
  unsigned int precision;
and decide which one is which based on len > WIDE_INT_MAX_ELTS or something
similar.
Or, if we can't affort to make it non-POD, perhaps valp would refer to obstack
destroyed at the end of each pass or something similar.
Another problem is with INTEGER_CST (note, if we lower this stuff before
expansion hopefully we wouldn't need something similar for rtxes).
Currently INTEGER_CST has:
/* The number of HOST_WIDE_INTs in an INTEGER_CST.  */
struct {
  /* The number of HOST_WIDE_INTs if the INTEGER_CST is accessed in
 its native precision.  */
  unsigned char unextended;

  /* The number of HOST_WIDE_INTs if the INTEGER_CST is extended to
 wider precisions based on its TYPE_SIGN.  */
  unsigned char extended;

  /* The number of HOST_WIDE_INTs if the INTEGER_CST is accessed in
 offset_int precision, with smaller integers being extended
 according to their TYPE_SIGN.  This is equal to one of the two
 fields above but is cached for speed.  */
  unsigned char offset;   
} int_length;
Now, this obviously limits the largest representable constants to 0xFF
HOST_WIDE_INTs,
i.e. at most 16320 bits.  We have 8 spare bits there, so one possibility would
be to add a flag there and if that flag is true, ignore
int_length.{unextended,extended,offset} fields and instead stick that info
somewhere into the val array.
Or kill TREE_INT_CST_OFFSET_NUNITS (replace it with
TREE_INT_CST_EXT_NUNITS (t) <= OFFSET_INT_ELTS ? TREE_INT_CST_EXT_NUNITS (t) :
TREE_INT_CST_NUNITS (t)) and turn unextended/extended into unsigned short.
Then we can handle at most _BitInt(4194240), slightly more than 2 times lower
than
what LLVM chose, I guess that would be still acceptable.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-11 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #37 from joseph at codesourcery dot com  ---
If _BitInt constants aren't INTEGER_CST, then all places that expect that 
any integer constant expression is folded to an INTEGER_CST will need 
updating to handle whatever tree code is used for _BitInt constants.  (In 
some places that may be needed for correctness, in other places - where a 
large value wouldn't actually be valid - only for proper diagnostics about 
an invalid value, if INTEGER_CST is still used for smaller _BitInt 
constants.)

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-11 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #36 from Jakub Jelinek  ---
Created attachment 55056
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55056=edit
gcc14-bitint-wip.patch

Just WIP on the top of the above patch, which does parsing of the _BitInt type
specifier in C and introduces BITINT_TYPE (I'm afraid we can't use INTEGER_TYPE
for that, both because it can have different calling/returning convention in
different ABIs and because we need more than 16-bit precision for it as well),
but still doesn't use it (right where it would create it stops for now and
pretends it is integer).
I've added also wb/WB suffix parsing on the libcpp side, but that is where I
stopped today.  Obviously for CPP_N_BITINT we need different interpretation of
the number because cpp_interpret_integer can handle at most 128-bit integers
(and of course even for the integers that fit into 128-bit with wb/WB suffixes
we also want to use the right type; but I guess we can use INTEGER_CSTs for
them).  I'm afraid we'll need some other TREE_CODE for bit-precise integer
constants which don't fit into widest_int (perhaps better for all that don't
fit into 128 bits), because the amount of code that assumes wi::to_widest works
on INTEGER_CSTs is huge.  As I said earlier, I think something during
gimplification or soon after it could remap small _BitInts (up to 128-bit resp.
64-bit when TImode isn't supported) to normal integral types except on the
function boundaries (where ABI conventions can result in different rules for
them), but probably we can't make INTEGER_TYPE <-> BITINT_TYPE conversions
useless because _BitInt could be e.g. passed to varargs.
Looking at what clang does, they seem to have raised the limit from 128 to
8388608, but in many cases they emit extremely terrible code.  Everything is
done without library support inline and even for huge numbers it doesn't even
use any loops, so is extremely cache unfriendly.  I think we should do
something like that solely for very small cases, otherwise use loops and either
let normal unrolling do its job, or say do 4 limbs in the loop body at a time
or something similar.  And would be nice if the ranger could at least discover
ranges of how many real bits each SSA_NAME can contain (with bits above those
being zero or sign extended) so that we could use more efficient
additions/subtractions/multiplications/divisions etc.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-05-11 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #35 from Jakub Jelinek  ---
Created attachment 55055
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55055=edit
gcc14-set-precision.patch

Untested preparation patch which prepares fo the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989#c25
idea of keeping 16-bit precision for all types but the new bit-precise integer
types and 32-bit precision.  Unfortunately it isn't just starting to use
SET_TYPE_PRECISION when it is used as an lvalue, but unfortunately the current
TYPE_PRECISION definition which is a unsigned:16 non-static data member for
-Wsign-compare acts as either signed or unsigned int and no warning is emitted,
while even if the new larger precision in some types was unsigned:31, using
those two options in a conditional leads to -Wsign-compare warnings because all
of sudden the macro is considered to be either int or unsigned depending on how
exactly it is defined.  There are more -Wsign-compare warnings if
TYPE_PRECISION is signed int than when it is unsigned int, so I want to
implement the latter and this patch also adjusts all spots I've noticed to
avoid the -Wsign-compare warnings.  Precision is never negative...

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-04-12 Thread george at bott dot gg via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

George Bott  changed:

   What|Removed |Added

 CC||george at bott dot gg

--- Comment #34 from George Bott  ---
I am currently using clangs support for up to 256 bit integers for crypto
related use cases and also non-power of 2 integers such as 160 bits. These are
not just used as storage, we are performing integer math on them and using the
__builtin_checked family of functions. I understand that the standard family of
checked functions that replace these builtin functions will be used instead
when implemented on clang. 

Limiting this to 128 bit, while being standard complaint, would not allow us to
compile on GCC.

Thanks

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2023-04-09 Thread leni536 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Lénárd Szolnoki  changed:

   What|Removed |Added

 CC||leni536 at gmail dot com

--- Comment #33 from Lénárd Szolnoki  ---
(In reply to jos...@codesourcery.com from comment #32)
> On Fri, 28 Oct 2022, jakub at gcc dot gnu.org via Gcc-bugs wrote:
> 
> > > That said, if C allows us to limit to 128bits then let's do that for now.
> > > 32bit targets will still see all the complication when we give that a 
> > > stab.
> > 
> > I'm afraid once we define BITINT_MAXWIDTH, it will become part of the ABI, 
> > so
> > we can't increase it afterwards.
> 
> I don't think it's part of the ABI; I think it's always OK to increase 
> BITINT_MAXWIDTH, as long as the wider types don't need more alignment than 
> the previous choice of max_align_t.

It's not part of the ABI until people put _BitInt(BITINT_MAXWIDTH) on ABI
boundaries of their libraries. If a ridiculously large BITINT_MAXWIDTH does
nothing more than discourages usages of _BitInt(BITINT_MAXWIDTH) in general,
than that's already great. We don't need an other intmax.

Also I don't want to think about the max N for _BitInt(N), similarly how I
don't want to think about the max N for int[N]. There might be implementation
limits, but it should be high enough so I don't have to think about those for
everyday coding.

> Thus, starting with a 128-bit limit (or indeed a 64-bit limit on 32-bit 
> platforms, so that all the types fix within existing modes supported for 
> arithmetic), and adding support for wider _BitInt later, would be a 
> reasonable thing to do.

I disagree.

> (You still have ABI considerations even with such a limit: apart from the 
> padding question, on x86_64 the ABI says _BitInt(128) is 64-bit aligned 
> but __int128 is 128-bit aligned.)
> 
> > Anyway, I'm afraid we probably don't have enough time to implement this
> > properly in stage1, so might need to target GCC 14 with it.  Unless somebody
> > spends on it
> > the remaining 2 weeks full time.
> 
> I think https://gcc.gnu.org/pipermail/gcc/2022-October/239704.html is 
> still current as a list of C2x language features likely not to make it 
> into GCC 13.  (I hope to get auto and constexpr done in the next two 
> weeks, and the other C2x language features not on that list are done.)

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-28 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #32 from joseph at codesourcery dot com  ---
On Fri, 28 Oct 2022, jakub at gcc dot gnu.org via Gcc-bugs wrote:

> > That said, if C allows us to limit to 128bits then let's do that for now.
> > 32bit targets will still see all the complication when we give that a stab.
> 
> I'm afraid once we define BITINT_MAXWIDTH, it will become part of the ABI, so
> we can't increase it afterwards.

I don't think it's part of the ABI; I think it's always OK to increase 
BITINT_MAXWIDTH, as long as the wider types don't need more alignment than 
the previous choice of max_align_t.

Thus, starting with a 128-bit limit (or indeed a 64-bit limit on 32-bit 
platforms, so that all the types fix within existing modes supported for 
arithmetic), and adding support for wider _BitInt later, would be a 
reasonable thing to do.

(You still have ABI considerations even with such a limit: apart from the 
padding question, on x86_64 the ABI says _BitInt(128) is 64-bit aligned 
but __int128 is 128-bit aligned.)

> Anyway, I'm afraid we probably don't have enough time to implement this
> properly in stage1, so might need to target GCC 14 with it.  Unless somebody
> spends on it
> the remaining 2 weeks full time.

I think https://gcc.gnu.org/pipermail/gcc/2022-October/239704.html is 
still current as a list of C2x language features likely not to make it 
into GCC 13.  (I hope to get auto and constexpr done in the next two 
weeks, and the other C2x language features not on that list are done.)

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-28 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #31 from joseph at codesourcery dot com  ---
On Fri, 28 Oct 2022, rguenth at gcc dot gnu.org via Gcc-bugs wrote:

> I wouldn't go with a new tree code, given semantics are INTEGER_TYPE it should
> be an INTEGER_TYPE.

Implementation note in that case: bit-precise integer types aren't allowed 
as underlying types for enums, so the code in 
c-parser.cc:c_parser_enum_specifier checking underlying types:

  else if (TREE_CODE (specs->type) != INTEGER_TYPE
   && TREE_CODE (specs->type) != BOOLEAN_TYPE)
{
  error_at (enum_loc, "invalid % underlying type");

would then need to check that the type isn't a bit-precise type.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-28 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #30 from Andrew Pinski  ---
I have an use case until 1k except I don't need division. It will in handy
while translating P4 language (https://p4.org/p4-spec/docs/P4-16-v-1.2.3.html)
to C. P4 supports any bit size you want and there are some uses for > 128 for
crypto; usually just a storage area for the key at that point.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-28 Thread colomar.6.4.3 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #29 from Alejandro Colomar  ---
Hi!

On 10/28/22 12:51, rguenther at suse dot de wrote:
> Quite likely yes (OTOH __BIGGEST_ALIGNMENT__ changed as well).  That
> also means BITINT_MAXWIDTH should eventually be decided by the ABI
> groups?
> 
> I also can hardly see any use for very big N other than "oh, cool".  I
> mean, we don't have _Float(N) either for N == 65000 even though what
> would be cool as well.

I do have a use.  Okay, I don't need 8M bits, but 1k is something that would 
help me.  Basically, it's a transparent bignum library, for which I can use
most 
standard C features.  BTW, it would also be nice if stdc_count_ones(3) would be 
implemented to support very wide _BitInt()s as an extension (C23 only
guarantees 
support for _BitInt()s that match a standard or extended type).

I have some program that works with matrices of 512x512, represented as arrays 
of 512 members of uint64_t[8], and it popcounts rows, which now means looping 
over an array of uint64_t[8] and using the builtin popcount.  And I'm not sure 
if I could still optimize it a little bit more.  If I could just call the 
type-generic stdc_count_ones(), and know that the implementation has written a 
quite optimal loop, that would be great (both for simplicity and performance).

Cheers,

Alex

> 
>> Anyway, I'm afraid we probably don't have enough time to implement this
>> properly in stage1, so might need to target GCC 14 with it.  Unless somebody
>> spends on it
>> the remaining 2 weeks full time.
> 
> It's absolutely a GCC 14 task given the ABI and library issue.
>

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-28 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #28 from rguenther at suse dot de  ---
On Fri, 28 Oct 2022, jakub at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989
> 
> --- Comment #27 from Jakub Jelinek  ---
> (In reply to Richard Biener from comment #26)
> > Does the C standard limit the number of bits?  Does it allow
> > implementation defined limits?
> 
> The latter.  limits.h defines BITINT_MAXWIDTH, which must be at least as large
> as number of bits in unsigned long long.  AFAIK LLVM plans 8388608 maximum 
> (but
> due to the missing library support uses 128 as maximum right now).
> 
> > Constants are tricky indeed but I suppose there's no way to write a
> > 199 bit integer constant in source?  We can always resort to constants
> > of the intfast_t[n] representation (aka a CTOR).
> 
> One can specify even very large constants in the source.
> 123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789uwb
> will be _BitInt with the minimum number of bits to store the above unsigned
> constant.
> 
> > That said, if C allows us to limit to 128bits then let's do that for now.
> > 32bit targets will still see all the complication when we give that a stab.
> 
> I'm afraid once we define BITINT_MAXWIDTH, it will become part of the ABI, so
> we can't increase it afterwards.

Quite likely yes (OTOH __BIGGEST_ALIGNMENT__ changed as well).  That
also means BITINT_MAXWIDTH should eventually be decided by the ABI
groups?

I also can hardly see any use for very big N other than "oh, cool".  I 
mean, we don't have _Float(N) either for N == 65000 even though what
would be cool as well.

> Anyway, I'm afraid we probably don't have enough time to implement this
> properly in stage1, so might need to target GCC 14 with it.  Unless somebody
> spends on it
> the remaining 2 weeks full time.

It's absolutely a GCC 14 task given the ABI and library issue.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-28 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #27 from Jakub Jelinek  ---
(In reply to Richard Biener from comment #26)
> Does the C standard limit the number of bits?  Does it allow
> implementation defined limits?

The latter.  limits.h defines BITINT_MAXWIDTH, which must be at least as large
as number of bits in unsigned long long.  AFAIK LLVM plans 8388608 maximum (but
due to the missing library support uses 128 as maximum right now).

> Constants are tricky indeed but I suppose there's no way to write a
> 199 bit integer constant in source?  We can always resort to constants
> of the intfast_t[n] representation (aka a CTOR).

One can specify even very large constants in the source.
123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789123456789uwb
will be _BitInt with the minimum number of bits to store the above unsigned
constant.

> That said, if C allows us to limit to 128bits then let's do that for now.
> 32bit targets will still see all the complication when we give that a stab.

I'm afraid once we define BITINT_MAXWIDTH, it will become part of the ABI, so
we can't increase it afterwards.
Anyway, I'm afraid we probably don't have enough time to implement this
properly in stage1, so might need to target GCC 14 with it.  Unless somebody
spends on it
the remaining 2 weeks full time.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-28 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #26 from Richard Biener  ---
Some random comments.

I wouldn't go with a new tree code, given semantics are INTEGER_TYPE it should
be an INTEGER_TYPE.  The TYPE_PRECISION issue is real - we have 16 spare bits
in tree_type_common so we could possibly afford to make it 16 bits.  Does the C
standard limit the number of bits?  Does it allow implementation defined
limits?

As of SSA representation and "lowering" this feels much like Middle-End Array
Expressions in the end.  I agree that first and foremost we should have
the types as registers but then we can simply lower early to a representation
supported by the target?  AKA make _BitInt(199) intfast_t[n] with appropriate
'n' and lower all accesses, doing arithmetic either via builtins or
internal functions on the whole object.

Constants are tricky indeed but I suppose there's no way to write a
199 bit integer constant in source?  We can always resort to constants
of the intfast_t[n] representation (aka a CTOR).

That said, if C allows us to limit to 128bits then let's do that for now.
32bit targets will still see all the complication when we give that a stab.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-26 Thread joseph at codesourcery dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #25 from joseph at codesourcery dot com  ---
On Wed, 26 Oct 2022, jakub at gcc dot gnu.org via Gcc-bugs wrote:

> Seems LLVM currently only supports _BitInt up to 128, which is kind of useless
> for users, those sizes can be easily handled as bitfields and performing 
> normal
> arithmetics on them.

Well, it would be useful for users of 32-bit targets who want 128-bit 
arithmetic, since we only support __int128 for 64-bit targets.

> As for implementation, I'd like to brainstorm about it a little bit.
> I'd say we want a new tree code for it, say BITINT_TYPE.

OK.  The signed and unsigned types of each precision do need to be 
distinguished from all the existing kinds of integer types (including the 
ones used for bit-fields: _BitInt types aren't subject to integer 
promotions, whereas bit-fields narrower than int are).

In general the types operate like integer types (in terms of allowed 
operations etc.) so INTEGRAL_TYPE_P would be true for them.  The main 
difference at front-end level is the lack of integer promotions, so that 
arithmetic can be carried out directly on narrower-than-int operands (but 
a bit-field declared with a _BitInt type gets promoted to that _BitInt 
type, e.g. unsigned _BitInt(7):2 acts as unsigned _BitInt(7) in 
arithmetic).

Unlike the bit-field types, there's no such thing as a signed _BitInt(1); 
signed bit-precise integer types must havet least two bits.

> TYPE_PRECISION unfortunately is only 10-bit, that is not enough, so it 
> would need the full precision to be specified somewhere else.

That may complicate things because of code expecting TYPE_PRECISION to be 
meaningful for all integer types.  But that could be addressed without 
needing to review every use of TYPE_PRECISION by e.g. changing 
TYPE_PRECISION to check wherever the _BitInt precision is specified, and 
instead using e.g. TYPE_RAW_PRECISION for direct access to the tree field 
(so only lvalue uses of TYPE_PRECISION would then need updating, other 
accesses would automatically get the full precision).

> And have targetm specify the ABI
> details (size of a limb (which would need to be exposed to libgcc with
> -fbuilding-libgcc), unless it is everywhere the same whether the limbs are
> least significant to most significant or vice versa, and whether the highest
> limb is sign/zero extended or unspecified beyond the precision.

I haven't seen an ABI specified for any architecture supporting big-endian 
yet, but I'd tend to expect such architectures to use big-endian ordering 
for the _BitInt representation to be consistent with existing integer 
types.

> What about the large ones?

I think we can at least slightly simplify things by assuming for now 
_BitInt multiplication / division / modulo are unlikely to be used much 
for arguments large enough that Karatsuba or asymptotically faster 
algorithms become relevant; that is, that naive quadratic-time algorithms 
are sufficient for those operations.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-26 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #24 from Jonathan Wakely  ---
(In reply to Jakub Jelinek from comment #23)
> What about the large ones?  Say for arbitrary size generic vectors we keep
> them in SSA form until late (generic vector lowering) and at that point
> lower, perhaps we could do the same for _BitInt?  The unary as well as most
> of binary operations  can be handled by simple loops over extraction of
> limbs from the large number, then there is multiplication and
> division/modulo.  I think the latter is why LLVM restricts it to 128 bits
> right now,

Right.

> https://gcc.gnu.org/pipermail/gcc/2022-May/thread.html#238657
> was an proposal from the LLVM side but I don't see it being actually further
> developed and don't see it on LLVM trunk.

I think work on it stalled after that thread. See also
https://discourse.llvm.org/t/rfc-add-support-for-division-of-large-bitint-builtins-selectiondag-globalisel-clang/60329/

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-26 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

Jakub Jelinek  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org

--- Comment #23 from Jakub Jelinek  ---
Seems LLVM currently only supports _BitInt up to 128, which is kind of useless
for users, those sizes can be easily handled as bitfields and performing normal
arithmetics on them.
As for implementation, I'd like to brainstorm about it a little bit.
I'd say we want a new tree code for it, say BITINT_TYPE.  TYPE_PRECISION
unfortunately is only 10-bit, that is not enough, so it would need the full
precision to be specified somewhere else.  And have targetm specify the ABI
details (size of a limb (which would need to be exposed to libgcc with
-fbuilding-libgcc), unless it is everywhere the same whether the limbs are
least significant to most significant or vice versa, and whether the highest
limb is sign/zero extended or unspecified beyond the precision.
We'll need to handle the wide constants somehow, but we have a problem with
wide ints that widest_int is not wide enough to handle arbitrarily long
constants.
Shall the type be a GIMPLE reg type?
I assume for _BitInt <= 128 (or when TImode isn't supported <= 64) we just want
to keep the new type on the function parameter/return value boundaries and use
INTEGER_TYPEs from say gimplification.
What about the large ones?  Say for arbitrary size generic vectors we keep them
in SSA form until late (generic vector lowering) and at that point lower,
perhaps we could do the same for _BitInt?  The unary as well as most of binary
operations  can be handled by simple loops over extraction of limbs from the
large number, then there is multiplication and division/modulo.  I think the
latter is why LLVM restricts it to 128 bits right now,
https://gcc.gnu.org/pipermail/gcc/2022-May/thread.html#238657
was an proposal from the LLVM side but I don't see it being actually further
developed and don't see it on LLVM trunk.
I wonder if for these libgcc APIs (and, is just __divmod/__udivmod enough, or
do we want also multiplication, or for -Os purposes also other APIs?) it
wouldn't be better to have more GMP/mpn like APIs where we don't specify number
of limbs like in the above thread, but number of bits and perhaps don't specify
it just for one argument but for multiple, so that we can then for the lowering
match sign/zero extensions of the arguments and can handle say _BitInt(2048) /
_BitInt(16) efficiently.
Thoughts on this?

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-26 Thread uweigand at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #22 from Ulrich Weigand  ---
(In reply to Jakub Jelinek from comment #15)
> PowerPC I think does, not sure about s390.

For s390x see here:
https://github.com/IBM/s390x-abi

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #21 from Segher Boessenkool  ---
(In reply to Jakub Jelinek from comment #19)
> (In reply to Segher Boessenkool from comment #16)
> > (In reply to Jakub Jelinek from comment #15)
> > > PowerPC I think does, not sure about s390.
> > 
> > Does what?
> 
> Published psABI which ought to specify how to pass/return _BitInt(N) and
> unsigned _BitInt(N).

psABI is an x86 thing?  But there are various ABIs for PowerPC that have
public documentation, six or so, and GCC has support for most of those.

None of them are "processor specific" (most are OS specific, instead), and
they differ in very fundamental things, in places.  They are much related
as well of course, either because there is an obvious choice, or history.

Many of those ABIs have not seen updates for decades, and are unlikely to
anymore.  OTOH the GCC support for them has been updated over time, there
often is only one sane choice anyway.

We'll make decisions on what ELFv2 will do for _Bitint when it is closer
in time than it is now.  The only interesting choice is whether values in
memory have undefined bits -- and they likely should, simply because all
other padding bits are undefined as well.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-25 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #20 from Segher Boessenkool  ---
(In reply to Andrew Pinski from comment #18)
> (In reply to Segher Boessenkool from comment #16)
> > (In reply to Jakub Jelinek from comment #15)
> > > PowerPC I think does, not sure about s390.
> > 
> > Does what?
> 
> Have a public place to submit issues against the powerpc abis.

Only the ELFv2 ABI really (it's on github).  The rest doesn't have (public)
maintained documents at all.

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-25 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #19 from Jakub Jelinek  ---
(In reply to Segher Boessenkool from comment #16)
> (In reply to Jakub Jelinek from comment #15)
> > PowerPC I think does, not sure about s390.
> 
> Does what?

Published psABI which ought to specify how to pass/return _BitInt(N) and
unsigned _BitInt(N).

[Bug c/102989] Implement C2x's n2763 (_BitInt)

2022-10-25 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102989

--- Comment #18 from Andrew Pinski  ---
(In reply to Segher Boessenkool from comment #16)
> (In reply to Jakub Jelinek from comment #15)
> > PowerPC I think does, not sure about s390.
> 
> Does what?

Have a public place to submit issues against the powerpc abis.

  1   2   >