[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #17 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:c74131e77f1a6b7afe700d3526a8992dc9744b0c

commit r14-9038-gc74131e77f1a6b7afe700d3526a8992dc9744b0c
Author: Jonathan Wakely 
Date:   Wed Feb 7 11:31:10 2024 +

libstdc++: Fix FAIL: 26_numerics/random/pr60037-neg.cc again [PR113961]

PR libstdc++/87744
PR libstdc++/113961

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-16 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #16 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:7f3d900684ad989164114f25eb46a33871c6533d

commit r14-9028-g7f3d900684ad989164114f25eb46a33871c6533d
Author: Jonathan Wakely 
Date:   Wed Feb 7 11:31:10 2024 +

libstdc++: Fix FAIL: 26_numerics/random/pr60037-neg.cc [PR113931]

PR libstdc++/87744
PR libstdc++/113931

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-15 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #15 from GCC Commits  ---
The master branch has been updated by Jonathan Wakely :

https://gcc.gnu.org/g:c9ce332b557bb95987d038d98ea929cdfd1eae1d

commit r14-8998-gc9ce332b557bb95987d038d98ea929cdfd1eae1d
Author: Jonathan Wakely 
Date:   Wed Feb 7 11:31:10 2024 +

libstdc++: Use 128-bit arithmetic for std::linear_congruential_engine
[PR87744]

For 32-bit targets without __int128 we need to implement the LCG
transition function by hand using 64-bit types.

We can also slightly simplify the __mod function by using if-constexpr
unconditionally, disabling -Wc++17-extensions warnings with diagnostic
pragmas.

libstdc++-v3/ChangeLog:

PR libstdc++/87744
* include/bits/random.h [!__SIZEOF_INT128__]
(_Select_uint_least_t):
Define specialization for 64-bit generators with
non-power-of-two modulus and large constants.
(__mod): Use if constexpr unconditionally.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.
* testsuite/26_numerics/random/linear_congruential_engine/87744.cc:
New test.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-07 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #14 from Jonathan Wakely  ---
(In reply to Lewis Fox from comment #12)
> (In reply to Jonathan Wakely from comment #2)
> 
> My original comment about libc++ was in reference to the LLVM bugzilla
> report #27839: https://bugs.llvm.org/show_bug.cgi?id=27839

Thanks, that got copied to github as
https://github.com/llvm/llvm-project/issues/28213

> It looks like the issue you discovered is LLVM bugzilla report #34206:
> https://bugs.llvm.org/show_bug.cgi?id=34206

And that is now https://github.com/llvm/llvm-project/issues/33554

> It seems like since I made that comment here, libc++ has updated to fix the
> misuse of Schrage's algorithm (though, looking at the current source code,
> it still looks wrong to me), so it does mean my initial comment is a little
> out of date.

Unsurprising when it took me more than 5 years to look into it properly ;-)

> This is a bit of an edge case that I don't think most users will encounter,
> so performance is probably less important here than accuracy.

100% agreed

> I'd personally
> prioritize minimizing branches (i.e. improving simplicity) than optimizing
> the operand sizes for performance, but that's just my opinion.

Agreed again, for although as I said in comment 13 I think the extra branch in
operator% might be worthwhile. Maybe with __builtin_expect(__l._M_hi == 0, 0))
as a branch prediction hint.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-07 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #13 from Jonathan Wakely  ---
(In reply to Jakub Jelinek from comment #11)
> Though, if it is common enough, one could try to optimize the __ll[0] == 0
> && __xx[0] == 0 case, one can do then either 32x32->64 or 64x64->64
> multiplication and be done with it.  But if it is rare in random's usage, it
> would just make the code larger.

We will only use this new type when the calculation a*(m-1)+c doesn't fit in 64
bits, and the input values should be uniformly distributed in [0,m) for a good
choice of parameters (and for a bad choice of parameters, you have bigger
problems than the calculation being slow!)

For a small value of a and large value of c we would use this new type but the
multiplication step would overflow infrequently, but I don't think that's a
good choice of parameters and not worth optimizing for.

So I think we should just use the full 64x64 multiplication unconditionally.

For operator% the if (__l._M_hi == 0) branch is probably worth it, because the
general case is so much more expensive.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-07 Thread lrflew.coll at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #12 from Lewis Fox  ---
(In reply to Jonathan Wakely from comment #2)

My original comment about libc++ was in reference to the LLVM bugzilla report
#27839: https://bugs.llvm.org/show_bug.cgi?id=27839

It looks like the issue you discovered is LLVM bugzilla report #34206:
https://bugs.llvm.org/show_bug.cgi?id=34206

It seems like since I made that comment here, libc++ has updated to fix the
misuse of Schrage's algorithm (though, looking at the current source code, it
still looks wrong to me), so it does mean my initial comment is a little out of
date.

Either way, though, this issue wasn't in comparison to libc++, but rather that
libstdc++ seems to contradict the C++ standard. For reference, MSVC doesn't
have a native 128-bit integer type, but still handles these correctly by using
64-bit integer arithmetic (though MSVC could still optimize their
implementation for x86_64 using intrinsics if they wanted to).

This is a bit of an edge case that I don't think most users will encounter, so
performance is probably less important here than accuracy. I'd personally
prioritize minimizing branches (i.e. improving simplicity) than optimizing the
operand sizes for performance, but that's just my opinion.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #11 from Jakub Jelinek  ---
(In reply to Jakub Jelinek from comment #10)
> As discussed on IRC, probably the
> if (!__builtin_mul_overflow(__l._M_lo, __x, &__lo))
> optimization isn't a good idea, because most likely the compiler will in
> that case
> expand that to a full 64x64->128 multiplication plus testing that the upper
> 64 bits are or aren't zero, so computing everything this function needs and
> doing it again if there is overflow, without possibility to query what it
> computed.

Though, if it is common enough, one could try to optimize the __ll[0] == 0 &&
__xx[0] == 0 case, one can do then either 32x32->64 or 64x64->64 multiplication
and be done with it.  But if it is rare in random's usage, it would just make
the code larger.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #10 from Jakub Jelinek  ---
As discussed on IRC, probably the
if (!__builtin_mul_overflow(__l._M_lo, __x, &__lo))
optimization isn't a good idea, because most likely the compiler will in that
case
expand that to a full 64x64->128 multiplication plus testing that the upper 64
bits are or aren't zero, so computing everything this function needs and doing
it again if there is overflow, without possibility to query what it computed.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #9 from Jonathan Wakely  ---
Oops yes!

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #8 from Jakub Jelinek  ---
if (!__l._M_hi)
  {
__l._M_lo %= __m;
return __l;
  }
auto __n = __l._M_hi ? __builtin_clzll(__l._M_hi)
   : 64 + __builtin_clzll(__l._M_lo);
So, on the __n initializer you already know that __l._M_hi is non-zero, no need
to test it again.  Sure, the compiler will optimize it, but the shorter the
method in the header the better.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

Jonathan Wakely  changed:

   What|Removed |Added

  Attachment #57346|0   |1
is obsolete||

--- Comment #7 from Jonathan Wakely  ---
Created attachment 57347
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57347=edit
With Jakub's suggested changes

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #6 from Jonathan Wakely  ---
(In reply to Jakub Jelinek from comment #4)
> Why not just
> __l._M_hi += __builtin_uaddll_overflow(__l._M_lo, __c,
> &__l._M_lo);
> and similarly for subtraction?

No good reason - I'll change it.

> Is the reason for using the clang compat builtins instead of
> __builtin_{add,sub,mul}_overflow the compatibility with clang versions which
> don't
> support these?

Again, no reason, I was just using the ones specific to the type because all
the types are fixed.

Thanks for the suggestions.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

Jonathan Wakely  changed:

   What|Removed |Added

  Attachment #57345|0   |1
is obsolete||

--- Comment #5 from Jonathan Wakely  ---
Created attachment 57346
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57346=edit
Improved operator%

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #4 from Jakub Jelinek  ---
(In reply to Jonathan Wakely from comment #1)
> Created attachment 57345 [details]
> Use 64-bit integers to do 128-bit arithmetic
> 
> This patch defines a custom type that implements the necessary 128-bit
> arithmetic for linear_congruential_engine without __int128.
> 
> It's an order of magnitude slower than using __int128 natively on x86_64,
> but it gives the right results.

if (__builtin_uaddll_overflow(__l._M_lo, __c, &__l._M_lo))
  __l._M_hi++;

Why not just
__l._M_hi += __builtin_uaddll_overflow(__l._M_lo, __c, &__l._M_lo);
and similarly for subtraction?
Is the reason for using the clang compat builtins instead of
__builtin_{add,sub,mul}_overflow the compatibility with clang versions which
don't
support these?

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #3 from Jonathan Wakely  ---
Based on that, I'm not sure what libc++ does is actually better than what
libstdc++ does today (i.e. refuse to compile if the results would be wrong).

I think the patch above would make sense though. I'm sure it could be
optimized, I was more concerned with getting the results right than with
performance.

The first loop in operator% could just be a single left shift by the difference
in the number of leading zeros between __l and __x. And halve() should just be
a right shift (and "halve" isn't a reserved name).

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #2 from Jonathan Wakely  ---
(In reply to Lewis Fox from comment #0)
> libc++'s
> implementation of linear_congruential_engine (though libc++ incorrectly uses
> Schrage's method).

I haven't checked what libc++ actually does, but it gives consistent results
for -m32 and -m64. However, apart from the first one, its results don't match
what libstdc++'s.

Libstdc++ and Boost give the same results:

864691128455147579
20120904253298372
499698276788149646
1283859513473858339
2211358525164835637
1565794316594927048
433037752037085551
3876224293909804757
842334705173512733
614226679431558764
3068544072891686009
780734117187050918
1963664848033545542
3791874062716605653
38634810437006
1377239172204792920
1808622055437870614
464217244571328665
2604268511046649463
2831603438105397950
3223716036697852745
784694016100218461
1453099037541982349
2259076772827085306
2590915097279144408
3831773502251465123
320081643915990134
2376324005742734684
2037587339862127157
893313058057432898
118956510249458009
973816490706440657
2982490375421836301
1236945357523402079
3905294429470053119
2304284986302062495
2454379621599061616
1553811809817066890
2904663527490058979
2904663527490058979
2904663527490058979
2904663527490058979
2904663527490058979
2904663527490058979
2904663527490058979
etc.

Libc++ gets stuck  oscillating between two values right away:

864691128455147579
1965622972376959002
4035225266123976763
1965622972376959002
4035225266123976763
1965622972376959002
4035225266123976763
1965622972376959002
4035225266123976763
1965622972376959002
4035225266123976763
1965622972376959002
4035225266123976763
1965622972376959002

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2024-02-06 Thread redi at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

--- Comment #1 from Jonathan Wakely  ---
Created attachment 57345
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57345=edit
Use 64-bit integers to do 128-bit arithmetic

This patch defines a custom type that implements the necessary 128-bit
arithmetic for linear_congruential_engine without __int128.

It's an order of magnitude slower than using __int128 natively on x86_64, but
it gives the right results.

[Bug libstdc++/87744] Some valid instantiations of linear_congruential_engine produce compiler errors when __int128 isn't available

2018-10-25 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87744

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2018-10-25
 Ever confirmed|0   |1