from:"pdimov at gmail dot com"

[Bug c++/114986] New: Seemingly incorrect "ignoring packed attribute" warning

2024-05-08 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114986

Bug ID: 114986
   Summary: Seemingly incorrect "ignoring packed attribute"
warning
   Product: gcc
   Version: 14.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following program
```
#include 
#include 

struct uuid
{
std::uint8_t data[ 16 ] = {};
};

struct __attribute__((packed)) X
{
uuid a;
unsigned char b;
unsigned c;
unsigned char d;
};

static_assert( offsetof(X, c) == 17 );
static_assert( sizeof(X) == 22 );
```
(https://godbolt.org/z/WvxjM3eqn)

gives
```
:11:10: warning: ignoring packed attribute because of unpacked non-POD
field 'uuid X::a'
```

However, the attribute is applied, because the static assertions pass.

If `__attribute__((packed))` is removed, the assertions (correctly) fail
(https://godbolt.org/z/hP4oG98fq).

Therefore, the warning seems wrong.

GCC 14, 13, 12 warn; 11 and earlier do not.

[Bug libstdc++/114865] [13/14/15 Regression] std::atomic::compare_exchange_strong seems to hang under GCC 13 for C++11

2024-05-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #19 from Peter Dimov  ---
This should work.

I still don't understand why JF so insisted on all these padding shenanigans.

[Bug libstdc++/114865] [13/14/15 Regression] std::atomic::compare_exchange_strong seems to hang under GCC 13 for C++11

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #13 from Peter Dimov  ---
(In reply to Andrew Pinski from comment #10)
> #if __cplusplus >= 201402L && __has_builtin(__builtin_clear_padding)
> if _GLIBCXX17_CONSTEXPR (__atomic_impl::__maybe_has_padding<_Tp>())
>   __builtin_clear_padding(std::__addressof(_M_i));
> #endif
> 
> So yes it is definitely dependent on C++ level ...
> That is for C++14+ it is working correctly.

Oh, that's the constructor of `atomic`. I thought it was the compiler
initializing the padding in C++14 and above.

I wonder why `__cplusplus >= 201402L` is here.

[Bug libstdc++/114865] [13/14/15 Regression] std::atomic::compare_exchange_strong seems to hang under GCC 13 for C++11

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #11 from Peter Dimov  ---
So, basically, C++14 and above initialize the padding of

```
std::atomic state{{ 0, 0x }};
```

in `main` to zero, which masks the problem in `generate`. (The problem in
`generate` still exists because the assembly is identical - it just doesn't
trigger because the padding is zero. If we manually poke something nonzero into
the padding, it would (ought to) still break.)

Static variables work for the same reason - the padding is guaranteed zero.

[Bug libstdc++/114865] [13/14/15 Regression] std::atomic::compare_exchange_strong seems to hang under GCC 13 on Ubuntu 23.04

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #9 from Peter Dimov  ---
Oh, my mistake. C++14 does mov QWORD, and C++11 does mov WORD.

[Bug libstdc++/114865] [13/14/15 Regression] std::atomic::compare_exchange_strong seems to hang under GCC 13 on Ubuntu 23.04

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #8 from Peter Dimov  ---
(In reply to Andrew Pinski from comment #6)
> No it is dependent on the standard level. C++11 fails but C++14, C++17 and
> C++20 all pass.

That's interesting because I see basically no difference in the generated code
on CE between 11 and 14:

--- "C++-x86-64 gcc 13.2-1 (1).asm" 2024-04-27 03:25:11.149385400 +0300
+++ "C++-x86-64 gcc 13.2-1.asm" 2024-04-27 03:24:59.207244400 +0300
@@ -76,10 +76,11 @@
 jmp .L8
 main:
 pushrbx
+mov eax, 8738
 mov ebx, 1024
 sub rsp, 16
 mov QWORD PTR [rsp], 0
-mov QWORD PTR [rsp+8], 8738
+mov WORD PTR [rsp+8], ax
 .L14:
 mov rdi, rsp
 callgenerate(std::atomic*)

(https://godbolt.org/z/nexn5W4Ph)

[Bug libstdc++/114865] [13/14/15 Regression] std::atomic::compare_exchange_strong seems to hang under GCC 13 on Ubuntu 23.04

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #4 from Peter Dimov  ---
This

https://raw.githubusercontent.com/boostorg/uuid/feature/gcc_pr_114865/test/test_gcc_pr114865.cpp

exhibits the problem for me on GCC 13/14. I'm only seeing the hang with
-std=c++11 -m32 in the CI run because this combination is tested first, but I
believe it's independent of standard level and address model.

[Bug libstdc++/114865] std::atomic::compare_exchange_strong seems to hang under GCC 13 on Ubuntu 23.04

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #2 from Peter Dimov  ---
The issue is also present for GCC 14 on Ubuntu 24.04:

https://github.com/boostorg/uuid/actions/runs/8853249656/job/24313667955

[Bug libstdc++/114865] std::atomic::compare_exchange_strong seems to hang under GCC 13 on Ubuntu 23.04

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

--- Comment #1 from Peter Dimov  ---
> The problem may well be inside libatomic, of course; I have no way to tell.

Narrator: but he did, in fact, have a way to tell.

This is a GHA run with GCC 9 to 13 tested on both Ubuntu 23.04 and Ubuntu
23.10, and only GCC 13 hangs.

https://github.com/boostorg/uuid/actions/runs/8852038822/job/24309866206

And indeed, the codegen from GCC 12 (https://godbolt.org/z/xc7oT76fn) is
radically different from (and much simpler than) the GCC 13 one
(https://godbolt.org/z/eP48xv3Mr).

I think that the problem has been introduced with this commit:

https://github.com/gcc-mirror/gcc/commit/157236dbd621644b3cec50b6cf38811959f3e78c

which, ironically enough, was supposed to improve the handling of types with
padding bits, but broke them entirely.

(I told the committee there was nothing wrong with compare_exchange as
specified, but did they listen to me? To ask the question is to answer it.)

[Bug libstdc++/114865] New: std::atomic::compare_exchange_strong seems to hang under GCC 13 on Ubuntu 23.04

2024-04-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114865

Bug ID: 114865
   Summary: std::atomic::compare_exchange_strong seems to hang
under GCC 13 on Ubuntu 23.04
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

I'm getting weird hangs on Github Actions when using
`std::atomic::compare_exchange_strong` under GCC 13 on Ubuntu 23.04
(only; GCC 12 and earlier on Ubuntu 22.04 and earlier work). `state_type` is
defined as

```
struct state_type
{
std::uint64_t timestamp;
std::uint16_t clock_seq;
};
```

and the code doing the CAS is

```
auto oldst = ps_->load( std::memory_order_relaxed );

for( ;; )
{
auto newst = get_new_state( oldst );

if( ps_->compare_exchange_strong( oldst, newst, std::memory_order_relaxed,
std::memory_order_relaxed ) )
{
state_ = newst;
break;
}
}
```

where `ps` is of type `std::atomic*`.

At a glance, I see nothing immediately wrong with the generated code
(https://godbolt.org/z/8Ee3hrTz8).

However, when I change `state_type` to

```
struct state_type
{
std::uint64_t timestamp;
std::uint16_t clock_seq;
std::uint16_t padding[ 3 ];
};
```
the hangs disappear. This leads me to think that the problem is caused by the
original struct having padding, which isn't being handled correctly for some
reason.

As we know, `std::atomic::compare_exchange_strong` is carefully specified to
take and return `expected` by reference, such that it can both compare the
entire object as if via `memcmp` (including the padding), and return it as if
by `memcpy`, again including the padding. Even though the padding bits of the
initial value returned by the atomic load are unspecified, at most one
iteration of the loop would be required for the padding bits to converge and
for the CAS to succeed.

However, going by the symptoms alone, this doesn't seem to be the case here.

The problem may well be inside libatomic, of course; I have no way to tell.

One GHA run showing the issue is
https://github.com/boostorg/uuid/actions/runs/8821753835, where only the GCC 13
job times out.

[Bug c++/113256] New: False -Wdangling-reference positive

2024-01-06 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113256

Bug ID: 113256
   Summary: False -Wdangling-reference positive
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following code
```
#include 
#include 

template auto bind( M T::* pm, A a )
{
return [=]( auto&& x ) -> M const& { return x.*pm; };
}

template struct arg {};

arg<1> _1;

int main()
{
std::pair pair;
int const& x = bind( &std::pair::first, _1 )( pair );
assert( &x == &pair.first );
}
```
(https://godbolt.org/z/a555MMTqo)

(reduced from a Boost.Bind test case)

causes

```
:16:16: warning: possibly dangling reference to a temporary
[-Wdangling-reference]
   16 | int const& x = bind( &std::pair::first, _1 )( pair );
  |^
```

with GCC 13.2 (but not trunk).

There are indeed two temporaries created in that full expression, but `int
const&` can't possibly bind to any of them.

[Bug libstdc++/113200] std::char_traits::move is not constexpr when the argument is a string literal

2024-01-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113200

--- Comment #4 from Peter Dimov  ---
I didn't notice your subsequent comment, sorry. :-)

[Bug libstdc++/113200] std::char_traits::move is not constexpr when the argument is a string literal

2024-01-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113200

--- Comment #3 from Peter Dimov  ---
I think that the compiler is correct; string literal address comparisons aren't
constant expressions. Clang gives the same error:
https://godbolt.org/z/xPWEf4z63.

[Bug libstdc++/113200] New: std::char_traits::move is not constexpr when the argument is a string literal

2024-01-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113200

Bug ID: 113200
   Summary: std::char_traits::move is not constexpr when the
argument is a string literal
   Product: gcc
   Version: 13.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

```
#include 
#include 

template struct S
{
char data_[ N ];

using traits_type = std::char_traits;

constexpr S( char const* p ): data_{}
{
std::size_t n = traits_type::length( p );

assert( n < N );

traits_type::move( data_, p, n + 1 );
}
};

template S( char const(&)[N] ) -> S;

constexpr S s( "test" );
```
(https://godbolt.org/z/PofY8MP6G)

fails with

```
In file included from
/opt/compiler-explorer/gcc-trunk-20240102/include/c++/14.0.0/string:42,
 from :1:
:22:23:   in 'constexpr' expansion of 'S<5>(((const char*)"test"))'
:16:26:   in 'constexpr' expansion of
'std::char_traits::move(((char*)(&((S<5>*)this)->S<5>::data_)), p, (n +
1))'
/opt/compiler-explorer/gcc-trunk-20240102/include/c++/14.0.0/bits/char_traits.h:423:50:
  in 'constexpr' expansion of '__gnu_cxx::char_traits::move(__s1, __s2,
__n)'
/opt/compiler-explorer/gcc-trunk-20240102/include/c++/14.0.0/bits/char_traits.h:230:20:
error: '(((const __gnu_cxx::char_traits::char_type*)(& s.S<5>::data_)) ==
((const char*)"test"))' is not a constant expression
  230 |   if (__s1 == __s2) // unlikely, but saves a lot of work
  |   ~^~~
```

(Reduced from a similar failure in Boost.StaticString.)

[Bug libstdc++/113099] locale without RTTI uses dynamic_cast before gcc 13.2 or has ODR violation since gcc 13.2

2023-12-24 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113099

--- Comment #10 from Peter Dimov  ---
Maybe the right thing to do is to use dynamic_cast only for virtual inheritance
(either have a trait or check whether static_cast isn't a valid expression),
otherwise static_cast, in both cases (standard and user-defined Facet.)

[Bug libstdc++/113099] locale without RTTI uses dynamic_cast before gcc 13.2 or has ODR violation since gcc 13.2

2023-12-24 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113099

--- Comment #7 from Peter Dimov  ---
You don't necessarily need dynamic_cast because facets are always installed and
obtained by their exact type, not via a reference to base. You can store the
Facet* as given (like shared_ptr does), and return it.

The only reason dynamic_cast is needed here is because you can't static_cast
from facet* to Facet* when virtual inheritance. But you are not required to
store facet* in the actual container; you can store the original Facet* as
void*.

[Bug c++/86355] [10 Regression] Internal compiler error with pack expansion and fold expression

2023-07-07 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86355

--- Comment #14 from Peter Dimov  ---
Should I open another bug for the failure to compile the original example?

[Bug c++/110476] constexpr floating point regression with -std=c++XX

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110476

--- Comment #2 from Peter Dimov  ---
Discussion of FLT_EVAL_METHOD notwithstanding, I think that this behavior is
not allowed by https://eel.is/c++draft/lex.fcon#3.

"If the scaled value is not in the range of representable values for its type,
the program is ill-formed. Otherwise, the value of a floating-point-literal is
the scaled value if representable, else the larger or smaller representable
value nearest the scaled value, chosen in an implementation-defined manner."

I don't see any license here for the value of 3.14f to be 3.14L.

[Bug c++/110477] -fexcess-precision=standard not applied consistently

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110477

--- Comment #8 from Peter Dimov  ---
As I commented on the duplicate bug, I don't think this behavior is allowed by
https://eel.is/c++draft/lex.fcon#3.

"If the scaled value is not in the range of representable values for its type,
the program is ill-formed. Otherwise, the value of a floating-point-literal is
the scaled value if representable, else the larger or smaller representable
value nearest the scaled value, chosen in an implementation-defined manner."

I don't see any license here for the value of 3.14f to be 3.14L.

[Bug target/108742] Incorrect constant folding with (or exposed by) -fexcess-precision=standard

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108742

--- Comment #13 from Peter Dimov  ---
I think that https://eel.is/c++draft/lex.fcon#3 disagrees.

"If the scaled value is not in the range of representable values for its type,
the program is ill-formed. Otherwise, the value of a floating-point-literal is
the scaled value if representable, else the larger or smaller representable
value nearest the scaled value, chosen in an implementation-defined manner."

I don't see any license here for the value of 3.14f to be 3.14L.

[Bug c++/110477] -fexcess-precision=standard not applied consistently

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110477

--- Comment #6 from Peter Dimov  ---
I suppose this is unfixable because there's all sorts of code assuming that the
value of (long double)3.14 is 3.14L and not (long double)(double)3.14L.

I doubt that anyone sane expects this from (long double)3.14F, but it's not
feasible to change one but not the other.

Bafflings will continue until morale improves.

[Bug c++/110477] -fexcess-precision=standard not applied consistently

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110477

--- Comment #3 from Peter Dimov  ---
That's true, but the normal expectation of anyone using
-fexcess-precision=standard would be for it to apply consistently everywhere
(that is, as if FLT_EVAL_METHOD is 0.)

Of course given that FLT_EVAL_METHOD is in a header, so unaffected by -f
options, it's not clear what can be done here.

[Bug c++/110476] constexpr floating point regression with -std=c++XX

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110476

--- Comment #1 from Peter Dimov  ---
As discussed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108742, this is a
consequence of applying the FLT_EVAL_METHOD=2 rules, and can be fixed by
casting 3.14f to (float).

That's... incredibly surprising, though. 3.14f is already a float.

For context, I encountered this regression in the Boost.Variant2 test suite
when I added GCC 13 to CI.

[Bug c++/110477] -fexcess-precision=standard not applied consistently

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110477

--- Comment #1 from Peter Dimov  ---
Looks like a duplicate of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108742
and is fixed by casting the rhs to (float), but any ordinary programmer would
be baffled.

For context, I encountered this regression in the Boost.Variant2 test suite
when I added GCC 13 to CI.

[Bug c++/110477] New: -fexcess-precision=standard not applied consistently

2023-06-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110477

Bug ID: 110477
   Summary: -fexcess-precision=standard not applied consistently
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following program

float f( float x, int y )
{
return x + y;
}

int main()
{
return f( 3.14f, 1 ) == 3.14f + 1;
}

returns different values with -std=c++XX (https://godbolt.org/z/8dK98ondM) and
-std=gnu++XX (https://godbolt.org/z/4Y4qfsKzM) under GCC 13/14 -m32, because
-fexcess-precision=standard is not consistently applied to both sides of the
comparison. Under -fexcess-precision=fast (and hence under previous GCC
versions), the comparison always succeeds because both sides use excess
precision (https://godbolt.org/z/dzdoxdnM9).

This is the runtime equivalent of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110476.

[Bug c++/110476] New: constexpr floating point regression with -std=c++XX

2023-06-28 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110476

Bug ID: 110476
   Summary: constexpr floating point regression with -std=c++XX
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following program

#define STATIC_ASSERT(...) static_assert(__VA_ARGS__, #__VA_ARGS__)

struct X
{
float f;
};

int main()
{
constexpr X x{ 3.14f };
STATIC_ASSERT( x.f == 3.14f );
}

fails under GCC 13/14 with

: In function 'int main()':
:11:24: error: static assertion failed: x.f == 3.14f
   11 | STATIC_ASSERT( x.f == 3.14f );
  |^~~~
:1:42: note: in definition of macro 'STATIC_ASSERT'
1 | #define STATIC_ASSERT(...) static_assert(__VA_ARGS__, #__VA_ARGS__)
  |  ^~~
:11:24: note: the comparison reduces to '(3.1410490417480469e+0l ==
3.141e+0l)'
   11 | STATIC_ASSERT( x.f == 3.14f );
  |^~~~

when compiled with -m32 -std=c++XX under x86 (https://godbolt.org/z/Ghs7j5Teq).
The reason is that -std=c++XX implies -fexcess-precision=standard
(https://godbolt.org/z/zx4rn4j5W).

Previous versions worked fine.

[Bug target/110096] Would be nice if __builtin_ia32_pause had a portable equivalent as it's applicable to ARM

2023-06-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110096

--- Comment #13 from Peter Dimov  ---
Even if we assume that WFE on lock (and SEV on unlock) is the correct approach
on ARM instead of YIELD (though this seems very much domain-specific, depending
on the expected amount of contention and who knows what else), isn't the
existence of pause/yield instructions on MIPS, POWER, and apparently RISC-V (*)
enough further evidence in favor of having a portable intrinsic for emitting
such an instruction?

(*) https://doc.rust-lang.org/src/core/hint.rs.html#178-191 (implementation of
https://doc.rust-lang.org/std/hint/fn.spin_loop.html)

[Bug target/110096] Would be nice if __builtin_ia32_pause had a portable equivalent as it's applicable to ARM

2023-06-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110096

--- Comment #9 from Peter Dimov  ---
I don't think I want WFE here, based on what I read about it. Putting the core
to sleep seems like something to do in an embedded system where I have full
control of what cores do, not something to do on the application level, in a
portable C++ library.

[Bug target/110096] Would be nice if __builtin_ia32_pause had a portable equivalent as it's applicable to ARM

2023-06-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110096

--- Comment #7 from Peter Dimov  ---
These intrinsics are typically used in spinlocks as in
```
while( flag_.test_and_set() )
{
// issue a power-saving NOP here
}
```
(where `flag_` is `std::atomic_flag`) and this use is generic and not
target-dependent.

[Bug target/110096] Would be nice if __builtin_ia32_pause had a portable equivalent as it's applicable to ARM

2023-06-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110096

--- Comment #5 from Peter Dimov  ---
This works for the specific case of ARM, even though I don't find it
substantially better than just using `asm("yield")`, but the benefit of having
a portable intrinsic for this functionality is that as such instructions are
added to targets and GCC gains support of them (as has happened with ARM), code
would automatically take advantage of them, without having to acquire new
ifdefs for each supported target.

[Bug target/110096] Would be nice if __builtin_ia32_pause had a portable equivalent as it's applicable to ARM

2023-06-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110096

--- Comment #3 from Peter Dimov  ---
How does the user know when to include `arm_acle.h`?

[Bug target/110096] New: Would be nice if __builtin_ia32_pause had a portable equivalent as it's applicable to ARM

2023-06-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110096

Bug ID: 110096
   Summary: Would be nice if __builtin_ia32_pause had a portable
equivalent as it's applicable to ARM
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

This is more of a feature request than a bug.

Currently `__builtin_ia32_pause()` only applies to Intel/AMD CPUs (hence the
`ia32` in the name), but it has a straightforward and equivalent meaning for
ARM (issue a YIELD instruction, which is the exact ARM equivalent to the PAUSE
x86 one.)

This forces us to do things like

```
#if __has_builtin(__builtin_ia32_pause)

__builtin_ia32_pause();

#elif defined(__GNUC__) && ( (defined(__ARM_ARCH) && __ARM_ARCH >= 8) ||
defined(__ARM_ARCH_8A__) || defined(__aarch64__) )

__asm__ __volatile__( "yield" : : : "memory" );

// ...
```

(E.g.
https://github.com/boostorg/core/blob/3b96d237c0e3ada30c9beca0f60062a2576dcafd/include/boost/core/detail/sp_thread_pause.hpp)

This can be solved in one of two ways; one, extend `__builtin_ia32_pause` to do
the right thing for ARM - unprincipled because of ia32 in the name, but will
automagically "fix" all code using `#if __has_builtin(__builtin_ia32_pause)`.

Or two, add a portable spelling for the intrinsic, either `__builtin_pause()`
or `__builtin_yield()`.

(Failing that, an ARM-specific `__builtin_arm_yield()` would still be an
improvement over the above because it at least will allow us to not hardcode
the ARM target detection, which we are probably getting wrong.)

[Bug c++/109985] New: __builtin_prefetch ignored by GCC 12/13

2023-05-26 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109985

Bug ID: 109985
   Summary: __builtin_prefetch ignored by GCC 12/13
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

We are investigating a Boost.Unordered performance regression with GCC 12,
on the following benchmark:

https://github.com/boostorg/boost_unordered_benchmarks/blob/4c717baac1bff8d3e51cb8485b72bbb63d533265/scattered_lookup.cpp

and it looks like the reason is that GCC 12 (and 13) ignore a call to
`__builtin_prefetch`.

While GCC 11 generates this:

```
.L108:
mov r8, r12
movdqa  xmm0, xmm1
sal r8, 4
lea r14, [r10+r8]
pcmpeqb xmm0, XMMWORD PTR [r14]
pmovmskbedx, xmm0
and edx, 32767
je  .L104
sub r8, r12
sal r8, 4
add r8, QWORD PTR [rbx+32]
prefetcht0  [r8]
.L106:
xor r15d, r15d
rep bsf r15d, edx
movsx   r15, r15d
sal r15, 4
add r15, r8
cmp rsi, QWORD PTR [r15]
jne .L144
add r9, QWORD PTR [r15+8]
mov rax, rdi
cmp r11, rdi
jne .L145
```
(https://godbolt.org/z/d663fdM16 - prefetcht0 [r8] right before L106)

GCC 12 generates this in the same function:
```
.L108:
mov r8, r10
movdqa  xmm0, xmm1
sal r8, 4
lea r9, [rbp+0+r8]
pcmpeqb xmm0, XMMWORD PTR [r9]
pmovmskbedx, xmm0
and edx, 32767
je  .L104
mov rdi, QWORD PTR [rsp+16]
sub r8, r10
mov QWORD PTR [rsp+24], rax
sal r8, 4
mov rdi, QWORD PTR [rdi+32]
mov QWORD PTR [rsp+8], rdi
mov rax, rdi
.L106:
xor edi, edi
rep bsf edi, edx
movsx   rdi, edi
sal rdi, 4
add rdi, r8
add rdi, rax
cmp r11, QWORD PTR [rdi]
jne .L143
add rsi, 8
add rbx, QWORD PTR [rdi+8]
cmp r12, rsi
jne .L109
```
(https://godbolt.org/z/T7csq7TPz - no prefetcht0 instruction before L106)

Simplifying this code unfortunately leads to the prefetcht0 being generated.

[Bug libstdc++/108952] Regression in uses_allocator_construction_args for pair of rvalue references

2023-02-27 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108952

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #4 from Peter Dimov  ---
An easy fix is to use `std::get<0>` instead of `.first`.

[Bug libstdc++/108836] std::mutex disappears in single-threaded libstdc++ builds

2023-02-17 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108836

--- Comment #4 from Peter Dimov  ---
A compromise between no mutex at all, and a mutex that is silently a no-op,
could be a no-op mutex with [[deprecated]] members, although the atomic_flag is
probably better.

[Bug libstdc++/108836] std::mutex disappears in single-threaded libstdc++ builds

2023-02-17 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108836

--- Comment #2 from Peter Dimov  ---
That's good to hear, but I don't think the issue is specific to mingw32. The
other report, https://github.com/boostorg/system/issues/92, was about "B&R
PLC", whatever this means. :-)

[Bug libstdc++/108836] New: std::mutex disappears in single-threaded libstdc++ builds

2023-02-17 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108836

Bug ID: 108836
   Summary: std::mutex disappears in single-threaded libstdc++
builds
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

We've been getting reports in Boost that our uses of  and std::mutex
don't work in a single-threaded build of libstdc++, so we had to add
configuration macros to avoid these issues. One example is
https://github.com/boostorg/system/commit/53c00841fc0d892bf43cda60e3ea2f05c4362b32,
another https://github.com/boostorg/url/issues/684.

Is there a reason not to make std::mutex available in single threaded builds,
with its operations being no-ops?

[Bug c++/100157] Support `__type_pack_element` like Clang

2022-11-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100157

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #11 from Peter Dimov  ---
Just import mp11 wholesale and use mp_at_c and mp_find :-)

[Bug target/107590] __atomic_test_and_set broken on PowerPC

2022-11-10 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107590

--- Comment #9 from Peter Dimov  ---
The easiest way to reproduce the issue is with the following test:

https://github.com/boostorg/smart_ptr/blob/c577d68b0272fd0bddc88ea60a8db07219391589/test/spinlock_test.cpp

This crashes because - presumably - sp2 is on an odd address.

The definition of the spinlock class is here:

https://github.com/boostorg/smart_ptr/blob/c577d68b0272fd0bddc88ea60a8db07219391589/include/boost/smart_ptr/detail/spinlock_gcc_atomic.hpp

[Bug target/107590] __atomic_test_and_set broken on PowerPC

2022-11-10 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107590

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #7 from Peter Dimov  ---
The spinlock is indeed using an `unsigned char`:

https://github.com/boostorg/smart_ptr/blob/c577d68b0272fd0bddc88ea60a8db07219391589/include/boost/smart_ptr/detail/spinlock_gcc_atomic.hpp#L33

That's because `__atomic_test_and_set` is documented to work on either `bool`
or `char`:

https://gcc.gnu.org/onlinedocs/gcc/extensions-to-the-c-language-family/built-in-functions-for-memory-model-aware-atomic-operations.html#_CPPv421__atomic_test_and_setPvi

"bool __atomic_test_and_set(void *ptr, int memorder)

This built-in function performs an atomic test-and-set operation on the byte at
*ptr. The byte is set to some implementation defined nonzero ‘set’ value and
the return value is true if and only if the previous contents were ‘set’. It
should be only used for operands of type bool or char. For other types only
part of the value may be set."

I don't see an alignment requirement being mentioned here.

[Bug tree-optimization/105545] [12/13 Regression] Warning for string assignment with _GLIBCXX_ASSERTIONS since r12-3347-g8af8abfbbace49e6

2022-11-03 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105545

--- Comment #9 from Peter Dimov  ---
My Godbolt link above no longer reproduces the warning because of
https://github.com/boostorg/describe/commit/c8c46bfdf78022a8a7e9e06983d8b04ccb921991,
but this one does: https://godbolt.org/z/oT1M31osa.

Looks like trunk has fixed the issue, though: https://godbolt.org/z/1GGvYWxKG.

[Bug tree-optimization/105545] [12/13 Regression] Warning for string assignment with _GLIBCXX_ASSERTIONS since r12-3347-g8af8abfbbace49e6

2022-06-21 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105545

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #7 from Peter Dimov  ---
FWIW, I'm getting this warning in one of the Boost.Describe examples
(https://godbolt.org/z/WKMjeTdne) from innocent-looking code that concatenates
std::strings with op+.

[Bug target/105992] New: memcmp(p, q, 7) == 0 can be optimized better on x86

2022-06-15 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105992

Bug ID: 105992
   Summary: memcmp(p, q, 7) == 0 can be optimized better on x86
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

bool eq( char const* p )
{
return __builtin_memcmp( p, "literal", 7 ) == 0;
}

generates

eq(char const*):
cmp DWORD PTR [rdi], 1702127980
je  .L6
.L2:
mov eax, 1
testeax, eax
seteal
ret
.L6:
xor eax, eax
cmp DWORD PTR [rdi+3], 1818325605
jne .L2
testeax, eax
seteal
ret

(https://godbolt.org/z/68MKqGz9T)

but LLVM does

eq(char const*):   # @eq(char const*)
mov eax, 1702127980
xor eax, dword ptr [rdi]
mov ecx, 1818325605
xor ecx, dword ptr [rdi + 3]
or  ecx, eax
seteal
ret

(https://godbolt.org/z/jxcb85Ysa)

There are similar bugs for ARM
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104611) and AVX512
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104610) but I haven't found one
for vanilla x86.

Recent changes to std::string::operator== make it use the above pattern:
https://godbolt.org/z/8KxqqG9cx

[Bug c++/102168] -Wnon-virtual-dtor shouldn't fire for protected dtor in a class with a friend declaration

2022-06-06 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102168

--- Comment #6 from Peter Dimov  ---
Yes, I suppose you're right. The warning warns that Derived _can be_ deleted
via Base*, and that's correct - if not very useful in practice in this specific
case.

In fact the private destructor case is even less useful. ~Derived won't compile
unless it's Derived that is Base's friend, in which case everything is actually
fine.

I suppose we'll have to live with it. The annoying part is that there's no
warning, and then you add an unrelated friend declaration while refactoring,
and the warning appears.

[Bug c++/102168] -Wnon-virtual-dtor shouldn't fire for protected dtor in a class with a friend declaration

2022-06-06 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102168

--- Comment #4 from Peter Dimov  ---
Warning on a private destructor + a friend declaration makes sense, because a
private destructor implies that the type is not intended to be derived from.
But warning on a protected destructor + a friend does not.

[Bug c++/102168] -Wnon-virtual-dtor shouldn't fire for protected dtor in a class with a friend declaration

2022-06-02 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102168

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #1 from Peter Dimov  ---
An issue against Boost.System has just been filed as a result of this warning:
https://github.com/boostorg/system/issues/83

It took me a while to figure out why the warning fires: because of the `friend
class error_code` declaration in `error_category`
(https://godbolt.org/z/z6x14P7M4).

Please reconsider issuing the warning when the destructor is protected and a
friend declaration exists. A protected destructor is a clear indication that
the type is intended as a base class.

[Bug c++/105482] New: Regression with `>=` in a template argument

2022-05-04 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105482

Bug ID: 105482
   Summary: Regression with `>=` in a template argument
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The current GCC trunk in C++11 or C++14 mode gives an error when including
boost/mp11/algorithm.hpp (from Boost 1.79):

opt/compiler-explorer/libs/boost_1_79_0/boost/mp11/algorithm.hpp:445:102:
error: '>=' should be '> =' to terminate a template argument list
  445 | struct mp_take_c_impl, typename std::enable_if= 10>::type>
  |
 ^~
  |
 > =
/opt/compiler-explorer/libs/boost_1_79_0/boost/mp11/algorithm.hpp:445:107:
error: template argument 3 is invalid
  445 | struct mp_take_c_impl, typename std::enable_if= 10>::type>
  |
  ^
/opt/compiler-explorer/libs/boost_1_79_0/boost/mp11/algorithm.hpp:445:114:
error: expected unqualified-id before '>' token
  445 | struct mp_take_c_impl, typename std::enable_if= 10>::type>
  |
 ^

See e.g. https://godbolt.org/z/nE9754and and
https://github.com/boostorg/mp11/issues/73.

GCC 11 works, as does trunk in C++17 mode or later.

[Bug libstdc++/104945] New: std::hash ignores the top 32 bits when size_t is 32 bit

2022-03-15 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104945

Bug ID: 104945
   Summary: std::hash ignores the top 32 bits when
size_t is 32 bit
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

A benchmark in Boost.Unordered uses uint64_t keys of the form `i << 40`:

https://github.com/boostorg/unordered/blob/33f81fd49039bccd1aa3dfd5a29ef6073b93009c/benchmark/uint64.cpp#L65

which leads to pathological behavior with std::unordered_map in 32 bit mode,
because std::hash ignores the top 32 bits:

https://godbolt.org/z/PncKbT7aq

I'm not entirely sure whether this would be considered a bug, but decided that
it's worth reporting. Ideally, std::hash ought to take into account all bits of
the integral input, instead of truncating it to size_t.

[Bug c++/104867] New: Base class matching ignores type of `auto` template parameter

2022-03-10 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104867

Bug ID: 104867
   Summary: Base class matching ignores type of `auto` template
parameter
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following program

```
enum class Foo
{
A1
};

enum class Bar
{
B1
};

template  struct enum_
{
};

template struct list {};

struct enum_type_map: list, int>, list, double>
{};

template V f( list, V> const& )
{
return {};
}

int main()
{
f( enum_type_map() );
}
```

yields

```
: In function 'int main()':
:26:6: error: no matching function for call to 'f(enum_type_map)'
   26 | f( enum_type_map() );
  | ~^~~
:19:21: note: candidate: 'template V f(const
list, V>&)'
   19 | template V f( list, V> const& )
  | ^
:19:21: note:   template argument deduction/substitution failed:
:26:6: note:   'const list, V>' is an ambiguous base
class of 'enum_type_map'
   26 | f( enum_type_map() );
  | ~^~~
```

which is caused by `A1` and `B1` having the same value 0, even though their
types differ. (https://godbolt.org/z/3854zrY7x)

Clang successfully compiles the code (https://godbolt.org/z/eKEdf1zdo).

This is a distilled version of a bug report against `mp_map_find` from Mp11:
https://github.com/boostorg/mp11/issues/72

[Bug c++/104426] -fsanitize=undefined causes constexpr failures

2022-02-07 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104426

--- Comment #4 from Peter Dimov  ---
FWIW, I agree with everything Martin Sebor says in PR71962.
-fallow-address-zero is an entirely separate feature, and shouldn't be implied
by -fsanitize=undefined.

[Bug c++/104426] New: -fsanitize=undefined causes constexpr failures

2022-02-07 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104426

Bug ID: 104426
   Summary: -fsanitize=undefined causes constexpr failures
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following program

```
struct category
{
constexpr bool failed() const noexcept
{
return true;
}
};

inline constexpr category s_cat;

struct condition
{
category const* cat_;

constexpr bool failed() const noexcept
{
if( cat_ )
{
return cat_->failed();
}
else
{
return false;
}
}
};

int main()
{
constexpr condition cond{ &s_cat };
static_assert( cond.failed() );
}
```

compiles without -fsanitize=undefined (https://godbolt.org/z/Pn9M5ocfz), but
fails with it (https://godbolt.org/z/KKc8Tb9qe) with

```
: In function 'int main()':
:31:31: error: non-constant condition for static assertion
   31 | static_assert( cond.failed() );
  |~~~^~
:31:31:   in 'constexpr' expansion of 'cond.condition::failed()'
:17:13: error: '((& s_cat) != 0)' is not a constant expression
   17 | if( cat_ )
  | ^~~~
```

This happens under all GCC versions starting from 7.

(The above is an extract from the test suite for
boost::system::error_condition.)

[Bug c++/102651] New: typeid(X**) instantiates X

2021-10-08 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102651

Bug ID: 102651
   Summary: typeid(X**) instantiates X
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

Using f.ex. `typeid(std:pair**)` tries to instantiate
`std::pair` and fails: https://godbolt.org/z/GhbYe3P8j

(One asterisk should be enough, but doesn't work either.)

The example without pair is

```
#include 

template 
struct S{
T x;
};

void foo()
{
typeid( S** );
}
```

https://godbolt.org/z/nG3Kr3Te7

Interestingly, if `S` is incomplete, it works:

```
#include 

template 
struct S;

void foo()
{
typeid( S** );
}
```

https://godbolt.org/z/nK8b5n1qn

[Bug libstdc++/102425] New: std::error_code() does not compare equal to std::error_condition()

2021-09-21 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102425

Bug ID: 102425
   Summary: std::error_code() does not compare equal to
std::error_condition()
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

As the title says. https://godbolt.org/z/er7qsjvoo.

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-16 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #18 from Peter Dimov  ---
I would use it like this: https://godbolt.org/z/1eqEx6678

#include 

struct error_category
{
};

error_category const& system_category();

struct error_code
{
error_code( int v, error_category const& cat, void const* loc =
__builtin_source_location() );
};

int main()
{
error_code ec( 5, system_category() );
}

provided, of course, I have some not-undefined way to interpret its result.

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-16 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #16 from Peter Dimov  ---
(In reply to Jakub Jelinek from comment #14)
> But we haven't done that that way and how would headers know if the
> __builtin_source_location that is available is the old or new one?

The header could do

namespace std {

  struct __source_location_impl { ... };

  class source_location {

using __impl = __source_location_impl;

// ...

  };
}

unless the compiler looks specifically for a nested struct, in which case

  class source_location {

struct __impl: __source_location_impl {};

// ...

  };

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-16 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #15 from Peter Dimov  ---
(In reply to Jonathan Wakely from comment #13)
> It wouldn't work correctly in all cases, as Jakub points out, because
> std::source_location::current() is part of the magic.
> 
> And I'm not convinced we want/need to support those uses.

I think that users of __builtin_source_location will be content with the subset
of uses it supports. :-)

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-16 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #10 from Peter Dimov  ---
(In reply to Jakub Jelinek from comment #9)
> That would be an aliasing violation.
> The artificial vars created by __builtin_source_location have the
> std::source_location::__impl type, so accessing those using some other
> dynamic type is invalid.

In that case, the only valid way to use the result of __builtin_source_location
would just be std::source_location itself. :-/

I wonder whether there's a conformance problem in making it available. It's
true that the identifier `source_location` isn't reserved, but only programs
that include `` can tell the difference, and these programs
(assuming they existed and worked) will probably be broken anyway because now
they'll be including the standard header instead of their own.

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-16 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #8 from Peter Dimov  ---
(In reply to Jakub Jelinek from comment #6)
> Note, having the struct somewhere else isn't that useful unless you know
> exactly how its non-static data members are named and what they mean, so
> ideally a class with accessor methods, which is what std::source_location
> provides currently.

I was going to undefined-behavior my way to victory by making
boost::source_location layout-compatible with the internal struct, and just
casting the result of __builtin_source_location to boost::source_location
const*. I think this works under the GCC object model?

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-16 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #7 from Peter Dimov  ---
(In reply to Jonathan Wakely from comment #5)
> Sure. It's just a question of whether we're trying to provide a general
> purpose extension available for users, or an internal helper for the
> std::lib. IIRC we explicitly decided we only cared about supporting the
> latter.

Yes, of course. It's just that __builtin_source_location is so painfully close
to exactly what I want - it gives a single pointer representing the location -
that it would be a pity not being able to use it without -std=c++20.

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-15 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #4 from Peter Dimov  ---
On the surface this looks not hard to fix - use ::__source_location_impl (or
std::__source_location_impl) instead of std::source_location:__impl as the
layout struct - but I'm not sure whether this would pose some further problems.

[Bug c++/102350] __builtin_source_location not available in earlier language modes

2021-09-15 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

--- Comment #2 from Peter Dimov  ---
(In reply to Jakub Jelinek from comment #1)
> __builtin_source_location doesn't require -std=c++20, but indeed does
> require  or some compatible definition of
> std::source_location::__impl class, and as it doesn't have hardcoded layout
> of that structure but instead matches whatever the source header provides
> (looks up fields it needs in there and uses whatever types and layout they
> have), there is no way around that.

That was my guess. I suppose that's inevitable then, and there's nothing to ask
for on the compiler side,  just needs to not disable itself
completely for pre-C++20 as it does now. (It's not going to work, but it can
still provide __impl so that the builtin can see it.)

[Bug c++/102350] New: __builtin_source_location not available in earlier language modes

2021-09-15 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102350

Bug ID: 102350
   Summary: __builtin_source_location not available in earlier
language modes
   Product: gcc
   Version: 12.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

Currently, `__builtin_source_location` requires (a) `` to be
included and (b) -std=c++20. Are there good reasons for these restrictions? The
builtin would still be extremely valuable in earlier language modes.

Libraries that still support 03/11/14/17 (Boost.System, for instance) could
transparently supply source locations by using a default `void const* loc =
__builtin_source_location()` argument.

[Bug c++/100827] Compiler crash with Boost.Bimap and Boost.Xpressive

2021-05-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100827

--- Comment #1 from Peter Dimov  ---
Update: GCC 10.2 doesn't have the issue, but 10.1 and 10.3 do. :-)

[Bug c++/100827] New: Compiler crash with Boost.Bimap and Boost.Xpressive

2021-05-29 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100827

Bug ID: 100827
   Summary: Compiler crash with Boost.Bimap and Boost.Xpressive
   Product: gcc
   Version: 11.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

I'm seeing a compiler crash with g++ 10/11 in -std=c++03 mode

https://godbolt.org/z/h4dY9oKGc
https://godbolt.org/z/o1P6nrhqG

when compiling one file from the Boost.Bimap test suite:

```
#include 

#include 

#include 

#include 
#include 

using namespace boost::bimaps;
using namespace boost::xpressive;
namespace xp = boost::xpressive;

int main()
{
//[ code_bimap_and_boost_xpressive

typedef bimap< std::string, int > bm_type;
bm_type bm;

std::string rel_str("one <--> 1 two <--> 2  three <--> 3");

sregex rel = ( (s1= +_w) >> " <--> " >> (s2= +_d) )
[
xp::ref(bm)->*insert( xp::construct(s1,
as(s2)) )
];

sregex relations = rel >> *(+_s >> rel);

regex_match(rel_str, relations);

assert( bm.size() == 3 );
//]

return 0;
}
```
(https://github.com/boostorg/bimap/blob/03bf1d222914d0c15563414f2e51b6a4ce0e0f69/example/bimap_and_boost/xpressive.cpp)

Trunk no longer has the issue, and neither do earlier versions or language
modes.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94938 looks related, but is
considered fixed in GCC 10.

[Bug c++/99495] constexpr virtual destructor is used before its definition

2021-04-23 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99495

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #1 from Peter Dimov  ---
Simplified:

```
struct Base
{
constexpr virtual ~Base() = default;
};

constexpr Base b;
```

https://godbolt.org/z/qGn1nx9ET

[Bug tree-optimization/14721] jump optimization involving a sibling call within a jump table

2021-01-12 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=14721

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #5 from Peter Dimov  ---
It's becoming more common with the use of std::variant and compatible libraries
(such as Boost.Variant2.) https://godbolt.org/z/414e6j shows a reduced example.

[Bug c++/98649] New: Trivial jump table not eliminated

2021-01-12 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98649

Bug ID: 98649
   Summary: Trivial jump table not eliminated
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

Trivial jump tables where all entries are the same are sometimes not
eliminated. E.g. the following example

```
struct Base { virtual void run( float f ) = 0; };
struct T0: Base { void run( float f ); };
struct T1: Base { void run( float f ); };
struct T2: Base { void run( float f ); };
struct T3: Base { void run( float f ); };
struct T4: Base { void run( float f ); };

template struct mp_int {};

struct variant
{
unsigned index_;

union
{
T0 t0_;
T1 t1_;
T2 t2_;
T3 t3_;
T4 t4_;
};

T0& get( mp_int<0> ) { return t0_; }
T1& get( mp_int<1> ) { return t1_; }
T2& get( mp_int<2> ) { return t2_; }
T3& get( mp_int<3> ) { return t3_; }
T4& get( mp_int<4> ) { return t4_; }
};

template decltype(auto) get( variant& v )
{
return v.get( mp_int() );
}

void f1( variant& v, float f )
{
switch( v.index_ )
{
case 0: get<0>(v).run( f ); break;
case 1: get<1>(v).run( f ); break;
case 2: get<2>(v).run( f ); break;
case 3: get<3>(v).run( f ); break;
case 4: get<4>(v).run( f ); break;
default: __builtin_unreachable();
}
}

```

(https://godbolt.org/z/MxzGh8)

results in

```
f1(variant&, float):
mov eax, DWORD PTR [rdi]
lea r8, [rdi+8]
jmp [QWORD PTR .L4[0+rax*8]]
.L4:
.quad   .L3
.quad   .L3
.quad   .L3
.quad   .L3
.quad   .L3
.L3:
mov rax, QWORD PTR [rdi+8]
mov rdi, r8
mov rax, QWORD PTR [rax]
jmp rax
```

This case may seem contrived, but it's not that rare in practice, because code
using std::variant or equivalent (such as Boost.Variant2, from which the
example has been reduced) is becoming more and more common nowadays.

[Bug c++/63707] Brace initialization of array sometimes fails if no copy constructor

2021-01-12 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63707

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #14 from Peter Dimov  ---
FWIW, I just hit this problem as well when trying to make changes to
Boost.Variant2. My reduced test case is https://godbolt.org/z/zG6ddP; I was
going to submit that as a bug but found this one.

I support the petition to have this fixed.

[Bug c++/97464] New: Missed redundant store optimization opportunity

2020-10-16 Thread pdimov at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97464

Bug ID: 97464
   Summary: Missed redundant store optimization opportunity
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The code

void f( int& x, float& y )
{
++x;
y = 1;
--x;
}

compiles to

f(int&, float&):
mov eax, DWORD PTR [rdi]
mov DWORD PTR [rsi], 0x3f80
mov DWORD PTR [rdi], eax
ret

(https://godbolt.org/z/so4h3v)

but the load from, and the store to, [rdi] are redundant. It's obvious that
TBAA is active, but it for some reason doesn't go far enough.

This is a simplified example from "realer" code where x is a reference count
whose unnecessary manipulations could have been optimized out entirely.

[Bug libstdc++/66146] call_once not C++11-compliant on ppc64le

2020-06-27 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146

Peter Dimov  changed:

   What|Removed |Added

 CC||pdimov at gmail dot com

--- Comment #30 from Peter Dimov  ---
I was going to ask the stupid question "why not just use the straightforward
double-checked locking here" but the answer is probably "ABI break".

[Bug c++/92985] missed optimization opportunity for switch linear transformation

2019-12-18 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92985

--- Comment #2 from Peter Dimov  ---
Two more reformulations that activate the linear transformation are:

int operator[]( std::size_t i ) const noexcept
{
std::ptrdiff_t offset;

switch( i )
{
case 0: offset = offsetof(X, x); break;
case 1: offset = offsetof(X, y); break;
case 2: offset = offsetof(X, z); break;
default: __builtin_unreachable();
}

return *(int const*)((char const*)this + offset);
}

(https://godbolt.org/z/cJDB_m)

and

int operator[]( std::size_t i ) const noexcept
{
int X::* p;

switch( i )
{
case 0: p = &X::x; break;
case 1: p = &X::y; break;
case 2: p = &X::z; break;
default: __builtin_unreachable();
}

return this->*p;
}

(https://godbolt.org/z/xfsKh5)

[Bug c++/92985] missed optimization opportunity for switch linear transformation

2019-12-18 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92985

--- Comment #1 from Peter Dimov  ---
Reformulating the switch in terms of integral offsets

struct X2
{
int x, y, z;

int operator[]( std::size_t i ) const noexcept
{
std::ptrdiff_t k0 = &x - &x;
std::ptrdiff_t k1 = &y - &x;
std::ptrdiff_t k2 = &z - &x;

std::ptrdiff_t k;

switch( i )
{
case 0: k = k0; break;
case 1: k = k1; break;
case 2: k = k2; break;
default: __builtin_unreachable();
}

return *( &x + k );
}
};

results in the desired

f2(X2 const&, unsigned long):
mov eax, DWORD PTR [rdi+rsi*4]
ret

(https://godbolt.org/z/YxhNSx)

[Bug c++/92005] [10 Regression] switch code generation regression

2019-10-06 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92005

--- Comment #2 from Peter Dimov  ---
r276416 makes -O2 inline less, and -O3 does fix this specific case. However,
there appears to be some deeper issue here. I've reduced the number of cases
from 10 to 5 for the example, but when I increase them back to 10 as in
https://godbolt.org/z/VyCeWQ, gcc 9.2 still generates a simple lookup table at
-O2, whereas gcc 10 generates a jump table even at -O3:
https://godbolt.org/z/qJqNDh.

```
template struct overloaded : Ts... { using Ts::operator()...; };
template overloaded(Ts...) -> overloaded;

struct T0 {};
struct T1 {};
struct T2 {};
struct T3 {};
struct T4 {};
struct T5 {};
struct T6 {};
struct T7 {};
struct T8 {};
struct T9 {};

struct variant
{
unsigned index_;

union
{
T0 t0_;
T1 t1_;
T2 t2_;
T3 t3_;
T4 t4_;
T5 t5_;
T6 t6_;
T7 t7_;
T8 t8_;
T9 t9_;
};
};

template int visit( F f, variant const& v )
{
switch( v.index_ )
{
case 0: return f( v.t0_ );
case 1: return f( v.t1_ );
case 2: return f( v.t2_ );
case 3: return f( v.t3_ );
case 4: return f( v.t4_ );
case 5: return f( v.t5_ );
case 6: return f( v.t6_ );
case 7: return f( v.t7_ );
case 8: return f( v.t8_ );
case 9: return f( v.t9_ );
default: __builtin_unreachable();
}
}

int do_visit(variant const& v) {
 return visit(overloaded{
[](T0 val) { return 3; },
[](T1 val) { return 5; },
[](T2 val) { return 8; },
[](T3 val) { return 9; },
[](T4 val) { return 10; },
[](T5 val) { return 11; },
[](T6 val) { return 12; },
[](T7 val) { return 13; },
[](T8 val) { return 14; },
[](T9 val) { return 233; }
}, v);
}

```

[Bug c++/92005] New: switch code generation regression

2019-10-06 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92005

Bug ID: 92005
   Summary: switch code generation regression
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following code:

```
template struct overloaded : Ts... { using Ts::operator()...; };
template overloaded(Ts...) -> overloaded;

struct T0 {};
struct T1 {};
struct T2 {};
struct T3 {};
struct T4 {};

struct variant
{
unsigned index_;

union
{
T0 t0_;
T1 t1_;
T2 t2_;
T3 t3_;
T4 t4_;
};
};

template int visit( F f, variant const& v )
{
switch( v.index_ )
{
case 0: return f( v.t0_ );
case 1: return f( v.t1_ );
case 2: return f( v.t2_ );
case 3: return f( v.t3_ );
case 4: return f( v.t4_ );
default: __builtin_unreachable();
}
}

int do_visit(variant const& v) {
 return visit(overloaded{
[](T0 val) { return 3; },
[](T1 val) { return 5; },
[](T2 val) { return 8; },
[](T3 val) { return 9; },
[](T4 val) { return 10; }
}, v);
}
```

(https://godbolt.org/z/uxQ6KF)

generates

```
do_visit(variant const&):
mov eax, DWORD PTR [rdi]
jmp [QWORD PTR .L4[0+rax*8]]
.L4:
.quad   .L8
.quad   .L7
.quad   .L9
.quad   .L5
.quad   .L3
.L9:
mov eax, 8
ret
.L7:
mov eax, 5
ret
.L8:
mov eax, 3
ret
.L5:
mov eax, 9
ret
.L3:
mov eax, 10
ret
```

with the current gcc trunk on godbolt.org (g++ (Compiler-Explorer-Build) 10.0.0
20191005 (experimental)) and

```
do_visit(variant const&):
mov eax, DWORD PTR [rdi]
mov eax, DWORD PTR CSWTCH.7[0+rax*4]
ret
CSWTCH.7:
.long   3
.long   5
.long   8
.long   9
.long   10
```

with gcc 9.2.

[Bug c++/89029] __builtin_constant_p erroneously true

2019-01-24 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89029

--- Comment #4 from Peter Dimov  ---
(In reply to Jonathan Wakely from comment #3)
> c.f. https://gcc.gnu.org/ml/libstdc++/2018-03/msg00031.html and the replies

Yes, pretty much.

> I doubt we would catch many bugs that way, as most bugs would involve
> non-constant indices and vectors that have changed size dynamically at
> run-time.

It's still pretty cool when it works, f.ex. here: https://godbolt.org/z/fHCB16

Annoying that we're so close to useful static analysis but it doesn't _quite_
work. (Also note how the code for g() goes straight to assert without telling
anyone.)

>RESOLVED INVALID

Too bad. FWIW, Clang trunk doesn't seem to suffer from the false positive
problem. It also "proves" the assertion failure in g, but not in f:
https://godbolt.org/z/92WyvR. (It also doesn't support __attribute((error)),
which makes this technique limited in value.)

Maybe the correct way to go about this is just to mark __assert_fail in some
manner ("warn if unconditionally called"), instead of trying to (ab)use
__builtin_constant_p.

[Bug c++/89029] New: __builtin_constant_p erroneously true

2019-01-23 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89029

Bug ID: 89029
   Summary: __builtin_constant_p erroneously true
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

In the following code:

```
inline bool assert_static_check( bool ev )
{
return __builtin_constant_p( ev ) && !ev;
}

struct X1
{
int v1_;
};

inline bool operator==( X1 const& x1, X1 const& x2 )
{
return x1.v1_ == x2.v1_;
}

int compare( X1 s1, X1 s2 )
{
return assert_static_check( s1 == s2 );
}

struct X2
{
int v1_;
int v2_;
};

inline bool operator==( X2 const& x1, X2 const& x2 )
{
return x1.v1_ == x2.v1_ && x1.v2_ == x2.v2_;
}

int compare( X2 s1, X2 s2 )
{
return assert_static_check( s1 == s2 );
}
```

the generated code for the first compare function is, correctly,

```
compare(X1, X1):
xor eax, eax
ret
```

but for the second, it is

```
compare(X2, X2):
mov eax, 1
cmp edi, esi
je  .L6
ret
.L6:
sar rdi, 32
sar rsi, 32
xor eax, eax
cmp edi, esi
setne   al
ret
```

which is incorrect, as `s1 == s2` isn't a constant here.

(g++ -std=c++17 -O3, https://godbolt.org/z/nn0-4k)

(This is simplified from an attempt to create a statically-diagnosed assert
facility that would warn when the asserted expression is known to be false:
https://godbolt.org/z/i5SXSd. Would've been cool if it worked.)

[Bug tree-optimization/87205] Inefficient code generation for switch

2018-09-04 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87205

--- Comment #9 from Peter Dimov  ---
For more context, see https://godbolt.org/z/SzfpKr

```
#include 

template struct variant
{
std::aligned_union_t<0, T...> storage_;
unsigned index_;
};

template
auto visit( variant& v, F f )
{
switch( v.index_ )
{
case 0: return f( (T0&)v.storage_ );
case 1: return f( (T1&)v.storage_ );
case 2: return f( (T2&)v.storage_ );
case 3: return f( (T3&)v.storage_ );
case 4: return f( (T4&)v.storage_ );
case 5: return f( (T5&)v.storage_ );
default: __builtin_unreachable();
}
}

struct X
{
int v;
};

template struct Y: X
{
};

using V = variant, Y<1>, Y<2>, Y<3>, Y<4>, Y<5>>;

void f( X& );
int g( int );

int h1( V& v )
{
return visit( v, [](X const& x){ return x.v; } );
}

int h2( V& v )
{
return visit( v, [](auto&& x){ return x.v; } );
}

void h3( V& v )
{
return visit( v, [](auto&& x){ f(x); } );
}

int h4( V& v )
{
return visit( v, [](auto&& x){ return g(x.v); } );
}
```

This generates

```
h1(variant, Y<1>, Y<2>, Y<3>, Y<4>, Y<5> >&):
  mov eax, DWORD PTR [rdi]
  ret
h2(variant, Y<1>, Y<2>, Y<3>, Y<4>, Y<5> >&):
  mov eax, DWORD PTR [rdi]
  ret
h3(variant, Y<1>, Y<2>, Y<3>, Y<4>, Y<5> >&):
  cmp DWORD PTR [rdi+4], 5
  jbe .L15
.L15:
  jmp f(X&)
h4(variant, Y<1>, Y<2>, Y<3>, Y<4>, Y<5> >&):
  cmp DWORD PTR [rdi+4], 5
  jbe .L19
.L19:
  mov edi, DWORD PTR [rdi]
  jmp g(int)
```

so the member access is folded in both cases (which is good!), even though the
first occurs through X& and the second through Y&.

I've been unable to determine what makes the optimizations misfire. This code
should in principle be the same as the simplified one, but it isn't.

[Bug tree-optimization/87205] Inefficient code generation for switch

2018-09-04 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87205

--- Comment #8 from Peter Dimov  ---
(In reply to Martin Liška from comment #7)
> I'm not sure here Y are different types here and member access based on
> the type is distinct.

Yes, one could argue that, I suppose. But in the `return ((Y<0>*)p)->v;` case
the member access _is_ lifted outside the jump table. If that's correct there,
it should also be correct here. :-)

[Bug tree-optimization/87205] Inefficient code generation for switch

2018-09-04 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87205

--- Comment #5 from Peter Dimov  ---
Another:

```
struct X
{
int v;
};

template struct Y: X
{
};

void f( int v );

void h( unsigned ix, void* p )
{
switch( ix )
{
case 0: f( ((Y<0>*)p)->v ); break;
case 1: f( ((Y<1>*)p)->v ); break;
case 2: f( ((Y<2>*)p)->v ); break;
case 3: f( ((Y<3>*)p)->v ); break;
case 4: f( ((Y<4>*)p)->v ); break;
case 5: f( ((Y<5>*)p)->v ); break;
default: __builtin_unreachable();
}
}
```

```
h(unsigned int, void*):
  mov edi, edi
  jmp [QWORD PTR .L4[0+rdi*8]]
.L4:
  .quad .L3
  .quad .L3
  .quad .L3
  .quad .L3
  .quad .L3
  .quad .L3
.L3:
  mov edi, DWORD PTR [rsi]
  jmp f(int)
```

https://godbolt.org/z/pGVx6W

This however demonstrates a different problem, so it may need to go into a
separate bug.

[Bug tree-optimization/87205] Inefficient code generation for switch

2018-09-04 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87205

--- Comment #4 from Peter Dimov  ---
If the code is not the same the jump table is not optimized out and there's no
extra check. But it also happens with code that is not the same on the C++
side, for example:

```
struct X
{
int v;
};

template struct Y: X
{
};

void f( X* x );

void h( unsigned ix, void* p )
{
switch( ix )
{
case 0: f( (Y<0>*)p ); break;
case 1: f( (Y<1>*)p ); break;
case 2: f( (Y<2>*)p ); break;
case 3: f( (Y<3>*)p ); break;
case 4: f( (Y<4>*)p ); break;
case 5: f( (Y<5>*)p ); break;
default: __builtin_unreachable();
}
}
```

```
h(unsigned int, void*):
  cmp edi, 5
  jbe .L5
.L5:
  mov rdi, rsi
  jmp f(X*)
```

https://godbolt.org/z/2Lh_GZ

A variation on the same theme, which demonstrates another kind of missed
optimization:

```
struct X
{
int v;
};

template struct Y: X
{
};

int h( unsigned ix, void* p )
{
switch( ix )
{
case 0: return ((Y<0>*)p)->v;
case 1: return ((Y<1>*)p)->v;
case 2: return ((Y<2>*)p)->v;
case 3: return ((Y<3>*)p)->v;
case 4: return ((Y<4>*)p)->v;
case 5: return ((Y<5>*)p)->v;
default: __builtin_unreachable();
}
}
```

```
h(unsigned int, void*):
  mov edi, edi
  mov eax, DWORD PTR [rsi]
  jmp [QWORD PTR .L4[0+rdi*8]]
.L4:
  .quad .L9
  .quad .L8
  .quad .L7
  .quad .L6
  .quad .L5
  .quad .L3
.L5:
  ret
.L3:
  ret
.L9:
  ret
.L8:
  ret
.L7:
  ret
.L6:
  ret
```

https://godbolt.org/z/lCzlR2

There's a table, so there's no redundant check.

[Bug c++/87205] New: Inefficient code generation for switch

2018-09-03 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87205

Bug ID: 87205
   Summary: Inefficient code generation for switch
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

For the following code:

```
void f( int x );

void h( unsigned ix )
{
switch( ix )
{
case 0: f(42); break;
case 1: f(42); break;
case 2: f(42); break;
case 3: f(42); break;
case 4: f(42); break;
case 5: f(42); break;
default: __builtin_unreachable();
}
}
```

g++ 9.0 -O2, -O3 generates:

```
h(unsigned int):
  cmp edi, 5
  jbe .L5
.L5:
  mov edi, 42
  jmp f(int)
```

https://godbolt.org/z/4I_Chu

The initial part that compares edi to 5 is redundant.

At -O1 the result is a jump table that doesn't check edi, as expected:

```
h(unsigned int):
  sub rsp, 8
  mov edi, edi
  jmp [QWORD PTR .L4[0+rdi*8]]
```

This is a simplified example; I've stripped the metaprogramming that produces
it. :-)

For comparison, g++ 8.2 produces

```
h(unsigned int):
  cmp edi, 5
  ja .L2
  mov edi, 42
  jmp f(int)
h(unsigned int) [clone .cold.0]:
.L2:
```

[Bug c++/86356] New: "invalid use of pack expansion" with fold expressions

2018-06-28 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86356

Bug ID: 86356
   Summary: "invalid use of pack expansion" with fold expressions
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The code

```
#include 

template using mp_bool = std::integral_constant;

template using mp_all = mp_bool<(static_cast(T::value) &&
...)>;
template using mp_any = mp_bool<(static_cast(T::value) ||
...)>;

template
struct variant_base_impl {};
template using variant_base =
variant_base_impl...>::value,
mp_any...>,
std::is_nothrow_default_constructible...>::value, T...>;

int main()
{
variant_base();
}
```

yields with g++ 7.3 (and 8.1)

```
testbed2017.cpp: In substitution of 'template using variant_base =
variant_base_impl(std::is_trivially_destructible::value)  && ...)>::value,
std::integral_constant(std::integral_constant(std::is_nothrow_move_constructible::value)  &&
...)>::value) ||
static_cast(std::is_nothrow_default_constructible::value)
...)>::value, T ...> [with T = {void, int, float}]':
testbed2017.cpp:13:31:   required from here
testbed2017.cpp:6:82: error: invalid use of pack expansion expression
 template using mp_any = mp_bool<(static_cast(T::value) ||
...)>;

  ^
```

clang++ accepts.

[Bug c++/86355] New: Internal compiler error with pack expansion and fold expression

2018-06-28 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86355

Bug ID: 86355
   Summary: Internal compiler error with pack expansion and fold
expression
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following code

```
#include 

template using mp_bool = std::integral_constant;
using mp_true = mp_bool;

template using mp_all = mp_bool<(static_cast(T::value) &&
...)>;

template using check2 = mp_all,
mp_all...,
mp_all...>>>;

static_assert( std::is_same, mp_true>::value );
```

with g++ 7.3 (and 8.1) yields

```
testbed2017.cpp: In substitution of 'template using
check2
 = mp_all, std::integral_constant(T::value)  && ...)> > [with V = void; T = {int, float}]':
testbed2017.cpp:10:52:   required from here
testbed2017.cpp:8:156: internal compiler error: Segmentation fault
 template using check2 = mp_all,
mp_all...,
mp_all...>>>;

^
```

clang++ accepts it.

[Bug c++/86354] New: Address comparison not a constant expression

2018-06-28 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86354

Bug ID: 86354
   Summary: Address comparison not a constant expression
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

The following code:

```
struct X {};

struct Y: X {};
struct Z: X {};

extern Y y;
extern Z z;

constexpr X const& x1() { return y; }
constexpr X const& x2() { return z; }

static_assert( &x1() != &x2() );
```

yields (with g++ 7.3 and 8.1)

```
testbed2017.cpp:12:1: error: non-constant condition for static assertion
 static_assert( &x1() != &x2() );
 ^
testbed2017.cpp:12:22: error: '(((const X*)(& y)) != ((const X*)(& z)))' is not
a constant expression
 static_assert( &x1() != &x2() );
~~^~~~
```

but other compilers accept it.

[Bug c++/83835] New: constexpr constructor rejected in c++17 mode (regression WRT c++14)

2018-01-14 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83835

Bug ID: 83835
   Summary: constexpr constructor rejected in c++17 mode
(regression WRT c++14)
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

This code:

```
class X
{
public:

virtual ~X();
};

class Y
{
private:

class Z: public X
{
private:

Y const * p_;

public:

constexpr explicit Z( Y const * p ): p_( p ) {}
};

Z z_;

public:

constexpr Y() noexcept: z_( this ) {}
};

int main()
{
}
```

is accepted in C++14 mode, and in C++1z mode by g++ 5 or 6, but fails in C++17
mode under g++ 7 or 8 with

```
error: temporary of non-literal type 'Y::Z' in a constant expression
```

[Bug c++/81311] New: An std::ref argument calls copy constructor instead of template constructor in C++17 mode

2017-07-04 Thread pdimov at gmail dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81311

Bug ID: 81311
   Summary: An std::ref argument calls copy constructor instead of
template constructor in C++17 mode
   Product: gcc
   Version: 8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pdimov at gmail dot com
  Target Milestone: ---

In the following code:

```
#include 
#include 

struct function
{
function()
{
std::cout << "function()\n";
}

template function( F )
{
std::cout << "function(F)\n";
}

function( function const& )
{
std::cout << "function(function const&)\n";
}
};

int main()
{
function f1;
function f2( std::ref(f1) );
}
```

g++ 7.1 and 8 with -std=c++1z/17 call the copy constructor, whereas with
-std=c++14 they call the template constructor (as do other compilers in all
language modes.)

[Bug middle-end/44164] [4.5 Regression] Aliasing bug triggered by Boost.Bind/Boost.Function

2010-05-17 Thread pdimov at gmail dot com



--- Comment #9 from pdimov at gmail dot com  2010-05-17 20:12 ---
But the standard says in [basic.types] that "For any trivially copyable type T,
if two pointers to T point to distinct T objects obj1 and obj2, where neither
obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making
up obj1 are copied into obj2,40 obj2 shall subsequently hold the same value as
obj1."

"float" is a trivially copyable type. Copying X results in copying the bytes of
X::data (because the default copy constructor of a class does a memberwise
copy, and the default copy constructor of an array does an elementwise copy).
Therefore, the underlying bytes of the object of type float, initialized at
x1.data, are copied into x2.data, which then must, if interpreted as a float,
hold the same value as the original object.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44164

[Bug middle-end/44164] [4.5 Regression] Aliasing bug triggered by Boost.Bind/Boost.Function

2010-05-17 Thread pdimov at gmail dot com



--- Comment #7 from pdimov at gmail dot com  2010-05-17 19:10 ---
(In reply to comment #6)
> Basically the middle-end sees this the same as
>   int i = 1, j;
>   float *p = new (&i) float(0.0);
>   j = i;
>   return *reinterpret_cast(&j);
> and you expect to return 0.0.

The int/float example does violate the aliasing rules, but I don't think that
it properly describes what's happening.

I see it more like a combination of the following two examples:

#include 

struct X
{
char data[ sizeof( float ) ];
};

int main()
{
X x1;
new( &x1.data ) float( 3.14f );

X x2 = x1;

std::cout << *(float const*)&x2.data << std::endl;
}

and

#include 

union Y
{
int i;
float f;
};

int main()
{
Y y1;
y1.f = 3.14f;

Y y2 = y1;

std::cout << y2.f << std::endl;
}

I don't think either of them violates the standard.


-- 

pdimov at gmail dot com changed:

   What|Removed |Added
----------------
 CC||pdimov at gmail dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44164

[Bug c++/21528] Boost shared_ptr_test.cpp fails with -O3

2005-05-12 Thread pdimov at gmail dot com


--- Additional Comments From pdimov at gmail dot com  2005-05-12 08:42 
---
Created an attachment (id=8871)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8871&action=view)
Preprocessed source


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21528

[Bug c++/21528] New: Boost shared_ptr_test.cpp fails with -O3

2005-05-12 Thread pdimov at gmail dot com

The following portion of shared_ptr_test.cpp:

#include 
#include 

int main()
{
boost::shared_ptr pi(new int);
boost::shared_ptr pv(pi);

boost::shared_ptr pi2 = boost::static_pointer_cast(pv);
BOOST_TEST(pi.get() == pi2.get());
BOOST_TEST(!(pi < pi2 || pi2 < pi));
BOOST_TEST(pi.use_count() == 3);
BOOST_TEST(pv.use_count() == 3);
BOOST_TEST(pi2.use_count() == 3);

return boost::report_errors();
}

(using the current Boost CVS HEAD that will become 1.33 soon) fails the three
use_count() tests with g++ 4.0.0 -O3, but not with -O0, 1, 2. It seems that a
reference count increment is being optimized away. This does not happen if
BOOST_TEST is replaced with assert.

-- 
   Summary: Boost shared_ptr_test.cpp fails with -O3
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: pdimov at gmail dot com
CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: i686-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21528

90 matches

Mail list logo