[Bug c++/89481] constexpr function allows writing one active union member and reading another

2019-02-25 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89481

--- Comment #3 from Michael Veksler  ---
Thanks for looking into it.

With the fix, does it behave the same way for:
 - runtime evaluation of all_zeros()
 - compile time evaluation such as std::integral_constant::value;

Currently (trunk 20190223 (experimental) with -std=c++2a), it returns different
results for both cases, which does not feel right.

[Bug c++/89481] New: constexpr function allows writing one active union member and reading another

2019-02-24 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89481

Bug ID: 89481
   Summary: constexpr function allows writing one active union
member and reading another
   Product: gcc
   Version: 8.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mickey.veksler at gmail dot com
  Target Milestone: ---

The following code should not compile, not when passed as a template argument
to std::intergral_constant :

constexpr bool all_zeros()
{
union mix {
double numeric;
char bytes[sizeof(double)+1];
};
mix zero{-0.0}; // active is 'numeric'

// setting a different active member(?):
zero.bytes[sizeof(double)] = '\0';
for (unsigned i=0 ; i != sizeof(double) ; ++i)
{
// or this is illegal (since it was never initialized through 'bytes')
if (zero.bytes[i])
   return false;
}
return true;
}

The puzzling thing is that the following two give different results:

int main()
{
  //  return all_zeros();
return std::integral_constant::value;
}

Unlike gcc, clang-7.0.0 emits diagnostics:
  :3:16: error: constexpr function never produces a constant expression
[-Winvalid-constexpr]
  constexpr bool all_zeros()
 ^
  :10:32: note: assignment to member 'bytes' of union with active
member 'numeric' is not allowed in a constant expression
  zero.bytes[sizeof(double)] = '\0';
   ^

[Bug libstdc++/86860] Reject valid overloads of subclass ostream operator<

2018-08-06 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86860

--- Comment #2 from Michael Veksler  ---
Sounds reasonable. Thanks.

[Bug libstdc++/86860] New: Reject valid overloads of subclass ostream operator<

2018-08-05 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86860

Bug ID: 86860
   Summary: Reject valid overloads of subclass ostream operator<<
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mickey.veksler at gmail dot com
  Target Milestone: ---

The following is rejected by gcc (not by other compilers):
#include 
#include 

struct St : std::ostringstream {
template St& operator<<( const Tp& value ) { return *this; }
operator std::string() const { return str(); }
friend std::ostream& operator<<( std::ostream& s, const St& ss ) { return 
s << ss.str(); }
};
struct Memory_type {
std::string to_string() const { return St() << "s=" << s; }
const char* s;
};

Because of an ambiguity between user defined:
template St& operator<<( const Tp& value ) { return *this; }

and GCC function defined (in ostream header):
  template
  inline
  typename enable_if<__and_<__not_>,
   __is_convertible_to_basic_ostream<_Ostream>,
   __is_insertable<
   __rvalue_ostream_type<_Ostream>,
   const _Tp&>>::value,
   __rvalue_ostream_type<_Ostream>>::type
   operator<<(_Ostream&& __os, const _Tp& __x)
  {  }


The above function does not seem to be part of the standard, and it seems that
the other compilers can work without it.


This is described in
https://stackoverflow.com/questions/51637953/what-enable-if-or-other-hint-is-need-for-the-following-overloaded-to-compile/51692662#51692662

[Bug libstdc++/86852] map and unordered_map wrong deduction guides for inilializer_list

2018-08-04 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86852

--- Comment #3 from Michael Veksler  ---
I agree that this is a ridiculous example. That's why there should be an
official DR to it. It is a bad idea to have each compiler, do a different thing
-- that's why there is a C++ standard. clang are sticking to the standard, so
code that compiles under clang does not compile under gcc and vice versa.

As you mention, there is https://cplusplus.github.io/LWG/issue3025 ,
but even if this proposal is accepted things are still too brittle in libstd++:

std::unordered_map m{{1,2}, {3,4}} does not work, and forcing
std::unordered_map m{std::pair{1,2}, {3,4}} is counterintuitive.

Worse: 
std::unordered_map m(
  std::initializer_list>{
  {1, 2}, {2, 3}});
does not work, which means that:
std::unordered_map m(
  std::initializer_list>{
  {1, 2}, {2, 3}});
does not work either.

It took me some time to find the right combination that makes this work, which
you mentioned above. I have seen other struggle with this, so it is not just
me. The current way GCC does is not very intuitive.

[Bug libstdc++/86852] New: map and unordered_map wrong deduction guides for inilializer_list

2018-08-03 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86852

Bug ID: 86852
   Summary: map and unordered_map wrong deduction guides for
inilializer_list
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mickey.veksler at gmail dot com
  Target Milestone: ---

According to
https://en.cppreference.com/w/cpp/container/unordered_map/deduction_guides

Deduction guides for unodrered_map:
template,
 class Pred = std::equal_to,
 class Alloc = std::allocator>>
unordered_map(std::initializer_list>,
 typename /*see below*/::size_type = /*see below*/,
 Hash = Hash(), Pred = Pred(), Alloc = Alloc())
-> unordered_map;


Note that the guide is for std::pair, i.e., the key is const.
In libstdc++'s unodered_map:

unordered_map(initializer_list>,
  typename unordered_map::size_type = {},
  _Hash = _Hash(), _Pred = _Pred(), _Allocator = _Allocator())
-> unordered_map<_Key, _Tp, _Hash, _Pred, _Allocator>;


Note that pair<_Key, _Tp>, i.e., the key is not const.

This breaks unordered_map and map deduction guides:
  #include 
  #include 
  int main() {
 std::unordered_map m1(std::initializer_list<
  std::pair>({{1, 2}, {2, 3}}));
  }

This fails with:
: In function 'int main()':

:5:69: error: class template argument deduction failed:

 std::pair>({{1, 2}, {2, 3}}));




However, if const is removed from the key:
This breaks unordered_map and map deduction guides:
  #include 
  #include 
  int main() {
 std::unordered_map m1(std::initializer_list<
  std::pair< int, int>>({{1, 2}, {2, 3}}));
  }

Deduction guide works, but it is unusable:
: In function 'int main()':

:5:64: error: no matching function for call to 'std::unordered_map::unordered_map(std::initializer_list >)'

 std::pair< int, int>>({{1, 2}, {2, 3}}));

=
The only way to make this work in gcc is
   std::unordered_map m{std::pair{1,2}, {3,4}};

But this does not seem to be correct.

[Bug c++/86773] New: GCC accepts junk before fold expressions

2018-08-01 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86773

Bug ID: 86773
   Summary: GCC accepts junk before fold expressions
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mickey.veksler at gmail dot com
  Target Milestone: ---

template 
auto work(Param && ...param)
{
return ("hi" ... / param);
}

int main()
{
std::cout << work(1.0, 2.0, 5, 4.0) << "\n";
}


GCC simply ignores the "hi" junk before the fold expression, with no
diagnostics.

[Bug tree-optimization/18501] [6/7/8/9 Regression] Missing 'used uninitialized' warning (CCP)

2018-07-24 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18501

Michael Veksler  changed:

   What|Removed |Added

 CC||mickey.veksler at gmail dot com

--- Comment #84 from Michael Veksler  ---
Ping.  
At least -Wmaybe-uninitialized should emit warnings.

This still happens on the trunk (gcc version 9.0.0 20180723 (experimental)
(GCC-Explorer-Build)) :
  int f(int a)
  {
int ret;
if (a) {
ret = 1;
}

return ret;
  }

No warning, including with  -Wmaybe-uninitialized. All other compilers warn
about this (at least clang and Visual C++).

[Bug c++/86619] Missed optimization opportunity with array aliasing

2018-07-23 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

--- Comment #4 from Michael Veksler  ---
It is interesting to check the impact on numerical C++ benchmarks.

Fortran has a conceptual restrict on all its parameter arrays, 
since aliasing is not allowed.

void f(int * __restrict__ v1, int * __restrict__ v2, int n)
{
for (int i=0 ; i < n ; i++)
v1[0] += v2[i];
}
and Fortran:
  subroutine f(v1, v2, n)
  integer :: v1(100)
  integer :: v2(100)
  integer :: n

  DO i=1, n
v1(1) = v1(1) + v2(i)
  END DO
  end subroutine f

Generate the same loop:
.L3:
addl(%rdx), %eax
addq$4, %rdx
cmpq%rdx, %r8
jne .L3


But without restrict, as expected, g++ generates:
.L8:
addl(%rdx), %eax
addq$4, %rdx
cmpq%r8, %rdx
movl%eax, (%rcx)
jne .L8

Running both variants from a loop (in a separate translation unit, 
without whole program optimization) (g++ 7.2.0 with -O2 on 64 bit cygwin):
#include 
#include 
void f(int * __restrict__ v1, int *__restrict__  v2, int SIZE);
void g(int * v1, int * v2, int SIZE);
constexpr int SIZE = 1'000'000;
int v2[SIZE];
int main()
{
int v1;

f(, v2, SIZE); // Warm up cache

auto start = std::clock();
constexpr int TIMES = 10'000;
for (int i=0 ; i < TIMES; ++i) {
v1 = 0;
f(, v2, SIZE);
}

auto t1 = std::clock();
for (int i=0 ; i < TIMES; ++i) {
v1 = 0;
g(, v2, SIZE);
}

auto t2 = std::clock();
std::cout << "with restrict: "
   << double(t1 - start) / CLOCKS_PER_SEC << " sec\n";
std::cout << "without restrict: "
   << double(t2 - t1) / CLOCKS_PER_SEC  << " sec\n";
}

And the results are:
  with restrict: 4.477 sec
  without restrict: 5.756 sec
Which clearly demonstrates the impact of good alias analysis.

With plain C pointers, this is an unavoidable price.
But unfortunately this also happens when passing pointers or 
references to arrays of different sizes, or when inheriting two 
different types from std::array, in order to mark the parameters
as non-aliasing.

[Bug c++/86619] Missed optimization opportunity with array aliasing

2018-07-23 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

--- Comment #2 from Michael Veksler  ---
>> type-based alias analysis doesn't distinguish between int[2] and int[3]. 

Is it just the way GCC implements type-based alias analysis, 
or is it defined that way in the C and C++ standards?

I suspect  that the weaker alias analysis of arrays (int [size] and
std::array) is one of the things that make C++ slower than 
Fortran on some benchmarks.

[Bug c++/86619] New: Missed optimization opportunity with array aliasing

2018-07-21 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86619

Bug ID: 86619
   Summary: Missed optimization opportunity with array aliasing
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mickey.veksler at gmail dot com
  Target Milestone: ---

// gcc version 9.0.0 20180720 (experimental) 
// Compiled with -O3

int f(std::array & a, std::array & b)
{
  a[0] = 1;
  b[0] = 2;
  return a[0];
}

Produces:
f(std::array&, std::array&):
  mov DWORD PTR [rdi], 1
  mov DWORD PTR [rsi], 2
  mov eax, DWORD PTR [rdi]
  ret

Instead of
  mov DWORD PTR [rdi], 1
  mov eax, 1
  mov DWORD PTR [rsi], 2
  ret

But this does not seem to be something that libstdc++ can do anything about.
Consider a simplified array implementation:

template 
struct ar
{
  T ar[size];
  T [](size_t offset) { return ar[offset]; }
};

int f1(ar & a, ar & b)
{
  a.ar[0] = 1;
  b.ar[0] = 2;
  return a.ar[0];
// This is perfect:
/*
  mov DWORD PTR [rdi], 1
  mov eax, 1
  mov DWORD PTR [rsi], 2
  ret
*/
}

// BUT:
int f2(ar & a, ar & b)
{
  a[0] = 1;
  b[0] = 2;
  return a[0];
// Too conservative alias analysis 
/*
  mov DWORD PTR [rdi], 1
  mov DWORD PTR [rsi], 2
  mov eax, DWORD PTR [rdi]
*/
}

It seems that by returning a reference, operator[] makes the compiler lose the
fact that a and b can't alias.

I'm not a language lawyer, but the following also seems to be another lost
optimization opportunity for arrays. After all, a and b have different types:
int g(int ()[2], int ()[3])
{
   a[0] = 1;
   b[0] = 2;
   return a[0];
/*
  mov DWORD PTR [rdi], 1
  mov DWORD PTR [rsi], 2
  mov eax, DWORD PTR [rdi]
  ret
*/
}

[Bug middle-end/55217] False -Wstrict-overflow warning

2014-10-03 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55217

--- Comment #5 from Michael Veksler mickey.veksler at gmail dot com ---
Running the delta.c example with -fdump-tree-all-all-lineno produces
delta.c.125t.vrp2. 

For some reason, stop_9 (which is the first stop_.* in the file) is initialized
with   stop_9 = barD.1593 (), but it should have been initialized with 0.
=
  # i_17 = PHI [delta.c : 5:36] i_10(4), [delta.c : 5:14] 10(2)
  # .MEM_18 = PHI .MEM_8(4), .MEM_4(D)(2)
  [delta.c : 6:13] # .MEM_8 = VDEF .MEM_18
  # USE = nonlocal 
  # CLB = nonlocal 
  stop_9 = barD.1593 ();  == Weird reorder
  [delta.c : 5:36] i_10 = i_17 + -1;
  [delta.c : 5:22] _5 = i_10 = 0;
  [delta.c : 5:29] _6 = stop_9 == 0;
  [delta.c : 5:26] _7 = _6  _5;
  [delta.c : 5:5] if (_7 != 0)
goto bb 4;
  else
goto bb 5;

==
This seems wrong because the first time stop == 0 is checked at the source is:
int stop= 0;
for (int i=10 ; i=0  !stop; --i) {
^ === First time stop == 0 is checked.
stop= bar();
}
}


=
This seems that VRP sees the call to bar() in the wrong place.

Another issue is that VRP sees i=0  !stop, which it translates to:
Visiting statement:

=== This gives a don't know: ===
[delta.c : 5:26] _7 = _6  _5;

Found new range for _7: [0, +INF]
[snip]
Predicate evaluates to: DON'T KNOW

==
From there things go downhill. Instead of knowing that _7 implies _5 (i.e.,
i=0), it loses this information. So VRP does not understand that in the loop
i= 0.

=== This causes the following: 

i_17: loop information indicates does not overflow
Induction variable (int) 9 + -1 * iteration does not wrap in statement i_10 =
i_17 + -1;
 in loop 1.
Statement i_10 = i_17 + -1;
 is executed at most 2147483657 (bounded by 2147483657) + 1 times in loop 1.
Found new range for i_17: [-INF, 10]

=
If it know that _7 implies _5 and hence i=0 then it could have understood that 
i_17: [0, 10].

= This could lead to missed optimizations (in other cases), and bogus
warnings: ===
Visiting statement:
[delta.c : 5:36] i_10 = i_17 + -1;

Found new range for i_10: [-INF(OVF), 9]


[Bug libstdc++/62237] New: ostream single character printing is slower than fprintf with %c.

2014-08-23 Thread mickey.veksler at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62237

Bug ID: 62237
   Summary: ostream single character printing is slower than
fprintf with %c.
   Product: gcc
   Version: 4.9.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: mickey.veksler at gmail dot com

Created attachment 33383
  -- https://gcc.gnu.org/bugzilla/attachment.cgi?id=33383action=edit
ostream vs. fprintf

I have tried to write a micro-benchmark to compare the performance of the
ostream and FILE interfaces. It turns out that GCC's ostream is performing
quite well except in one case:
out  ch;

It takes almost 2.5 times the wall-time of the fprintf(out, %c, ch) version
(Please see the attached files). This does not make sense to me. Even worse,
the fprintf version takes only 60% CPU instead of the 99% of the fprintf
version.

In order not to make you follow my analysis, which is possibly wrong, I do not
add more of my results. I would say that from my analysis it seems that it
should be relatively easy to speed-up this case, significantly.


[Bug c/55217] False -Wstrict-overflow warning

2013-12-14 Thread mickey.veksler at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55217

Michael Veksler mickey.veksler at gmail dot com changed:

   What|Removed |Added

 CC||mickey.veksler at gmail dot com

--- Comment #1 from Michael Veksler mickey.veksler at gmail dot com ---
(Strange that this hasn't been confirmed for over a year!)

I have a similar issue with gcc-4.8 with a slightly different test-case. 
So first I looked into your test case. To me it seems that gcc-4.7 warning does
look strange here, but with gcc-4.8 things look better:


gcc -c -O2 -Wstrict-overflow=2 beta.c -std=c99
beta.c: In function ‘f’:
beta.c:7:20: warning: assuming signed overflow does not occur when simplifying
conditional to constant [-Wstrict-overflow]
 if (r)
^
beta.c:10:17: warning: assuming signed overflow does not occur when simplifying
conditional to constant [-Wstrict-overflow]
 for (int j = 0; j  r; j++)
---

The first warning for if(r) makes sense.
However, the second warning does not make sense. Even after removing the 'if'
the warning stays:
---
void f(int n, int s)
{
int r = 1;
for (int i = 1; i  n; i++)
if (r)
r++;
for (int j = 0; j  r; j++)
h(s);
}


gamma.c:9:9: warning: assuming signed overflow does not occur when simplifying
conditional to constant [-Wstrict-overflow]
 for (int j = 0; j  r; j++)
---

It is as if gcc transforms the for loop to:
int j=0;
if (j = r) goto done;  // --- does the warning come from here?
  loop:
h(s);
j++
if (j  r) goto loop;
  done:

I assume that then gcc notices that r=1, unless it overflows, and hence
j=r must be false for the first j, i.e., j=0.
If this is what happens, then this is the wrong way to do it. If the same
expression is duplicated then -Wstrict-overflow should be emitted only if
it applies to both duplicates, or am I missing something?

[Bug c/55217] False -Wstrict-overflow warning

2013-12-14 Thread mickey.veksler at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55217

--- Comment #2 from Michael Veksler mickey.veksler at gmail dot com ---
A much more clear-cut, weird, and severe case:
$ cat delta.c
int bar();
void foo()
{
int stop= 0;
for (int i=10 ; i=0  !stop; --i) {
stop= bar();
}
}

$ gcc -c -O3 -Wstrict-overflow=3 delta.c -std=c99
delta.c: In function ‘foo’:
delta.c:5:22: warning: assuming signed overflow does not occur when changing X
+- C1 cmp C2 to X cmp C1 +- C2 [-Wstrict-overflow]
 for (int i=10 ; i=0  !stop; --i) {
  ^


This make no sense at all and significantly lowers the usability of
-Wstrict-overflow=3. Either VRP or constant-propagation must have realized that
overflow is impossible, or does VRP come into play only after the warning is
emitted? Or maybe VRP can't do it because such reasoning requires induction?

Oh, and:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
4.8.1-10ubuntu9' --with-bugurl=file:///usr/share/doc/gcc-4.8/README.Bugs
--enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-4.8 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.8 --libdir=/usr/lib --enable-nls
--with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --enable-gnu-unique-object --enable-plugin
--with-system-zlib --disable-browser-plugin --enable-java-awt=gtk
--enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64/jre
--enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.8-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.8-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 4.8.1 (Ubuntu/Linaro 4.8.1-10ubuntu9)