[Bug target/116014] Missed optimization opportunity: inverted shift count

2024-07-24 Thread bisqwit at iki dot fi via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116014

--- Comment #2 from Joel Yliluoma  ---
(In reply to Andi Kleen from comment #1)
> is that from some real code? why would a programmer write shifts like that?

Yes, it is from actual code:

uint64_t readvlq()
{
uint64_t x, f = ~(uint64_t)0, ones8 = f / 255, pat80 = ones8*0x80,
pat7F=ones8*0x7F;
memcpy(, ptr, sizeof(x));
uint8_t n = __builtin_ctzll(~(x|pat7F)) + 1;
ptr += n/8;
return _pext_u64(x, pat7F >> (64-n));
}

This function reads a variable-length encoded integer (as in General MIDI) from
a bytestream without loops or branches. It essentially does the same as this:

uint64_t readvlq()
{
uint64_t result = 0;
do { result = (result << 7) | (*ptr & 0x7F); } while(*ptr++ & 0x80);
return result;
}

It isn’t too hard to think of plausible other cases where bitshifts with
numberofbits(tgt)-variable may occur. In fact, after just 2 minutes of
searching with `grep`, I found this line in LLVM
(llvm-17/llvm/Bitstream/BitstreamWriter.h), where CurValue is a 32-bit entity:

CurValue = Val >> (32-CurBit);

[Bug middle-end/116013] Missed optimization opportunity with andn involving consts

2024-07-20 Thread bisqwit at iki dot fi via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116013

--- Comment #1 from Joel Yliluoma  ---
Should be noted that this is not x86_64 specific; andn exists for other
platforms too, and even for platforms that don’t have it, changing
`~(expr|const)` into `~expr & ~const`   is unlikely to be a pessimization.

[Bug tree-optimization/116014] New: Missed optimization opportunity: inverted shift count

2024-07-20 Thread bisqwit at iki dot fi via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116014

Bug ID: 116014
   Summary: Missed optimization opportunity: inverted shift count
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Below are six short functions which perform bit-shifts by a non-constant
inverted amount. GCC fails to generate most optimal code. Further explanation
is given below the assembler code.

#include 
uint64_t shl_m64(uint64_t value, uint8_t k)
{
return value << (64-k);
}
uint64_t shl_m63(uint64_t value, uint8_t k)
{
return value << (63-k);
}
uint64_t shr_m64(uint64_t value, uint8_t k)
{
return value >> (64-k);
}
uint64_t shr_m63(uint64_t value, uint8_t k)
{
return value >> (63-k);
}
int64_t asr_m64(int64_t value, uint8_t k)
{
return value >> (64-k);
}
int64_t asr_m63(int64_t value, uint8_t k)
{
return value >> (63-k);
}

Below is the code generated by GCC, using -Ofast -mbmi2 -masm=intel. BMI2 is
used just to make the assembler code more succinct; it is not relevant for the
report.

shl_m64:
mov eax, 64
sub eax, esi
shlxrax, rdi, rax
ret
shl_m63:
mov eax, 63
sub eax, esi
shlxrax, rdi, rax
ret
shr_m64:
mov eax, 64
sub eax, esi
shrxrax, rdi, rax
ret
shr_m63:
mov eax, 63
sub eax, esi
shrxrax, rdi, rax
ret
asr_m64:
mov eax, 64
sub eax, esi
sarxrax, rdi, rax
ret
asr_m63:
mov eax, 63
sub eax, esi
sarxrax, rdi, rax
ret

GCC fails to utilize the fact that on Intel, the shift instructions
automatically mask the shift-count into the target register width. That is,
shift of a 64-bit operand by 68 is the same as shift by 68%64 = 4, and shift of
a 32-bit operand by 100 is the same shift by 100%32 = 4. Utilizing this
knowledge permits the use of single-insn neg/not to replace the subtract, which
requires two insns.

In comparison, Clang (version 16) produces this (optimal) code:

shl_m64:
neg sil
shlxrax, rdi, rsi
ret
shl_m63:
not sil
shlxrax, rdi, rsi
ret
shr_m64:
neg sil
shrxrax, rdi, rsi
ret
shr_m63:
not sil
shrxrax, rdi, rsi
ret
asr_m64:
neg sil
sarxrax, rdi, rsi
ret
asr_m63:
not sil
sarxrax, rdi, rsi
ret

Tested GCC version: GCC: (Debian 14-20240330-1) 14.0.1 20240330 (experimental)
[master r14-9728-g6fc84f680d0]

[Bug tree-optimization/116013] New: Missed optimization opportunity with andn involving consts

2024-07-20 Thread bisqwit at iki dot fi via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116013

Bug ID: 116013
   Summary: Missed optimization opportunity with andn involving
consts
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Below are two short functions which work identically. While GCC utilizes the
ANDN instruction (of Intel BMI1) for test2, it fails to see that it could do
the same with test1.

#include 
uint64_t test1(uint64_t value)
{
return ~(value | 0x7F7F7F7F7F7F7F7F);
}
uint64_t test2(uint64_t value)
{
return ~value & ~0x7F7F7F7F7F7F7F7F;
}

Assembler listings of both functions are below (-Ofast -mbmi):

test1:
movabsq $9187201950435737471, %rdx
movq%rdi, %rax
orq %rdx, %rax
notq%rax
ret
test2:
movabsq $-9187201950435737472, %rax
andn%rax, %rdi, %rax
ret

Tested compiler version:
GCC: (Debian 14-20240330-1) 14.0.1 20240330 (experimental) [master
r14-9728-g6fc84f680d0]

This optimization makes only sense if one of the operands is a compile-time
constant. If neither operand is a compile-time constant, then the opposite
optimization makes more sense — which GCC already does.
It is also worth noting, that GCC already compiles ~(var1 | ~var2) into ~var1 &
var2, utilizing ANDN. This is good.

[Bug c++/99895] New: Function parameters generated wrong in call to member of non-type template parameter in lambda

2021-04-03 Thread bisqwit at iki dot fi via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99895

Bug ID: 99895
   Summary: Function parameters generated wrong in call to member
of non-type template parameter in lambda
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

GCC produces false error message:

bug1.cc: In instantiation of ‘consteval void VerifyHash() [with unsigned int
expected_hash = 5; fixed_string<...auto...> ...s = {fixed_string<6>{"khaki"},
fixed_string<6>{"plums"}}]’:
bug1.cc:24:37:   required from here
bug1.cc:19:41: error: no matching function for call to
‘fixed_string<6>::data(const fixed_string<6>*)’
   19 |   [](auto){static_assert(hash(s.data(), s.size()) ==
expected_hash);}(s)
  |   ~~^~
bug1.cc:11:27: note: candidate: ‘consteval const char* fixed_string::data()
const [with long unsigned int N = 6]’
   11 | consteval const char* data() const { return str; }
  |   ^~~~
bug1.cc:11:27: note:   candidate expects 0 arguments, 1 provided

On this code:

#include  // copy_n and size_t
static constexpr unsigned hash(const char* s, std::size_t length)
{
s=s;
return length;
}
template
struct fixed_string
{
constexpr fixed_string(const char ()[N]) { std::copy_n(s, N, str); }
consteval const char* data() const { return str; }
consteval std::size_t size() const { return N-1; }
char str[N];
};
template
static consteval void VerifyHash()
{
(
  [](auto){static_assert(hash(s.data(), s.size()) == expected_hash);}(s)
,...);
// The compiler mistakenly translates s.data() into s.data()
// and then complains that the call is not valid, because
// the function expects 0 parameters and 1 "was provided".
}
void foo()
{
VerifyHash<5, "khaki", "plums">();
}


Compiler version:
g++-10 (Debian 10.2.1-6) 10.2.1 20210110

[Bug c++/99893] New: C++20 unexpanded parameter packs falsely not detected (lambda is involved)

2021-04-03 Thread bisqwit at iki dot fi via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99893

Bug ID: 99893
   Summary: C++20 unexpanded parameter packs falsely not detected
(lambda is involved)
   Product: gcc
   Version: 10.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

GCC produces false error message:

bug1.cc: In function ‘consteval void VerifyHash()’:
bug1.cc:20:70: error: operand of fold expression has no unexpanded parameter
packs
   20 |   [](){static_assert(hash(s.data(), s.size()) == expected_hash);}()
  |   ~~~^~

On this code:

#include  // copy_n and size_t
static constexpr unsigned hash(const char* s, std::size_t length)
{
s=s;
return length;
}
template
struct fixed_string
{
constexpr fixed_string(const char ()[N]) { std::copy_n(s, N, str); }
consteval const char* data() const { return str; }
consteval std::size_t size() const { return N-1; }
char str[N];
};
template
static consteval void VerifyHash()
{
(
  [](){static_assert(hash(s.data(), s.size()) == expected_hash);}()
,...);
// ^ Falsely reports that there are no unexpanded parameter packs,
//   while there definitely is ("s" is used).
}
void foo()
{
VerifyHash<5, "khaki", "plums">();
}


Compiler version:
g++-10 (Debian 10.2.1-6) 10.2.1 20210110

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2020-04-21 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485

--- Comment #25 from Joel Yliluoma  ---
(In reply to Jakub Jelinek from comment #24)
> on x86 read e.g. about MXCSR register and in the description of each
> instruction on which Exceptions it can raise.

So the quick answer to #15 is that addps instruction may raise exceptions. Ok,
thanks for clearing that up. My bad. So it seems that LLVM relies on the
assumption that the upper portions of the register are zeroed, and this is what
you said in the first place.

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2020-04-21 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485

--- Comment #23 from Joel Yliluoma  ---
(In reply to Jakub Jelinek from comment #21)
> (In reply to Joel Yliluoma from comment #20)
> > Which exceptions would be generated by data in an unused portion of a
> > register?
> 
> addps adds 4 float elements, there is no "unused" portion.
> If some of the elements contain garbage, it can trigger for e.g. the addition
> FE_INVALID, FE_OVERFLOW, FE_UNDERFLOW or FE_INEXACT (FE_DIVBYZERO obviously
> isn't relevant to addition).
> Please read the standard about floating point exceptions, fenv.h etc.

There is “unused” portion, for the purposes of the data use. Same as with
padding in structs; the memory is unused because no part in program relies on
its contents, even though the CPU may load those portions in registers when
e.g. moving and copying the struct. The CPU won’t know whether it’s used or
not.

You mention FE_INVALID etc., but those are concepts within the C standard
library, not in the hardware. The C standard library will not make judgments on
the upper portions of the register. So if you have two float[2]s, and you add
them together into another float[2], and the compiler uses addps to achieve
this task, what is the mechanism that would supposedly generate an exception,
when no part in the software depends and makes judgments on the irrelevant
parts of the register?

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2020-04-21 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485

--- Comment #20 from Joel Yliluoma  ---
(In reply to Jakub Jelinek from comment #16)
> (In reply to Joel Yliluoma from comment #15)
> > (In reply to Richard Biener from comment #14)
> > > I also think llvms code generation is bogus since it appears the ABI
> > > does not guarantee zeroed upper elements of the xmm0 argument
> > > which means they could contain sNaNs:
> > 
> > Why would it matter that the unused portions of the register contain NaNs?
> 
> Because it could then raise exceptions that shouldn't be raised?

Which exceptions would be generated by data in an unused portion of a register?
Does for example “addps” generate an exception if one or two of the operands
contains NaNs? Which instructions would generate exceptions?

I can only think of divps, when dividing by a zero, but it does not seem that
even LLVM compiles the two-element vector division into divps.

If the register is passed as a parameter to a library function, they would not
make judgments based on the values of the unused portions of the registers.

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2020-04-21 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485

--- Comment #15 from Joel Yliluoma  ---
(In reply to Richard Biener from comment #14)
> I also think llvms code generation is bogus since it appears the ABI
> does not guarantee zeroed upper elements of the xmm0 argument
> which means they could contain sNaNs:

Why would it matter that the unused portions of the register contain NaNs?

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2020-04-21 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485

--- Comment #13 from Joel Yliluoma  ---
GCC 4.1.2 is indicated in the bug report headers.
Luckily, Compiler Explorer has a copy of that exact version, and it indeed
vectorizes the second function: https://godbolt.org/z/DC_SSb

On my own system, the earliest I have is 4.6. The Compiler Explorer has 4.4,
and it, or anything newer than that, no longer vectorizes either function.

[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity

2020-04-21 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485

--- Comment #11 from Joel Yliluoma  ---
Looks like this issue has taken a step or two *backwards* in the past years.

Where as the second function used to be vectorized properly, today it seems
neither of them are.

Contrast this with Clang, which compiles *both* functions into a single
instruction:

  vaddps xmm0, xmm1, xmm0

or some variant thereof depending on the -m options.

Compiler Explorer link: https://godbolt.org/z/2AKhnt

[Bug c++/94575] Bogus warning: Used variable is “not” used

2020-04-20 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94575

--- Comment #2 from Joel Yliluoma  ---
Sorry, the error Marek Polacek mentions is due to a copypaste mistake on my
part. The correct code that demonstrates the problem is here. The difference is
the && instead of &.

#include 
template
static void Use(T&& plot)
{
plot(1);
}
int main()
{
static const int table[1] = {123456};
Use([&](auto x)
{
unsigned var = table[x];
unsigned ui = var;
std::printf("%u\n", ui);
});
}

[Bug c++/94575] New: Bogus warning: Used variable is “not” used

2020-04-13 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94575

Bug ID: 94575
   Summary: Bogus warning: Used variable is “not” used
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Incorrect warning (-std=c++14 -Wall )
Tested and occurs on GCC 5.4.1, 6.5.0, 7.5.0, 8.4.0, 9.3.0, and 10.0.1 (master
revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536))

tmp.cc: In function ‘int main()’:
tmp.cc:9:22: warning: variable ‘table’ set but not used
[-Wunused-but-set-variable]
9 | static const int table[1] = {123456};
  |  ^

Table is in fact used; program prints 123456.

#include 
template
static void Use(T& plot)
{
plot(1);
}
int main()
{
static const int table[1] = {123456};
Use([&](auto x)
{
unsigned var = table[x];
unsigned ui = var;
std::printf("%u\n", ui);
});
}

This a non-exhaustive list of changes that will make the warning go away:

— Changing the lambda auto parameter into a static type such as int
— Changing Use() into a lambda function in main()
— Removing the store into temporary variable “ui”

[Bug c++/94571] Error: Expected comma or semicolon, comma found

2020-04-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94571

--- Comment #1 from Joel Yliluoma  ---
  |   ^

(Missing line from the paste)

The problem exists since GCC 7. (GCC 6 and earlier did not support structured
bindings.)

[Bug c++/94571] New: Error: Expected comma or semicolon, comma found

2020-04-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94571

Bug ID: 94571
   Summary: Error: Expected comma or semicolon, comma found
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

void foo()
{
int test1[2], test2[2];
auto [a,b] = test1, [c,d] = test2;
}

The error message given for this (invalid) C++17 code is a bit confusing.

tmp.cc: In function ‘void foo()’:
tmp.cc:4:23: error: expected ‘,’ or ‘;’ before ‘,’ token
4 | auto [a,b] = test1, [c,d] = test2;

You expected comma, found comma. So what is the problem? The proper error
message would be to only expect a semicolon.

[Bug tree-optimization/58195] Missed optimization opportunity when returning a conditional

2020-04-10 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58195

--- Comment #4 from Joel Yliluoma  ---
Still confirmed on GCC 10 (Debian 10-20200324-1) 10.0.1 20200324 (experimental)
[master revision
596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536]


Seems I lack the oomph to update the "confirmed" state of this report.

[Bug c++/94546] New: unimplemented: unexpected AST of kind nontype_argument_pack

2020-04-10 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94546

Bug ID: 94546
   Summary: unimplemented: unexpected AST of kind
nontype_argument_pack
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Keywords: rejects-valid
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Rejects valid code.

$ g++-10 --version
g++-10 (Debian 10-20200324-1) 10.0.1 20200324 (experimental) [master revision
596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536]

$ g++-10 tmp.cc -std=c++20
tmp.cc: In instantiation of ‘void test(auto:1&&) [with auto:1 =
main()::&]’:
tmp.cc:18:14:   required from here
tmp.cc:8:5: sorry, unimplemented: unexpected AST of kind nontype_argument_pack
8 | [&](T&&... rest)
  | ^~~~
9 | {
  | ~
   10 | plot(std::forward(rest)...);
  | ~~~
   11 | };
  | ~
tmp.cc:8: confused by earlier errors, bailing out

#include 
void test(auto&& plot)
{
// Note: For brevity, this lambda function is only
// defined, not called nor assigned to a variable.
// Doing those things won’t fix the error.
[&](T&&... rest)
{
plot(std::forward(rest)...);
};
}
int main()
{
auto Plot = [](auto&&...)
{
};
test(Plot);
}

[Bug c++/94490] New: Ternary expression with 3 consts is “not” a constant expression

2020-04-04 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94490

Bug ID: 94490
   Summary: Ternary expression with 3 consts is “not” a constant
expression
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Versions tested: GCC 9.3.0, GCC 10.0.1 20200324 (experimental) [master revision
596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536]

Error message:

test6.cc: In instantiation of ‘auto Mul(const T&, const T2&) [with T =
std::array; T2 = std::array]’:
test6.cc:20:58:   required from here
test6.cc:16:78: error: ‘(false ?  std::tuple_size_v > :
std::tuple_size_v >)’ is not a constant expression
   16 |  return typename arith_result::type {
std::get(vec) ... };
  |
 ^
test6.cc:20:6: error: ‘void x’ has incomplete type
   20 | auto x = Mul(std::array{}, std::array{});
  |  ^


Compiler is erroneously claiming that an expression of type (x ? y : z) where
all of x,y,z are constant expressions, is not a constant expression.



Code:

#include 
#include 

template, std::size_t
B=std::tuple_size_v, std::size_t N = std::min(A,B), class S =
std::make_index_sequence<(A>
struct arith_result
{
using type = std::conditional_t(std::index_sequence) ->
std::array...,
std::tuple_element_t...>, N>{}(S{})),
decltype([](std::index_sequence) ->
std::tuple,
std::tuple_element_t>...>{}(S{}))>;
};

template, typename T2 = T>
auto Mul(const T& vec, const T2& val)
{
return [&](std::index_sequence) {
 return typename arith_result::type { std::get(vec) ... };
} (std::make_index_sequence<2>{});
}

auto x = Mul(std::array{}, std::array{});


Note that if I replace the Mul function with this (inline the lambda call), the
problem goes away:

template, typename T2 = T>
auto Mul(const T& vec, const T2& val)
{
return typename arith_result::type {
std::get<0>(vec),std::get<1>(vec) };
}

Somehow the compiler forgets to do constant folding while it is processing the
lambda.

[Bug c++/94489] New: ICE: unexpected expression ‘std::min’ of kind overload

2020-04-04 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94489

Bug ID: 94489
   Summary: ICE: unexpected expression ‘std::min’ of kind overload
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

On GCC 9.3.0:

$ g++-9 test5.cc -std=c++2a -g -fconcepts
test5.cc: In instantiation of ‘auto Mul(const T&, const T2&) [with T =
std::array; T2 = std::array]’:
test5.cc:33:62:   required from here
test5.cc:28:117: internal compiler error: unexpected expression ‘std::min’ of
kind overload
   28 |  return typename arith_result::type {
std::plus(std::get(vec), std::get(val)) ... };
  |
^
0x7f78e300ce0a __libc_start_main
../csu/libc-start.c:308

On 10.0.1 20200324 (experimental) [master revision
596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536]:

$ g++-10 test5.cc -std=c++20 -g
test5.cc: In instantiation of ‘auto Mul(const T&, const T2&) [with T =
std::array; T2 = std::array]’:
test5.cc:33:62:   required from here
test5.cc:28:117: internal compiler error: unexpected expression ‘std::min’ of
kind overload
   28 |  return typename arith_result::type {
std::plus(std::get(vec), std::get(val)) ... };
  |
^
0x63d82b cxx_eval_constant_expression
../../src/gcc/cp/constexpr.c:6301
0x637ded cxx_eval_call_expression
../../src/gcc/cp/constexpr.c:2055
0x63ad65 cxx_eval_constant_expression
../../src/gcc/cp/constexpr.c:5483
0x63ccc2 cxx_eval_indirect_ref
../../src/gcc/cp/constexpr.c:4213
0x63ccc2 cxx_eval_constant_expression
../../src/gcc/cp/constexpr.c:5704
0x63dbe0 cxx_eval_outermost_constant_expr
../../src/gcc/cp/constexpr.c:6502
0x63e5ec cxx_constant_value(tree_node*, tree_node*)
../../src/gcc/cp/constexpr.c:6659
0x733823 expand_integer_pack
../../src/gcc/cp/pt.c:3751
0x733823 expand_builtin_pack_call
../../src/gcc/cp/pt.c:3790
0x733823 tsubst_pack_expansion(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:12714
0x735561 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:13078
0x73a678 tsubst_argument_pack(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:13040
0x735534 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:13090
0x7356e5 tsubst_aggr_type
../../src/gcc/cp/pt.c:13295
0x72c6f4 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool,
bool)
../../src/gcc/cp/pt.c:20100
0x7304e4 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool,
bool)
../../src/gcc/cp/pt.c:15745
0x7304e4 tsubst(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:15745
0x735466 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:13092
0x7356e5 tsubst_aggr_type
../../src/gcc/cp/pt.c:13295
0x7307fe tsubst(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:15633

Code is listed below.

#include 
#include 
#include 

template
concept IsTuple = requires(T t) { {std::get<0>(t) };} and
(std::tuple_size_v>-MinSz) <= (MaxSz-MinSz);

template,std::tuple_size_v),
 class seq = decltype(std::make_index_sequence{})>
struct arith_result
{
template
static auto t(std::index_sequence)
-> std::tuple,
std::tuple_element_t>...>;

template
static auto a(std::index_sequence)
-> std::array...,
std::tuple_element_t...>, dim>;

using type = std::conditional_t;
};

template, typename T2 = T>
auto Mul(const T& vec, const T2& val)
{
return [&](std::index_sequence) {
 return typename arith_result::type {
std::plus(std::get(vec), std::get(val)) ... };
}
(std::make_index_sequence>,

std::tuple_size_v>)>{});
}

auto x = Mul(std::array{}, std::array{});

[Bug c++/94128] ICE on C++20 "requires requires" with lambda

2020-03-11 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94128

--- Comment #2 from Joel Yliluoma  ---
Yes, it is valid.

— The auto parameter is valid since C++20. It is called a “placeholder type”,
which has existed since C++11. C++20 made it valid also in function parameters.

— The “requires” is a valid keyword since C++20. It specifies constraints that
the parameter must match. The double “requires” manifests in certain
situations.

— Until C++20, lambdas were not permitted in unevaluated contexts. Changed in
C++20.

[Bug c++/94128] New: ICE on C++20 "requires requires" with lambda

2020-03-10 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94128

Bug ID: 94128
   Summary: ICE on C++20 "requires requires" with lambda
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

For this code:

void test(auto param)
requires requires{ { [](auto p){return p;}(param) }; };

void test2() { test(1); }

On this compiler:

g++-10 (Debian 10-20200222-1) 10.0.1 20200222 (experimental) [master
revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779]

Compiling with this commandline:

g++-10 -v tmp.cc -std=c++20

We get:

Using built-in specs.
COLLECT_GCC=g++-10
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 10-20200222-1'
--with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs
--enable-languages=c,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr
--with-gcc-major-version-only --program-suffix=-10
--program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
--enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new
--enable-gnu-unique-object --disable-vtable-verify --enable-plugin
--enable-default-pie --with-system-zlib --with-target-system-zlib=auto
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-offload-targets=nvptx-none,amdgcn-amdhsa,hsa
--without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.0.1 20200222 (experimental) [master revision
01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779] (Debian
10-20200222-1) 
COLLECT_GCC_OPTIONS='-v' '-std=c++2a' '-shared-libgcc' '-mtune=generic'
'-march=x86-64'
 /usr/lib/gcc/x86_64-linux-gnu/10/cc1plus -quiet -v -imultiarch
x86_64-linux-gnu -D_GNU_SOURCE tmp.cc -quiet -dumpbase tmp.cc -mtune=generic
-march=x86-64 -auxbase tmp -std=c++2a -version -fasynchronous-unwind-tables -o
/tmp/cc8CWcEJ.s
GNU C++17 (Debian 10-20200222-1) version 10.0.1 20200222 (experimental) [master
revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779]
(x86_64-linux-gnu)
compiled by GNU C version 10.0.1 20200222 (experimental) [master
revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779], GMP
version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22-GMP

warning: GMP header version 6.2.0 differs from library version 6.1.2.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring duplicate directory "/usr/include/x86_64-linux-gnu/c++/10"
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory
"/usr/lib/gcc/x86_64-linux-gnu/10/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/include/c++/10
 /usr/include/x86_64-linux-gnu/c++/10
 /usr/include/c++/10/backward
 /usr/lib/gcc/x86_64-linux-gnu/10/include
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/10/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C++17 (Debian 10-20200222-1) version 10.0.1 20200222 (experimental) [master
revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779]
(x86_64-linux-gnu)
compiled by GNU C version 10.0.1 20200222 (experimental) [master
revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779], GMP
version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22-GMP

warning: GMP header version 6.2.0 differs from library version 6.1.2.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: f533434f622c23e753fbd5b6135ebdd3
tmp.cc: In instantiation of ‘void test(auto:1) requires
requires{{()(test::param)};} [with auto:1 = int]’:
tmp.cc:3:22:   required from here
tmp.cc:2:26: internal compiler error: Segmentation fault
2 | requires requires{ { [](auto p){return p;}(param) }; };
  |  ^
0xc248ef crash_signal
../../src/gcc/toplev.c:328
0x7fb53dd2d0ff ???
   
/build/glibc-kSJANG/glibc-2.29/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0x733be8 tsubst_template_args(tree_node*, tree_node*, int, tree_node*)
../../src/gcc/cp/pt.c:13090
0x738602 tsubst_fu

[Bug c++/92766] New: [Rejects valid] pointer+0 erroneously treated as rvalue

2019-12-03 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92766

Bug ID: 92766
   Summary: [Rejects valid] pointer+0 erroneously treated as
rvalue
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

template
void foo(T&& begin, T&& end);

void test()
{
unsigned char buffer[16];
const unsigned char* ptr = buffer;
foo(ptr+0, ptr+8);
}

On GCC 8.3, this compiles fine.
On GCC 9.1 and trunk, fails to compile (-std=c++11, c++14, c++17, c++2a):

:8:21: error: no matching function for call to 'foo(const unsigned
char*&, const unsigned char*)'

8 | foo(ptr+0, ptr+8);

  | ^

:2:6: note: candidate: 'template void foo(T&&, T&&)'

2 | void foo(T&& begin, T&& end);

  |  ^~~

:2:6: note:   template argument deduction/substitution failed:

:8:21: note:   deduced conflicting types for parameter 'T' ('const
unsigned char*&' and 'const unsigned char*')

8 | foo(ptr+0, ptr+8);

  | ^

Compiler returned: 1

[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)

2019-08-23 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526

Joel Yliluoma  changed:

   What|Removed |Added

 CC||bisqwit at iki dot fi

--- Comment #2 from Joel Yliluoma  ---
The theory that it is related to RVO seems to be confirmed by the fact that if
the code is changed like this:

   struct Vec { float v[8]; };
   void multiply(struct Vec* result,
 const struct Vec* __restrict__ v1,
 const struct Vec* __restrict__ v2)
   {
   for(unsigned i = 0; i < 8; ++i)
   result->v[i] = v1->v[i] * v2->v[i];
   }

Then it gets compiled in the shorter and proper form. Interestingly, even if
the __restrict__ attribute is removed, it still gets vectorized. Is this
correct behavior?

[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array

2019-08-05 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #24 from Joel Yliluoma  ---
The simple horizontal 8-bit add seems to work nicely. Very nice work.

However, the original bug report — that the code snippet quoted below no longer
receives love from the SIMD optimization unless you explicitly say “pragma #omp
simd” — seems still unaddressed.

#define num_words 2

typedef unsigned long long E;
E bytes[num_words];
unsigned char sum() 
{
E b[num_words] = {};
//#pragma omp simd
for(unsigned n=0; n> 32);
temp += (temp >> 16);
temp += (temp >> 8);
// Save that number in an array
b[n] = temp;
}
// Calculate sum of those sums
unsigned char result = 0;
//#pragma omp simd
for(unsigned n=0; nhttps://godbolt.org/z/XL3cIK

[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array

2019-08-01 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #19 from Joel Yliluoma  ---
If the function return type is changed to "unsigned short", the AVX code with
"vpextrb" will do a spurious "movzx eax, al" at the end — but if the return
type is "unsigned int", it will not. The code with "(v)movd" should of course
do it, if the vector element size is shorter than the return type.

[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array

2019-08-01 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #18 from Joel Yliluoma  ---
Great, thanks. I can test this in a few days, but I would like to make sure
that the proper thing still happens if the vector is of bytes but the return
value of the function is a larger-than-byte integer type. Will it still
generate a movd in this case? Because that would be wrong. :-)

[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array

2019-08-01 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #16 from Joel Yliluoma  ---
In reference to my previous comment, this is the code I tested with and the
compiler flags were -Ofast -mno-avx.

unsigned char bytes[128];
unsigned char sum (void)
{
  unsigned char r = 0;
  const unsigned char *p = (const unsigned char *) bytes;
  int n;
  for (n = 0; n < sizeof (bytes); ++n)
r += p[n];
  return r;
}

[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array

2019-08-01 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #15 from Joel Yliluoma  ---
Seems to work neatly now.
Any reason why on vector size 128, non-AVX, it does the low byte move through
the red zone? Are pextrb or movd instructions not available? Or does ABI
specify that the upper bits of the eax register must be zero?

movaps  XMMWORD PTR [rsp-40], xmm2
movzx   eax, BYTE PTR [rsp-40]

Clang does just a simple movd here.

movdeax, xmm1

[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array

2019-07-29 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #6 from Joel Yliluoma  ---
Maybe a horizontal checksum is a bit obscure term. A 8-bit checksum is what is
being accomplished, nonetheless. Yes, there are simpler ways to do it…

But I tried a number of different approaches in order to try and get maximum
performance SIMD code out of GCC, and I came upon this curious case that I
posted this bugreport about.

To another compiler, I reported a related bug concerning a code that looks like
this:

unsigned char calculate_checksum(const void* ptr)
{
unsigned char bytes[16], result = 0;
memcpy(bytes, ptr, 16);
// The reason the memcpy is there in place is because to
// my knowledge, it is the only _safe_ way permitted by
// the standard to do conversions between representations.
// Union, pointer casting, etc. are not safe.
for(unsigned n=0; n<16; ++n) result += bytes[n];
return result;
}

After my report, their compiler now generates:

vmovdqu xmm0, xmmword ptr [rdi]
vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1]
vpaddb xmm0, xmm0, xmm1
vpxor xmm1, xmm1, xmm1
vpsadbw xmm0, xmm0, xmm1
vpextrb eax, xmm0, 0
ret

This is what GCC generates for the same code.

vmovdqu xmm0, XMMWORD PTR [rdi]
vpsrldq xmm1, xmm0, 8
vpaddb  xmm0, xmm0, xmm1
vpsrldq xmm1, xmm0, 4
vpaddb  xmm0, xmm0, xmm1
vpsrldq xmm1, xmm0, 2
vpaddb  xmm0, xmm0, xmm1
vpsrldq xmm1, xmm0, 1
vpaddb  xmm0, xmm0, xmm1
vpextrb eax, xmm0, 0
ret

So the bottom line is, (v)psadbw reductions should be added as M. Glisse
correctly indicated.

[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array

2019-07-19 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

--- Comment #3 from Joel Yliluoma  ---
For the record, for this particular case (8-bit checksum of an array, 16 bytes
in this case) there exists even more optimal SIMD code, which ICC (version 18
or greater) generates automatically.

vmovups   xmm0, XMMWORD PTR bytes[rip]  #5.9
vpxor xmm2, xmm2, xmm2  #4.41
vpaddbxmm0, xmm2, xmm0  #4.41
vpsrldq   xmm1, xmm0, 8 #4.41
vpaddbxmm3, xmm0, xmm1  #4.41
vpsadbw   xmm4, xmm2, xmm3  #4.41
vmovd eax, xmm4 #4.41
movsx rax, al   #4.41
ret #7.16

[Bug tree-optimization/91201] New: [7~9 Regression] SIMD not generated for horizontal sum of bytes in array

2019-07-18 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201

Bug ID: 91201
   Summary: [7~9 Regression] SIMD not generated for horizontal sum
of bytes in array
   Product: gcc
   Version: 9.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

For this code —

typedef unsigned long long E;
const unsigned D = 2;
E bytes[D];
unsigned char sum() 
{
E b[D]{};
//#pragma omp simd
for(unsigned n=0; n> 32);
temp += (temp >> 16);
temp += (temp >> 8);
b[n] = temp;
}
E result = 0;
//#pragma omp simd
for(unsigned n=0; nhttps://godbolt.org/z/azkXiL

[Bug rtl-optimization/88770] New: Redundant load opt. or CSE pessimizes code

2019-01-09 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88770

Bug ID: 88770
   Summary: Redundant load opt. or CSE pessimizes code
   Product: gcc
   Version: 8.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

For this code (-xc -std=c99 or -xc++ -std=c++17):

struct guu { int a; int b; float c; char d; };

extern void test(struct guu);

void caller()
{
test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} );
test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} );
}

CSE (or some other form of redundant loads optimization) pessimizes the code.
Problem occurs on optimization levels -O1 and higher, including -Os.

If the function "caller" calls test() just once, the resulting code is (-O3
-fno-optimize-sibling-calls, stack alignment/push/pops omitted for brevity):

movabs  rdi, 21474836483
movabs  rsi, 39743127552
calltest

If "caller" calls test() twice, the code is a lot longer and not just twice as
long. (Stack alignment/push/pops omitted for brevity):

movabs  rbp, 21474836483
mov rdi, rbp
movabs  rbx, 38654705664
mov rsi, rbx
or  rbx, 1088421888
or  rsi, 1088421888
calltest
mov rsi, rbx
mov rdi, rbp
calltest

If we change caller() such that the parameters in the two calls are not
identical:

void caller()
{
test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} );
test( (struct guu){.a = 3, .b = 6, .c = 7, .d = 10} );
}

The generated code is optimal again as expected:

movabs  rdi, 21474836483
movabs  rsi, 39743127552
calltest
movabs  rdi, 25769803779
movabs  rsi, 44038094848
calltest

The problem in the first examples is that the compiler sees that the same
parameter is used twice, and it tries to save it in a callee-saves register, in
order to reuse the same values on the second call. However re-initializing the
registers from scratch would have been more efficient.

The problem occurs on GCC versions 4.8.1 and newer. It does not occur in GCC
version 4.7.4, which generated different code that is otherwise inefficient.

For reference, the problem also exists in Clang versions 3.5 and newer, but not
in versions 3.4 and earlier.

[Bug tree-optimization/63259] Detecting byteswap sequence

2018-11-19 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63259

Joel Yliluoma  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #23 from Joel Yliluoma  ---
It would seem to be fixed since GCC 5.

[Bug c++/84556] New: C++17, lambda, OpenMP simd: sorry, unimplemented: unexpected AST

2018-02-25 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84556

Bug ID: 84556
   Summary: C++17, lambda, OpenMP simd: sorry, unimplemented:
unexpected AST
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

This code generates an AST error when compiled with -std=c++17 -fopenmp.

void foo()
{
auto keymaker = [](void)
{
#pragma omp simd
for(unsigned pos = 0; pos < 4; ++pos)
{
}
};
}

test.cc: In lambda function:
test.cc:9:5: sorry, unimplemented: unexpected AST of kind omp_simd
 };
 ^
test.cc:9: confused by earlier errors, bailing out

Compiling without -fopenmp, or using an earlier standard mode such as
-std=c++14 or -std=c++11, the error is not produced.

Tested on: g++-7 (Debian 7.2.0-19) 7.2.0
Tested on: g++-7 (Debian 7.2.0-18) 7.2.0
Tested on: g++-7.1 (GCC) 7.1.0

Problem does NOT occur on:
g++-6 (Debian 6.4.0-11) 6.4.0 20171206

Problem does NOT occur with #pragma omp parallel for.

[Bug lto/71536] New: lto1 ICE: func-static constant in openmp offloaded function

2016-06-14 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71536

Bug ID: 71536
   Summary: lto1 ICE: func-static constant in openmp offloaded
function
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

With this example program:

#pragma omp declare target
void Process()
{
static const int value = 12;
}
#pragma omp end declare target

int main()
{
#pragma omp target
{
Process();
}
}

g++ crash.cc -fopenmp -O1

lto1: internal compiler error: Segmentation fault
0x93bcaf crash_signal
../../gcc/toplev.c:333
0x82d2eb input_offload_tables(bool)
../../gcc/lto-cgraph.c:1931
0x5c6590 read_cgraph_and_symbols
../../gcc/lto/lto.c:2858
0x5c6590 lto_main()
../../gcc/lto/lto.c:3304
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
mkoffload-intelmic: fatal error:
x86_64-pc-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc returned 1 exit
status
compilation terminated.
lto-wrapper: fatal error:
/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/x86_64-intelmicemul-linux-gnu/mkoffload
returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The main() can be deleted from the program, and the error still occurs.
This error requires at least -O1 to trigger it.

GCC version: 6.1.0

[Bug lto/71535] New: ICE in LTO1 with -fopenmp offloading

2016-06-14 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71535

Bug ID: 71535
   Summary: ICE in LTO1 with -fopenmp offloading
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

With this example program:

#include 

void Process(unsigned* Target)
{
for(int s=0; s<4; ++s)
Target[s] = 100u * std::min(255, std::max(0, 0))
 +  200u * std::min(255, std::max(0, 0));
}

int main()
{
#pragma omp target teams distribute parallel for
for(unsigned y=0; y<16; ++y)
{
unsigned Line[16];
Process(Line);
}
}

g++ tmps.cc -fopenmp

lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203
0x82f4d5 input_overwrite_node
../../gcc/lto-cgraph.c:1201
0x82f4d5 input_node
../../gcc/lto-cgraph.c:1296
0x82f4d5 input_cgraph_1
../../gcc/lto-cgraph.c:1546
0x82f4d5 input_symtab()
../../gcc/lto-cgraph.c:1849
0x5c657b read_cgraph_and_symbols
../../gcc/lto/lto.c:2856
0x5c657b lto_main()
../../gcc/lto/lto.c:3304
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
mkoffload-intelmic: fatal error:
x86_64-pc-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc returned 1 exit
status
compilation terminated.
lto-wrapper: fatal error:
/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/x86_64-intelmicemul-linux-gnu/mkoffload
returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

GCC version: 6.1.0
This bug is very likely related to PR71499.

[Bug lto/71499] ICE in LTO1 when attempting NVPTX offloading (-fopenacc)

2016-06-10 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71499

--- Comment #1 from Joel Yliluoma  ---
Addendum: While this works, reading LTO data and producing HOST code:

/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto1 -dumpbase tmpe.o -auxbase
tmpe -version -fopenacc tmpe.o -o /tmp/ccZgHvRO.s 

This does not:

/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/accel/nvptx-none/lto1
-dumpbase tmpe.o -auxbase tmpe -version -fopenacc tmpe.o -o /tmp/ccZgHvRO.s
GNU GIMPLE (GCC) version 6.1.0 (nvptx-none)
compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4,
MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU GIMPLE (GCC) version 6.1.0 (nvptx-none)
compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4,
MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
options passed:  -fopenacc tmpe.o
options enabled:  -faggressive-loop-optimizations -fauto-inc-dec
 -fchkp-check-incomplete-type -fchkp-check-read -fchkp-check-write
 -fchkp-instrument-calls -fchkp-narrow-bounds -fchkp-optimize
 -fchkp-store-bounds -fchkp-use-static-bounds
 -fchkp-use-static-const-bounds -fchkp-use-wrappers -fcommon
 -fdelete-null-pointer-checks -fearly-inlining
 -feliminate-unused-debug-types -ffunction-cse -fgcse-lm -fgnu-runtime
 -fgnu-unique -fident -finline-atomics -fipa-pta -fira-hoist-pressure
 -fira-share-save-slots -fira-share-spill-slots -fivopts
 -fkeep-static-consts -fleading-underscore -flifetime-dse
 -flto-odr-type-merging -fmath-errno -fmerge-debug-strings -fpeephole -fplt
 -fprefetch-loop-arrays -freg-struct-return -fsched-critical-path-heuristic
 -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
 -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
 -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fschedule-fusion
 -fsemantic-interposition -fshow-column -fsigned-zeros
 -fsplit-ivs-in-unroller -fssa-backprop -fstdarg-opt
 -fstrict-volatile-bitfields -fsync-libcalls -ftoplevel-reorder
 -ftrapping-math -ftree-cselim -ftree-forwprop -ftree-loop-if-convert
 -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
 -ftree-parallelize-loops= -ftree-phiprop -ftree-reassoc -ftree-scev-cprop
 -funit-at-a-time -fvar-tracking-assignments -fzero-initialized-in-bss -m64
Reading object files: tmpe.o {GC start 776k} 
Reading the callgraph
lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203
0x7a73c5 input_overwrite_node
../../gcc/lto-cgraph.c:1201
0x7a73c5 input_node
../../gcc/lto-cgraph.c:1296
0x7a73c5 input_cgraph_1
../../gcc/lto-cgraph.c:1546
0x7a73c5 input_symtab()
../../gcc/lto-cgraph.c:1849
0x5537fb read_cgraph_and_symbols
../../gcc/lto/lto.c:2856
0x5537fb lto_main()
../../gcc/lto/lto.c:3304
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.

I also tried using different combinations of --enable-languages=c,c++,lto ,
--enable-lto , including neither, but none affected the problem. I also tried
using the svn version of gcc, but it also exhibited the same problem.
The nvptx-newlib revision is aadc8eb0ec43b7cd0dd2dfb484bae63c8b05ef24 and
nvptx-tools revision is c28050f60193b3b95a18866a96f03334e874e78f.

[Bug lto/71499] New: ICE in LTO1 when attempting NVPTX offloading (-fopenacc)

2016-06-10 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71499

Bug ID: 71499
   Summary: ICE in LTO1 when attempting NVPTX offloading
(-fopenacc)
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Summary: Error message:

lto1: internal compiler error: in input_overwrite_node, at
lto-cgraph.c:1203
On GCC 6.1.0

Compiling this code:

void test()
{
}
int main()
{
  #pragma acc parallel
  test();
}

With this commandline:

gcc tmpe.c  -O0 -fopenacc -v

Complete output of GCC:

Using built-in specs.
COLLECT_GCC=/usr/local/bin/gcc
   
COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
Target: x86_64-pc-linux-gnu
Configured with: ../configure --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--enable-offload-targets=nvptx-none=/usr/local/nvptx-none
--enable-languages=c,c++ --with-cuda-driver=/usr --disable-bootstrap
Thread model: posix
gcc version 6.1.0 (GCC) 
COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64'
'-pthread'
 /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/cc1 -quiet -v -imultiarch
x86_64-linux-gnu -D_REENTRANT tmpe.c -quiet -dumpbase tmpe.c -mtune=generic
-march=x86-64 -auxbase tmpe -O0 -version -fopenacc -o /tmp/ccPHnCW0.s
GNU C11 (GCC) version 6.1.0 (x86_64-pc-linux-gnu)
compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4,
MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu"
ignoring nonexistent directory
"/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../x86_64-pc-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include
 /usr/local/include
 /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include-fixed
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
GNU C11 (GCC) version 6.1.0 (x86_64-pc-linux-gnu)
compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4,
MPC version 1.0.3, isl version 0.15
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 1c46fde4e47f1157bf1461541c266a3c
COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64'
'-pthread'
 as -v --64 -o /tmp/ccd60m4w.o /tmp/ccPHnCW0.s
GNU assembler version 2.26 (x86_64-linux-gnu) using BFD version (GNU
Binutils for Debian) 2.26
   
COMPILER_PATH=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/libexec/gcc/x86_64-pc-linux-gnu/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/
   
LIBRARY_PATH=/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../:/lib/:/usr/lib/
Reading specs from
/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64/libgomp.spec
COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64'
'-pthread'
 /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/collect2 -plugin
/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/liblto_plugin.so
-plugin-opt=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto-wrapper
-plugin-opt=-fresolution=/tmp/ccPr7Oc3.res -plugin-opt=-pass-through=-lgcc
-plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lpthread
-plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc
-plugin-opt=-pass-through=-lgcc_s --eh-frame-hdr -m elf_x86_64 -dynamic-linker
/lib64/ld-linux-x86-64.so.2 /usr/lib/x86_64-linux-gnu/crt1.o
/usr/lib/x86_64-linux-gnu/crti.o
/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtbegin.o
/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtoffloadbegin.o
-L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0
-L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64
-L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu
-L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../.. /tmp/ccd60m4w.o -lgomp
-lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed
-lgcc_s --no-as-needed /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtend.o
/usr/lib/x86_64-linux-gnu/crtn.o
/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtoffloadend.o
   
/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/nvptx-none/mkoffload
@/tmp/ccDURU7y
/usr/local/bin/x86_64-pc-linux-gnu-accel-nvp

[Bug libstdc++/70411] Stack overflow with std::regex_match

2016-03-25 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70411

--- Comment #1 from Joel Yliluoma  ---
Minimal regex that causes the same crash: "^0+ .*"

[Bug libstdc++/70411] New: Stack overflow with std::regex_match

2016-03-25 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70411

Bug ID: 70411
   Summary: Stack overflow with std::regex_match
   Product: gcc
   Version: 5.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

When running this code, libstdc++ crashes with a stack overflow (segmentation
fault) in std::regex_match. This regular expression is not the type that should
require exponential backtracking.

Crash occurs in code compiled by GCC 5.3.1 on x86_64-linux-gnu. Clang++ does
the same crash, when using libstdc++ from GCC.

Code compiled by GCC 4.9 does _not_ produce a crash, as it evidently uses a
different version of libstdc++.

#include 
#include 

std::string make_test_string()
{
std::string result = " 16777216 1 ";
for(unsigned n=0; n<1; ++n) result += "EA   NOP%";
return result;
}
std::regex testregex("^([0-9A-F]+) +([0-9]+) +([0-9]+) (.*)$");

int main()
{
std::string teststr = make_test_string();

std::smatch res;
std::regex_match(teststr, res, testregex);
}

[Bug c++/67838] New: Rejects-valid-code: templated lambda variable.

2015-10-04 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67838

Bug ID: 67838
   Summary: Rejects-valid-code: templated lambda variable.
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

GCC 5.2 fails to compile the code below, erroneously citing "error: use of
'TestFunc' before deduction of 'auto'".
Compiles fine in Clang 3.5.

#include 

template
static auto TestFunc = [](int param1)
{
return param1;
};

template
static void test(Func func)
{
printf("%d\n", func(12345));
}

int main()
{
test(TestFunc);
test(TestFunc);
}


[Bug c++/67838] Rejects-valid-code: templated lambda variable.

2015-10-04 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67838

--- Comment #1 from Joel Yliluoma  ---
Note that the use of "template" here is to declare a parametric variable. It is
not for the function's parameter list. It works the same way as in this
expression:

template
int v = 3*LMode;


[Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd

2015-09-17 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Bug ID: 67609
   Summary: [Regression] Generates wrong code for SSE2 _mm_load_pd
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: regression
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

For this program (needs -msse2 to compile).

#include 
__m128d reg;
void set_lower(double b)
{
double v[2];
_mm_store_pd(v, reg);
v[0] = b;
reg = _mm_load_pd(v);
}

On optimization levels -O1 and up, GCC 5.2 incorrectly generates code that
destroys the upper half of reg.
movapd  %xmm0, %xmm1
movaps  %xmm1, reg(%rip)

On -O0, the bug does not occur.
If the index expression is changed into an expression whose value is not known
at compile-time, the code will work properly.

GCC 4.9 does this correctly (if with bit too much labor):

movdqa  reg(%rip), %xmm1
movaps  %xmm1, -24(%rsp)
movsd   %xmm0, -24(%rsp)
movapd  -24(%rsp), %xmm2
movaps  %xmm2, reg(%rip)

For comparison, Clang 3.4 and 3.5:
movlpd  %xmm0, reg(%rip)

For comparison, Clang 3.6:
movaps  reg(%rip), %xmm1
movsd   %xmm0, %xmm1
movaps  %xmm1, reg(%rip)


[Bug regression/67609] [Regression] Generates wrong code for SSE2 _mm_load_pd

2015-09-17 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

--- Comment #1 from Joel Yliluoma  ---
For the record, changing _mm_load_pd(v) into _mm_set_pd(v[1],v[0]) will coax
GCC into generating correct code. The bug is related to _mm_load_pd().


[Bug regression/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd

2015-09-17 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609

Joel Yliluoma  changed:

   What|Removed |Added

  Component|target  |regression

--- Comment #6 from Joel Yliluoma  ---
And also for _mm_load_ps in a similar situation. I did manage to get some error
to occur with floats too, but I'm yet to isolate the problem.


[Bug rtl-optimization/67577] New: Trivial float-vectorization foiled by a loop

2015-09-14 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67577

Bug ID: 67577
   Summary: Trivial float-vectorization foiled by a loop
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

This code is written as if tailored to be SIMD-optimized by GCC...
But GCC somehow blows it.

template
struct vec
{
T d[N];

vec<T,N> operator* (const T& b)
{
vec<T,N> result;
for(unsigned n=0u; n<N; ++n) result.d[n] = d[n] * b;
return result;
}
vec<T,N> operator+ (const vec<T,N>& b)
{
vec<T,N> result;
for(unsigned n=0u; n<N; ++n) result.d[n] = d[n] + b.d[n];
return result;
}
vec<T,N> operator- (const vec<T,N>& b)
{
vec<T,N> result;
for(unsigned n=0u; n<N; ++n) result.d[n] = d[n] - b.d[n];
return result;
}
};


float scale;
vec<float,8> a, b, c;

void x()
{
for(int n=0; n<1; ++n)
{
vec<float,8> result = b + (a - b) * scale;
c = result;
}
}

Generated code (inner loop):

movss   b+4(%rip), %xmm6
movss   a+4(%rip), %xmm7
subss   %xmm6, %xmm7
movss   scale(%rip), %xmm0
movss   b+8(%rip), %xmm5
movss   b+12(%rip), %xmm4
movss   b+16(%rip), %xmm3
mulss   %xmm0, %xmm7
movss   b+20(%rip), %xmm1
movss   b+24(%rip), %xmm2
movss   b+28(%rip), %xmm9
movss   b(%rip), %xmm8
addss   %xmm6, %xmm7
movss   a+8(%rip), %xmm6
subss   %xmm5, %xmm6
movss   %xmm7, c+4(%rip)
mulss   %xmm0, %xmm6
addss   %xmm5, %xmm6
movss   a+12(%rip), %xmm5
subss   %xmm4, %xmm5
movss   %xmm6, c+8(%rip)
mulss   %xmm0, %xmm5
addss   %xmm4, %xmm5
movss   a+16(%rip), %xmm4
subss   %xmm3, %xmm4
movss   %xmm5, c+12(%rip)
mulss   %xmm0, %xmm4
addss   %xmm3, %xmm4
movss   a+20(%rip), %xmm3
subss   %xmm1, %xmm3
movss   %xmm4, c+16(%rip)
mulss   %xmm0, %xmm3
addss   %xmm1, %xmm3
movss   a+24(%rip), %xmm1
subss   %xmm2, %xmm1
movss   %xmm3, c+20(%rip)
mulss   %xmm0, %xmm1
addss   %xmm2, %xmm1
movss   a+28(%rip), %xmm2
subss   %xmm9, %xmm2
movss   %xmm1, c+24(%rip)
mulss   %xmm0, %xmm2
addss   %xmm9, %xmm2
movss   a(%rip), %xmm9
subss   %xmm8, %xmm9
movss   %xmm2, c+28(%rip)
mulss   %xmm9, %xmm0
addss   %xmm8, %xmm0
movss   %xmm0, c(%rip)

Platform: amd64; GCC version 5.2.1.

If I comment away the dummy for-loop, or I change the float "scale" variable
into a function parameter, the inner loop changes into a much simpler code that
vectorizes like I meant to:

movaps  b(%rip), %xmm3
movaps  b+16(%rip), %xmm1
movaps  a+16(%rip), %xmm0
movaps  a(%rip), %xmm2
subps   %xmm1, %xmm0
movss   scale(%rip), %xmm4
subps   %xmm3, %xmm2
shufps  $0, %xmm4, %xmm4
mulps   %xmm4, %xmm0
mulps   %xmm4, %xmm2
addps   %xmm1, %xmm0
addps   %xmm3, %xmm2
movaps  %xmm0, -24(%rsp)
movq-16(%rsp), %rax
movaps  %xmm2, -40(%rsp)
movq%xmm2, c(%rip)
movq%xmm0, c+16(%rip)
movq-32(%rsp), %rdx
movq%rax, c+24(%rip)
movq%rdx, c+8(%rip)

Although there's still some glitch in the generated code causing dummy memory
transfers, at least it now did the calculations using packed registers.

If I change the global "scale" variable into a function parameter, the
following shorter code is generated instead (essentially the same what Clang
successfully produces for all three cases).

movaps  b+16(%rip), %xmm2
shufps  $0, %xmm0, %xmm0
movaps  a+16(%rip), %xmm1
subps   %xmm2, %xmm1
movaps  b(%rip), %xmm3
mulps   %xmm0, %xmm1
addps   %xmm2, %xmm1
movaps  a(%rip), %xmm2
subps   %xmm3, %xmm2
movaps  %xmm1, c+16(%rip)
mulps   %xmm2, %xmm0
addps   %xmm3, %xmm0
movaps  %xmm0, c(%rip)

Something causes GCC's tree-vectorization to be really rickety and easily
foiled by trivial changes in code, and I'd like to see it fixed at least in
these particular cases.


[Bug rtl-optimization/67577] Trivial float-vectorization foiled by a loop

2015-09-14 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67577

--- Comment #1 from Joel Yliluoma  ---
It may be also worth mentioning that adding an explicit '#pragma omp simd'
before each of those loops, inside the operator functions, will make sure that
GCC at least does the mathematics using packed registers. The memory store
cannot apparently be forced to occur without redundant temporaries though.


[Bug c++/67559] [C++] [regression] Passing non-trivially copyable objects through '...' doesn't generate warning or error

2015-09-13 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67559

--- Comment #2 from Joel Yliluoma  ---
But when compiling for earlier standard versions that explicitly label this as
undefined behavior, it should at least give a warning.


[Bug c++/67561] New: [C++14] ICE in tsubst_copy (nested auto lambdas may be involved)

2015-09-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67561

Bug ID: 67561
   Summary: [C++14] ICE in tsubst_copy (nested auto lambdas may be
involved)
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

When compiling this program, 

struct Polygon {};

template
struct View {};

template
void RenderLight(PixType, Params&&... pack)
{
View tmp;
DrawView(pack..., tmp);
}

template
static void DrawPolygon(Plotter plot_pixel)
{
plot_pixel(1,1,1, [&](unsigned n) { return n; });
}

template
static void DrawView(PlotFunc&& GetPlotFunc,
 View& view)
{
DrawPolygon(GetPlotFunc(view, Polygon{}, false));
}

static auto LightmapRenderer = [](unsigned round)
{
return [round](auto& view, const Polygon& polygon, bool)
{
return [=,,](unsigned x,unsigned y, float z, auto&&
prop)
{
if(true) { if(round == 1) {} }
else { if(round > 1) {} }
};
};
};

void CalculateLightmap()
{
RenderLight( 0, LightmapRenderer(1) );
}

The following error is produced:

testv.cc: In instantiation of '<lambda(unsigned int)>::<lambda(auto:1&,
const Polygon&, bool)>::<lambda(unsigned int, unsigned int, float, auto:2&&)>
[with auto:2 = DrawPolygon(Plotter) [with Plotter = <lambda(unsigned
int)>::<lambda(auto:1&, const Polygon&, bool)> [with auto:1 =
View]::<lambda(unsigned int, unsigned int, float,
auto:2&&)>]::<lambda(unsigned int)>; auto:1 = View]':
testv.cc:16:15:   required from 'void DrawPolygon(Plotter) [with Plotter =
<lambda(unsigned int)>::<lambda(auto:1&, const Polygon&, bool)> [with auto:1 =
View]::<lambda(unsigned int, unsigned int, float, auto:2&&)>]'
testv.cc:23:16:   required from 'void DrawView(PlotFunc&&, View&) [with
View = View; PlotFunc = <lambda(unsigned int)>::<lambda(auto:1&, const
Polygon&, bool)>&]'
testv.cc:10:13:   required from 'void RenderLight(PixType, Params&& ...)
[with PixType = int; Params = {<lambda(unsigned int)>::<lambda(auto:1&, const
Polygon&, bool)>}]'
testv.cc:40:41:   required from here
testv.cc:29:5: internal compiler error: in tsubst_copy, at cp/pt.c:12997
 {
 ^
0x632bbe tsubst_copy
../../src/gcc/cp/pt.c:12997
0x623a08 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*,
bool, bool)
../../src/gcc/cp/pt.c:15740
0x6246c8 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*,
bool, bool)
../../src/gcc/cp/pt.c:14771
0x6242d1 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*,
bool, bool)
../../src/gcc/cp/pt.c:15522
0x62c002 tsubst_expr
../../src/gcc/cp/pt.c:14552
0x63492d tsubst_decl
../../src/gcc/cp/pt.c:11500
0x632a7a tsubst_copy
../../src/gcc/cp/pt.c:13127
0x623a08 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*,
bool, bool)
../../src/gcc/cp/pt.c:15740
0x624ae7 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*,
bool, bool)
../../src/gcc/cp/pt.c:14930
0x62c002 tsubst_expr
../../src/gcc/cp/pt.c:14552
0x62b3c5 tsubst_expr
../../src/gcc/cp/pt.c:14113
0x62bf4c tsubst_expr
../../src/gcc/cp/pt.c:14135
0x62b3e5 tsubst_expr
../../src/gcc/cp/pt.c:14115
0x62bf4c tsubst_expr
../../src/gcc/cp/pt.c:14135
0x62b8f4 tsubst_expr
../../src/gcc/cp/pt.c:13949
0x62bf4c tsubst_expr
../../src/gcc/cp/pt.c:14135
0x62acb7 instantiate_decl(tree_node*, int, bool)
../../src/gcc/cp/pt.c:20582
0x659f62 mark_used(tree_node*, int)
../../src/gcc/cp/decl2.c:5035
0x5f9800 build_over_call
../../src/gcc/cp/call.c:7501
0x5fbf31 build_op_call_1
../../src/gcc/cp/call.c:4345
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

Compiler version 5.2.1, on Debian x86_64. More information (g++ -v):

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 5.2.1-16'
--with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id
--libex

[Bug c++/67561] [C++14] ICE in tsubst_copy (nested auto lambdas may be involved)

2015-09-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67561

Joel Yliluoma  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #2 from Joel Yliluoma  ---
Looks like a duplicate of PR67411.

*** This bug has been marked as a duplicate of bug 67411 ***


[Bug c++/67559] New: [C++] [regression] Passing non-trivially copyable objects through '...' doesn't generate warning or error

2015-09-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67559

Bug ID: 67559
   Summary: [C++] [regression] Passing non-trivially copyable
objects through '...' doesn't generate warning or
error
   Product: gcc
   Version: 5.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

In GCC 4.9, this code generates an error. In GCC 5.2, it generates no warning
or error, even on -Wall -Wextra -pedantic.

struct test { test(){} ~test(){} };
void a(int, ...) {}
int main()
{
test object;
a(5, object);
}

Tried different standards modes: -std=c++98, -std=c++03, -std=c++11, -std=c++14

Tried also lambda functions with variadic args, same result.

The error message in GCC 4.9 (and earier down to 4.6) was:

cannot pass objects of non-trivially-copyable type 'struct test' through
'...'

In GCC 5.2, no error or warning message is given in any of the standard modes.
In the standard version C++03, this behavior is undefined (§5.2.2/7). In C++11,
it is conditionally supported with implementation-defined semantics.

[Bug c++/67411] [5/6 Regression] internal compiler error: in tsubst_copy, at cp/pt.c:13473

2015-09-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67411

Joel Yliluoma  changed:

   What|Removed |Added

 CC||bisqwit at iki dot fi

--- Comment #2 from Joel Yliluoma  ---
*** Bug 67561 has been marked as a duplicate of this bug. ***


[Bug c++/67561] [C++14] ICE in tsubst_copy (nested auto lambdas may be involved)

2015-09-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67561

--- Comment #1 from Joel Yliluoma  ---
Further reduced example:

template
void DrawView(PlotFunc GetPlotFunc)
{
GetPlotFunc(1)(2);
}

void CalculateLightmap()
{
auto LightmapRenderer = [](unsigned round)
{
return [round](const auto& view)
{
return [=](auto prop)
{
round + 0;
};
};
};

DrawView(LightmapRenderer(0));
}

Replacing the [=] with [] or [&] retains the error. Replacing it with [round]
removes the error.


[Bug c++/67558] New: [C++] OpenMP "if" clause does not utilize compile-time constants

2015-09-12 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67558

Bug ID: 67558
   Summary: [C++] OpenMP "if" clause does not utilize compile-time
constants
   Product: gcc
   Version: 5.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Consider this example code.

unsigned x;

template
void plain_if(unsigned y)
{
if(Threads)
{
#pragma omp task firstprivate(y) shared(x)
x = y >> 1;
}
else
{
x = y >> 1;
}
}

template
void omp_if(unsigned y)
{
#pragma omp task if(Threads) firstprivate(y) shared(x)
x = y >> 1;
}

void plain_if_false(unsigned y) { plain_if(y); }
void plain_if_true(unsigned y) { plain_if(y); }

void omp_if_false(unsigned y) { omp_if(y); }
void omp_if_true(unsigned y) { omp_if(y); }

plain_if and omp_if do essentially the same thing. In both of them, the
template parameter "Threads" controls whether to create an OpenMP task for the
action or not.

However, when the code is compiled, all functions explicitly call GOMP_task,
except plain_if.
It is clear that GCC treats a plain if() differently than an OpenMP if(). It is
a case of lacking optimization.


[Bug c++/66644] Rejects C++11 in-class anonymous union members initialization

2015-06-23 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66644

--- Comment #1 from Joel Yliluoma bisqwit at iki dot fi ---
The last code piece should have test2{0,0}; there. Something ate a couple of
characters off the end of that line.


[Bug c++/66644] New: Rejects C++11 in-class anonymous union members initialization

2015-06-23 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66644

Bug ID: 66644
   Summary: Rejects C++11 in-class anonymous union members
initialization
   Product: gcc
   Version: 5.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi
  Target Milestone: ---

Accepted by GCC:

struct test  
{   union  
{
struct { char a=0, b; };
char buffer[16];
};  };

NOT accepted by GCC (multiple fields in union 'test::anonymous union'
initialized):

struct test  
{   union  
{
struct { char a=0, b=0; };
char buffer[16];
};  };

Still accepted by GCC:

struct test
{   union
{
struct { char a, b; } test2{0,
char buffer[16];
};  };

I think there's a compiler bug here. It should not complain about initializing
multiple fields in a struct that is nested inside the union, because this does
not comprise a conflict.

Tested on GCC versions 4.7.4, 4.8.4, 4.9.2, and 5.1.1.


[Bug rtl-optimization/63259] New: Detecting byteswap sequence

2014-09-13 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63259

Bug ID: 63259
   Summary: Detecting byteswap sequence
   Product: gcc
   Version: 4.9.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi

This is just silly. GCC optimizes the first function into single opcode
(bswap), but not the other. For Clang, it's the other way around.

unsigned byteswap_gcc(unsigned result)
{
result = ((result  0xu) 16) | ((result  0xu) 16);
result = ((result  0xFF00FF00u)  8) | ((result  0x00FF00FFu)  8);
return result;
}
unsigned byteswap_clang(unsigned result)
{
result = ((result  0xFF00FF00u)  8) | ((result  0x00FF00FFu)  8);
result = ((result  0xu) 16) | ((result  0xu) 16);
return result;
}

unsigned byteswap(unsigned v)
{
#ifdef __clang__
 return byteswap_clang(v);
#else
 return byteswap_gcc(v);
#endif
}

GCC output:

byteswap_gcc:
movl%edi, %eax
bswap   %eax
ret

byteswap_clang:
movl%edi, %eax
andl$-16711936, %eax
shrl$8, %eax
movl%eax, %edx
movl%edi, %eax
andl$16711935, %eax
sall$8, %eax
orl %edx, %eax
roll$16, %eax
ret

byteswap:
movl%edi, %eax
bswap   %eax
ret

Clang output:

byteswap_gcc:   # @byteswap_gcc
roll$16, %edi
movl%edi, %eax
shrl$8, %eax
andl$16711935, %eax # imm = 0xFF00FF
shll$8, %edi
andl$-16711936, %edi# imm = 0xFF00FF00
orl %eax, %edi
movl%edi, %eax
retq

byteswap_clang: # @byteswap_clang
bswapl  %edi
movl%edi, %eax
retq

byteswap:   # @byteswap
bswapl  %edi
movl%edi, %eax
retq


Tested both -m32 and -m64, with options: -Ofast -S
Tested versions:
- gcc (Debian 4.9.1-11) 4.9.1  Target: x86_64-linux-gnu
- Debian clang version 3.5.0-+rc1-2 (tags/RELEASE_35/rc1) (based on LLVM 3.5.0)
 Target: x86_64-pc-linux-gnu


[Bug c++/61323] New: 'static' and 'const' attributes cause non-type template argument matching failure

2014-05-26 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61323

Bug ID: 61323
   Summary: 'static' and 'const' attributes cause non-type
template argument matching failure
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bisqwit at iki dot fi

// Works:

char* table1[10];
templateunsigned size, char*(table)[size] void test1() { }
void tester1() { test110,table1(); }

// Doesn't work:

static char* table2[10];
templateunsigned size, char*(table)[size] void test2() { }
void tester2() { test210,table2(); } // error: 'table2' cannot appear in a
constant-expression

// Works:

const char* table3[10];
templateunsigned size, const char*(table)[size] void test3() { }
void tester3() { test310,table3(); }

// Doesn't work:

const char* const table4[10] = {};
templateunsigned size, const char*const (table)[size] void test4() { }
void tester4() { test410,table4(); } // error: 'table4' cannot appear in a
constant-expression

// Works:

const char* volatile table5[10] = {};
templateunsigned size, const char* volatile (table)[size] void test5() { }
void tester5() { test510,table5(); }

// Doesn't work:

const char* const table6[10] = {};
templateunsigned size, const char*const (table)[size] void test6() { }
void tester6() { test610,table6(); } // error: 'table6' cannot appear in a
constant-expression
--
Compiler versions tested:
  g++-4.4 (Debian 4.4.7-7) 4.4.7
  g++-4.5 (Debian 4.5.3-12) 4.5.3
  g++-4.6 (Debian 4.6.4-7) 4.6.4
  g++-4.7 (Debian 4.7.3-13) 4.7.3
  g++-4.8 (Debian 4.8.2-21) 4.8.2
  g++-4.9 (Debian 4.9.0-2) 4.9.0
Giving -std=c++11 did not make a difference.
--
Also tested: CLANG++ 3.5
CLANG++ gives diagnostic message: non-type template argument referring to
object 'table2' with internal linkage is a C++11 extension on all those cases
that GCC failed, when compiled without -std=c++11. Compiling with -std=c++11 or
even with -std=c++1y did not work on GCC.


[Bug c++/61323] 'static' and 'const' attributes cause non-type template argument matching failure

2014-05-26 Thread bisqwit at iki dot fi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61323

--- Comment #1 from Joel Yliluoma bisqwit at iki dot fi ---
Interestingly enough, only if you add the term constexpr to the array
declaration, you get an actually meaningful error message:

constexpr const char* table7[10] = {};  
templateunsigned size, const char*const (table)[size] void test7() { }
void tester7() { test710,table7(); } 

Produces: 
test.cc:42:35: error: 'table7' is not a valid template argument for type 'const
char* const ()[10]' because object 'table7' has not external linkage
 void tester7() { test710,table7(); } 

The problem is that this very setting (non external linkage object as template
argument) should be allowed by C++11 -- the same standard that also gives us
constexpr.


[Bug rtl-optimization/58195] Missed optimization opportunity when returning a conditional

2014-03-11 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58195

Joel Yliluoma bisqwit at iki dot fi changed:

   What|Removed |Added

 CC||bisqwit at iki dot fi

--- Comment #1 from Joel Yliluoma bisqwit at iki dot fi ---
Problem confirmed on gcc (GCC) 4.9.0 20140303 (experimental) (SVN version) in
both 32-bit and 64-bit mode using -Ofast.

For comparison, Clang++ produces this instead (even on -O1): 

negl%edi
movl%edi, %eax
ret

GCC misses an optimization opportunity here.


[Bug c++/56794] New: C++11 Error in range-based for with parameter pack array

2013-03-31 Thread bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56794



 Bug #: 56794

   Summary: C++11 Error in range-based for with parameter pack

array

Classification: Unclassified

   Product: gcc

   Version: 4.8.0

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: c++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: bisq...@iki.fi





G++ 4.7.2 and 4.8.0 give the following error message for the for-loop in below

code:



tmp.cc:10:17: error: range-based 'for' expression of type 'const int []' has

incomplete type



On G++ 4.6.3 (and Clang++), it compiles fine. Regression?





templateint... values

static void Colors()

{

static const int colors[] = { values... };

// ^ This version passes in G++ 4.6 and Clang++ 3.0, fails in G++ 4.7 and

4.8



//static const int colors[sizeof...(values)] = { values... };

// ^This passes in all of them



for(auto c: colors) { }

// ^ This line is the one that gets the error message

}



int main()

{

Colors0,1,2 ();

}


[Bug c++/55250] New: [C++0x][constexpr] enum declarations within constexpr function are allowed, constexpr declarations are not

2012-11-09 Thread bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55250



 Bug #: 55250

   Summary: [C++0x][constexpr] enum declarations within constexpr

function are allowed, constexpr declarations are not

Classification: Unclassified

   Product: gcc

   Version: 4.7.1

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: c++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: bisq...@iki.fi





The following code compiles in GCC without warnings on -Wall -W -pedantic:

  constexpr int Test1(int x)   { enum { y = 1 };   return x+y; }



The following one does not:

  constexpr int Test2(int x)   { constexpr int y = 1;  return x+y; }



For the second code, GCC gives error: body of constexpr function 'constexpr

int Test2(int)' not a return-statement



In comparison, Clang++ gives an error for Test1: error: types cannot be

defined in a constexpr function, and for Test2: error: variables cannot be

declared in a constexpr function for Test2.



Now, reading http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2235.pdf

, it is not entirely unambiguous which behavior is correct.



While I would like that both samples worked without warnings, I suggest that

attempting to declare an enum within a constexpr function will be made a

-pedantic warning.



[Tested on GCC 4.6.3 through 4.7.2. On GCC 4.5.3, both functions compiled

without warnings.]


[Bug c++/55239] New: Spurious unused variable warning on function-local objects with a destructor and an initializer

2012-11-08 Thread bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55239



 Bug #: 55239

   Summary: Spurious unused variable warning on function-local

objects with a destructor and an initializer

Classification: Unclassified

   Product: gcc

   Version: 4.7.1

Status: UNCONFIRMED

  Severity: minor

  Priority: P3

 Component: c++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: bisq...@iki.fi





In the code below, a function-local object is declared with a destructor whose

role is to ensure that some action is taken at the end of the scope, no matter

which route the function is exited.



#include stdio.h

void LoadSomeFile(const char* fn)

{

/* Open file */

FILE* fp = fopen(fn, rb);

/* Ensure that the file is automatically closed no matter which path

this function is exited */

struct closer { FILE* f;  ~closer() { if(f) fclose(f); }

  } autoclosefp = {fp};

/* Some code here that deals with fp, and may include several return;

clauses */

}

int main() { LoadSomeFile(__FILE__); } // test



Bug GCC gives a spurious unused variable 'autoclosefp' for this code,

implying that autoclosefp has no function. It does. Without it, the file would

not be closed and resources would be leaked.



The problem also occurs, when the code is rewritten like this:



#include stdio.h

void LoadSomeFile(const char* fn)

{

/* Open file */

FILE* fp = fopen(fn, rb);

/* Ensure that the file is automatically closed no matter which path

this function is exited */

struct closer { FILE* f;  ~closer() { if(f) fclose(f); } };

closer autoclosefp = {fp};

/* Some code here that deals with fp, and may include several return;

clauses */

}

int main() { LoadSomeFile(__FILE__); } // test



Changing the = {fp} into C++11 style {fp} does not take away the warning,

either.



Only changing the initialization-by-initializer into an member-assignment takes

away the warning.



#include stdio.h

void LoadSomeFile(const char* fn)

{

/* Open file */

FILE* fp = fopen(fn, rb);

/* Ensure that the file is automatically closed no matter which path

this function is exited */

struct closer { FILE* f;  ~closer() { if(f) fclose(f); } } autoclosefp;

autoclosefp.f = fp;

/* Some code here that deals with fp, and may include several return;

clauses */

}

int main() { LoadSomeFile(__FILE__); } // test



I would argue that this is inconvenient, and wrong behavior on GCC.



Tested and verified on GCC 3.3 through 4.7.1. The -Wunused-variable (or -Wall)

option is required.


[Bug c++/55240] New: [c++0x] ICE on non-static data member initialization using 'auto' variable from containing function

2012-11-08 Thread bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55240



 Bug #: 55240

   Summary: [c++0x] ICE on non-static data member initialization

using 'auto' variable from containing function

Classification: Unclassified

   Product: gcc

   Version: 4.7.2

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: c++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: bisq...@iki.fi





This code causes an ICE in GCC 4.7.1 and 4.7.2:



int main()

{

int q = 1;

struct test { int x = q; } instance;

}



tmpq.cc: In constructor 'constexpr main()::test::test()':

tmpq.cc:4:12: internal compiler error: in expand_expr_real_1, at expr.c:9122



It is notable that if the code is written like this, the error message changes.



int main()

{

int q = 1;

struct test { int x; test():x(q){} } instance;

}



tmpq.cc:5:35: error: use of 'auto' variable from containing function

tmpq.cc:3:9: error:   'int q' declared here


[Bug c++/55239] Spurious unused variable warning on function-local objects with a destructor and an initializer

2012-11-08 Thread bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55239



--- Comment #5 from Joel Yliluoma bisqwit at iki dot fi 2012-11-08 15:16:48 
UTC ---

Nice. I had no idea this was first reported in 2003 and fixed in 2012 in a

version recent enough to be still unreleased :)


[Bug c++/54946] New: ICE on template parameter from cast char-pointer in C++11 constexpr struct

2012-10-17 Thread bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54946



 Bug #: 54946

   Summary: ICE on template parameter from cast char-pointer in

C++11 constexpr struct

Classification: Unclassified

   Product: gcc

   Version: 4.7.1

Status: UNCONFIRMED

  Severity: normal

  Priority: P3

 Component: c++

AssignedTo: unassig...@gcc.gnu.org

ReportedBy: bisq...@iki.fi





The following C++11 code causes an ICE.



templateconts char*sstatic void testfunc();

constexpr struct testtype { const char* str; } test = { abc} ;

void (*functionpointer)() = testfunc(const char*) test.str;



Sample GCC commandline to invoke the error: g++ test1.cc -std=c++0x



Error message on g++-4.7.1 (Debian 4.7.1-7 on x86_64)

test1.cc:5:29: internal compiler error: in convert_nontype_argument, at

cp/pt.c:5794



Error message on g++-4.6.3 (Debian 4.6.3-11 on x86_64):

test1.cc:5:29: internal compiler error: in convert_nontype_argument, at

cp/pt.c:5430



I do not know whether the code is valid.

Things that do not affect the error:

- Adding / removing const at any point or changing pointers into arrays at

any point

- Changing functionpointer into an array of function pointers

- Any code generation related options (such as -m32 or optimization levels)



Things that do hide the error:

- Removing the (const char*) cast entirely

- Changing the string pointers into integers

- Removing the struct encapsulation from str (making it constexpr const char*

str = abc; and removing test. from the third line)


[Bug c++/54946] ICE on template parameter from cast char-pointer in C++11 constexpr struct

2012-10-17 Thread bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54946



--- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2012-10-17 12:10:21 
UTC ---

Please excuse the conts typo in the post; naturally it meant to say const

there. The typo is not relevant to the bug report.



I changed the code a few times trying to figure out what triggers the error and

what does not, and the version I copypasted was not a compiled one.


[Bug libstdc++/53630] New: C+11 regex compiler produces SIGSEGV

2012-06-11 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53630

 Bug #: 53630
   Summary: C+11 regex compiler produces SIGSEGV
Classification: Unclassified
   Product: gcc
   Version: 4.7.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bisq...@iki.fi


This simple code produces a segmentation fault. Tested in GCC 4.7.0, GCC 4.6.3,
and Clang 3.1 (where the latter uses libstdc++ from GCC 4.7).

#include regex

int main()
{
std::regex r((go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|))),
std::regex::extended);
return 0;
}

Omitting the std::regex::extended option does not make a difference.

Replacing all of the |) with ) makes it compile, but obviously with a
completely different expression. As of now, libstdc++ does not yet support the
'?' operator, so the expression cannot be rewritten as (go
)?((n(orth)?)|(s(outh)?)|(w(est)?)|(e(ast)?)). There is also no non-capturing
grouping operator, so writing e.g. (n(?:orth|)) is not an option.

A minimal regexp that duplicates the crash is: ((a(b|))|x). Simple
reorderings such as ((a(|b))|x) or (x|(a(|b))) do not make a difference.

GDB backtrace below:

(gdb) bt
#0  0x7732cdbd in malloc_consolidate (av=0x77639e60) at
malloc.c:5169
#1  0x7732f2a4 in _int_malloc (av=0x77639e60, bytes=1280) at
malloc.c:4373
#2  0x77331960 in *__GI___libc_malloc (bytes=1280) at malloc.c:3660
#3  0x77b39e6d in operator new(unsigned long) () from
/usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00404ce4 in
__gnu_cxx::new_allocatorstd::__regex::_State::allocate (this=0x7fffe980,
__n=16)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/ext/new_allocator.h:94
#5  0x004046b0 in std::_Vector_basestd::__regex::_State,
std::allocatorstd::__regex::_State ::_M_allocate (this=0x7fffe980,
__n=16)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/stl_vector.h:169
#6  0x00404338 in
_ZNSt6vectorINSt7__regex6_StateESaIS1_EE19_M_emplace_back_auxIJS1_EEEvDpOT_
(this=0x7fffe980, __args=0x7fffe010)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/vector.tcc:402
#7  0x00404270 in
_ZNSt6vectorINSt7__regex6_StateESaIS1_EE12emplace_backIJS1_EEEvDpOT_
(this=0x7fffe980, __args=0x7fffe010)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/vector.tcc:102
#8  0x00403690 in std::vectorstd::__regex::_State,
std::allocatorstd::__regex::_State ::push_back(std::__regex::_State) (
this=0x7fffe980, __x=0x7fffe010) at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/stl_vector.h:900
#9  0x00403052 in
std::__regex::_Nfa::_M_insert_subexpr_begin(std::functionvoid
(std::__regex::_PatternCursor const, std::__regex::_Results) const)
(this=0x7fffe978, __t=...) at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_nfa.h:312
#10 0x0040848f in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_atom (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:943
#11 0x00407b98 in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_term (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:793
#12 0x00405be9 in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_alternative (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:771
#13 0x00403119 in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_disjunction (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:756
#14 0x004084d5 in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_atom (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:945
#15 0x00407b98 in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_term (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:793
#16 0x00405be9 in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_alternative (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:771
#17 0x00405c2f in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_alternative (this=0x7fffe928)
at
/usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:774
#18 0x00403119 in std::__regex::_Compilerchar const*,
std::regex_traitschar ::_M_disjunction 

[Bug c++/50276] [C++0x] Wrong used uninitialized in this function warning

2012-01-03 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50276

--- Comment #5 from Joel Yliluoma bisqwit at iki dot fi 2012-01-03 23:16:07 
UTC ---
It also accepts this code without complaints, which is another error:

templateint i
bool test()
{
  if (bool value = this_identifier_has_not_been_declared( []() {} ))
return value;

  __builtin_abort();

  return false;
}

int main()
{
  test0();
}

The wrong-code problem occurs also with this code:

templateint i
bool test()
{
  if (bool value = []() { return 1; } )
return value;

  __builtin_abort();

  return false;
}

int main()
{
  test0();
}


[Bug c++/50276] New: Wrong used uninitialized in this function warning [C++0x]

2011-09-02 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50276

 Bug #: 50276
   Summary: Wrong used uninitialized in this function warning
[C++0x]
Classification: Unclassified
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bisq...@iki.fi


For this example code, GCC mistakenly produces the following warning:
tmp.cc:10:5: warning: 'value' is used uninitialized in this function
[-Wuninitialized]
The warning is wrongly given, because there is no execution path that does not
assign a well-defined value to the variable. In fact, there are no branches at
all between the declaring and the assigning of the variable.

templatetypename T
unsigned testfun(const T func)
{
return func();
}

templateint i
unsigned test()
{
if(unsigned value = testfun( [] () { return 0; }))
{
return value;
}
return i;
}

int main()
{
return test1();
}

The warning being wrongly given depends on the following conditions:
- test() being a template function: changing i into an actual parameter
removes the warning
- func being a functor: changing it into an integer parameter removes the
warning
- the variable value being declared and assigned to in the if-condition:
declaring and assigning it separately removes the warning.
- the func parameter being a lambda function: changing it into a static
method of a class removes the warning.

The following aspects do not affect the warning:
- testfun() being a template function: changing T into an explicit int(*)()
retains the warning
- whether i is used within test() or not
- adding static or inline attributes to any function did not change the
warning.

Tested on GCC 4.5.3 and GCC 4.6.1, on x86_64-linux-gnu in both 32-bit and
64-bit mode on all optimization modes.


[Bug c++/50276] Wrong used uninitialized in this function warning [C++0x]

2011-09-02 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50276

--- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2011-09-02 13:04:31 
UTC ---
Even this produces the warning. Changing any of the 0s into 1 did not
affect the warning.

static inline unsigned testfun(void*)
{
return 0;
}

templateint i
static inline unsigned test()
{
if(unsigned value = testfun( []() { return 0; } ))
return value;
return 0;
}

int main()
{
return test0();
}


[Bug c++/49100] [OpenMP]: Compiler error when inline method defined within OpenMP loop

2011-07-25 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100

--- Comment #3 from Joel Yliluoma bisqwit at iki dot fi 2011-07-25 10:01:08 
UTC ---
While it's true that one should not reference the original variable within the
loop, question is, why does the inner function reference the original variable
rather than the inloop variable when there's no explicit reference to the
original variable. A reference is established, by name, within the loop, but
within the loop there should be no possible way to reference the outside-loop
variable because the inner-loop namespace shadows the outer one, and hence the
reference should bind into the inner-loop variable, thus conforming to OpenMP
specification.


[Bug c++/49100] [OpenMP]: Compiler error when inline method defined within OpenMP loop

2011-07-25 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100

--- Comment #5 from Joel Yliluoma bisqwit at iki dot fi 2011-07-25 10:24:20 
UTC ---
Obviously :) All right, thanks.


[Bug c++/49100] New: [OpenMP]: Compiler error when inline method defined within OpenMP loop

2011-05-21 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100

   Summary: [OpenMP]: Compiler error when inline method defined
within OpenMP loop
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bisq...@iki.fi


This report is similar to PR49043, but unlike it, this does not involve C++0x.

This valid code fails to compile on GCC.
GCC spews an invalid exit from OpenMP structured block error message.

int main()
{
#pragma omp parallel for
for(int a=0; a10; ++a)
{
struct x
{
void test() { return; };
};
}
}

If the explicit return statement is removed, it compiles.
It is also triggered by code such as this:

struct y
{
static bool test(int c) { return c==5; }
};

if put inside the OpenMP loop construct, meaning it happens for static and
non-static methods as long as they include an explicit return statement.

The purpose of this error is to catch exits from an OpenMP construct (return,
break, goto). No such thing happens when a function is called or defined. The
error is not given when the struct is defined outside the loop (even if invoked
inside the loop). It is clearly a parser error.

It failed on all GCC versions that I tried that support OpenMP. These include
GCC 4.2.4, 4.3.5, 4.4.6, 4.5.3 and 4.6.1.

I have not tested whether the patch committed as a result of PR49043 also fixes
this bug.


[Bug c++/49100] [OpenMP]: Compiler error when inline method defined within OpenMP loop

2011-05-21 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100

--- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2011-05-21 11:31:46 
UTC ---
It also does not happen with C's nested functions. This for instance compiles
and works just fine (to my surprise).

#include stdio.h
int main()
{
int a;
#pragma omp parallel for
for(a=0; a10; ++a)
{
int c() { return 65; }
putchar( c() );
}
return 0;
}

I venture into another bug report here, but I wonder if this is a bug or
intentional behavior, that the code below outputs YY, as though the
variable a within c() is bound to the a from the surrounding context rather
than the OpenMP loop's private copy of a. If the OpenMP loop is removed, it
outputs ABCDEFGHIJ as expected.

#include stdio.h
int main()
{
int a = 24;
#pragma omp parallel for
for(a=0; a10; ++a)
{
int c() { return a+65; }
putchar( c() );
}
return 0;
}


[Bug c++/49043] [OpenMP C++0x]: Compiler error when lambda-function within OpenMP loop

2011-05-19 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49043

--- Comment #2 from Joel Yliluoma bisqwit at iki dot fi 2011-05-19 08:10:06 
UTC ---
Even if the lambda function is not called, it happens. Merely defining the
function causes it.
Interestingly though, it does not happen if a method body is defined within the
loop. The code below does not cause the error. So it is restricted to lambda
function bodies.
It also does not happen when calling lambda functions that are defined outside
the loop.

int main()
{
#pragma omp parallel for
for(int a=0; a10; ++a)
{
struct tmp
{
static int test() { return 0; }
};
}
}


[Bug c++/49043] [OpenMP C++0x]: Compiler error when lambda-function within OpenMP loop

2011-05-19 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49043

Joel Yliluoma bisqwit at iki dot fi changed:

   What|Removed |Added

Summary|[C++0x] Returns from lambda |[OpenMP  C++0x]: Compiler
   |functions incorrectly   |error when lambda-function
   |detected as exits from|within OpenMP loop
   |OpenMP loops in surrounding |
   |code|

--- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2011-05-19 08:05:26 
UTC ---
It also happens if the lambda-function does not explicitly contain the return
statement.

For example, this code triggers the same error.

int main()
{
#pragma omp parallel for
for(int a=0; a10; ++a)
{
auto func = [] () - void { };
func();
}
}


[Bug c++/49043] New: Returns from lambda functions incorrectly detected as exits from OpenMP loops in surrounding code

2011-05-18 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49043

   Summary: Returns from lambda functions incorrectly detected as
exits from OpenMP loops in surrounding code
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bisq...@iki.fi


GCC incorrectly considers return statements within lambda functions as exits
from an OpenMP structured block.

For the code below, this error message is generated:

tmp3.cc: In lambda function:
tmp3.cc:7:40: error: invalid exit from OpenMP structured block

#include iostream
int main()
{
#pragma omp parallel for
for(int a=0; a10; ++a)
{
auto func = [=] () { return a; };
std::cout  func();
}
}

Compiled with: -fopenmp -std=gnu++0x
Tested versions: 4.5.3 , 4.6.1

The purpose of this error is to catch exits from an OpenMP construct (return,
break, goto). No such thing happens when a lamdba function is called, which is
not different from calling an inlined function, therefore the error message is
misplaced.


[Bug libstdc++/48933] New: Infinite recursion in tr1/cmath functions with complex parameters

2011-05-09 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48933

   Summary: Infinite recursion in tr1/cmath functions with complex
parameters
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bisq...@iki.fi
CC: w...@iki.fi


All of the function calls in this example code produce a stack overflow due to
infinite recursion, regardless of optimization level.
Compile with: g++ code.cc
Tested on the following gcc versions: 4.2.4  4.3.5  4.4.6  4.5.2  4.6.1
No compiler warnings or errors are emitted. (Tried even -Wall -W -pedantic
-ansi).

Does not happen on gcc 4.0.4, because tr1/cmath is unavailable.

#include tr1/cmath
#include complex
int main()
{
std::tr1::tgamma( std::complexdouble (0.5, 0.0) );
std::tr1::cbrt( std::complexdouble (0.5, 0.0) );
std::tr1::asinh( std::complexdouble (0.5, 0.0) );
std::tr1::acosh( std::complexdouble (1.5, 0.0) );
std::tr1::atanh( std::complexdouble (0.5, 0.0) );
std::tr1::erf( std::complexdouble (0.5, 0.0) );
std::tr1::hypot( std::complexdouble (1.0, 0.0) ,
 std::complexdouble (1.0, 0.0) );
std::tr1::logb( std::complexdouble (0.5, 0.0) );
std::tr1::round( std::complexdouble (0.5, 0.0) );
std::tr1::trunc( std::complexdouble (0.5, 0.0) );
}

The bug can be traced to all functions in tr1/cmath that look like this:

  templatetypename _Tp
inline typename __gnu_cxx::__promote_Tp::__type
cbrt(_Tp __x) 
{
  typedef typename __gnu_cxx::__promote_Tp::__type __type;
  return cbrt(__type(__x)); // -- infinite recursion here
}


[Bug libstdc++/48933] Infinite recursion in tr1/cmath functions with complex parameters

2011-05-09 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48933

--- Comment #4 from Joel Yliluoma bisqwit at iki dot fi 2011-05-09 10:51:28 
UTC ---
There is, however, an asinh, a cbrt, a hypot etc. for complex types. I don't
know about standard, but mathematically they are well defined. (for example,
hypot(x,y) = sqrt(x*x + y*y), asinh(x) = log(x + sqrt(x*x + 1)))

For trunc  other rounding functions probably not so.


[Bug c++/46764] New: std=c++0x causes compilation failure on SFINAE test for methods

2010-12-02 Thread bisqwit at iki dot fi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46764

   Summary: std=c++0x causes compilation failure on SFINAE test
for methods
   Product: gcc
   Version: 4.5.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bisq...@iki.fi


This code tests whether a class defines a method of a certain name or not.
It fails to compile on GCC when -std=c++0x is used. Without -std=c++0x, it
compiles and works fine.

#include iostream

struct Hello { int helloworld() { return 0; } };
struct Generic {};

// SFINAE test
template typename T
class has_helloworld
{
typedef char yes;
typedef struct { char dummy[2]; } no;

template typename C static  yes test( typeof(C::helloworld) ) ;
template typename C static   no test(...);

public:
enum { value = sizeof(testT(0)) == sizeof(yes) };
};

int main()
{
std::cout  has_helloworldHello::value  std::endl;
std::cout  has_helloworldGeneric::value  std::endl;
return 0;
}

With -std=c++0x, we get the following error message:

tmp5.cc:13:68: error: ISO C++ forbids in-class initialization of non-const
static member 'test'
tmp5.cc:13:68: error: template declaration of 'has_helloworld::yes test'

Without -std=c++0x, the code compiles without warnings.

Indicating that GCC misinterprets test() to be a member/variable initialization
rather than a method/function declaration, despite the parameter expression
yielding a type rather than a value.


[Bug c++/42697] ice-on-legal-code: template class template function local objects

2010-01-17 Thread bisqwit at iki dot fi


--- Comment #9 from bisqwit at iki dot fi  2010-01-17 22:37 ---
Out of curiosity... What does it mean it's not a regression, and what are its
practical implications?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42697



[Bug c++/42697] ice-on-legal-code: template class template function local objects

2010-01-17 Thread bisqwit at iki dot fi


--- Comment #11 from bisqwit at iki dot fi  2010-01-18 07:59 ---
Ah, I see. So the reason it is not fixed in 4.5 is that it may cause new
regressions?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42697



[Bug c++/42743] New: Inexplicable error message with constructing SIMD values

2010-01-14 Thread bisqwit at iki dot fi
This code:

#include emmintrin.h

templatetypename vec_t
void x()
{
vec_t tmp1 = vec_t(); // Works with myvec, causes error with __m128
vec_t tmp2 = {};  // Causes warnings about uninitialized members in
myvec
vec_t tmp3;   // This may cause a warning about use of
uninitialized variables if tmp3 is later read-accessed.
}


struct myvec
{
struct tmp
{
float data[2];
} d;
};
void y()
{
x__m128 ();
xmyvec   ();
}

Produces this error when vec_t is __m128:
tmp.cc:6: error: can't convert between vector values of different size
And this warning when vec_t is myvec:
tmp.cc:7: warning: missing initializer for member 'myvec::d'

It is my understanding that constructor calls should never be treated as syntax
errors. Is there really no way to write this code so that it causes neither a
compile error or a warning?


-- 
   Summary: Inexplicable error message with constructing SIMD values
   Product: gcc
   Version: 4.4.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bisqwit at iki dot fi
 GCC build triplet: x86_64-pc-linux-gnu
  GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: x86_64-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42743



[Bug c++/42697] New: ice-on-legal-code: template class template function local objects

2010-01-11 Thread bisqwit at iki dot fi
Example source code:

templateclass Value_t
class fparser
{
templatebool Option
void eval2(Value_t r[2]);
public: 
void evaltest();
};


/*templateclass Value_t
templatebool Option
void fparserValue_t::eval2(Value_t r[2])
{
}*/

template
templatebool Option
void fparserint::eval2(int r[2])
{
struct ObjType
{
int tmp;
};
ObjType Object = { 5 };
}


templateclass Value_t
void fparserValue_t::evaltest
()
{
eval2false(0);
}


template class fparserint;

Compilation result:

Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.2-8'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared
--enable-multiarch --enable-linker-build-id --with-system-zlib
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc
--with-arch-32=i486 --with-tune=generic --enable-checking=release
--build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 4.4.2 (Debian 4.4.2-8) 
COLLECT_GCC_OPTIONS='-Wall' '-W' '-O3' '-c' '-v' '-shared-libgcc'
'-mtune=generic'
 /usr/lib/gcc/x86_64-linux-gnu/4.4.2/cc1plus -quiet -v -D_GNU_SOURCE tmp2.cc
-quiet -dumpbase tmp2.cc -mtune=generic -auxbase tmp2 -O3 -Wall -W -version -o
/tmp/ccPuGyBP.s
ignoring nonexistent directory /usr/local/include/x86_64-linux-gnu
ignoring nonexistent directory
/usr/lib/gcc/x86_64-linux-gnu/4.4.2/../../../../x86_64-linux-gnu/include
ignoring nonexistent directory /usr/include/x86_64-linux-gnu
#include ... search starts here:
#include ... search starts here:
 /usr/include/c++/4.4
 /usr/include/c++/4.4/x86_64-linux-gnu
 /usr/include/c++/4.4/backward
 /usr/local/include
 /usr/lib/gcc/x86_64-linux-gnu/4.4.2/include
 /usr/lib/gcc/x86_64-linux-gnu/4.4.2/include-fixed
 /usr/include
End of search list.
GNU C++ (Debian 4.4.2-8) version 4.4.2 (x86_64-linux-gnu)
compiled by GNU C version 4.4.2, GMP version 4.3.1, MPFR version
2.4.2-p1.
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 545dc413edcab7204151d021d996af39
tmp2.cc: In member function 'void fparserValue_t::eval2(Value_t*) [with bool
Option = false, Value_t = int]':
tmp2.cc:33:   instantiated from 'void fparserValue_t::evaltest() [with
Value_t = int]'
tmp2.cc:37:   instantiated from here
tmp2.cc:19: internal compiler error: in tsubst, at cp/pt.c:9339
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-4.4/README.Bugs for instructions.


Occurs on g++-4.1: internal compiler error: in tsubst, at cp/pt.c:7267
Occurs on g++-4.2: internal compiler error: in tsubst, at cp/pt.c:7465
Occurs on g++-4.3: internal compiler error: in tsubst, at cp/pt.c:9031
Occurs on g++-4.4: internal compiler error: in tsubst, at cp/pt.c:9339
Does not occur on g++-4.0, because 4.0 gives error template-id 'eval2' for
'void fparserint::eval2(int*)' does not match any template declaration
instead.

Occurs at all optimization levels.


-- 
   Summary: ice-on-legal-code: template class template function
local objects
   Product: gcc
   Version: 4.4.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bisqwit at iki dot fi
 GCC build triplet: x86_64-pc-linux-gnu
  GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: x86_64-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42697



[Bug c++/34953] New: ICC on destructor + noreturn-function at -O3

2008-01-24 Thread bisqwit at iki dot fi
This code crashes GCC versions 4.1.2, 4.1.3, and 4.2.3, when compiled using the
-O3 option.


void B_CLEAR(void* ret);
void B_NeverReturns(void* ret) __attribute__((noreturn));

int main()
{
const struct AutoErrPop { ~AutoErrPop() { } } AutoErrPopper = { };
B_NeverReturns(0);
}

void B_NeverReturns(void* ret)
{
B_CLEAR(ret); /* Never returns (does a setjmp/goto) */
}

Tested on x86_64 and i386. To reproduce: g++ a.cc -O3

Expected result:
a.cc:4: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See URL:http://gcc.gnu.org/bugs.html for instructions.
For Debian GNU/Linux specific bug reporting instructions,
see URL:file:///usr/share/doc/gcc-4.2/README.Bugs.


-- 
   Summary: ICC on destructor + noreturn-function at -O3
   Product: gcc
   Version: 4.1.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bisqwit at iki dot fi
 GCC build triplet: x86_64-pc-linux-gnu
  GCC host triplet: x86_64-pc-linux-gnu
GCC target triplet: x86_64-pc-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34953



[Bug c++/34953] ICC on destructor + noreturn-function at -O3

2008-01-24 Thread bisqwit at iki dot fi


--- Comment #1 from bisqwit at iki dot fi  2008-01-24 13:52 ---
The body of the function B_CLEAR() is not included, and not relevant, since the
error happens without the body as well, and does not progress to linking.


-- 

bisqwit at iki dot fi changed:

   What|Removed |Added

 CC||bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34953



[Bug c++/32767] ICE in constructing a template object using statement expression on AMD64.

2007-07-15 Thread bisqwit at iki dot fi


--- Comment #4 from bisqwit at iki dot fi  2007-07-15 21:17 ---
Also is reported that on some 32-bit platforms, instead of causing an ICE, it
causes a rampant memory eating phenomenon.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32767



[Bug c++/32767] ICE in constructing a template object using statement expression on AMD64.

2007-07-14 Thread bisqwit at iki dot fi


--- Comment #2 from bisqwit at iki dot fi  2007-07-14 17:34 ---
Also, yay, bug report #32767 :)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32767



[Bug rtl-optimization/31485] New: C complex numbers, amd64 SSE, missed optimization opportunity

2007-04-05 Thread bisqwit at iki dot fi
Considering that complex turns basically any basic type into a vector type,
complex number addition and subtraction could utilize SSE instructions to
perform the operation on real and imaginary parts simultaneously. (Only applies
to addition and subtraction.)

Code:

#include complex.h

typedef float complex ss1;
typedef float ss2 __attribute__((vector_size(sizeof(ss1;

ss1 add1(ss1 a, ss1 b) { return a + b; }
ss2 add2(ss2 a, ss2 b) { return a + b; }

Produces:

add1:
movq%xmm0, -8(%rsp)
movq%xmm1, -16(%rsp)
movss   -4(%rsp), %xmm0
movss   -8(%rsp), %xmm1
addss   -12(%rsp), %xmm0
addss   -16(%rsp), %xmm1
movss   %xmm0, -20(%rsp)
movss   %xmm1, -24(%rsp)
movq-24(%rsp), %xmm0
ret
add2:
movlps  %xmm0, -16(%rsp)
movlps  %xmm1, -24(%rsp)
movaps  -24(%rsp), %xmm0
addps   -16(%rsp), %xmm0
movaps  %xmm0, -56(%rsp)
movlps  -56(%rsp), %xmm0
ret

Command line:
gcc -msse  -O3 -S test2.c
(Results are same with -ffast-math)
Architecture:
CPU=AMD Athlon(tm) 64 X2 Dual Core Processor 4600+
CPU features=fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow
pni lahf_lm cmp_legacy

GCC is:
Target: x86_64-linux-gnu
Configured with: ../src/configure -v
--enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr
--enable-shared --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --enable-nls
--program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu
--enable-libstdcxx-debug --enable-mpfr --enable-checking=release
x86_64-linux-gnu
Thread model: posix
gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)


-- 
   Summary: C complex numbers, amd64 SSE, missed optimization
opportunity
   Product: gcc
   Version: 4.1.2
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bisqwit at iki dot fi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485



[Bug rtl-optimization/26098] New: ICE in multiplication of 16-byte longlong vector on x86_64

2006-02-04 Thread bisqwit at iki dot fi
This code causes ICE on gcc 4.0.3 on x86_64.

typedef long long vec __attribute__ ((vector_size(16)));
vec vecsqr(vec a) { return a*a; }

Commandline:

gcc -O1 -S -o - tmp.c

Resulting output:

.file   tmp.c
tmp.c: In function 'vecsqr':
tmp.c:2: error: unrecognizable insn:
(insn 13 12 15 0 (set (reg:DI 58 [ D.1470 ])
(vec_select:DI (reg/v:V2DI 61 [ a ])
(parallel [
(const_int 1 [0x1])
]))) -1 (nil)
(expr_list:REG_DEAD (reg/v:V2DI 61 [ a ])
(nil)))
tmp.c:2: internal compiler error: in extract_insn, at recog.c:2020

It goes ICE on when -O option = 1. -O0 does not trigger it.
Option -mno-sse also disables the ICE, but then it gives error: SSE register
return with SSE disabled. -mno-sse2 doesn't disable it.
Unsigned/signed type has no effect to result. Without
__attribute__((vector_size)), it does not ICE.

GCC version (gcc -v):

Using built-in specs.
Target: x86_64-linux-gnu
Configured with: ../src/configure -v
--enable-languages=c,c++,java,f95,objc,ada,treelang --prefix=/usr
--enable-shared --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --enable-nls
--program-suffix=-4.0 --enable-__cxa_atexit --enable-clocale=gnu
--enable-libstdcxx-debug --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-4.0-1.4.2.0/jre --enable-mpfr
--disable-werror --enable-checking=release x86_64-linux-gnu
Thread model: posix
gcc version 4.0.3 20051201 (prerelease) (Debian 4.0.2-5)


-- 
   Summary: ICE in multiplication of 16-byte longlong vector on
x86_64
   Product: gcc
   Version: 4.0.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: bisqwit at iki dot fi
 GCC build triplet: x86_64-linux-gnu
  GCC host triplet: x86_64-linux-gnu
GCC target triplet: x86_64-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26098