[Bug target/116014] Missed optimization opportunity: inverted shift count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116014 --- Comment #2 from Joel Yliluoma --- (In reply to Andi Kleen from comment #1) > is that from some real code? why would a programmer write shifts like that? Yes, it is from actual code: uint64_t readvlq() { uint64_t x, f = ~(uint64_t)0, ones8 = f / 255, pat80 = ones8*0x80, pat7F=ones8*0x7F; memcpy(, ptr, sizeof(x)); uint8_t n = __builtin_ctzll(~(x|pat7F)) + 1; ptr += n/8; return _pext_u64(x, pat7F >> (64-n)); } This function reads a variable-length encoded integer (as in General MIDI) from a bytestream without loops or branches. It essentially does the same as this: uint64_t readvlq() { uint64_t result = 0; do { result = (result << 7) | (*ptr & 0x7F); } while(*ptr++ & 0x80); return result; } It isn’t too hard to think of plausible other cases where bitshifts with numberofbits(tgt)-variable may occur. In fact, after just 2 minutes of searching with `grep`, I found this line in LLVM (llvm-17/llvm/Bitstream/BitstreamWriter.h), where CurValue is a 32-bit entity: CurValue = Val >> (32-CurBit);
[Bug middle-end/116013] Missed optimization opportunity with andn involving consts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116013 --- Comment #1 from Joel Yliluoma --- Should be noted that this is not x86_64 specific; andn exists for other platforms too, and even for platforms that don’t have it, changing `~(expr|const)` into `~expr & ~const` is unlikely to be a pessimization.
[Bug tree-optimization/116014] New: Missed optimization opportunity: inverted shift count
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116014 Bug ID: 116014 Summary: Missed optimization opportunity: inverted shift count Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Below are six short functions which perform bit-shifts by a non-constant inverted amount. GCC fails to generate most optimal code. Further explanation is given below the assembler code. #include uint64_t shl_m64(uint64_t value, uint8_t k) { return value << (64-k); } uint64_t shl_m63(uint64_t value, uint8_t k) { return value << (63-k); } uint64_t shr_m64(uint64_t value, uint8_t k) { return value >> (64-k); } uint64_t shr_m63(uint64_t value, uint8_t k) { return value >> (63-k); } int64_t asr_m64(int64_t value, uint8_t k) { return value >> (64-k); } int64_t asr_m63(int64_t value, uint8_t k) { return value >> (63-k); } Below is the code generated by GCC, using -Ofast -mbmi2 -masm=intel. BMI2 is used just to make the assembler code more succinct; it is not relevant for the report. shl_m64: mov eax, 64 sub eax, esi shlxrax, rdi, rax ret shl_m63: mov eax, 63 sub eax, esi shlxrax, rdi, rax ret shr_m64: mov eax, 64 sub eax, esi shrxrax, rdi, rax ret shr_m63: mov eax, 63 sub eax, esi shrxrax, rdi, rax ret asr_m64: mov eax, 64 sub eax, esi sarxrax, rdi, rax ret asr_m63: mov eax, 63 sub eax, esi sarxrax, rdi, rax ret GCC fails to utilize the fact that on Intel, the shift instructions automatically mask the shift-count into the target register width. That is, shift of a 64-bit operand by 68 is the same as shift by 68%64 = 4, and shift of a 32-bit operand by 100 is the same shift by 100%32 = 4. Utilizing this knowledge permits the use of single-insn neg/not to replace the subtract, which requires two insns. In comparison, Clang (version 16) produces this (optimal) code: shl_m64: neg sil shlxrax, rdi, rsi ret shl_m63: not sil shlxrax, rdi, rsi ret shr_m64: neg sil shrxrax, rdi, rsi ret shr_m63: not sil shrxrax, rdi, rsi ret asr_m64: neg sil sarxrax, rdi, rsi ret asr_m63: not sil sarxrax, rdi, rsi ret Tested GCC version: GCC: (Debian 14-20240330-1) 14.0.1 20240330 (experimental) [master r14-9728-g6fc84f680d0]
[Bug tree-optimization/116013] New: Missed optimization opportunity with andn involving consts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116013 Bug ID: 116013 Summary: Missed optimization opportunity with andn involving consts Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Below are two short functions which work identically. While GCC utilizes the ANDN instruction (of Intel BMI1) for test2, it fails to see that it could do the same with test1. #include uint64_t test1(uint64_t value) { return ~(value | 0x7F7F7F7F7F7F7F7F); } uint64_t test2(uint64_t value) { return ~value & ~0x7F7F7F7F7F7F7F7F; } Assembler listings of both functions are below (-Ofast -mbmi): test1: movabsq $9187201950435737471, %rdx movq%rdi, %rax orq %rdx, %rax notq%rax ret test2: movabsq $-9187201950435737472, %rax andn%rax, %rdi, %rax ret Tested compiler version: GCC: (Debian 14-20240330-1) 14.0.1 20240330 (experimental) [master r14-9728-g6fc84f680d0] This optimization makes only sense if one of the operands is a compile-time constant. If neither operand is a compile-time constant, then the opposite optimization makes more sense — which GCC already does. It is also worth noting, that GCC already compiles ~(var1 | ~var2) into ~var1 & var2, utilizing ANDN. This is good.
[Bug c++/99895] New: Function parameters generated wrong in call to member of non-type template parameter in lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99895 Bug ID: 99895 Summary: Function parameters generated wrong in call to member of non-type template parameter in lambda Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- GCC produces false error message: bug1.cc: In instantiation of ‘consteval void VerifyHash() [with unsigned int expected_hash = 5; fixed_string<...auto...> ...s = {fixed_string<6>{"khaki"}, fixed_string<6>{"plums"}}]’: bug1.cc:24:37: required from here bug1.cc:19:41: error: no matching function for call to ‘fixed_string<6>::data(const fixed_string<6>*)’ 19 | [](auto){static_assert(hash(s.data(), s.size()) == expected_hash);}(s) | ~~^~ bug1.cc:11:27: note: candidate: ‘consteval const char* fixed_string::data() const [with long unsigned int N = 6]’ 11 | consteval const char* data() const { return str; } | ^~~~ bug1.cc:11:27: note: candidate expects 0 arguments, 1 provided On this code: #include // copy_n and size_t static constexpr unsigned hash(const char* s, std::size_t length) { s=s; return length; } template struct fixed_string { constexpr fixed_string(const char ()[N]) { std::copy_n(s, N, str); } consteval const char* data() const { return str; } consteval std::size_t size() const { return N-1; } char str[N]; }; template static consteval void VerifyHash() { ( [](auto){static_assert(hash(s.data(), s.size()) == expected_hash);}(s) ,...); // The compiler mistakenly translates s.data() into s.data() // and then complains that the call is not valid, because // the function expects 0 parameters and 1 "was provided". } void foo() { VerifyHash<5, "khaki", "plums">(); } Compiler version: g++-10 (Debian 10.2.1-6) 10.2.1 20210110
[Bug c++/99893] New: C++20 unexpanded parameter packs falsely not detected (lambda is involved)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99893 Bug ID: 99893 Summary: C++20 unexpanded parameter packs falsely not detected (lambda is involved) Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- GCC produces false error message: bug1.cc: In function ‘consteval void VerifyHash()’: bug1.cc:20:70: error: operand of fold expression has no unexpanded parameter packs 20 | [](){static_assert(hash(s.data(), s.size()) == expected_hash);}() | ~~~^~ On this code: #include // copy_n and size_t static constexpr unsigned hash(const char* s, std::size_t length) { s=s; return length; } template struct fixed_string { constexpr fixed_string(const char ()[N]) { std::copy_n(s, N, str); } consteval const char* data() const { return str; } consteval std::size_t size() const { return N-1; } char str[N]; }; template static consteval void VerifyHash() { ( [](){static_assert(hash(s.data(), s.size()) == expected_hash);}() ,...); // ^ Falsely reports that there are no unexpanded parameter packs, // while there definitely is ("s" is used). } void foo() { VerifyHash<5, "khaki", "plums">(); } Compiler version: g++-10 (Debian 10.2.1-6) 10.2.1 20210110
[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485 --- Comment #25 from Joel Yliluoma --- (In reply to Jakub Jelinek from comment #24) > on x86 read e.g. about MXCSR register and in the description of each > instruction on which Exceptions it can raise. So the quick answer to #15 is that addps instruction may raise exceptions. Ok, thanks for clearing that up. My bad. So it seems that LLVM relies on the assumption that the upper portions of the register are zeroed, and this is what you said in the first place.
[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485 --- Comment #23 from Joel Yliluoma --- (In reply to Jakub Jelinek from comment #21) > (In reply to Joel Yliluoma from comment #20) > > Which exceptions would be generated by data in an unused portion of a > > register? > > addps adds 4 float elements, there is no "unused" portion. > If some of the elements contain garbage, it can trigger for e.g. the addition > FE_INVALID, FE_OVERFLOW, FE_UNDERFLOW or FE_INEXACT (FE_DIVBYZERO obviously > isn't relevant to addition). > Please read the standard about floating point exceptions, fenv.h etc. There is “unused” portion, for the purposes of the data use. Same as with padding in structs; the memory is unused because no part in program relies on its contents, even though the CPU may load those portions in registers when e.g. moving and copying the struct. The CPU won’t know whether it’s used or not. You mention FE_INVALID etc., but those are concepts within the C standard library, not in the hardware. The C standard library will not make judgments on the upper portions of the register. So if you have two float[2]s, and you add them together into another float[2], and the compiler uses addps to achieve this task, what is the mechanism that would supposedly generate an exception, when no part in the software depends and makes judgments on the irrelevant parts of the register?
[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485 --- Comment #20 from Joel Yliluoma --- (In reply to Jakub Jelinek from comment #16) > (In reply to Joel Yliluoma from comment #15) > > (In reply to Richard Biener from comment #14) > > > I also think llvms code generation is bogus since it appears the ABI > > > does not guarantee zeroed upper elements of the xmm0 argument > > > which means they could contain sNaNs: > > > > Why would it matter that the unused portions of the register contain NaNs? > > Because it could then raise exceptions that shouldn't be raised? Which exceptions would be generated by data in an unused portion of a register? Does for example “addps” generate an exception if one or two of the operands contains NaNs? Which instructions would generate exceptions? I can only think of divps, when dividing by a zero, but it does not seem that even LLVM compiles the two-element vector division into divps. If the register is passed as a parameter to a library function, they would not make judgments based on the values of the unused portions of the registers.
[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485 --- Comment #15 from Joel Yliluoma --- (In reply to Richard Biener from comment #14) > I also think llvms code generation is bogus since it appears the ABI > does not guarantee zeroed upper elements of the xmm0 argument > which means they could contain sNaNs: Why would it matter that the unused portions of the register contain NaNs?
[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485 --- Comment #13 from Joel Yliluoma --- GCC 4.1.2 is indicated in the bug report headers. Luckily, Compiler Explorer has a copy of that exact version, and it indeed vectorizes the second function: https://godbolt.org/z/DC_SSb On my own system, the earliest I have is 4.6. The Compiler Explorer has 4.4, and it, or anything newer than that, no longer vectorizes either function.
[Bug rtl-optimization/31485] C complex numbers, amd64 SSE, missed optimization opportunity
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485 --- Comment #11 from Joel Yliluoma --- Looks like this issue has taken a step or two *backwards* in the past years. Where as the second function used to be vectorized properly, today it seems neither of them are. Contrast this with Clang, which compiles *both* functions into a single instruction: vaddps xmm0, xmm1, xmm0 or some variant thereof depending on the -m options. Compiler Explorer link: https://godbolt.org/z/2AKhnt
[Bug c++/94575] Bogus warning: Used variable is “not” used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94575 --- Comment #2 from Joel Yliluoma --- Sorry, the error Marek Polacek mentions is due to a copypaste mistake on my part. The correct code that demonstrates the problem is here. The difference is the && instead of &. #include template static void Use(T&& plot) { plot(1); } int main() { static const int table[1] = {123456}; Use([&](auto x) { unsigned var = table[x]; unsigned ui = var; std::printf("%u\n", ui); }); }
[Bug c++/94575] New: Bogus warning: Used variable is “not” used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94575 Bug ID: 94575 Summary: Bogus warning: Used variable is “not” used Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Incorrect warning (-std=c++14 -Wall ) Tested and occurs on GCC 5.4.1, 6.5.0, 7.5.0, 8.4.0, 9.3.0, and 10.0.1 (master revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536)) tmp.cc: In function ‘int main()’: tmp.cc:9:22: warning: variable ‘table’ set but not used [-Wunused-but-set-variable] 9 | static const int table[1] = {123456}; | ^ Table is in fact used; program prints 123456. #include template static void Use(T& plot) { plot(1); } int main() { static const int table[1] = {123456}; Use([&](auto x) { unsigned var = table[x]; unsigned ui = var; std::printf("%u\n", ui); }); } This a non-exhaustive list of changes that will make the warning go away: — Changing the lambda auto parameter into a static type such as int — Changing Use() into a lambda function in main() — Removing the store into temporary variable “ui”
[Bug c++/94571] Error: Expected comma or semicolon, comma found
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94571 --- Comment #1 from Joel Yliluoma --- | ^ (Missing line from the paste) The problem exists since GCC 7. (GCC 6 and earlier did not support structured bindings.)
[Bug c++/94571] New: Error: Expected comma or semicolon, comma found
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94571 Bug ID: 94571 Summary: Error: Expected comma or semicolon, comma found Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- void foo() { int test1[2], test2[2]; auto [a,b] = test1, [c,d] = test2; } The error message given for this (invalid) C++17 code is a bit confusing. tmp.cc: In function ‘void foo()’: tmp.cc:4:23: error: expected ‘,’ or ‘;’ before ‘,’ token 4 | auto [a,b] = test1, [c,d] = test2; You expected comma, found comma. So what is the problem? The proper error message would be to only expect a semicolon.
[Bug tree-optimization/58195] Missed optimization opportunity when returning a conditional
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58195 --- Comment #4 from Joel Yliluoma --- Still confirmed on GCC 10 (Debian 10-20200324-1) 10.0.1 20200324 (experimental) [master revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536] Seems I lack the oomph to update the "confirmed" state of this report.
[Bug c++/94546] New: unimplemented: unexpected AST of kind nontype_argument_pack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94546 Bug ID: 94546 Summary: unimplemented: unexpected AST of kind nontype_argument_pack Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: rejects-valid Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Rejects valid code. $ g++-10 --version g++-10 (Debian 10-20200324-1) 10.0.1 20200324 (experimental) [master revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536] $ g++-10 tmp.cc -std=c++20 tmp.cc: In instantiation of ‘void test(auto:1&&) [with auto:1 = main()::&]’: tmp.cc:18:14: required from here tmp.cc:8:5: sorry, unimplemented: unexpected AST of kind nontype_argument_pack 8 | [&](T&&... rest) | ^~~~ 9 | { | ~ 10 | plot(std::forward(rest)...); | ~~~ 11 | }; | ~ tmp.cc:8: confused by earlier errors, bailing out #include void test(auto&& plot) { // Note: For brevity, this lambda function is only // defined, not called nor assigned to a variable. // Doing those things won’t fix the error. [&](T&&... rest) { plot(std::forward(rest)...); }; } int main() { auto Plot = [](auto&&...) { }; test(Plot); }
[Bug c++/94490] New: Ternary expression with 3 consts is “not” a constant expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94490 Bug ID: 94490 Summary: Ternary expression with 3 consts is “not” a constant expression Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Versions tested: GCC 9.3.0, GCC 10.0.1 20200324 (experimental) [master revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536] Error message: test6.cc: In instantiation of ‘auto Mul(const T&, const T2&) [with T = std::array; T2 = std::array]’: test6.cc:20:58: required from here test6.cc:16:78: error: ‘(false ? std::tuple_size_v > : std::tuple_size_v >)’ is not a constant expression 16 | return typename arith_result::type { std::get(vec) ... }; | ^ test6.cc:20:6: error: ‘void x’ has incomplete type 20 | auto x = Mul(std::array{}, std::array{}); | ^ Compiler is erroneously claiming that an expression of type (x ? y : z) where all of x,y,z are constant expressions, is not a constant expression. Code: #include #include template, std::size_t B=std::tuple_size_v, std::size_t N = std::min(A,B), class S = std::make_index_sequence<(A> struct arith_result { using type = std::conditional_t(std::index_sequence) -> std::array..., std::tuple_element_t...>, N>{}(S{})), decltype([](std::index_sequence) -> std::tuple, std::tuple_element_t>...>{}(S{}))>; }; template, typename T2 = T> auto Mul(const T& vec, const T2& val) { return [&](std::index_sequence) { return typename arith_result::type { std::get(vec) ... }; } (std::make_index_sequence<2>{}); } auto x = Mul(std::array{}, std::array{}); Note that if I replace the Mul function with this (inline the lambda call), the problem goes away: template, typename T2 = T> auto Mul(const T& vec, const T2& val) { return typename arith_result::type { std::get<0>(vec),std::get<1>(vec) }; } Somehow the compiler forgets to do constant folding while it is processing the lambda.
[Bug c++/94489] New: ICE: unexpected expression ‘std::min’ of kind overload
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94489 Bug ID: 94489 Summary: ICE: unexpected expression ‘std::min’ of kind overload Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- On GCC 9.3.0: $ g++-9 test5.cc -std=c++2a -g -fconcepts test5.cc: In instantiation of ‘auto Mul(const T&, const T2&) [with T = std::array; T2 = std::array]’: test5.cc:33:62: required from here test5.cc:28:117: internal compiler error: unexpected expression ‘std::min’ of kind overload 28 | return typename arith_result::type { std::plus(std::get(vec), std::get(val)) ... }; | ^ 0x7f78e300ce0a __libc_start_main ../csu/libc-start.c:308 On 10.0.1 20200324 (experimental) [master revision 596c90d3559:023579257f5:906b3eb9df6c577d3f6e9c3ea5c9d7e4d1e90536]: $ g++-10 test5.cc -std=c++20 -g test5.cc: In instantiation of ‘auto Mul(const T&, const T2&) [with T = std::array; T2 = std::array]’: test5.cc:33:62: required from here test5.cc:28:117: internal compiler error: unexpected expression ‘std::min’ of kind overload 28 | return typename arith_result::type { std::plus(std::get(vec), std::get(val)) ... }; | ^ 0x63d82b cxx_eval_constant_expression ../../src/gcc/cp/constexpr.c:6301 0x637ded cxx_eval_call_expression ../../src/gcc/cp/constexpr.c:2055 0x63ad65 cxx_eval_constant_expression ../../src/gcc/cp/constexpr.c:5483 0x63ccc2 cxx_eval_indirect_ref ../../src/gcc/cp/constexpr.c:4213 0x63ccc2 cxx_eval_constant_expression ../../src/gcc/cp/constexpr.c:5704 0x63dbe0 cxx_eval_outermost_constant_expr ../../src/gcc/cp/constexpr.c:6502 0x63e5ec cxx_constant_value(tree_node*, tree_node*) ../../src/gcc/cp/constexpr.c:6659 0x733823 expand_integer_pack ../../src/gcc/cp/pt.c:3751 0x733823 expand_builtin_pack_call ../../src/gcc/cp/pt.c:3790 0x733823 tsubst_pack_expansion(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:12714 0x735561 tsubst_template_args(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:13078 0x73a678 tsubst_argument_pack(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:13040 0x735534 tsubst_template_args(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:13090 0x7356e5 tsubst_aggr_type ../../src/gcc/cp/pt.c:13295 0x72c6f4 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ../../src/gcc/cp/pt.c:20100 0x7304e4 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ../../src/gcc/cp/pt.c:15745 0x7304e4 tsubst(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:15745 0x735466 tsubst_template_args(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:13092 0x7356e5 tsubst_aggr_type ../../src/gcc/cp/pt.c:13295 0x7307fe tsubst(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:15633 Code is listed below. #include #include #include template concept IsTuple = requires(T t) { {std::get<0>(t) };} and (std::tuple_size_v>-MinSz) <= (MaxSz-MinSz); template,std::tuple_size_v), class seq = decltype(std::make_index_sequence{})> struct arith_result { template static auto t(std::index_sequence) -> std::tuple, std::tuple_element_t>...>; template static auto a(std::index_sequence) -> std::array..., std::tuple_element_t...>, dim>; using type = std::conditional_t; }; template, typename T2 = T> auto Mul(const T& vec, const T2& val) { return [&](std::index_sequence) { return typename arith_result::type { std::plus(std::get(vec), std::get(val)) ... }; } (std::make_index_sequence>, std::tuple_size_v>)>{}); } auto x = Mul(std::array{}, std::array{});
[Bug c++/94128] ICE on C++20 "requires requires" with lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94128 --- Comment #2 from Joel Yliluoma --- Yes, it is valid. — The auto parameter is valid since C++20. It is called a “placeholder type”, which has existed since C++11. C++20 made it valid also in function parameters. — The “requires” is a valid keyword since C++20. It specifies constraints that the parameter must match. The double “requires” manifests in certain situations. — Until C++20, lambdas were not permitted in unevaluated contexts. Changed in C++20.
[Bug c++/94128] New: ICE on C++20 "requires requires" with lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94128 Bug ID: 94128 Summary: ICE on C++20 "requires requires" with lambda Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- For this code: void test(auto param) requires requires{ { [](auto p){return p;}(param) }; }; void test2() { test(1); } On this compiler: g++-10 (Debian 10-20200222-1) 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779] Compiling with this commandline: g++-10 -v tmp.cc -std=c++20 We get: Using built-in specs. COLLECT_GCC=g++-10 COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 10-20200222-1' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none,amdgcn-amdhsa,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779] (Debian 10-20200222-1) COLLECT_GCC_OPTIONS='-v' '-std=c++2a' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /usr/lib/gcc/x86_64-linux-gnu/10/cc1plus -quiet -v -imultiarch x86_64-linux-gnu -D_GNU_SOURCE tmp.cc -quiet -dumpbase tmp.cc -mtune=generic -march=x86-64 -auxbase tmp -std=c++2a -version -fasynchronous-unwind-tables -o /tmp/cc8CWcEJ.s GNU C++17 (Debian 10-20200222-1) version 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779] (x86_64-linux-gnu) compiled by GNU C version 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779], GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22-GMP warning: GMP header version 6.2.0 differs from library version 6.1.2. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 ignoring duplicate directory "/usr/include/x86_64-linux-gnu/c++/10" ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu" ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/10/../../../../x86_64-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/include/c++/10 /usr/include/x86_64-linux-gnu/c++/10 /usr/include/c++/10/backward /usr/lib/gcc/x86_64-linux-gnu/10/include /usr/local/include /usr/lib/gcc/x86_64-linux-gnu/10/include-fixed /usr/include/x86_64-linux-gnu /usr/include End of search list. GNU C++17 (Debian 10-20200222-1) version 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779] (x86_64-linux-gnu) compiled by GNU C version 10.0.1 20200222 (experimental) [master revision 01af7e0a0c2:487fe13f218:e99b18cf7101f205bfdd9f0f29ed51caaec52779], GMP version 6.2.0, MPFR version 4.0.2, MPC version 1.1.0, isl version isl-0.22-GMP warning: GMP header version 6.2.0 differs from library version 6.1.2. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: f533434f622c23e753fbd5b6135ebdd3 tmp.cc: In instantiation of ‘void test(auto:1) requires requires{{()(test::param)};} [with auto:1 = int]’: tmp.cc:3:22: required from here tmp.cc:2:26: internal compiler error: Segmentation fault 2 | requires requires{ { [](auto p){return p;}(param) }; }; | ^ 0xc248ef crash_signal ../../src/gcc/toplev.c:328 0x7fb53dd2d0ff ??? /build/glibc-kSJANG/glibc-2.29/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 0x733be8 tsubst_template_args(tree_node*, tree_node*, int, tree_node*) ../../src/gcc/cp/pt.c:13090 0x738602 tsubst_fu
[Bug c++/92766] New: [Rejects valid] pointer+0 erroneously treated as rvalue
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92766 Bug ID: 92766 Summary: [Rejects valid] pointer+0 erroneously treated as rvalue Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- template void foo(T&& begin, T&& end); void test() { unsigned char buffer[16]; const unsigned char* ptr = buffer; foo(ptr+0, ptr+8); } On GCC 8.3, this compiles fine. On GCC 9.1 and trunk, fails to compile (-std=c++11, c++14, c++17, c++2a): :8:21: error: no matching function for call to 'foo(const unsigned char*&, const unsigned char*)' 8 | foo(ptr+0, ptr+8); | ^ :2:6: note: candidate: 'template void foo(T&&, T&&)' 2 | void foo(T&& begin, T&& end); | ^~~ :2:6: note: template argument deduction/substitution failed: :8:21: note: deduced conflicting types for parameter 'T' ('const unsigned char*&' and 'const unsigned char*') 8 | foo(ptr+0, ptr+8); | ^ Compiler returned: 1
[Bug c/91526] Unnecessary SSE and other instructions generated when compiling in C mode (vs. C++ mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91526 Joel Yliluoma changed: What|Removed |Added CC||bisqwit at iki dot fi --- Comment #2 from Joel Yliluoma --- The theory that it is related to RVO seems to be confirmed by the fact that if the code is changed like this: struct Vec { float v[8]; }; void multiply(struct Vec* result, const struct Vec* __restrict__ v1, const struct Vec* __restrict__ v2) { for(unsigned i = 0; i < 8; ++i) result->v[i] = v1->v[i] * v2->v[i]; } Then it gets compiled in the shorter and proper form. Interestingly, even if the __restrict__ attribute is removed, it still gets vectorized. Is this correct behavior?
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #24 from Joel Yliluoma --- The simple horizontal 8-bit add seems to work nicely. Very nice work. However, the original bug report — that the code snippet quoted below no longer receives love from the SIMD optimization unless you explicitly say “pragma #omp simd” — seems still unaddressed. #define num_words 2 typedef unsigned long long E; E bytes[num_words]; unsigned char sum() { E b[num_words] = {}; //#pragma omp simd for(unsigned n=0; n> 32); temp += (temp >> 16); temp += (temp >> 8); // Save that number in an array b[n] = temp; } // Calculate sum of those sums unsigned char result = 0; //#pragma omp simd for(unsigned n=0; nhttps://godbolt.org/z/XL3cIK
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #19 from Joel Yliluoma --- If the function return type is changed to "unsigned short", the AVX code with "vpextrb" will do a spurious "movzx eax, al" at the end — but if the return type is "unsigned int", it will not. The code with "(v)movd" should of course do it, if the vector element size is shorter than the return type.
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #18 from Joel Yliluoma --- Great, thanks. I can test this in a few days, but I would like to make sure that the proper thing still happens if the vector is of bytes but the return value of the function is a larger-than-byte integer type. Will it still generate a movd in this case? Because that would be wrong. :-)
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #16 from Joel Yliluoma --- In reference to my previous comment, this is the code I tested with and the compiler flags were -Ofast -mno-avx. unsigned char bytes[128]; unsigned char sum (void) { unsigned char r = 0; const unsigned char *p = (const unsigned char *) bytes; int n; for (n = 0; n < sizeof (bytes); ++n) r += p[n]; return r; }
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #15 from Joel Yliluoma --- Seems to work neatly now. Any reason why on vector size 128, non-AVX, it does the low byte move through the red zone? Are pextrb or movd instructions not available? Or does ABI specify that the upper bits of the eax register must be zero? movaps XMMWORD PTR [rsp-40], xmm2 movzx eax, BYTE PTR [rsp-40] Clang does just a simple movd here. movdeax, xmm1
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #6 from Joel Yliluoma --- Maybe a horizontal checksum is a bit obscure term. A 8-bit checksum is what is being accomplished, nonetheless. Yes, there are simpler ways to do it… But I tried a number of different approaches in order to try and get maximum performance SIMD code out of GCC, and I came upon this curious case that I posted this bugreport about. To another compiler, I reported a related bug concerning a code that looks like this: unsigned char calculate_checksum(const void* ptr) { unsigned char bytes[16], result = 0; memcpy(bytes, ptr, 16); // The reason the memcpy is there in place is because to // my knowledge, it is the only _safe_ way permitted by // the standard to do conversions between representations. // Union, pointer casting, etc. are not safe. for(unsigned n=0; n<16; ++n) result += bytes[n]; return result; } After my report, their compiler now generates: vmovdqu xmm0, xmmword ptr [rdi] vpshufd xmm1, xmm0, 78 # xmm1 = xmm0[2,3,0,1] vpaddb xmm0, xmm0, xmm1 vpxor xmm1, xmm1, xmm1 vpsadbw xmm0, xmm0, xmm1 vpextrb eax, xmm0, 0 ret This is what GCC generates for the same code. vmovdqu xmm0, XMMWORD PTR [rdi] vpsrldq xmm1, xmm0, 8 vpaddb xmm0, xmm0, xmm1 vpsrldq xmm1, xmm0, 4 vpaddb xmm0, xmm0, xmm1 vpsrldq xmm1, xmm0, 2 vpaddb xmm0, xmm0, xmm1 vpsrldq xmm1, xmm0, 1 vpaddb xmm0, xmm0, xmm1 vpextrb eax, xmm0, 0 ret So the bottom line is, (v)psadbw reductions should be added as M. Glisse correctly indicated.
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #3 from Joel Yliluoma --- For the record, for this particular case (8-bit checksum of an array, 16 bytes in this case) there exists even more optimal SIMD code, which ICC (version 18 or greater) generates automatically. vmovups xmm0, XMMWORD PTR bytes[rip] #5.9 vpxor xmm2, xmm2, xmm2 #4.41 vpaddbxmm0, xmm2, xmm0 #4.41 vpsrldq xmm1, xmm0, 8 #4.41 vpaddbxmm3, xmm0, xmm1 #4.41 vpsadbw xmm4, xmm2, xmm3 #4.41 vmovd eax, xmm4 #4.41 movsx rax, al #4.41 ret #7.16
[Bug tree-optimization/91201] New: [7~9 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 Bug ID: 91201 Summary: [7~9 Regression] SIMD not generated for horizontal sum of bytes in array Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- For this code — typedef unsigned long long E; const unsigned D = 2; E bytes[D]; unsigned char sum() { E b[D]{}; //#pragma omp simd for(unsigned n=0; n> 32); temp += (temp >> 16); temp += (temp >> 8); b[n] = temp; } E result = 0; //#pragma omp simd for(unsigned n=0; nhttps://godbolt.org/z/azkXiL
[Bug rtl-optimization/88770] New: Redundant load opt. or CSE pessimizes code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88770 Bug ID: 88770 Summary: Redundant load opt. or CSE pessimizes code Product: gcc Version: 8.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- For this code (-xc -std=c99 or -xc++ -std=c++17): struct guu { int a; int b; float c; char d; }; extern void test(struct guu); void caller() { test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} ); test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} ); } CSE (or some other form of redundant loads optimization) pessimizes the code. Problem occurs on optimization levels -O1 and higher, including -Os. If the function "caller" calls test() just once, the resulting code is (-O3 -fno-optimize-sibling-calls, stack alignment/push/pops omitted for brevity): movabs rdi, 21474836483 movabs rsi, 39743127552 calltest If "caller" calls test() twice, the code is a lot longer and not just twice as long. (Stack alignment/push/pops omitted for brevity): movabs rbp, 21474836483 mov rdi, rbp movabs rbx, 38654705664 mov rsi, rbx or rbx, 1088421888 or rsi, 1088421888 calltest mov rsi, rbx mov rdi, rbp calltest If we change caller() such that the parameters in the two calls are not identical: void caller() { test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} ); test( (struct guu){.a = 3, .b = 6, .c = 7, .d = 10} ); } The generated code is optimal again as expected: movabs rdi, 21474836483 movabs rsi, 39743127552 calltest movabs rdi, 25769803779 movabs rsi, 44038094848 calltest The problem in the first examples is that the compiler sees that the same parameter is used twice, and it tries to save it in a callee-saves register, in order to reuse the same values on the second call. However re-initializing the registers from scratch would have been more efficient. The problem occurs on GCC versions 4.8.1 and newer. It does not occur in GCC version 4.7.4, which generated different code that is otherwise inefficient. For reference, the problem also exists in Clang versions 3.5 and newer, but not in versions 3.4 and earlier.
[Bug tree-optimization/63259] Detecting byteswap sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63259 Joel Yliluoma changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #23 from Joel Yliluoma --- It would seem to be fixed since GCC 5.
[Bug c++/84556] New: C++17, lambda, OpenMP simd: sorry, unimplemented: unexpected AST
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84556 Bug ID: 84556 Summary: C++17, lambda, OpenMP simd: sorry, unimplemented: unexpected AST Product: gcc Version: 7.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- This code generates an AST error when compiled with -std=c++17 -fopenmp. void foo() { auto keymaker = [](void) { #pragma omp simd for(unsigned pos = 0; pos < 4; ++pos) { } }; } test.cc: In lambda function: test.cc:9:5: sorry, unimplemented: unexpected AST of kind omp_simd }; ^ test.cc:9: confused by earlier errors, bailing out Compiling without -fopenmp, or using an earlier standard mode such as -std=c++14 or -std=c++11, the error is not produced. Tested on: g++-7 (Debian 7.2.0-19) 7.2.0 Tested on: g++-7 (Debian 7.2.0-18) 7.2.0 Tested on: g++-7.1 (GCC) 7.1.0 Problem does NOT occur on: g++-6 (Debian 6.4.0-11) 6.4.0 20171206 Problem does NOT occur with #pragma omp parallel for.
[Bug lto/71536] New: lto1 ICE: func-static constant in openmp offloaded function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71536 Bug ID: 71536 Summary: lto1 ICE: func-static constant in openmp offloaded function Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- With this example program: #pragma omp declare target void Process() { static const int value = 12; } #pragma omp end declare target int main() { #pragma omp target { Process(); } } g++ crash.cc -fopenmp -O1 lto1: internal compiler error: Segmentation fault 0x93bcaf crash_signal ../../gcc/toplev.c:333 0x82d2eb input_offload_tables(bool) ../../gcc/lto-cgraph.c:1931 0x5c6590 read_cgraph_and_symbols ../../gcc/lto/lto.c:2858 0x5c6590 lto_main() ../../gcc/lto/lto.c:3304 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. mkoffload-intelmic: fatal error: x86_64-pc-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc returned 1 exit status compilation terminated. lto-wrapper: fatal error: /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/x86_64-intelmicemul-linux-gnu/mkoffload returned 1 exit status compilation terminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status The main() can be deleted from the program, and the error still occurs. This error requires at least -O1 to trigger it. GCC version: 6.1.0
[Bug lto/71535] New: ICE in LTO1 with -fopenmp offloading
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71535 Bug ID: 71535 Summary: ICE in LTO1 with -fopenmp offloading Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- With this example program: #include void Process(unsigned* Target) { for(int s=0; s<4; ++s) Target[s] = 100u * std::min(255, std::max(0, 0)) + 200u * std::min(255, std::max(0, 0)); } int main() { #pragma omp target teams distribute parallel for for(unsigned y=0; y<16; ++y) { unsigned Line[16]; Process(Line); } } g++ tmps.cc -fopenmp lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203 0x82f4d5 input_overwrite_node ../../gcc/lto-cgraph.c:1201 0x82f4d5 input_node ../../gcc/lto-cgraph.c:1296 0x82f4d5 input_cgraph_1 ../../gcc/lto-cgraph.c:1546 0x82f4d5 input_symtab() ../../gcc/lto-cgraph.c:1849 0x5c657b read_cgraph_and_symbols ../../gcc/lto/lto.c:2856 0x5c657b lto_main() ../../gcc/lto/lto.c:3304 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. mkoffload-intelmic: fatal error: x86_64-pc-linux-gnu-accel-x86_64-intelmicemul-linux-gnu-gcc returned 1 exit status compilation terminated. lto-wrapper: fatal error: /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/x86_64-intelmicemul-linux-gnu/mkoffload returned 1 exit status compilation terminated. /usr/bin/ld: error: lto-wrapper failed collect2: error: ld returned 1 exit status GCC version: 6.1.0 This bug is very likely related to PR71499.
[Bug lto/71499] ICE in LTO1 when attempting NVPTX offloading (-fopenacc)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71499 --- Comment #1 from Joel Yliluoma --- Addendum: While this works, reading LTO data and producing HOST code: /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto1 -dumpbase tmpe.o -auxbase tmpe -version -fopenacc tmpe.o -o /tmp/ccZgHvRO.s This does not: /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/accel/nvptx-none/lto1 -dumpbase tmpe.o -auxbase tmpe -version -fopenacc tmpe.o -o /tmp/ccZgHvRO.s GNU GIMPLE (GCC) version 6.1.0 (nvptx-none) compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU GIMPLE (GCC) version 6.1.0 (nvptx-none) compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 options passed: -fopenacc tmpe.o options enabled: -faggressive-loop-optimizations -fauto-inc-dec -fchkp-check-incomplete-type -fchkp-check-read -fchkp-check-write -fchkp-instrument-calls -fchkp-narrow-bounds -fchkp-optimize -fchkp-store-bounds -fchkp-use-static-bounds -fchkp-use-static-const-bounds -fchkp-use-wrappers -fcommon -fdelete-null-pointer-checks -fearly-inlining -feliminate-unused-debug-types -ffunction-cse -fgcse-lm -fgnu-runtime -fgnu-unique -fident -finline-atomics -fipa-pta -fira-hoist-pressure -fira-share-save-slots -fira-share-spill-slots -fivopts -fkeep-static-consts -fleading-underscore -flifetime-dse -flto-odr-type-merging -fmath-errno -fmerge-debug-strings -fpeephole -fplt -fprefetch-loop-arrays -freg-struct-return -fsched-critical-path-heuristic -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fschedule-fusion -fsemantic-interposition -fshow-column -fsigned-zeros -fsplit-ivs-in-unroller -fssa-backprop -fstdarg-opt -fstrict-volatile-bitfields -fsync-libcalls -ftoplevel-reorder -ftrapping-math -ftree-cselim -ftree-forwprop -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize -ftree-parallelize-loops= -ftree-phiprop -ftree-reassoc -ftree-scev-cprop -funit-at-a-time -fvar-tracking-assignments -fzero-initialized-in-bss -m64 Reading object files: tmpe.o {GC start 776k} Reading the callgraph lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203 0x7a73c5 input_overwrite_node ../../gcc/lto-cgraph.c:1201 0x7a73c5 input_node ../../gcc/lto-cgraph.c:1296 0x7a73c5 input_cgraph_1 ../../gcc/lto-cgraph.c:1546 0x7a73c5 input_symtab() ../../gcc/lto-cgraph.c:1849 0x5537fb read_cgraph_and_symbols ../../gcc/lto/lto.c:2856 0x5537fb lto_main() ../../gcc/lto/lto.c:3304 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. I also tried using different combinations of --enable-languages=c,c++,lto , --enable-lto , including neither, but none affected the problem. I also tried using the svn version of gcc, but it also exhibited the same problem. The nvptx-newlib revision is aadc8eb0ec43b7cd0dd2dfb484bae63c8b05ef24 and nvptx-tools revision is c28050f60193b3b95a18866a96f03334e874e78f.
[Bug lto/71499] New: ICE in LTO1 when attempting NVPTX offloading (-fopenacc)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71499 Bug ID: 71499 Summary: ICE in LTO1 when attempting NVPTX offloading (-fopenacc) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Summary: Error message: lto1: internal compiler error: in input_overwrite_node, at lto-cgraph.c:1203 On GCC 6.1.0 Compiling this code: void test() { } int main() { #pragma acc parallel test(); } With this commandline: gcc tmpe.c -O0 -fopenacc -v Complete output of GCC: Using built-in specs. COLLECT_GCC=/usr/local/bin/gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none Target: x86_64-pc-linux-gnu Configured with: ../configure --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --enable-offload-targets=nvptx-none=/usr/local/nvptx-none --enable-languages=c,c++ --with-cuda-driver=/usr --disable-bootstrap Thread model: posix gcc version 6.1.0 (GCC) COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64' '-pthread' /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/cc1 -quiet -v -imultiarch x86_64-linux-gnu -D_REENTRANT tmpe.c -quiet -dumpbase tmpe.c -mtune=generic -march=x86-64 -auxbase tmpe -O0 -version -fopenacc -o /tmp/ccPHnCW0.s GNU C11 (GCC) version 6.1.0 (x86_64-pc-linux-gnu) compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 ignoring nonexistent directory "/usr/local/include/x86_64-linux-gnu" ignoring nonexistent directory "/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../x86_64-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include /usr/local/include /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include-fixed /usr/include/x86_64-linux-gnu /usr/include End of search list. GNU C11 (GCC) version 6.1.0 (x86_64-pc-linux-gnu) compiled by GNU C version 6.1.0, GMP version 6.0.0, MPFR version 3.1.4, MPC version 1.0.3, isl version 0.15 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 1c46fde4e47f1157bf1461541c266a3c COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64' '-pthread' as -v --64 -o /tmp/ccd60m4w.o /tmp/ccPHnCW0.s GNU assembler version 2.26 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.26 COMPILER_PATH=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/libexec/gcc/x86_64-pc-linux-gnu/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/ LIBRARY_PATH=/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../:/lib/:/usr/lib/ Reading specs from /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64/libgomp.spec COLLECT_GCC_OPTIONS='-O0' '-fopenacc' '-v' '-mtune=generic' '-march=x86-64' '-pthread' /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/collect2 -plugin /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/liblto_plugin.so -plugin-opt=/usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0/lto-wrapper -plugin-opt=-fresolution=/tmp/ccPr7Oc3.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lpthread -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtbegin.o /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtoffloadbegin.o -L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0 -L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/../../.. /tmp/ccd60m4w.o -lgomp -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtend.o /usr/lib/x86_64-linux-gnu/crtn.o /usr/local/lib/gcc/x86_64-pc-linux-gnu/6.1.0/crtoffloadend.o /usr/local/libexec/gcc/x86_64-pc-linux-gnu/6.1.0//accel/nvptx-none/mkoffload @/tmp/ccDURU7y /usr/local/bin/x86_64-pc-linux-gnu-accel-nvp
[Bug libstdc++/70411] Stack overflow with std::regex_match
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70411 --- Comment #1 from Joel Yliluoma --- Minimal regex that causes the same crash: "^0+ .*"
[Bug libstdc++/70411] New: Stack overflow with std::regex_match
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70411 Bug ID: 70411 Summary: Stack overflow with std::regex_match Product: gcc Version: 5.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- When running this code, libstdc++ crashes with a stack overflow (segmentation fault) in std::regex_match. This regular expression is not the type that should require exponential backtracking. Crash occurs in code compiled by GCC 5.3.1 on x86_64-linux-gnu. Clang++ does the same crash, when using libstdc++ from GCC. Code compiled by GCC 4.9 does _not_ produce a crash, as it evidently uses a different version of libstdc++. #include #include std::string make_test_string() { std::string result = " 16777216 1 "; for(unsigned n=0; n<1; ++n) result += "EA NOP%"; return result; } std::regex testregex("^([0-9A-F]+) +([0-9]+) +([0-9]+) (.*)$"); int main() { std::string teststr = make_test_string(); std::smatch res; std::regex_match(teststr, res, testregex); }
[Bug c++/67838] New: Rejects-valid-code: templated lambda variable.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67838 Bug ID: 67838 Summary: Rejects-valid-code: templated lambda variable. Product: gcc Version: 5.2.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- GCC 5.2 fails to compile the code below, erroneously citing "error: use of 'TestFunc' before deduction of 'auto'". Compiles fine in Clang 3.5. #include template static auto TestFunc = [](int param1) { return param1; }; template static void test(Func func) { printf("%d\n", func(12345)); } int main() { test(TestFunc); test(TestFunc); }
[Bug c++/67838] Rejects-valid-code: templated lambda variable.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67838 --- Comment #1 from Joel Yliluoma --- Note that the use of "template" here is to declare a parametric variable. It is not for the function's parameter list. It works the same way as in this expression: template int v = 3*LMode;
[Bug regression/67609] New: [Regression] Generates wrong code for SSE2 _mm_load_pd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609 Bug ID: 67609 Summary: [Regression] Generates wrong code for SSE2 _mm_load_pd Product: gcc Version: 5.2.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: regression Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- For this program (needs -msse2 to compile). #include __m128d reg; void set_lower(double b) { double v[2]; _mm_store_pd(v, reg); v[0] = b; reg = _mm_load_pd(v); } On optimization levels -O1 and up, GCC 5.2 incorrectly generates code that destroys the upper half of reg. movapd %xmm0, %xmm1 movaps %xmm1, reg(%rip) On -O0, the bug does not occur. If the index expression is changed into an expression whose value is not known at compile-time, the code will work properly. GCC 4.9 does this correctly (if with bit too much labor): movdqa reg(%rip), %xmm1 movaps %xmm1, -24(%rsp) movsd %xmm0, -24(%rsp) movapd -24(%rsp), %xmm2 movaps %xmm2, reg(%rip) For comparison, Clang 3.4 and 3.5: movlpd %xmm0, reg(%rip) For comparison, Clang 3.6: movaps reg(%rip), %xmm1 movsd %xmm0, %xmm1 movaps %xmm1, reg(%rip)
[Bug regression/67609] [Regression] Generates wrong code for SSE2 _mm_load_pd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609 --- Comment #1 from Joel Yliluoma --- For the record, changing _mm_load_pd(v) into _mm_set_pd(v[1],v[0]) will coax GCC into generating correct code. The bug is related to _mm_load_pd().
[Bug regression/67609] [5/6 Regression] Generates wrong code for SSE2 _mm_load_pd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67609 Joel Yliluoma changed: What|Removed |Added Component|target |regression --- Comment #6 from Joel Yliluoma --- And also for _mm_load_ps in a similar situation. I did manage to get some error to occur with floats too, but I'm yet to isolate the problem.
[Bug rtl-optimization/67577] New: Trivial float-vectorization foiled by a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67577 Bug ID: 67577 Summary: Trivial float-vectorization foiled by a loop Product: gcc Version: 5.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- This code is written as if tailored to be SIMD-optimized by GCC... But GCC somehow blows it. template struct vec { T d[N]; vec<T,N> operator* (const T& b) { vec<T,N> result; for(unsigned n=0u; n<N; ++n) result.d[n] = d[n] * b; return result; } vec<T,N> operator+ (const vec<T,N>& b) { vec<T,N> result; for(unsigned n=0u; n<N; ++n) result.d[n] = d[n] + b.d[n]; return result; } vec<T,N> operator- (const vec<T,N>& b) { vec<T,N> result; for(unsigned n=0u; n<N; ++n) result.d[n] = d[n] - b.d[n]; return result; } }; float scale; vec<float,8> a, b, c; void x() { for(int n=0; n<1; ++n) { vec<float,8> result = b + (a - b) * scale; c = result; } } Generated code (inner loop): movss b+4(%rip), %xmm6 movss a+4(%rip), %xmm7 subss %xmm6, %xmm7 movss scale(%rip), %xmm0 movss b+8(%rip), %xmm5 movss b+12(%rip), %xmm4 movss b+16(%rip), %xmm3 mulss %xmm0, %xmm7 movss b+20(%rip), %xmm1 movss b+24(%rip), %xmm2 movss b+28(%rip), %xmm9 movss b(%rip), %xmm8 addss %xmm6, %xmm7 movss a+8(%rip), %xmm6 subss %xmm5, %xmm6 movss %xmm7, c+4(%rip) mulss %xmm0, %xmm6 addss %xmm5, %xmm6 movss a+12(%rip), %xmm5 subss %xmm4, %xmm5 movss %xmm6, c+8(%rip) mulss %xmm0, %xmm5 addss %xmm4, %xmm5 movss a+16(%rip), %xmm4 subss %xmm3, %xmm4 movss %xmm5, c+12(%rip) mulss %xmm0, %xmm4 addss %xmm3, %xmm4 movss a+20(%rip), %xmm3 subss %xmm1, %xmm3 movss %xmm4, c+16(%rip) mulss %xmm0, %xmm3 addss %xmm1, %xmm3 movss a+24(%rip), %xmm1 subss %xmm2, %xmm1 movss %xmm3, c+20(%rip) mulss %xmm0, %xmm1 addss %xmm2, %xmm1 movss a+28(%rip), %xmm2 subss %xmm9, %xmm2 movss %xmm1, c+24(%rip) mulss %xmm0, %xmm2 addss %xmm9, %xmm2 movss a(%rip), %xmm9 subss %xmm8, %xmm9 movss %xmm2, c+28(%rip) mulss %xmm9, %xmm0 addss %xmm8, %xmm0 movss %xmm0, c(%rip) Platform: amd64; GCC version 5.2.1. If I comment away the dummy for-loop, or I change the float "scale" variable into a function parameter, the inner loop changes into a much simpler code that vectorizes like I meant to: movaps b(%rip), %xmm3 movaps b+16(%rip), %xmm1 movaps a+16(%rip), %xmm0 movaps a(%rip), %xmm2 subps %xmm1, %xmm0 movss scale(%rip), %xmm4 subps %xmm3, %xmm2 shufps $0, %xmm4, %xmm4 mulps %xmm4, %xmm0 mulps %xmm4, %xmm2 addps %xmm1, %xmm0 addps %xmm3, %xmm2 movaps %xmm0, -24(%rsp) movq-16(%rsp), %rax movaps %xmm2, -40(%rsp) movq%xmm2, c(%rip) movq%xmm0, c+16(%rip) movq-32(%rsp), %rdx movq%rax, c+24(%rip) movq%rdx, c+8(%rip) Although there's still some glitch in the generated code causing dummy memory transfers, at least it now did the calculations using packed registers. If I change the global "scale" variable into a function parameter, the following shorter code is generated instead (essentially the same what Clang successfully produces for all three cases). movaps b+16(%rip), %xmm2 shufps $0, %xmm0, %xmm0 movaps a+16(%rip), %xmm1 subps %xmm2, %xmm1 movaps b(%rip), %xmm3 mulps %xmm0, %xmm1 addps %xmm2, %xmm1 movaps a(%rip), %xmm2 subps %xmm3, %xmm2 movaps %xmm1, c+16(%rip) mulps %xmm2, %xmm0 addps %xmm3, %xmm0 movaps %xmm0, c(%rip) Something causes GCC's tree-vectorization to be really rickety and easily foiled by trivial changes in code, and I'd like to see it fixed at least in these particular cases.
[Bug rtl-optimization/67577] Trivial float-vectorization foiled by a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67577 --- Comment #1 from Joel Yliluoma --- It may be also worth mentioning that adding an explicit '#pragma omp simd' before each of those loops, inside the operator functions, will make sure that GCC at least does the mathematics using packed registers. The memory store cannot apparently be forced to occur without redundant temporaries though.
[Bug c++/67559] [C++] [regression] Passing non-trivially copyable objects through '...' doesn't generate warning or error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67559 --- Comment #2 from Joel Yliluoma --- But when compiling for earlier standard versions that explicitly label this as undefined behavior, it should at least give a warning.
[Bug c++/67561] New: [C++14] ICE in tsubst_copy (nested auto lambdas may be involved)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67561 Bug ID: 67561 Summary: [C++14] ICE in tsubst_copy (nested auto lambdas may be involved) Product: gcc Version: 5.2.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- When compiling this program, struct Polygon {}; template struct View {}; template void RenderLight(PixType, Params&&... pack) { View tmp; DrawView(pack..., tmp); } template static void DrawPolygon(Plotter plot_pixel) { plot_pixel(1,1,1, [&](unsigned n) { return n; }); } template static void DrawView(PlotFunc&& GetPlotFunc, View& view) { DrawPolygon(GetPlotFunc(view, Polygon{}, false)); } static auto LightmapRenderer = [](unsigned round) { return [round](auto& view, const Polygon& polygon, bool) { return [=,,](unsigned x,unsigned y, float z, auto&& prop) { if(true) { if(round == 1) {} } else { if(round > 1) {} } }; }; }; void CalculateLightmap() { RenderLight( 0, LightmapRenderer(1) ); } The following error is produced: testv.cc: In instantiation of '<lambda(unsigned int)>::<lambda(auto:1&, const Polygon&, bool)>::<lambda(unsigned int, unsigned int, float, auto:2&&)> [with auto:2 = DrawPolygon(Plotter) [with Plotter = <lambda(unsigned int)>::<lambda(auto:1&, const Polygon&, bool)> [with auto:1 = View]::<lambda(unsigned int, unsigned int, float, auto:2&&)>]::<lambda(unsigned int)>; auto:1 = View]': testv.cc:16:15: required from 'void DrawPolygon(Plotter) [with Plotter = <lambda(unsigned int)>::<lambda(auto:1&, const Polygon&, bool)> [with auto:1 = View]::<lambda(unsigned int, unsigned int, float, auto:2&&)>]' testv.cc:23:16: required from 'void DrawView(PlotFunc&&, View&) [with View = View; PlotFunc = <lambda(unsigned int)>::<lambda(auto:1&, const Polygon&, bool)>&]' testv.cc:10:13: required from 'void RenderLight(PixType, Params&& ...) [with PixType = int; Params = {<lambda(unsigned int)>::<lambda(auto:1&, const Polygon&, bool)>}]' testv.cc:40:41: required from here testv.cc:29:5: internal compiler error: in tsubst_copy, at cp/pt.c:12997 { ^ 0x632bbe tsubst_copy ../../src/gcc/cp/pt.c:12997 0x623a08 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ../../src/gcc/cp/pt.c:15740 0x6246c8 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ../../src/gcc/cp/pt.c:14771 0x6242d1 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ../../src/gcc/cp/pt.c:15522 0x62c002 tsubst_expr ../../src/gcc/cp/pt.c:14552 0x63492d tsubst_decl ../../src/gcc/cp/pt.c:11500 0x632a7a tsubst_copy ../../src/gcc/cp/pt.c:13127 0x623a08 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ../../src/gcc/cp/pt.c:15740 0x624ae7 tsubst_copy_and_build(tree_node*, tree_node*, int, tree_node*, bool, bool) ../../src/gcc/cp/pt.c:14930 0x62c002 tsubst_expr ../../src/gcc/cp/pt.c:14552 0x62b3c5 tsubst_expr ../../src/gcc/cp/pt.c:14113 0x62bf4c tsubst_expr ../../src/gcc/cp/pt.c:14135 0x62b3e5 tsubst_expr ../../src/gcc/cp/pt.c:14115 0x62bf4c tsubst_expr ../../src/gcc/cp/pt.c:14135 0x62b8f4 tsubst_expr ../../src/gcc/cp/pt.c:13949 0x62bf4c tsubst_expr ../../src/gcc/cp/pt.c:14135 0x62acb7 instantiate_decl(tree_node*, int, bool) ../../src/gcc/cp/pt.c:20582 0x659f62 mark_used(tree_node*, int) ../../src/gcc/cp/decl2.c:5035 0x5f9800 build_over_call ../../src/gcc/cp/call.c:7501 0x5fbf31 build_op_call_1 ../../src/gcc/cp/call.c:4345 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See for instructions. Compiler version 5.2.1, on Debian x86_64. More information (g++ -v): Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 5.2.1-16' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libex
[Bug c++/67561] [C++14] ICE in tsubst_copy (nested auto lambdas may be involved)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67561 Joel Yliluoma changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #2 from Joel Yliluoma --- Looks like a duplicate of PR67411. *** This bug has been marked as a duplicate of bug 67411 ***
[Bug c++/67559] New: [C++] [regression] Passing non-trivially copyable objects through '...' doesn't generate warning or error
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67559 Bug ID: 67559 Summary: [C++] [regression] Passing non-trivially copyable objects through '...' doesn't generate warning or error Product: gcc Version: 5.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- In GCC 4.9, this code generates an error. In GCC 5.2, it generates no warning or error, even on -Wall -Wextra -pedantic. struct test { test(){} ~test(){} }; void a(int, ...) {} int main() { test object; a(5, object); } Tried different standards modes: -std=c++98, -std=c++03, -std=c++11, -std=c++14 Tried also lambda functions with variadic args, same result. The error message in GCC 4.9 (and earier down to 4.6) was: cannot pass objects of non-trivially-copyable type 'struct test' through '...' In GCC 5.2, no error or warning message is given in any of the standard modes. In the standard version C++03, this behavior is undefined (§5.2.2/7). In C++11, it is conditionally supported with implementation-defined semantics.
[Bug c++/67411] [5/6 Regression] internal compiler error: in tsubst_copy, at cp/pt.c:13473
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67411 Joel Yliluoma changed: What|Removed |Added CC||bisqwit at iki dot fi --- Comment #2 from Joel Yliluoma --- *** Bug 67561 has been marked as a duplicate of this bug. ***
[Bug c++/67561] [C++14] ICE in tsubst_copy (nested auto lambdas may be involved)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67561 --- Comment #1 from Joel Yliluoma --- Further reduced example: template void DrawView(PlotFunc GetPlotFunc) { GetPlotFunc(1)(2); } void CalculateLightmap() { auto LightmapRenderer = [](unsigned round) { return [round](const auto& view) { return [=](auto prop) { round + 0; }; }; }; DrawView(LightmapRenderer(0)); } Replacing the [=] with [] or [&] retains the error. Replacing it with [round] removes the error.
[Bug c++/67558] New: [C++] OpenMP "if" clause does not utilize compile-time constants
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67558 Bug ID: 67558 Summary: [C++] OpenMP "if" clause does not utilize compile-time constants Product: gcc Version: 5.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Consider this example code. unsigned x; template void plain_if(unsigned y) { if(Threads) { #pragma omp task firstprivate(y) shared(x) x = y >> 1; } else { x = y >> 1; } } template void omp_if(unsigned y) { #pragma omp task if(Threads) firstprivate(y) shared(x) x = y >> 1; } void plain_if_false(unsigned y) { plain_if(y); } void plain_if_true(unsigned y) { plain_if(y); } void omp_if_false(unsigned y) { omp_if(y); } void omp_if_true(unsigned y) { omp_if(y); } plain_if and omp_if do essentially the same thing. In both of them, the template parameter "Threads" controls whether to create an OpenMP task for the action or not. However, when the code is compiled, all functions explicitly call GOMP_task, except plain_if. It is clear that GCC treats a plain if() differently than an OpenMP if(). It is a case of lacking optimization.
[Bug c++/66644] Rejects C++11 in-class anonymous union members initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66644 --- Comment #1 from Joel Yliluoma bisqwit at iki dot fi --- The last code piece should have test2{0,0}; there. Something ate a couple of characters off the end of that line.
[Bug c++/66644] New: Rejects C++11 in-class anonymous union members initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66644 Bug ID: 66644 Summary: Rejects C++11 in-class anonymous union members initialization Product: gcc Version: 5.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi Target Milestone: --- Accepted by GCC: struct test { union { struct { char a=0, b; }; char buffer[16]; }; }; NOT accepted by GCC (multiple fields in union 'test::anonymous union' initialized): struct test { union { struct { char a=0, b=0; }; char buffer[16]; }; }; Still accepted by GCC: struct test { union { struct { char a, b; } test2{0, char buffer[16]; }; }; I think there's a compiler bug here. It should not complain about initializing multiple fields in a struct that is nested inside the union, because this does not comprise a conflict. Tested on GCC versions 4.7.4, 4.8.4, 4.9.2, and 5.1.1.
[Bug rtl-optimization/63259] New: Detecting byteswap sequence
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63259 Bug ID: 63259 Summary: Detecting byteswap sequence Product: gcc Version: 4.9.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi This is just silly. GCC optimizes the first function into single opcode (bswap), but not the other. For Clang, it's the other way around. unsigned byteswap_gcc(unsigned result) { result = ((result 0xu) 16) | ((result 0xu) 16); result = ((result 0xFF00FF00u) 8) | ((result 0x00FF00FFu) 8); return result; } unsigned byteswap_clang(unsigned result) { result = ((result 0xFF00FF00u) 8) | ((result 0x00FF00FFu) 8); result = ((result 0xu) 16) | ((result 0xu) 16); return result; } unsigned byteswap(unsigned v) { #ifdef __clang__ return byteswap_clang(v); #else return byteswap_gcc(v); #endif } GCC output: byteswap_gcc: movl%edi, %eax bswap %eax ret byteswap_clang: movl%edi, %eax andl$-16711936, %eax shrl$8, %eax movl%eax, %edx movl%edi, %eax andl$16711935, %eax sall$8, %eax orl %edx, %eax roll$16, %eax ret byteswap: movl%edi, %eax bswap %eax ret Clang output: byteswap_gcc: # @byteswap_gcc roll$16, %edi movl%edi, %eax shrl$8, %eax andl$16711935, %eax # imm = 0xFF00FF shll$8, %edi andl$-16711936, %edi# imm = 0xFF00FF00 orl %eax, %edi movl%edi, %eax retq byteswap_clang: # @byteswap_clang bswapl %edi movl%edi, %eax retq byteswap: # @byteswap bswapl %edi movl%edi, %eax retq Tested both -m32 and -m64, with options: -Ofast -S Tested versions: - gcc (Debian 4.9.1-11) 4.9.1 Target: x86_64-linux-gnu - Debian clang version 3.5.0-+rc1-2 (tags/RELEASE_35/rc1) (based on LLVM 3.5.0) Target: x86_64-pc-linux-gnu
[Bug c++/61323] New: 'static' and 'const' attributes cause non-type template argument matching failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61323 Bug ID: 61323 Summary: 'static' and 'const' attributes cause non-type template argument matching failure Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bisqwit at iki dot fi // Works: char* table1[10]; templateunsigned size, char*(table)[size] void test1() { } void tester1() { test110,table1(); } // Doesn't work: static char* table2[10]; templateunsigned size, char*(table)[size] void test2() { } void tester2() { test210,table2(); } // error: 'table2' cannot appear in a constant-expression // Works: const char* table3[10]; templateunsigned size, const char*(table)[size] void test3() { } void tester3() { test310,table3(); } // Doesn't work: const char* const table4[10] = {}; templateunsigned size, const char*const (table)[size] void test4() { } void tester4() { test410,table4(); } // error: 'table4' cannot appear in a constant-expression // Works: const char* volatile table5[10] = {}; templateunsigned size, const char* volatile (table)[size] void test5() { } void tester5() { test510,table5(); } // Doesn't work: const char* const table6[10] = {}; templateunsigned size, const char*const (table)[size] void test6() { } void tester6() { test610,table6(); } // error: 'table6' cannot appear in a constant-expression -- Compiler versions tested: g++-4.4 (Debian 4.4.7-7) 4.4.7 g++-4.5 (Debian 4.5.3-12) 4.5.3 g++-4.6 (Debian 4.6.4-7) 4.6.4 g++-4.7 (Debian 4.7.3-13) 4.7.3 g++-4.8 (Debian 4.8.2-21) 4.8.2 g++-4.9 (Debian 4.9.0-2) 4.9.0 Giving -std=c++11 did not make a difference. -- Also tested: CLANG++ 3.5 CLANG++ gives diagnostic message: non-type template argument referring to object 'table2' with internal linkage is a C++11 extension on all those cases that GCC failed, when compiled without -std=c++11. Compiling with -std=c++11 or even with -std=c++1y did not work on GCC.
[Bug c++/61323] 'static' and 'const' attributes cause non-type template argument matching failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61323 --- Comment #1 from Joel Yliluoma bisqwit at iki dot fi --- Interestingly enough, only if you add the term constexpr to the array declaration, you get an actually meaningful error message: constexpr const char* table7[10] = {}; templateunsigned size, const char*const (table)[size] void test7() { } void tester7() { test710,table7(); } Produces: test.cc:42:35: error: 'table7' is not a valid template argument for type 'const char* const ()[10]' because object 'table7' has not external linkage void tester7() { test710,table7(); } The problem is that this very setting (non external linkage object as template argument) should be allowed by C++11 -- the same standard that also gives us constexpr.
[Bug rtl-optimization/58195] Missed optimization opportunity when returning a conditional
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58195 Joel Yliluoma bisqwit at iki dot fi changed: What|Removed |Added CC||bisqwit at iki dot fi --- Comment #1 from Joel Yliluoma bisqwit at iki dot fi --- Problem confirmed on gcc (GCC) 4.9.0 20140303 (experimental) (SVN version) in both 32-bit and 64-bit mode using -Ofast. For comparison, Clang++ produces this instead (even on -O1): negl%edi movl%edi, %eax ret GCC misses an optimization opportunity here.
[Bug c++/56794] New: C++11 Error in range-based for with parameter pack array
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56794 Bug #: 56794 Summary: C++11 Error in range-based for with parameter pack array Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi G++ 4.7.2 and 4.8.0 give the following error message for the for-loop in below code: tmp.cc:10:17: error: range-based 'for' expression of type 'const int []' has incomplete type On G++ 4.6.3 (and Clang++), it compiles fine. Regression? templateint... values static void Colors() { static const int colors[] = { values... }; // ^ This version passes in G++ 4.6 and Clang++ 3.0, fails in G++ 4.7 and 4.8 //static const int colors[sizeof...(values)] = { values... }; // ^This passes in all of them for(auto c: colors) { } // ^ This line is the one that gets the error message } int main() { Colors0,1,2 (); }
[Bug c++/55250] New: [C++0x][constexpr] enum declarations within constexpr function are allowed, constexpr declarations are not
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55250 Bug #: 55250 Summary: [C++0x][constexpr] enum declarations within constexpr function are allowed, constexpr declarations are not Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi The following code compiles in GCC without warnings on -Wall -W -pedantic: constexpr int Test1(int x) { enum { y = 1 }; return x+y; } The following one does not: constexpr int Test2(int x) { constexpr int y = 1; return x+y; } For the second code, GCC gives error: body of constexpr function 'constexpr int Test2(int)' not a return-statement In comparison, Clang++ gives an error for Test1: error: types cannot be defined in a constexpr function, and for Test2: error: variables cannot be declared in a constexpr function for Test2. Now, reading http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2235.pdf , it is not entirely unambiguous which behavior is correct. While I would like that both samples worked without warnings, I suggest that attempting to declare an enum within a constexpr function will be made a -pedantic warning. [Tested on GCC 4.6.3 through 4.7.2. On GCC 4.5.3, both functions compiled without warnings.]
[Bug c++/55239] New: Spurious unused variable warning on function-local objects with a destructor and an initializer
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55239 Bug #: 55239 Summary: Spurious unused variable warning on function-local objects with a destructor and an initializer Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: minor Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi In the code below, a function-local object is declared with a destructor whose role is to ensure that some action is taken at the end of the scope, no matter which route the function is exited. #include stdio.h void LoadSomeFile(const char* fn) { /* Open file */ FILE* fp = fopen(fn, rb); /* Ensure that the file is automatically closed no matter which path this function is exited */ struct closer { FILE* f; ~closer() { if(f) fclose(f); } } autoclosefp = {fp}; /* Some code here that deals with fp, and may include several return; clauses */ } int main() { LoadSomeFile(__FILE__); } // test Bug GCC gives a spurious unused variable 'autoclosefp' for this code, implying that autoclosefp has no function. It does. Without it, the file would not be closed and resources would be leaked. The problem also occurs, when the code is rewritten like this: #include stdio.h void LoadSomeFile(const char* fn) { /* Open file */ FILE* fp = fopen(fn, rb); /* Ensure that the file is automatically closed no matter which path this function is exited */ struct closer { FILE* f; ~closer() { if(f) fclose(f); } }; closer autoclosefp = {fp}; /* Some code here that deals with fp, and may include several return; clauses */ } int main() { LoadSomeFile(__FILE__); } // test Changing the = {fp} into C++11 style {fp} does not take away the warning, either. Only changing the initialization-by-initializer into an member-assignment takes away the warning. #include stdio.h void LoadSomeFile(const char* fn) { /* Open file */ FILE* fp = fopen(fn, rb); /* Ensure that the file is automatically closed no matter which path this function is exited */ struct closer { FILE* f; ~closer() { if(f) fclose(f); } } autoclosefp; autoclosefp.f = fp; /* Some code here that deals with fp, and may include several return; clauses */ } int main() { LoadSomeFile(__FILE__); } // test I would argue that this is inconvenient, and wrong behavior on GCC. Tested and verified on GCC 3.3 through 4.7.1. The -Wunused-variable (or -Wall) option is required.
[Bug c++/55240] New: [c++0x] ICE on non-static data member initialization using 'auto' variable from containing function
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55240 Bug #: 55240 Summary: [c++0x] ICE on non-static data member initialization using 'auto' variable from containing function Classification: Unclassified Product: gcc Version: 4.7.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi This code causes an ICE in GCC 4.7.1 and 4.7.2: int main() { int q = 1; struct test { int x = q; } instance; } tmpq.cc: In constructor 'constexpr main()::test::test()': tmpq.cc:4:12: internal compiler error: in expand_expr_real_1, at expr.c:9122 It is notable that if the code is written like this, the error message changes. int main() { int q = 1; struct test { int x; test():x(q){} } instance; } tmpq.cc:5:35: error: use of 'auto' variable from containing function tmpq.cc:3:9: error: 'int q' declared here
[Bug c++/55239] Spurious unused variable warning on function-local objects with a destructor and an initializer
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55239 --- Comment #5 from Joel Yliluoma bisqwit at iki dot fi 2012-11-08 15:16:48 UTC --- Nice. I had no idea this was first reported in 2003 and fixed in 2012 in a version recent enough to be still unreleased :)
[Bug c++/54946] New: ICE on template parameter from cast char-pointer in C++11 constexpr struct
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54946 Bug #: 54946 Summary: ICE on template parameter from cast char-pointer in C++11 constexpr struct Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi The following C++11 code causes an ICE. templateconts char*sstatic void testfunc(); constexpr struct testtype { const char* str; } test = { abc} ; void (*functionpointer)() = testfunc(const char*) test.str; Sample GCC commandline to invoke the error: g++ test1.cc -std=c++0x Error message on g++-4.7.1 (Debian 4.7.1-7 on x86_64) test1.cc:5:29: internal compiler error: in convert_nontype_argument, at cp/pt.c:5794 Error message on g++-4.6.3 (Debian 4.6.3-11 on x86_64): test1.cc:5:29: internal compiler error: in convert_nontype_argument, at cp/pt.c:5430 I do not know whether the code is valid. Things that do not affect the error: - Adding / removing const at any point or changing pointers into arrays at any point - Changing functionpointer into an array of function pointers - Any code generation related options (such as -m32 or optimization levels) Things that do hide the error: - Removing the (const char*) cast entirely - Changing the string pointers into integers - Removing the struct encapsulation from str (making it constexpr const char* str = abc; and removing test. from the third line)
[Bug c++/54946] ICE on template parameter from cast char-pointer in C++11 constexpr struct
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54946 --- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2012-10-17 12:10:21 UTC --- Please excuse the conts typo in the post; naturally it meant to say const there. The typo is not relevant to the bug report. I changed the code a few times trying to figure out what triggers the error and what does not, and the version I copypasted was not a compiled one.
[Bug libstdc++/53630] New: C+11 regex compiler produces SIGSEGV
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53630 Bug #: 53630 Summary: C+11 regex compiler produces SIGSEGV Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi This simple code produces a segmentation fault. Tested in GCC 4.7.0, GCC 4.6.3, and Clang 3.1 (where the latter uses libstdc++ from GCC 4.7). #include regex int main() { std::regex r((go |)((n(orth|))|(s(outh|))|(w(est|))|(e(ast|))), std::regex::extended); return 0; } Omitting the std::regex::extended option does not make a difference. Replacing all of the |) with ) makes it compile, but obviously with a completely different expression. As of now, libstdc++ does not yet support the '?' operator, so the expression cannot be rewritten as (go )?((n(orth)?)|(s(outh)?)|(w(est)?)|(e(ast)?)). There is also no non-capturing grouping operator, so writing e.g. (n(?:orth|)) is not an option. A minimal regexp that duplicates the crash is: ((a(b|))|x). Simple reorderings such as ((a(|b))|x) or (x|(a(|b))) do not make a difference. GDB backtrace below: (gdb) bt #0 0x7732cdbd in malloc_consolidate (av=0x77639e60) at malloc.c:5169 #1 0x7732f2a4 in _int_malloc (av=0x77639e60, bytes=1280) at malloc.c:4373 #2 0x77331960 in *__GI___libc_malloc (bytes=1280) at malloc.c:3660 #3 0x77b39e6d in operator new(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #4 0x00404ce4 in __gnu_cxx::new_allocatorstd::__regex::_State::allocate (this=0x7fffe980, __n=16) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/ext/new_allocator.h:94 #5 0x004046b0 in std::_Vector_basestd::__regex::_State, std::allocatorstd::__regex::_State ::_M_allocate (this=0x7fffe980, __n=16) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/stl_vector.h:169 #6 0x00404338 in _ZNSt6vectorINSt7__regex6_StateESaIS1_EE19_M_emplace_back_auxIJS1_EEEvDpOT_ (this=0x7fffe980, __args=0x7fffe010) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/vector.tcc:402 #7 0x00404270 in _ZNSt6vectorINSt7__regex6_StateESaIS1_EE12emplace_backIJS1_EEEvDpOT_ (this=0x7fffe980, __args=0x7fffe010) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/vector.tcc:102 #8 0x00403690 in std::vectorstd::__regex::_State, std::allocatorstd::__regex::_State ::push_back(std::__regex::_State) ( this=0x7fffe980, __x=0x7fffe010) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/stl_vector.h:900 #9 0x00403052 in std::__regex::_Nfa::_M_insert_subexpr_begin(std::functionvoid (std::__regex::_PatternCursor const, std::__regex::_Results) const) (this=0x7fffe978, __t=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_nfa.h:312 #10 0x0040848f in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_atom (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:943 #11 0x00407b98 in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_term (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:793 #12 0x00405be9 in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_alternative (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:771 #13 0x00403119 in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_disjunction (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:756 #14 0x004084d5 in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_atom (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:945 #15 0x00407b98 in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_term (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:793 #16 0x00405be9 in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_alternative (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:771 #17 0x00405c2f in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_alternative (this=0x7fffe928) at /usr/bin/../lib/gcc/x86_64-linux-gnu/4.7/../../../../include/c++/4.7/bits/regex_compiler.h:774 #18 0x00403119 in std::__regex::_Compilerchar const*, std::regex_traitschar ::_M_disjunction
[Bug c++/50276] [C++0x] Wrong used uninitialized in this function warning
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50276 --- Comment #5 from Joel Yliluoma bisqwit at iki dot fi 2012-01-03 23:16:07 UTC --- It also accepts this code without complaints, which is another error: templateint i bool test() { if (bool value = this_identifier_has_not_been_declared( []() {} )) return value; __builtin_abort(); return false; } int main() { test0(); } The wrong-code problem occurs also with this code: templateint i bool test() { if (bool value = []() { return 1; } ) return value; __builtin_abort(); return false; } int main() { test0(); }
[Bug c++/50276] New: Wrong used uninitialized in this function warning [C++0x]
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50276 Bug #: 50276 Summary: Wrong used uninitialized in this function warning [C++0x] Classification: Unclassified Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: minor Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi For this example code, GCC mistakenly produces the following warning: tmp.cc:10:5: warning: 'value' is used uninitialized in this function [-Wuninitialized] The warning is wrongly given, because there is no execution path that does not assign a well-defined value to the variable. In fact, there are no branches at all between the declaring and the assigning of the variable. templatetypename T unsigned testfun(const T func) { return func(); } templateint i unsigned test() { if(unsigned value = testfun( [] () { return 0; })) { return value; } return i; } int main() { return test1(); } The warning being wrongly given depends on the following conditions: - test() being a template function: changing i into an actual parameter removes the warning - func being a functor: changing it into an integer parameter removes the warning - the variable value being declared and assigned to in the if-condition: declaring and assigning it separately removes the warning. - the func parameter being a lambda function: changing it into a static method of a class removes the warning. The following aspects do not affect the warning: - testfun() being a template function: changing T into an explicit int(*)() retains the warning - whether i is used within test() or not - adding static or inline attributes to any function did not change the warning. Tested on GCC 4.5.3 and GCC 4.6.1, on x86_64-linux-gnu in both 32-bit and 64-bit mode on all optimization modes.
[Bug c++/50276] Wrong used uninitialized in this function warning [C++0x]
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50276 --- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2011-09-02 13:04:31 UTC --- Even this produces the warning. Changing any of the 0s into 1 did not affect the warning. static inline unsigned testfun(void*) { return 0; } templateint i static inline unsigned test() { if(unsigned value = testfun( []() { return 0; } )) return value; return 0; } int main() { return test0(); }
[Bug c++/49100] [OpenMP]: Compiler error when inline method defined within OpenMP loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100 --- Comment #3 from Joel Yliluoma bisqwit at iki dot fi 2011-07-25 10:01:08 UTC --- While it's true that one should not reference the original variable within the loop, question is, why does the inner function reference the original variable rather than the inloop variable when there's no explicit reference to the original variable. A reference is established, by name, within the loop, but within the loop there should be no possible way to reference the outside-loop variable because the inner-loop namespace shadows the outer one, and hence the reference should bind into the inner-loop variable, thus conforming to OpenMP specification.
[Bug c++/49100] [OpenMP]: Compiler error when inline method defined within OpenMP loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100 --- Comment #5 from Joel Yliluoma bisqwit at iki dot fi 2011-07-25 10:24:20 UTC --- Obviously :) All right, thanks.
[Bug c++/49100] New: [OpenMP]: Compiler error when inline method defined within OpenMP loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100 Summary: [OpenMP]: Compiler error when inline method defined within OpenMP loop Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi This report is similar to PR49043, but unlike it, this does not involve C++0x. This valid code fails to compile on GCC. GCC spews an invalid exit from OpenMP structured block error message. int main() { #pragma omp parallel for for(int a=0; a10; ++a) { struct x { void test() { return; }; }; } } If the explicit return statement is removed, it compiles. It is also triggered by code such as this: struct y { static bool test(int c) { return c==5; } }; if put inside the OpenMP loop construct, meaning it happens for static and non-static methods as long as they include an explicit return statement. The purpose of this error is to catch exits from an OpenMP construct (return, break, goto). No such thing happens when a function is called or defined. The error is not given when the struct is defined outside the loop (even if invoked inside the loop). It is clearly a parser error. It failed on all GCC versions that I tried that support OpenMP. These include GCC 4.2.4, 4.3.5, 4.4.6, 4.5.3 and 4.6.1. I have not tested whether the patch committed as a result of PR49043 also fixes this bug.
[Bug c++/49100] [OpenMP]: Compiler error when inline method defined within OpenMP loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49100 --- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2011-05-21 11:31:46 UTC --- It also does not happen with C's nested functions. This for instance compiles and works just fine (to my surprise). #include stdio.h int main() { int a; #pragma omp parallel for for(a=0; a10; ++a) { int c() { return 65; } putchar( c() ); } return 0; } I venture into another bug report here, but I wonder if this is a bug or intentional behavior, that the code below outputs YY, as though the variable a within c() is bound to the a from the surrounding context rather than the OpenMP loop's private copy of a. If the OpenMP loop is removed, it outputs ABCDEFGHIJ as expected. #include stdio.h int main() { int a = 24; #pragma omp parallel for for(a=0; a10; ++a) { int c() { return a+65; } putchar( c() ); } return 0; }
[Bug c++/49043] [OpenMP C++0x]: Compiler error when lambda-function within OpenMP loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49043 --- Comment #2 from Joel Yliluoma bisqwit at iki dot fi 2011-05-19 08:10:06 UTC --- Even if the lambda function is not called, it happens. Merely defining the function causes it. Interestingly though, it does not happen if a method body is defined within the loop. The code below does not cause the error. So it is restricted to lambda function bodies. It also does not happen when calling lambda functions that are defined outside the loop. int main() { #pragma omp parallel for for(int a=0; a10; ++a) { struct tmp { static int test() { return 0; } }; } }
[Bug c++/49043] [OpenMP C++0x]: Compiler error when lambda-function within OpenMP loop
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49043 Joel Yliluoma bisqwit at iki dot fi changed: What|Removed |Added Summary|[C++0x] Returns from lambda |[OpenMP C++0x]: Compiler |functions incorrectly |error when lambda-function |detected as exits from|within OpenMP loop |OpenMP loops in surrounding | |code| --- Comment #1 from Joel Yliluoma bisqwit at iki dot fi 2011-05-19 08:05:26 UTC --- It also happens if the lambda-function does not explicitly contain the return statement. For example, this code triggers the same error. int main() { #pragma omp parallel for for(int a=0; a10; ++a) { auto func = [] () - void { }; func(); } }
[Bug c++/49043] New: Returns from lambda functions incorrectly detected as exits from OpenMP loops in surrounding code
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49043 Summary: Returns from lambda functions incorrectly detected as exits from OpenMP loops in surrounding code Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi GCC incorrectly considers return statements within lambda functions as exits from an OpenMP structured block. For the code below, this error message is generated: tmp3.cc: In lambda function: tmp3.cc:7:40: error: invalid exit from OpenMP structured block #include iostream int main() { #pragma omp parallel for for(int a=0; a10; ++a) { auto func = [=] () { return a; }; std::cout func(); } } Compiled with: -fopenmp -std=gnu++0x Tested versions: 4.5.3 , 4.6.1 The purpose of this error is to catch exits from an OpenMP construct (return, break, goto). No such thing happens when a lamdba function is called, which is not different from calling an inlined function, therefore the error message is misplaced.
[Bug libstdc++/48933] New: Infinite recursion in tr1/cmath functions with complex parameters
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48933 Summary: Infinite recursion in tr1/cmath functions with complex parameters Product: gcc Version: 4.6.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi CC: w...@iki.fi All of the function calls in this example code produce a stack overflow due to infinite recursion, regardless of optimization level. Compile with: g++ code.cc Tested on the following gcc versions: 4.2.4 4.3.5 4.4.6 4.5.2 4.6.1 No compiler warnings or errors are emitted. (Tried even -Wall -W -pedantic -ansi). Does not happen on gcc 4.0.4, because tr1/cmath is unavailable. #include tr1/cmath #include complex int main() { std::tr1::tgamma( std::complexdouble (0.5, 0.0) ); std::tr1::cbrt( std::complexdouble (0.5, 0.0) ); std::tr1::asinh( std::complexdouble (0.5, 0.0) ); std::tr1::acosh( std::complexdouble (1.5, 0.0) ); std::tr1::atanh( std::complexdouble (0.5, 0.0) ); std::tr1::erf( std::complexdouble (0.5, 0.0) ); std::tr1::hypot( std::complexdouble (1.0, 0.0) , std::complexdouble (1.0, 0.0) ); std::tr1::logb( std::complexdouble (0.5, 0.0) ); std::tr1::round( std::complexdouble (0.5, 0.0) ); std::tr1::trunc( std::complexdouble (0.5, 0.0) ); } The bug can be traced to all functions in tr1/cmath that look like this: templatetypename _Tp inline typename __gnu_cxx::__promote_Tp::__type cbrt(_Tp __x) { typedef typename __gnu_cxx::__promote_Tp::__type __type; return cbrt(__type(__x)); // -- infinite recursion here }
[Bug libstdc++/48933] Infinite recursion in tr1/cmath functions with complex parameters
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48933 --- Comment #4 from Joel Yliluoma bisqwit at iki dot fi 2011-05-09 10:51:28 UTC --- There is, however, an asinh, a cbrt, a hypot etc. for complex types. I don't know about standard, but mathematically they are well defined. (for example, hypot(x,y) = sqrt(x*x + y*y), asinh(x) = log(x + sqrt(x*x + 1))) For trunc other rounding functions probably not so.
[Bug c++/46764] New: std=c++0x causes compilation failure on SFINAE test for methods
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46764 Summary: std=c++0x causes compilation failure on SFINAE test for methods Product: gcc Version: 4.5.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassig...@gcc.gnu.org ReportedBy: bisq...@iki.fi This code tests whether a class defines a method of a certain name or not. It fails to compile on GCC when -std=c++0x is used. Without -std=c++0x, it compiles and works fine. #include iostream struct Hello { int helloworld() { return 0; } }; struct Generic {}; // SFINAE test template typename T class has_helloworld { typedef char yes; typedef struct { char dummy[2]; } no; template typename C static yes test( typeof(C::helloworld) ) ; template typename C static no test(...); public: enum { value = sizeof(testT(0)) == sizeof(yes) }; }; int main() { std::cout has_helloworldHello::value std::endl; std::cout has_helloworldGeneric::value std::endl; return 0; } With -std=c++0x, we get the following error message: tmp5.cc:13:68: error: ISO C++ forbids in-class initialization of non-const static member 'test' tmp5.cc:13:68: error: template declaration of 'has_helloworld::yes test' Without -std=c++0x, the code compiles without warnings. Indicating that GCC misinterprets test() to be a member/variable initialization rather than a method/function declaration, despite the parameter expression yielding a type rather than a value.
[Bug c++/42697] ice-on-legal-code: template class template function local objects
--- Comment #9 from bisqwit at iki dot fi 2010-01-17 22:37 --- Out of curiosity... What does it mean it's not a regression, and what are its practical implications? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42697
[Bug c++/42697] ice-on-legal-code: template class template function local objects
--- Comment #11 from bisqwit at iki dot fi 2010-01-18 07:59 --- Ah, I see. So the reason it is not fixed in 4.5 is that it may cause new regressions? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42697
[Bug c++/42743] New: Inexplicable error message with constructing SIMD values
This code: #include emmintrin.h templatetypename vec_t void x() { vec_t tmp1 = vec_t(); // Works with myvec, causes error with __m128 vec_t tmp2 = {}; // Causes warnings about uninitialized members in myvec vec_t tmp3; // This may cause a warning about use of uninitialized variables if tmp3 is later read-accessed. } struct myvec { struct tmp { float data[2]; } d; }; void y() { x__m128 (); xmyvec (); } Produces this error when vec_t is __m128: tmp.cc:6: error: can't convert between vector values of different size And this warning when vec_t is myvec: tmp.cc:7: warning: missing initializer for member 'myvec::d' It is my understanding that constructor calls should never be treated as syntax errors. Is there really no way to write this code so that it causes neither a compile error or a warning? -- Summary: Inexplicable error message with constructing SIMD values Product: gcc Version: 4.4.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bisqwit at iki dot fi GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42743
[Bug c++/42697] New: ice-on-legal-code: template class template function local objects
Example source code: templateclass Value_t class fparser { templatebool Option void eval2(Value_t r[2]); public: void evaltest(); }; /*templateclass Value_t templatebool Option void fparserValue_t::eval2(Value_t r[2]) { }*/ template templatebool Option void fparserint::eval2(int r[2]) { struct ObjType { int tmp; }; ObjType Object = { 5 }; } templateclass Value_t void fparserValue_t::evaltest () { eval2false(0); } template class fparserint; Compilation result: Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.2-8' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.4.2 (Debian 4.4.2-8) COLLECT_GCC_OPTIONS='-Wall' '-W' '-O3' '-c' '-v' '-shared-libgcc' '-mtune=generic' /usr/lib/gcc/x86_64-linux-gnu/4.4.2/cc1plus -quiet -v -D_GNU_SOURCE tmp2.cc -quiet -dumpbase tmp2.cc -mtune=generic -auxbase tmp2 -O3 -Wall -W -version -o /tmp/ccPuGyBP.s ignoring nonexistent directory /usr/local/include/x86_64-linux-gnu ignoring nonexistent directory /usr/lib/gcc/x86_64-linux-gnu/4.4.2/../../../../x86_64-linux-gnu/include ignoring nonexistent directory /usr/include/x86_64-linux-gnu #include ... search starts here: #include ... search starts here: /usr/include/c++/4.4 /usr/include/c++/4.4/x86_64-linux-gnu /usr/include/c++/4.4/backward /usr/local/include /usr/lib/gcc/x86_64-linux-gnu/4.4.2/include /usr/lib/gcc/x86_64-linux-gnu/4.4.2/include-fixed /usr/include End of search list. GNU C++ (Debian 4.4.2-8) version 4.4.2 (x86_64-linux-gnu) compiled by GNU C version 4.4.2, GMP version 4.3.1, MPFR version 2.4.2-p1. GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 545dc413edcab7204151d021d996af39 tmp2.cc: In member function 'void fparserValue_t::eval2(Value_t*) [with bool Option = false, Value_t = int]': tmp2.cc:33: instantiated from 'void fparserValue_t::evaltest() [with Value_t = int]' tmp2.cc:37: instantiated from here tmp2.cc:19: internal compiler error: in tsubst, at cp/pt.c:9339 Please submit a full bug report, with preprocessed source if appropriate. See file:///usr/share/doc/gcc-4.4/README.Bugs for instructions. Occurs on g++-4.1: internal compiler error: in tsubst, at cp/pt.c:7267 Occurs on g++-4.2: internal compiler error: in tsubst, at cp/pt.c:7465 Occurs on g++-4.3: internal compiler error: in tsubst, at cp/pt.c:9031 Occurs on g++-4.4: internal compiler error: in tsubst, at cp/pt.c:9339 Does not occur on g++-4.0, because 4.0 gives error template-id 'eval2' for 'void fparserint::eval2(int*)' does not match any template declaration instead. Occurs at all optimization levels. -- Summary: ice-on-legal-code: template class template function local objects Product: gcc Version: 4.4.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bisqwit at iki dot fi GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42697
[Bug c++/34953] New: ICC on destructor + noreturn-function at -O3
This code crashes GCC versions 4.1.2, 4.1.3, and 4.2.3, when compiled using the -O3 option. void B_CLEAR(void* ret); void B_NeverReturns(void* ret) __attribute__((noreturn)); int main() { const struct AutoErrPop { ~AutoErrPop() { } } AutoErrPopper = { }; B_NeverReturns(0); } void B_NeverReturns(void* ret) { B_CLEAR(ret); /* Never returns (does a setjmp/goto) */ } Tested on x86_64 and i386. To reproduce: g++ a.cc -O3 Expected result: a.cc:4: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. For Debian GNU/Linux specific bug reporting instructions, see URL:file:///usr/share/doc/gcc-4.2/README.Bugs. -- Summary: ICC on destructor + noreturn-function at -O3 Product: gcc Version: 4.1.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bisqwit at iki dot fi GCC build triplet: x86_64-pc-linux-gnu GCC host triplet: x86_64-pc-linux-gnu GCC target triplet: x86_64-pc-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34953
[Bug c++/34953] ICC on destructor + noreturn-function at -O3
--- Comment #1 from bisqwit at iki dot fi 2008-01-24 13:52 --- The body of the function B_CLEAR() is not included, and not relevant, since the error happens without the body as well, and does not progress to linking. -- bisqwit at iki dot fi changed: What|Removed |Added CC||bisqwit at iki dot fi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34953
[Bug c++/32767] ICE in constructing a template object using statement expression on AMD64.
--- Comment #4 from bisqwit at iki dot fi 2007-07-15 21:17 --- Also is reported that on some 32-bit platforms, instead of causing an ICE, it causes a rampant memory eating phenomenon. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32767
[Bug c++/32767] ICE in constructing a template object using statement expression on AMD64.
--- Comment #2 from bisqwit at iki dot fi 2007-07-14 17:34 --- Also, yay, bug report #32767 :) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32767
[Bug rtl-optimization/31485] New: C complex numbers, amd64 SSE, missed optimization opportunity
Considering that complex turns basically any basic type into a vector type, complex number addition and subtraction could utilize SSE instructions to perform the operation on real and imaginary parts simultaneously. (Only applies to addition and subtraction.) Code: #include complex.h typedef float complex ss1; typedef float ss2 __attribute__((vector_size(sizeof(ss1; ss1 add1(ss1 a, ss1 b) { return a + b; } ss2 add2(ss2 a, ss2 b) { return a + b; } Produces: add1: movq%xmm0, -8(%rsp) movq%xmm1, -16(%rsp) movss -4(%rsp), %xmm0 movss -8(%rsp), %xmm1 addss -12(%rsp), %xmm0 addss -16(%rsp), %xmm1 movss %xmm0, -20(%rsp) movss %xmm1, -24(%rsp) movq-24(%rsp), %xmm0 ret add2: movlps %xmm0, -16(%rsp) movlps %xmm1, -24(%rsp) movaps -24(%rsp), %xmm0 addps -16(%rsp), %xmm0 movaps %xmm0, -56(%rsp) movlps -56(%rsp), %xmm0 ret Command line: gcc -msse -O3 -S test2.c (Results are same with -ffast-math) Architecture: CPU=AMD Athlon(tm) 64 X2 Dual Core Processor 4600+ CPU features=fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm cmp_legacy GCC is: Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) -- Summary: C complex numbers, amd64 SSE, missed optimization opportunity Product: gcc Version: 4.1.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bisqwit at iki dot fi http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31485
[Bug rtl-optimization/26098] New: ICE in multiplication of 16-byte longlong vector on x86_64
This code causes ICE on gcc 4.0.3 on x86_64. typedef long long vec __attribute__ ((vector_size(16))); vec vecsqr(vec a) { return a*a; } Commandline: gcc -O1 -S -o - tmp.c Resulting output: .file tmp.c tmp.c: In function 'vecsqr': tmp.c:2: error: unrecognizable insn: (insn 13 12 15 0 (set (reg:DI 58 [ D.1470 ]) (vec_select:DI (reg/v:V2DI 61 [ a ]) (parallel [ (const_int 1 [0x1]) ]))) -1 (nil) (expr_list:REG_DEAD (reg/v:V2DI 61 [ a ]) (nil))) tmp.c:2: internal compiler error: in extract_insn, at recog.c:2020 It goes ICE on when -O option = 1. -O0 does not trigger it. Option -mno-sse also disables the ICE, but then it gives error: SSE register return with SSE disabled. -mno-sse2 doesn't disable it. Unsigned/signed type has no effect to result. Without __attribute__((vector_size)), it does not ICE. GCC version (gcc -v): Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,java,f95,objc,ada,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.0 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-4.0-1.4.2.0/jre --enable-mpfr --disable-werror --enable-checking=release x86_64-linux-gnu Thread model: posix gcc version 4.0.3 20051201 (prerelease) (Debian 4.0.2-5) -- Summary: ICE in multiplication of 16-byte longlong vector on x86_64 Product: gcc Version: 4.0.3 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: bisqwit at iki dot fi GCC build triplet: x86_64-linux-gnu GCC host triplet: x86_64-linux-gnu GCC target triplet: x86_64-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26098