Re: GCC does not support *mmintrin.h with function specific opts
On Mon, Jun 17, 2013 at 10:49 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, Jun 14, 2013 at 11:08 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, Jun 14, 2013 at 1:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote: * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Quick look - pr43791.c is not expected to work at -O0, so skip -O0 for example by guarding the whole thing with #if __OPTIMIZED__ 0. Similar for pr44043.c. Seems like __OPTIMIZED__ is not defined in my config, dont know why. I see other tests checking for __OPTIMIZED. I fixed these two tests by adding a dg-prune-output at the end. Is that reasonable? I have attached the patch with all tests passing now. I have attached the latest patch with a small error in the previous patch fixed. I will submit this patch if there are no objections. I have committed this patch. Thanks Sri Thanks Sri Thanks Sri That we didn't error at -O0 before is a bug. Eventually I was suggesting to error if we end up outputting the body of an always_inline function, that is, any uses remain (including indirect calls or address-takens which is where we do not error right now). Richard. Thanks Sri Thanks for your patience! Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, Jun 14, 2013 at 11:08 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, Jun 14, 2013 at 1:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote: * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Quick look - pr43791.c is not expected to work at -O0, so skip -O0 for example by guarding the whole thing with #if __OPTIMIZED__ 0. Similar for pr44043.c. Seems like __OPTIMIZED__ is not defined in my config, dont know why. I see other tests checking for __OPTIMIZED. I fixed these two tests by adding a dg-prune-output at the end. Is that reasonable? I have attached the patch with all tests passing now. I have attached the latest patch with a small error in the previous patch fixed. I will submit this patch if there are no objections. Thanks Sri Thanks Sri That we didn't error at -O0 before is a bug. Eventually I was suggesting to error if we end up outputting the body of an always_inline function, that is, any uses remain (including indirect calls or address-takens which is where we do not error right now). Richard. Thanks Sri Thanks for your patience! Honza * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. * gcc.c-torture/compile/pr44043.c: Fix test to expect an error. * gcc.c-torture/compile/pr43791.c: Fix test to expect an error. Index: testsuite/gcc.c-torture/compile/pr44043.c === --- testsuite/gcc.c-torture/compile/pr44043.c (revision 200034) +++ testsuite/gcc.c-torture/compile/pr44043.c (working copy) @@ -85,3 +85,5 @@ int raw_sendmsg(struct sock *sk, struct msghdr *ms { raw_send_hdrinc(sk, msg-msg_iov, len, (void *)0, msg-msg_flags); } + +/* { dg-prune-output (inlining failed in call to always_inline.*indirect function call with a yet undetermined callee|called from here) } */ Index: testsuite/gcc.c-torture/compile/pr43791.c === --- testsuite/gcc.c-torture/compile/pr43791.c (revision 200034) +++ testsuite/gcc.c-torture/compile/pr43791.c (working copy) @@ -1,3 +1,4 @@ + int owner(); int clear(); @@ -18,4 +19,4 @@ void fasttrylock(void (*slowfn)()) { void trylock(void) { fasttrylock(slowtrylock); } - +/* { dg-prune-output (inlining failed in call to always_inline.*indirect function call with a yet undetermined callee|called from here) } */ Index: testsuite/gcc.target/i386/inline_error.c === --- testsuite/gcc.target/i386/inline_error.c(revision 0) +++ testsuite/gcc.target/i386/inline_error.c(revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O0 -mno-popcnt } */ + +inline int __attribute__ ((__gnu_inline__, __always_inline__, target(popcnt))) +foo () /* { dg-error inlining failed in call to always_inline .* target specific option mismatch } */ +{ + return 0; +} + +int bar() +{ + return foo (); /* { dg-error called from here } */ +} Index: ipa-inline.c === --- ipa-inline.c(revision 200034) +++ ipa-inline.c(working copy) @@ -1911,7 +1911,15 @@ inline_always_inline_functions (struct cgraph_node } if (!can_early_inline_edge_p (e)) - continue; + { + /* Set inlined to true if the callee is marked always_inline but +is not inlinable. This will allow flagging an error later in +expand_call_inline in tree-inline.c. */ + if (lookup_attribute (always_inline, +DECL_ATTRIBUTES (callee-symbol.decl)) != NULL) + inlined = true; + continue; + } if (dump_file) fprintf (dump_file, Inlining %s into %s (always_inline).\n, Index: tree-inline.c === --- tree-inline.c (revision 200034) +++ tree-inline.c (working copy) @@ -3905,8 +3905,6 @@ expand_call_inline (basic_block bb, gimple stmt, c for inlining, but
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote: * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Quick look - pr43791.c is not expected to work at -O0, so skip -O0 for example by guarding the whole thing with #if __OPTIMIZED__ 0. Similar for pr44043.c. That we didn't error at -O0 before is a bug. Eventually I was suggesting to error if we end up outputting the body of an always_inline function, that is, any uses remain (including indirect calls or address-takens which is where we do not error right now). Richard. Thanks Sri Thanks for your patience! Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote: * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Quick look - pr43791.c is not expected to work at -O0, so skip -O0 for example by guarding the whole thing with #if __OPTIMIZED__ 0. Similar for pr44043.c. That we didn't error at -O0 before is a bug. Eventually I was suggesting to error if we end up outputting the body of an always_inline function, that is, any uses remain (including indirect calls or address-takens which is where we do not error right now). Well, as dicussed earlier, this will make kernel folks crazy, since they define inline to be always inline and likes to take address of that function. (of course that is crazy already) Honza Richard. Thanks Sri Thanks for your patience! Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, Jun 14, 2013 at 10:43:34AM +0200, Richard Biener wrote: On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote: * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Quick look - pr43791.c is not expected to work at -O0, so skip -O0 for example by guarding the whole thing with #if __OPTIMIZED__ 0. Similar for pr44043.c. That we didn't error at -O0 before is a bug. Eventually I was suggesting to error if we end up outputting the body of an always_inline function, that is, any uses remain (including indirect calls or address-takens which is where we do not error right now). Well, for indirect calls/address-takens and gnu_inline style extern inline always_inline, it should be fine if we just emit calls to the library function, otherwise you'd break a lot of code, because glibc uses always_inline heavily for -D_FORTIFY_SOURCE wrapper inlines, open argument checking etc. But that body supposedly doesn't inherit the always_inline attribute. Jakub
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, Jun 14, 2013 at 1:43 AM, Richard Biener richard.guent...@gmail.com wrote: On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote: * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Quick look - pr43791.c is not expected to work at -O0, so skip -O0 for example by guarding the whole thing with #if __OPTIMIZED__ 0. Similar for pr44043.c. Seems like __OPTIMIZED__ is not defined in my config, dont know why. I see other tests checking for __OPTIMIZED. I fixed these two tests by adding a dg-prune-output at the end. Is that reasonable? I have attached the patch with all tests passing now. Thanks Sri That we didn't error at -O0 before is a bug. Eventually I was suggesting to error if we end up outputting the body of an always_inline function, that is, any uses remain (including indirect calls or address-takens which is where we do not error right now). Richard. Thanks Sri Thanks for your patience! Honza * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. * gcc.c-torture/compile/pr44043.c: Fix test to expect an error. * gcc.c-torture/compile/pr43791.c: Fix test to expect an error. Index: tree-inline.c === --- tree-inline.c (revision 200034) +++ tree-inline.c (working copy) @@ -3905,8 +3905,6 @@ expand_call_inline (basic_block bb, gimple stmt, c for inlining, but we can't do that because frontends overwrite the body. */ !cg_edge-callee-local.redefined_extern_inline - /* Avoid warnings during early inline pass. */ - cgraph_global_info_ready /* PR 20090218-1_0.c. Body can be provided by another module. */ (reason != CIF_BODY_NOT_AVAILABLE || !flag_generate_lto)) { Index: ipa-inline.c === --- ipa-inline.c(revision 200034) +++ ipa-inline.c(working copy) @@ -1911,7 +1911,15 @@ inline_always_inline_functions (struct cgraph_node } if (!can_early_inline_edge_p (e)) - continue; + { + /* Set inlined to true if the callee is marked always_inline but +is not inlinable. This will allow flagging an error later in +expand_call_inline in tree-inline.c. */ + if (lookup_attribute (always_inline, +DECL_ATTRIBUTES (callee-symbol.decl)) != NULL) + inlined = true; + continue; + } if (dump_file) fprintf (dump_file, Inlining %s into %s (always_inline).\n, Index: testsuite/gcc.target/i386/inline_error.c === --- testsuite/gcc.target/i386/inline_error.c(revision 0) +++ testsuite/gcc.target/i386/inline_error.c(revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O0 -mno-popcnt } */ + +inline int __attribute__ ((__gnu_inline__, __always_inline__, target(popcnt))) +foo () /* { dg-error inlining failed in call to always_inline .* target specific option mismatch } */ +{ + return 0; +} + +int bar() +{ + return foo (); /* { dg-error called from here } */ +} Index: testsuite/gcc.c-torture/compile/pr44043.c === --- testsuite/gcc.c-torture/compile/pr44043.c (revision 200034) +++ testsuite/gcc.c-torture/compile/pr44043.c (working copy) @@ -85,3 +85,5 @@ int raw_sendmsg(struct sock *sk, struct msghdr *ms { raw_send_hdrinc(sk, msg-msg_iov, len, (void *)0, msg-msg_flags); } + +/* { dg-prune-output (inlining failed in call to always_inline.*indirect function call with a yet undetermined callee|called from here) } */ Index: testsuite/gcc.c-torture/compile/pr43791.c === --- testsuite/gcc.c-torture/compile/pr43791.c (revision 200034) +++ testsuite/gcc.c-torture/compile/pr43791.c (working copy) @@ -1,3 +1,4 @@ + int owner(); int clear(); @@ -6,16 +7,16 @@ static void fixup() { } inline __attribute__ ((always_inline)) -void slowtrylock(void) {
Re: GCC does not support *mmintrin.h with function specific opts
Can you create a helper function to flag the error and perhaps also put that check inside can_inline_edge_p ? David On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Honza, I have isolated the ipa-inline.c part into a separate patch with a test and attached it here. The patch is simple. Could you please take a look? * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Thanks Sri On Mon, Jun 10, 2013 at 4:10 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Tue, Jun 4, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping, for review of ipa-inline.c change. Sri On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as always_inline. In -O0 mode + this will go undetected because the error flagged in + expand_call_inline in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked gnu_inline with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize +|| flag_no_inline) This should be if ((!optimize || flag_no_inline) on one line. I'd prefer also the testcase for the ICEs, something like: /* Test case to check if AVX intrinsics and function specific target optimizations work together. Check by including x86intrin.h */ /* { dg-do compile } */ /* { dg-options -O2 -mno-sse -mno-avx } */ #include x86intrin.h __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } and another testcase that does: /* { dg-do compile } */ #pragma GCC target (mavx) /* { dg-error whatever } */ Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review it too (and Honza for ipa-inline.c?). Honza, could you please take a look at the ipa-inline.c fix? I will split the patches and submit after Honza's review. I will also make the changes mentioned. Thanks Sri Jakub
Re: GCC does not support *mmintrin.h with function specific opts
Can you create a helper function to flag the error and perhaps also put that check inside can_inline_edge_p ? David On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Honza, I have isolated the ipa-inline.c part into a separate patch with a test and attached it here. The patch is simple. Could you please take a look? * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Sorry for taking ages to look at the patch, I was too hooked into other problems. I also think can_early_inline_edge_p should not produce diagnostic - it is supposed to be predicate. So your problem is that the hard worker in tree-inline is not called at -O0 and thus errors are not output? I would suggest arranging inline_always_inline_functions to return true even if inlining failed and thus making inline_calls to be called. Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote: Can you create a helper function to flag the error and perhaps also put that check inside can_inline_edge_p ? David On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Honza, I have isolated the ipa-inline.c part into a separate patch with a test and attached it here. The patch is simple. Could you please take a look? * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Sorry for taking ages to look at the patch, I was too hooked into other problems. I also think can_early_inline_edge_p should not produce diagnostic - it is supposed to be predicate. So your problem is that the hard worker in tree-inline is not called at -O0 and thus errors are not output? I would suggest arranging inline_always_inline_functions to return true even if inlining failed and thus making inline_calls to be called. Thanks Honza! Yes, that is the problem. Should I just make it return true for these special conditions (TARGET_MISMATCH + gnu_inline + always_inline + ...)? Thanks, Sri Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote: Can you create a helper function to flag the error and perhaps also put that check inside can_inline_edge_p ? David On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Honza, I have isolated the ipa-inline.c part into a separate patch with a test and attached it here. The patch is simple. Could you please take a look? * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Sorry for taking ages to look at the patch, I was too hooked into other problems. I also think can_early_inline_edge_p should not produce diagnostic - it is supposed to be predicate. So your problem is that the hard worker in tree-inline is not called at -O0 and thus errors are not output? I would suggest arranging inline_always_inline_functions to return true even if inlining failed and thus making inline_calls to be called. Thanks Honza! Yes, that is the problem. Should I just make it return true for these special conditions (TARGET_MISMATCH + gnu_inline + always_inline + ...)? Can't it just return true if there is any alwaysinline call? I think if inline fails, we always want to error, right? Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Jun 13, 2013 at 10:19 AM, Jan Hubicka hubi...@ucw.cz wrote: On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote: Can you create a helper function to flag the error and perhaps also put that check inside can_inline_edge_p ? David On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Honza, I have isolated the ipa-inline.c part into a separate patch with a test and attached it here. The patch is simple. Could you please take a look? * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Sorry for taking ages to look at the patch, I was too hooked into other problems. I also think can_early_inline_edge_p should not produce diagnostic - it is supposed to be predicate. So your problem is that the hard worker in tree-inline is not called at -O0 and thus errors are not output? I would suggest arranging inline_always_inline_functions to return true even if inlining failed and thus making inline_calls to be called. Thanks Honza! Yes, that is the problem. Should I just make it return true for these special conditions (TARGET_MISMATCH + gnu_inline + always_inline + ...)? Can't it just return true if there is any alwaysinline call? I think if inline fails, we always want to error, right? Ok, patch attached that does this. Please let me know what you think. Thanks Sri Honza * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. Index: tree-inline.c === --- tree-inline.c (revision 200034) +++ tree-inline.c (working copy) @@ -3905,8 +3905,6 @@ expand_call_inline (basic_block bb, gimple stmt, c for inlining, but we can't do that because frontends overwrite the body. */ !cg_edge-callee-local.redefined_extern_inline - /* Avoid warnings during early inline pass. */ - cgraph_global_info_ready /* PR 20090218-1_0.c. Body can be provided by another module. */ (reason != CIF_BODY_NOT_AVAILABLE || !flag_generate_lto)) { Index: ipa-inline.c === --- ipa-inline.c(revision 200034) +++ ipa-inline.c(working copy) @@ -1911,7 +1911,15 @@ inline_always_inline_functions (struct cgraph_node } if (!can_early_inline_edge_p (e)) - continue; + { + /* Set inlined to true if the callee is marked always_inline but +is not inlinable. This will allow flagging an error later in +expand_call_inline in tree-inline.c. */ + if (lookup_attribute (always_inline, +DECL_ATTRIBUTES (callee-symbol.decl)) != NULL) + inlined = true; + continue; + } if (dump_file) fprintf (dump_file, Inlining %s into %s (always_inline).\n, Index: testsuite/gcc.target/i386/inline_error.c === --- testsuite/gcc.target/i386/inline_error.c(revision 0) +++ testsuite/gcc.target/i386/inline_error.c(revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O0 -mno-popcnt } */ + +inline int __attribute__ ((__gnu_inline__, __always_inline__, target(popcnt))) +foo () /* { dg-error inlining failed in call to always_inline .* target specific option mismatch } */ +{ + return 0; +} + +int bar() +{ + return foo (); /* { dg-error called from here } */ +}
Re: GCC does not support *mmintrin.h with function specific opts
If you want to flag errors for all possible wrongly used always_inline attribute, should this change be done in can_inline_edge_p? Or keep your current change, but also add a warning (something like 'always inline function is ignored etc') in inline_always_inline_functions when inline transformation can not meaningfully give a useful error. David On Thu, Jun 13, 2013 at 11:33 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Jun 13, 2013 at 10:19 AM, Jan Hubicka hubi...@ucw.cz wrote: On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote: Can you create a helper function to flag the error and perhaps also put that check inside can_inline_edge_p ? David On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote: Hi Honza, I have isolated the ipa-inline.c part into a separate patch with a test and attached it here. The patch is simple. Could you please take a look? * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Sorry for taking ages to look at the patch, I was too hooked into other problems. I also think can_early_inline_edge_p should not produce diagnostic - it is supposed to be predicate. So your problem is that the hard worker in tree-inline is not called at -O0 and thus errors are not output? I would suggest arranging inline_always_inline_functions to return true even if inlining failed and thus making inline_calls to be called. Thanks Honza! Yes, that is the problem. Should I just make it return true for these special conditions (TARGET_MISMATCH + gnu_inline + always_inline + ...)? Can't it just return true if there is any alwaysinline call? I think if inline fails, we always want to error, right? Ok, patch attached that does this. Please let me know what you think. Thanks Sri Honza
Re: GCC does not support *mmintrin.h with function specific opts
* tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Thanks for your patience! Honza
Re: GCC does not support *mmintrin.h with function specific opts
If you want to flag errors for all possible wrongly used always_inline attribute, should this change be done in can_inline_edge_p? Or keep your current change, but also add a warning (something like 'always inline function is ignored etc') in inline_always_inline_functions when inline transformation can not meaningfully give a useful error. We have the CIF error codes to be able to give useful diagnostic at transformation time. I think it is better to have all the diagnostic output at one place unless we have really good reasons to fork it. We are not losing any precision here, right? Sure, other option would be to move all alwaysinline diagnostic into inline_always_inline_functions and remove the code path in tree-inline. The warnings however are quite meaningfully places in tree-inline, because we want to warn only after all the inlining algorithms has finished and inlining really did not happen. They can be moved out there if we walk the edges at end and output diagnostic, but I do not see anything really wrong with their current location. Honza
Re: GCC does not support *mmintrin.h with function specific opts
yes, what you said makes sense. thanks, David On Thu, Jun 13, 2013 at 12:45 PM, Jan Hubicka hubi...@ucw.cz wrote: If you want to flag errors for all possible wrongly used always_inline attribute, should this change be done in can_inline_edge_p? Or keep your current change, but also add a warning (something like 'always inline function is ignored etc') in inline_always_inline_functions when inline transformation can not meaningfully give a useful error. We have the CIF error codes to be able to give useful diagnostic at transformation time. I think it is better to have all the diagnostic output at one place unless we have really good reasons to fork it. We are not losing any precision here, right? Sure, other option would be to move all alwaysinline diagnostic into inline_always_inline_functions and remove the code path in tree-inline. The warnings however are quite meaningfully places in tree-inline, because we want to warn only after all the inlining algorithms has finished and inlining really did not happen. They can be moved out there if we walk the edges at end and output diagnostic, but I do not see anything really wrong with their current location. Honza
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote: * tree-inline.c (expand_call_inline): Allow the error to be flagged in early inline pass. * ipa-inline.c (inline_always_inline_functions): Pretend always_inline functions are inlined during failures to flag an error. * gcc.target/i386/inline_error.c: New test. This patch is OK if it passes testing. Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing because of always_inline functions being present that cannot be inlined and the compiler is now generating error messages. I will fix them and resend the patch. Thanks Sri Thanks for your patience! Honza
Re: GCC does not support *mmintrin.h with function specific opts
Hi Honza, I have isolated the ipa-inline.c part into a separate patch with a test and attached it here. The patch is simple. Could you please take a look? * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Thanks Sri On Mon, Jun 10, 2013 at 4:10 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Tue, Jun 4, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping, for review of ipa-inline.c change. Sri On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as always_inline. In -O0 mode + this will go undetected because the error flagged in + expand_call_inline in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked gnu_inline with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize +|| flag_no_inline) This should be if ((!optimize || flag_no_inline) on one line. I'd prefer also the testcase for the ICEs, something like: /* Test case to check if AVX intrinsics and function specific target optimizations work together. Check by including x86intrin.h */ /* { dg-do compile } */ /* { dg-options -O2 -mno-sse -mno-avx } */ #include x86intrin.h __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } and another testcase that does: /* { dg-do compile } */ #pragma GCC target (mavx) /* { dg-error whatever } */ Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review it too (and Honza for ipa-inline.c?). Honza, could you please take a look at the ipa-inline.c fix? I will split the patches and submit after Honza's review. I will also make the changes mentioned. Thanks Sri Jakub * ipa-inline.c (can_early_inline_edge_p): Flag an error when the function that cannot be inlined is target specific. * gcc.target/i386/inline_error.c: New test. Index: testsuite/gcc.target/i386/inline_error.c === --- testsuite/gcc.target/i386/inline_error.c(revision 0) +++ testsuite/gcc.target/i386/inline_error.c(revision 0) @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options -O0 -mno-popcnt } */ + +inline int __attribute__ ((__gnu_inline__, __always_inline__, target(popcnt))) +foo () /* { dg-error inlining failed in call to extern gnu_inline .* target specific option mismatch } */ +{ + return 0; +} + +int bar() +{ + return foo (); +} /* { dg-error called from here } */ Index: ipa-inline.c === --- ipa-inline.c(revision 200034) +++ ipa-inline.c(working copy) @@ -374,7 +374,31 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error here when the inlining cannot happen because of target +option mismatch but the callee is marked as always_inline. +Otherwise, in -O0 mode this could go unreported because the error +flagged in expand_call_inline in tree-inline.c might not execute. +Then, the linker could complain about a missing body for the callee +if it turned out that the callee was also marked gnu_inline with +extern inline keyword as bodies of such functions are not +generated. */ + if (!optimize + e-inline_failed == CIF_TARGET_OPTION_MISMATCH + lookup_attribute (always_inline, DECL_ATTRIBUTES (callee-symbol.decl)) + lookup_attribute (gnu_inline, DECL_ATTRIBUTES (callee-symbol.decl)) + DECL_DECLARED_INLINE_P (callee-symbol.decl)) + { + error (inlining failed in call to extern gnu_inline %q+F: %s, +callee-symbol.decl, +cgraph_inline_failed_string
Re: GCC does not support *mmintrin.h with function specific opts
Ping. On Tue, Jun 4, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping, for review of ipa-inline.c change. Sri On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as always_inline. In -O0 mode + this will go undetected because the error flagged in + expand_call_inline in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked gnu_inline with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize +|| flag_no_inline) This should be if ((!optimize || flag_no_inline) on one line. I'd prefer also the testcase for the ICEs, something like: /* Test case to check if AVX intrinsics and function specific target optimizations work together. Check by including x86intrin.h */ /* { dg-do compile } */ /* { dg-options -O2 -mno-sse -mno-avx } */ #include x86intrin.h __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } and another testcase that does: /* { dg-do compile } */ #pragma GCC target (mavx) /* { dg-error whatever } */ Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review it too (and Honza for ipa-inline.c?). Honza, could you please take a look at the ipa-inline.c fix? I will split the patches and submit after Honza's review. I will also make the changes mentioned. Thanks Sri Jakub
Re: GCC does not support *mmintrin.h with function specific opts
Ping. On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping, for review of ipa-inline.c change. Sri On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as always_inline. In -O0 mode + this will go undetected because the error flagged in + expand_call_inline in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked gnu_inline with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize +|| flag_no_inline) This should be if ((!optimize || flag_no_inline) on one line. I'd prefer also the testcase for the ICEs, something like: /* Test case to check if AVX intrinsics and function specific target optimizations work together. Check by including x86intrin.h */ /* { dg-do compile } */ /* { dg-options -O2 -mno-sse -mno-avx } */ #include x86intrin.h __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } and another testcase that does: /* { dg-do compile } */ #pragma GCC target (mavx) /* { dg-error whatever } */ Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review it too (and Honza for ipa-inline.c?). Honza, could you please take a look at the ipa-inline.c fix? I will split the patches and submit after Honza's review. I will also make the changes mentioned. Thanks Sri Jakub
Re: GCC does not support *mmintrin.h with function specific opts
Ping, for review of ipa-inline.c change. Sri On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com wrote: On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as always_inline. In -O0 mode + this will go undetected because the error flagged in + expand_call_inline in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked gnu_inline with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize +|| flag_no_inline) This should be if ((!optimize || flag_no_inline) on one line. I'd prefer also the testcase for the ICEs, something like: /* Test case to check if AVX intrinsics and function specific target optimizations work together. Check by including x86intrin.h */ /* { dg-do compile } */ /* { dg-options -O2 -mno-sse -mno-avx } */ #include x86intrin.h __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } and another testcase that does: /* { dg-do compile } */ #pragma GCC target (mavx) /* { dg-error whatever } */ Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review it too (and Honza for ipa-inline.c?). Honza, could you please take a look at the ipa-inline.c fix? I will split the patches and submit after Honza's review. I will also make the changes mentioned. Thanks Sri Jakub
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote: On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as always_inline. In -O0 mode + this will go undetected because the error flagged in + expand_call_inline in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked gnu_inline with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize +|| flag_no_inline) This should be if ((!optimize || flag_no_inline) on one line. I'd prefer also the testcase for the ICEs, something like: /* Test case to check if AVX intrinsics and function specific target optimizations work together. Check by including x86intrin.h */ /* { dg-do compile } */ /* { dg-options -O2 -mno-sse -mno-avx } */ #include x86intrin.h __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } and another testcase that does: /* { dg-do compile } */ #pragma GCC target (mavx) /* { dg-error whatever } */ Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review it too (and Honza for ipa-inline.c?). Honza, could you please take a look at the ipa-inline.c fix? I will split the patches and submit after Honza's review. I will also make the changes mentioned. Thanks Sri Jakub
Re: GCC does not support *mmintrin.h with function specific opts
On Sat, May 18, 2013 at 6:00 AM, Sriraman Tallam tmsri...@google.com wrote: I don't really understand why you made the change to x86intrin.h instead of making it inside each *mmintrin.h header. The code would be the same size, it would let us include smmintrin.h directly if we wanted to, and x86intrin.h would also automatically work. Right, I should have done that instead! Yeah, definitely. For the standalone headers, which have currently __FEATURE__ guards inside of it, please replace it by the larger snippets involving #pragma, and in the x86intrin.h/immintrin.h headers include those unconditionally, instead of just if __FEATURE__ is defined. For the non-standalone headers (newer ones like avxintrin.h), replace the #ifdef __FEATURE__ in immintrin.h/x86intrin.h with larger snippets. * I did mostly as suggested except that even for avx and avx2 headers I did not see the harm in doing it in the header itself. AVX header did not have the #ifndef _AVXINTRIN_H_INCLUDED which I added before doing this. I have added test cases to show it is doing the right thing for avx. * I also found that when the caller to these intrinsics do not have the right target attribute, an error is raised in -O2 mode but not in -O0 mode. I have fixed this with a patch to ipa-inline.c, please see if this is alright. Test case intrinsics_5.c checks if an error is raised. * LZCNT needed to be handled which is done now. * common/config/i386/i386-common.c: Handle LZCNT. The above part is OK for mainline and release patches. Please commit LZCNT part as a separate patch to mainline. Also, please get middle-end part reviewed and committed before we proceed with target-dependent part. Thanks, Uros.
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote: --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) -return false; +{ + enum availability avail; + struct cgraph_node *callee += cgraph_function_or_thunk_node (e-callee, avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as always_inline. In -O0 mode + this will go undetected because the error flagged in + expand_call_inline in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked gnu_inline with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize +|| flag_no_inline) This should be if ((!optimize || flag_no_inline) on one line. I'd prefer also the testcase for the ICEs, something like: /* Test case to check if AVX intrinsics and function specific target optimizations work together. Check by including x86intrin.h */ /* { dg-do compile } */ /* { dg-options -O2 -mno-sse -mno-avx } */ #include x86intrin.h __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } and another testcase that does: /* { dg-do compile } */ #pragma GCC target (mavx) /* { dg-error whatever } */ Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review it too (and Honza for ipa-inline.c?). Jakub
Re: GCC does not support *mmintrin.h with function specific opts
Hi Jakub, I have taken your proposed changes and made patch for this. Please let me know what you think. I have changed only the headers mmintrin.h and x86intrin.h as that includes all the other headers. The builtins get enabled automatically when the pragma target is specified so need to do any thing to def_builtin. I have included 4 test case, where intrinsics_4.c uses your example with __mm256_and_ps. I had to fix a bug with lzcnt builtins in i386-common.c as that was not handled there. Thanks Sri On Wed, May 15, 2013 at 7:25 PM, Sriraman Tallam tmsri...@google.com wrote: On Tue, May 14, 2013 at 3:04 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote: When trying with -O2 -mno-avx: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } we get bogus errors and ICE: tty2.c: In function '_mm256_and_ps': tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6 tty2.c: In function 'foo': tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*) ../../gcc/expr.c:3485 (I have added 1 || instead of your generate_builtins into i386.c (def_builtin)), that just shows that target attribute/pragma support still has very severe issues that need to be fixed, instead of papered around. Note, we ICE on: #pragma GCC target (mavx) That should be fixed too. Ok, I had a brief look at the above two issues. The first testcase has the problem that the ix86_previous_fndecl cache gets out of date. When set_cfun is called on _mm256_and_ps (with the implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps, TARGET_AVX is set to true, target reinited. Then set_cfun is called with NULL, we don't do anything. Later on #pragma GCC pop_options appears, sets !TARGET_AVX (as that is the new target_option_current_node). Next foo is being parsed, avx attribute is noticed, the same target node is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's target node is the same as foo's and so we don't do cl_target_restore_option at all, so !TARGET_AVX remains, while it should be set. That is the reason for the bogus inform etc. Fixed by resetting the ix86_previous_fndecl cache on any #pragma GCC target below. The #pragma GCC target (mavx) is also fixed below. The patch also includes the 1 || to enable building all builtins. We still ICE with: #0 fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180 #1 0x00771c39 in convert_move (to=0x71b2df00, from=0x71b314e0, unsignedp=0) at ../../gcc/expr.c:316 #2 0x0078009f in store_expr (exp=0x719ab390, target=0x71b2df00, call_param_p=0, nontemporal=false) at ../../gcc/expr.c:5300 #3 0x0077eba1 in expand_assignment (to=0x71b35090, from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025 on the first testcase. We don't ICE say on: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a[10], b[10], c[10]; void __attribute__((target (avx))) foo (void) { a[0] = _mm256_and_ps (b[0], c[0]); } The problem is that in the first testcase, the VAR_DECL c (guess also b and a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode (it has been laid out while -mno-avx has been the current) and also DECL_RTL which is a mem:BLK. Guess expr.c would need to special case TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM to the desired mode? --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100 +++ gcc/config/i386/i386-c.c2013-05-14
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, 16 May 2013, Sriraman Tallam wrote: Hi Jakub, I have taken your proposed changes and made patch for this. Please let me know what you think. I have changed only the headers mmintrin.h and x86intrin.h as that includes all the other headers. I don't really understand why you made the change to x86intrin.h instead of making it inside each *mmintrin.h header. The code would be the same size, it would let us include smmintrin.h directly if we wanted to, and x86intrin.h would also automatically work. -- Marc Glisse
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, May 16, 2013 at 3:55 PM, Marc Glisse marc.gli...@inria.fr wrote: On Thu, 16 May 2013, Sriraman Tallam wrote: Hi Jakub, I have taken your proposed changes and made patch for this. Please let me know what you think. I have changed only the headers mmintrin.h and x86intrin.h as that includes all the other headers. I don't really understand why you made the change to x86intrin.h instead of making it inside each *mmintrin.h header. The code would be the same size, it would let us include smmintrin.h directly if we wanted to, and x86intrin.h would also automatically work. Right, I should have done that instead! Sri -- Marc Glisse
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, May 16, 2013 at 04:00:53PM -0700, Sriraman Tallam wrote: On Thu, May 16, 2013 at 3:55 PM, Marc Glisse marc.gli...@inria.fr wrote: I don't really understand why you made the change to x86intrin.h instead of making it inside each *mmintrin.h header. The code would be the same size, it would let us include smmintrin.h directly if we wanted to, and x86intrin.h would also automatically work. Right, I should have done that instead! Yeah, definitely. For the standalone headers, which have currently __FEATURE__ guards inside of it, please replace it by the larger snippets involving #pragma, and in the x86intrin.h/immintrin.h headers include those unconditionally, instead of just if __FEATURE__ is defined. For the non-standalone headers (newer ones like avxintrin.h), replace the #ifdef __FEATURE__ in immintrin.h/x86intrin.h with larger snippets. Jakub
Re: GCC does not support *mmintrin.h with function specific opts
On Tue, May 14, 2013 at 3:04 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote: When trying with -O2 -mno-avx: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } we get bogus errors and ICE: tty2.c: In function '_mm256_and_ps': tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6 tty2.c: In function 'foo': tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*) ../../gcc/expr.c:3485 (I have added 1 || instead of your generate_builtins into i386.c (def_builtin)), that just shows that target attribute/pragma support still has very severe issues that need to be fixed, instead of papered around. Note, we ICE on: #pragma GCC target (mavx) That should be fixed too. Ok, I had a brief look at the above two issues. The first testcase has the problem that the ix86_previous_fndecl cache gets out of date. When set_cfun is called on _mm256_and_ps (with the implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps, TARGET_AVX is set to true, target reinited. Then set_cfun is called with NULL, we don't do anything. Later on #pragma GCC pop_options appears, sets !TARGET_AVX (as that is the new target_option_current_node). Next foo is being parsed, avx attribute is noticed, the same target node is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's target node is the same as foo's and so we don't do cl_target_restore_option at all, so !TARGET_AVX remains, while it should be set. That is the reason for the bogus inform etc. Fixed by resetting the ix86_previous_fndecl cache on any #pragma GCC target below. The #pragma GCC target (mavx) is also fixed below. The patch also includes the 1 || to enable building all builtins. We still ICE with: #0 fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180 #1 0x00771c39 in convert_move (to=0x71b2df00, from=0x71b314e0, unsignedp=0) at ../../gcc/expr.c:316 #2 0x0078009f in store_expr (exp=0x719ab390, target=0x71b2df00, call_param_p=0, nontemporal=false) at ../../gcc/expr.c:5300 #3 0x0077eba1 in expand_assignment (to=0x71b35090, from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025 on the first testcase. We don't ICE say on: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a[10], b[10], c[10]; void __attribute__((target (avx))) foo (void) { a[0] = _mm256_and_ps (b[0], c[0]); } The problem is that in the first testcase, the VAR_DECL c (guess also b and a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode (it has been laid out while -mno-avx has been the current) and also DECL_RTL which is a mem:BLK. Guess expr.c would need to special case TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM to the desired mode? --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100 +++ gcc/config/i386/i386-c.c2013-05-14 11:46:50.773806894 +0200 @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree args, tre if (! args) { - cur_tree = ((pop_target) - ? pop_target - : target_option_default_node); + cur_tree = (pop_target ? pop_target : target_option_default_node); cl_target_option_restore (global_options, TREE_TARGET_OPTION (cur_tree)); } else { cur_tree = ix86_valid_target_attribute_tree (args); - if (!cur_tree) - return false; + if (!cur_tree || cur_tree ==
Re: GCC does not support *mmintrin.h with function specific opts
On Mon, May 13, 2013 at 11:46 PM, H.J. Lu hjl.to...@gmail.com wrote: On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. Added and new patch attached. Thanks Sri -- Joseph S. Myers jos...@codesourcery.com It looks good to me. But I can't approve it. * config/i386/i386.c (construct_container): Do not issue SSE return error for extern gnu_inline functions. (def_builtin): Do not generate builtins when -mno-generate-builtins is used. * doc/invoke.texi: Document option -mgenerate-builtins. * config/i386/i386.opt (mgenerate-builtins): New target option. * config/i386/i386-c.c (ix86_target_macros_internal): Define macro __ALL_ISA__ when generate_target_builtins is true. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/bmiintrin.h: Ditto. * config/i386/fma4intrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/mmintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. I think that the option should be named -mtarget-builtins. Since HJ is OK with this user-visible change, the patch is OK for mainline (with eventually renamed option). We also have an option to switch this new functionality off, and we are early enough in the development cycle to find out if anything is fundamentaly broken with this approach. BTW, does this patch address the request in PR57202? Thanks, Uros.
Re: GCC does not support *mmintrin.h with function specific opts
On Tue, May 14, 2013 at 08:58:55AM +0200, Uros Bizjak wrote: I think that the option should be named -mtarget-builtins. There shouldn't be an option for it at all. If constructing the builtins is slow (it is), we should just create them lazily, the *builtin_decl_{explicit,implicit}* APIs were a first step for that, plus we should build some gperf table of all the generic and all the target builtins and record prefixes used to find them (__builtin_, __sync_, __atomic_, what else?), then just teach FE that if they are looking up a symbol prefixed with one of these, they should also look it up in the perfect hash table and if found there, call some function to construct the builtin. Of course, this isn't a prerequisite of the changes you are looking for, but introducing an option that hopefully will be completely useless in a few months is just a bad idea. Since HJ is OK with this user-visible change, the patch is OK for mainline (with eventually renamed option). We also have an option to switch this new functionality off, and we are early enough in the development cycle to find out if anything is fundamentaly broken with this approach. BTW, does this patch address the request in PR57202? I'm strongly against the patch in its current form, it is a hack rather than a solution. I don't see how it could be properly tested, when say immintrin.h and x86intrin.h is still full of code like: #ifdef __AVX__ #include avxintrin.h #endif so when you just #include x86intrin.h rather than the few headers that were tested, you are out of luck. The right solution is really IMNSHO something along the lines of: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __AVXINTRIN_H_POP_OPTIONS__ #endif ... #ifdef __AVXINTRIN_H_POP_OPTIONS__ #pragma GCC pop_options #undef __AVXINTRIN_H_POP_OPTIONS__ #endif around the headers. As can be seen on say -O2 -mno-avx: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__((vector_size (32))); extern __v8sf a, b; extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) bar (int x) { a = b + 1.0f; return x + 1; } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif int __attribute__((target (avx2))) avx2 (int x) { return bar (x) + bar (x + 1); } int __attribute__((target (avx))) avx (int x) { return bar (x) + bar (x + 1); } int __attribute__((target (xop))) xop (int x) { return bar (x) + bar (x + 1); } int __attribute__((target (sse2))) sse2 (int x) { return bar (x) + bar (x + 1); } int nothing (int x) { return bar (x) + bar (x + 1); } the inliner happily inlines the avx target function into avx/avx2/xop targetted functions (correct), inlines it even into sse2 (something that should be fixed not to), and doesn't inline into nothing (IMHO correct, we want an error in that case, one shouldn't use avx intrinsics in say sse2 only targetted function (unless -mavx is used on command line, i.e. the check should always be if the caller's target set is equal or superset of callee's target set). When trying with -O2 -mno-avx: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } we get bogus errors and ICE: tty2.c: In function '_mm256_and_ps': tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6 tty2.c: In function 'foo': tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*) ../../gcc/expr.c:3485 (I have added 1 || instead of your generate_builtins into i386.c (def_builtin)), that just shows that target attribute/pragma support still has very severe issues that need to be fixed, instead of papered around. Note, we ICE on: #pragma GCC target (mavx) That should be fixed too. Jakub
Re: GCC does not support *mmintrin.h with function specific opts
On Tue, May 14, 2013 at 10:39 AM, Jakub Jelinek ja...@redhat.com wrote: On Tue, May 14, 2013 at 08:58:55AM +0200, Uros Bizjak wrote: I think that the option should be named -mtarget-builtins. There shouldn't be an option for it at all. If constructing the builtins is slow (it is), we should just create them lazily, the *builtin_decl_{explicit,implicit}* APIs were a first step for that, plus we should build some gperf table of all the generic and all the target builtins and record prefixes used to find them (__builtin_, __sync_, __atomic_, what else?), then just teach FE that if they are looking up a symbol prefixed with one of these, they should also look it up in the perfect hash table and if found there, call some function to construct the builtin. Of course, this isn't a prerequisite of the changes you are looking for, but introducing an option that hopefully will be completely useless in a few months is just a bad idea. Since HJ is OK with this user-visible change, the patch is OK for mainline (with eventually renamed option). We also have an option to switch this new functionality off, and we are early enough in the development cycle to find out if anything is fundamentaly broken with this approach. BTW, does this patch address the request in PR57202? I'm strongly against the patch in its current form, it is a hack rather than a solution. I don't see how it could be properly tested, when say immintrin.h and x86intrin.h is still full of code like: #ifdef __AVX__ #include avxintrin.h #endif so when you just #include x86intrin.h rather than the few headers that were tested, you are out of luck. Jakub, thanks for your thorough analysis. It looks that the approach in the patch is fundamentally flawed and that infrastructure is not developed/fixed enough for alternative solution. Based on expressed concerns, I retract my previous approval. Thanks, Uros.
Re: GCC does not support *mmintrin.h with function specific opts
On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote: When trying with -O2 -mno-avx: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } we get bogus errors and ICE: tty2.c: In function '_mm256_and_ps': tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6 tty2.c: In function 'foo': tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*) ../../gcc/expr.c:3485 (I have added 1 || instead of your generate_builtins into i386.c (def_builtin)), that just shows that target attribute/pragma support still has very severe issues that need to be fixed, instead of papered around. Note, we ICE on: #pragma GCC target (mavx) That should be fixed too. Ok, I had a brief look at the above two issues. The first testcase has the problem that the ix86_previous_fndecl cache gets out of date. When set_cfun is called on _mm256_and_ps (with the implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps, TARGET_AVX is set to true, target reinited. Then set_cfun is called with NULL, we don't do anything. Later on #pragma GCC pop_options appears, sets !TARGET_AVX (as that is the new target_option_current_node). Next foo is being parsed, avx attribute is noticed, the same target node is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's target node is the same as foo's and so we don't do cl_target_restore_option at all, so !TARGET_AVX remains, while it should be set. That is the reason for the bogus inform etc. Fixed by resetting the ix86_previous_fndecl cache on any #pragma GCC target below. The #pragma GCC target (mavx) is also fixed below. The patch also includes the 1 || to enable building all builtins. We still ICE with: #0 fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180 #1 0x00771c39 in convert_move (to=0x71b2df00, from=0x71b314e0, unsignedp=0) at ../../gcc/expr.c:316 #2 0x0078009f in store_expr (exp=0x719ab390, target=0x71b2df00, call_param_p=0, nontemporal=false) at ../../gcc/expr.c:5300 #3 0x0077eba1 in expand_assignment (to=0x71b35090, from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025 on the first testcase. We don't ICE say on: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a[10], b[10], c[10]; void __attribute__((target (avx))) foo (void) { a[0] = _mm256_and_ps (b[0], c[0]); } The problem is that in the first testcase, the VAR_DECL c (guess also b and a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode (it has been laid out while -mno-avx has been the current) and also DECL_RTL which is a mem:BLK. Guess expr.c would need to special case TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM to the desired mode? --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100 +++ gcc/config/i386/i386-c.c2013-05-14 11:46:50.773806894 +0200 @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree args, tre if (! args) { - cur_tree = ((pop_target) - ? pop_target - : target_option_default_node); + cur_tree = (pop_target ? pop_target : target_option_default_node); cl_target_option_restore (global_options, TREE_TARGET_OPTION (cur_tree)); } else { cur_tree = ix86_valid_target_attribute_tree (args); - if (!cur_tree) - return false; + if (!cur_tree || cur_tree == error_mark_node) + { + cl_target_option_restore (global_options, + TREE_TARGET_OPTION (prev_tree)); +
Re: GCC does not support *mmintrin.h with function specific opts
On Tue, May 14, 2013 at 12:04 PM, Jakub Jelinek ja...@redhat.com wrote: On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote: When trying with -O2 -mno-avx: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a, b, c; void __attribute__((target (avx))) foo (void) { a = _mm256_and_ps (b, c); } we get bogus errors and ICE: tty2.c: In function '_mm256_and_ps': tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6 tty2.c: In function 'foo': tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*) ../../gcc/expr.c:3485 (I have added 1 || instead of your generate_builtins into i386.c (def_builtin)), that just shows that target attribute/pragma support still has very severe issues that need to be fixed, instead of papered around. Note, we ICE on: #pragma GCC target (mavx) That should be fixed too. Ok, I had a brief look at the above two issues. The first testcase has the problem that the ix86_previous_fndecl cache gets out of date. When set_cfun is called on _mm256_and_ps (with the implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps, TARGET_AVX is set to true, target reinited. Then set_cfun is called with NULL, we don't do anything. Later on #pragma GCC pop_options appears, sets !TARGET_AVX (as that is the new target_option_current_node). Next foo is being parsed, avx attribute is noticed, the same target node is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's target node is the same as foo's and so we don't do cl_target_restore_option at all, so !TARGET_AVX remains, while it should be set. That is the reason for the bogus inform etc. Fixed by resetting the ix86_previous_fndecl cache on any #pragma GCC target below. The #pragma GCC target (mavx) is also fixed below. The patch also includes the 1 || to enable building all builtins. We still ICE with: #0 fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180 #1 0x00771c39 in convert_move (to=0x71b2df00, from=0x71b314e0, unsignedp=0) at ../../gcc/expr.c:316 #2 0x0078009f in store_expr (exp=0x719ab390, target=0x71b2df00, call_param_p=0, nontemporal=false) at ../../gcc/expr.c:5300 #3 0x0077eba1 in expand_assignment (to=0x71b35090, from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025 on the first testcase. We don't ICE say on: #ifndef __AVX__ #pragma GCC push_options #pragma GCC target(avx) #define __DISABLE_AVX__ #endif typedef float __v8sf __attribute__ ((__vector_size__ (32))); typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__)); extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); } #ifdef __DISABLE_AVX__ #pragma GCC pop_options #undef __DISABLE_AVX__ #endif __m256 a[10], b[10], c[10]; void __attribute__((target (avx))) foo (void) { a[0] = _mm256_and_ps (b[0], c[0]); } The problem is that in the first testcase, the VAR_DECL c (guess also b and a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode (it has been laid out while -mno-avx has been the current) and also DECL_RTL which is a mem:BLK. Guess expr.c would need to special case TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM to the desired mode? I think any entity with static storage (maybe even automatic storage) should have BLKmode (or rather its mode should not matter) and what matters is the mode we use for the access - that is, the mode of the MEM_REF we expand, for example. That TYPE_MODE is dynamic for vector types is a bad thing. It also means that type layout may be variable (consider PPC where double has different alignment in structs, so what layout would a struct with a vector_size(16) double vector get with -mvsx vs. -mno-vsx?) Richard. --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100 +++ gcc/config/i386/i386-c.c2013-05-14 11:46:50.773806894 +0200 @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree
Re: GCC does not support *mmintrin.h with function specific opts
On Tue, May 14, 2013 at 12:22:05PM +0200, Richard Biener wrote: The problem is that in the first testcase, the VAR_DECL c (guess also b and a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode (it has been laid out while -mno-avx has been the current) and also DECL_RTL which is a mem:BLK. Guess expr.c would need to special case TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM to the desired mode? --- gcc/expr.c.jj 2013-05-14 08:25:40.0 +0200 +++ gcc/expr.c 2013-05-14 12:55:46.331983406 +0200 @@ -9310,6 +9310,8 @@ expand_expr_real_1 (tree exp, rtx target set_curr_insn_location (saved_loc); if (REG_P (r) !REG_EXPR (r)) set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r); + if (MEM_P (r) GET_MODE (r) == BLKmode mode != BLKmode) + r = adjust_address (r, mode, 0); return r; } fixes the ICE on that testcase. I think any entity with static storage (maybe even automatic storage) should have BLKmode (or rather its mode should not matter) and what matters is the mode we use for the access - that is, the mode of the MEM_REF we expand, for example. There wasn't any MEM_REF in that case, just SSA_NAME with SSA_NAME_DEF_STMT of a load from VAR_DECL. That TYPE_MODE is dynamic for vector types is a bad thing. It also means that type layout may be variable (consider PPC where double has different alignment in structs, so what layout would a struct with a vector_size(16) double vector get with -mvsx vs. -mno-vsx?) I haven't introduced the dynamic TYPE_MODE, but getting rid of it would be terribly hard I'm afraid. Anyway, if struct layout depends on modes and handles the vector modes differently from BLKmode, then it is a target bug, it really should look at the types instead. Otherwise if you have such struct in a header, then -mvsx code will be ABI incompatible with -mno-vsx code using the same header. Jakub
Re: GCC does not support *mmintrin.h with function specific opts
Ping. On Thu, May 9, 2013 at 2:20 PM, Sriraman Tallam tmsri...@google.com wrote: cc:Diego On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. Added and new patch attached. Thanks Sri -- Joseph S. Myers jos...@codesourcery.com
Re: GCC does not support *mmintrin.h with function specific opts
On Mon, May 13, 2013 at 2:21 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 9, 2013 at 2:20 PM, Sriraman Tallam tmsri...@google.com wrote: cc:Diego On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. Added and new patch attached. Thanks Sri -- Joseph S. Myers jos...@codesourcery.com It looks good to me. But I can't approve it. Add Uros, -- H.J.
Re: GCC does not support *mmintrin.h with function specific opts
cc:Diego On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. Added and new patch attached. Thanks Sri -- Joseph S. Myers jos...@codesourcery.com
Re: GCC does not support *mmintrin.h with function specific opts
Ping. On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote: Ping. On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. Added and new patch attached. Thanks Sri -- Joseph S. Myers jos...@codesourcery.com
Re: GCC does not support *mmintrin.h with function specific opts
Ping. On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. Added and new patch attached. Thanks Sri -- Joseph S. Myers jos...@codesourcery.com
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers jos...@codesourcery.com wrote: On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. Added and new patch attached. Thanks Sri -- Joseph S. Myers jos...@codesourcery.com * config/i386/i386.c (construct_container): Do not issue SSE return error for extern gnu_inline functions. (def_builtin): Do not generate builtins when -mno-generate-builtins is used. * doc/invoke.texi: Document option -mgenerate-builtins. * config/i386/i386.opt (mgenerate-builtins): New target option. * config/i386/i386-c.c (ix86_target_macros_internal): Define macro __ALL_ISA__ when generate_target_builtins is true. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/bmiintrin.h: Ditto. * config/i386/fma4intrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/mmintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. Index: config/i386/smmintrin.h === --- config/i386/smmintrin.h (revision 198212) +++ config/i386/smmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _SMMINTRIN_H_INCLUDED #define _SMMINTRIN_H_INCLUDED -#ifndef __SSE4_1__ +#if !defined (__SSE4_1__) !defined (__ALL_ISA__) # error SSE4.1 instruction set not enabled #else Index: config/i386/f16cintrin.h === --- config/i386/f16cintrin.h(revision 198212) +++ config/i386/f16cintrin.h(working copy) @@ -25,7 +25,7 @@ # error Never use f16intrin.h directly; include x86intrin.h or immintrin.h instead. #endif -#ifndef __F16C__ +#if !defined (__F16C__) !defined (__ALL_ISA__) # error F16C instruction set not enabled #else Index: config/i386/wmmintrin.h === --- config/i386/wmmintrin.h (revision 198212) +++ config/i386/wmmintrin.h (working copy) @@ -30,7 +30,7 @@ /* We need definitions from the SSE2 header file. */ #include emmintrin.h -#if !defined (__AES__) !defined (__PCLMUL__) +#if !defined (__AES__) !defined (__PCLMUL__) !defined (__ALL_ISA__) # error AES/PCLMUL instructions not enabled #else Index: config/i386/bmi2intrin.h === --- config/i386/bmi2intrin.h(revision 198212) +++ config/i386/bmi2intrin.h(working copy) @@ -25,7 +25,7 @@ # error Never use bmi2intrin.h directly; include x86intrin.h instead. #endif -#ifndef __BMI2__ +#if !defined (__BMI2__) !defined (__ALL_ISA__) # error BMI2 instruction set not enabled #endif /* __BMI2__ */ Index: config/i386/pmmintrin.h === --- config/i386/pmmintrin.h (revision 198212) +++ config/i386/pmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _PMMINTRIN_H_INCLUDED #define _PMMINTRIN_H_INCLUDED -#ifndef __SSE3__ +#if !defined (__SSE3__) !defined (__ALL_ISA__) # error SSE3 instruction set not enabled #else Index: config/i386/lzcntintrin.h === --- config/i386/lzcntintrin.h (revision 198212) +++ config/i386/lzcntintrin.h (working copy) @@ -25,7 +25,7 @@ # error Never use lzcntintrin.h directly; include x86intrin.h instead. #endif -#ifndef __LZCNT__ +#if !defined (__LZCNT__) !defined (__ALL_ISA__) # error LZCNT instruction is not enabled #endif /* __LZCNT__ */ Index: config/i386/tmmintrin.h === --- config/i386/tmmintrin.h (revision 198212) +++ config/i386/tmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _TMMINTRIN_H_INCLUDED #define _TMMINTRIN_H_INCLUDED -#ifndef __SSSE3__ +#if !defined (__SSSE3__) !defined (__ALL_ISA__) # error SSSE3 instruction set not enabled #else Index: config/i386/xmmintrin.h
Re: GCC does not support *mmintrin.h with function specific opts
On Tue, 16 Apr 2013, Sriraman Tallam wrote: Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. Any new option needs documenting in invoke.texi. -- Joseph S. Myers jos...@codesourcery.com
Re: GCC does not support *mmintrin.h with function specific opts
Ping. On Wed, Apr 17, 2013 at 7:13 PM, Sriraman Tallam tmsri...@google.com wrote: +HJ On Tue, Apr 16, 2013 at 1:54 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, I have attached an updated patch that addresses all the comments raised. On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. 1) this shouldn't be an option, either it can be made to work reliably, then it should be done always, or it can't, then it shouldn't be done Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. 2) have you verified that if you always generate all builtins, that the builtins not supported by the ISA selected from the command line are created with the right vector modes? This issue does not arise. When the target builtin is expanded, it is checked if the ISA support is there, either via function specific target opts or global target opts. If not, an error is raised. Test case added for this, please see intrinsic_4.c in patch. 3) the *intrin.h headers in the case where the guarding macro isn't defined should be surrounded by something like #ifndef __FMA4__ #pragma GCC push options #pragma GCC target(fma4) #endif ... #ifndef __FMA4__ #pragma GCC pop options #endif so that everything that is in the headers is compiled with the ISA in question I do not think this should be done because it will break the inlining ability of the header function and cause issues if the caller does not specify the required ISA. The fact that the header functions are marked extern __inline, with gnu_inline guarantees that a body will not be generated and they will be inlined. If the caller does not have the required ISA, appropriate errors will be raised. Test cases added, see intrinsics_1.c, intrinsics_2.c 4) what happens if you use the various vector types typedefed in the *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE for VECTOR_TYPE is a function call, perhaps it will just be handled as generic BLKmode vectors, which is desirable I think I checked some tests here. With -mno-sse for instance, vector types are not permitted in function arguments and return values and gcc raises a warning/error in each case. With return values, gcc always gives an error if a SSE register is required in a return value. I even fixed this message to not do it for functions marked as extern inline, with gnu_inline keyword as a body for them will not be generated. 5) what happens if you use a target builtin in a function not supporting the corresponding ISA, do you get proper error explaining what you are doing wrong? Yes, please sse intrinsic_4.c test in patch. 6) what happens if you use some intrinsics in a function not supporting the corresponding ISA? Dunno if the inliner chooses not to inline it and error out because it is always_inline, or what exactly will happen then Same deal here. The intrinsic function will, guaranteed, to be inlined into the caller which will be a corresponding builtin call. That builtin call will trigger an error if the ISA is not supported. Thanks Sri For all this you certainly need testcases. Jakub
Re: GCC does not support *mmintrin.h with function specific opts
+HJ On Tue, Apr 16, 2013 at 1:54 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, I have attached an updated patch that addresses all the comments raised. On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. 1) this shouldn't be an option, either it can be made to work reliably, then it should be done always, or it can't, then it shouldn't be done Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. 2) have you verified that if you always generate all builtins, that the builtins not supported by the ISA selected from the command line are created with the right vector modes? This issue does not arise. When the target builtin is expanded, it is checked if the ISA support is there, either via function specific target opts or global target opts. If not, an error is raised. Test case added for this, please see intrinsic_4.c in patch. 3) the *intrin.h headers in the case where the guarding macro isn't defined should be surrounded by something like #ifndef __FMA4__ #pragma GCC push options #pragma GCC target(fma4) #endif ... #ifndef __FMA4__ #pragma GCC pop options #endif so that everything that is in the headers is compiled with the ISA in question I do not think this should be done because it will break the inlining ability of the header function and cause issues if the caller does not specify the required ISA. The fact that the header functions are marked extern __inline, with gnu_inline guarantees that a body will not be generated and they will be inlined. If the caller does not have the required ISA, appropriate errors will be raised. Test cases added, see intrinsics_1.c, intrinsics_2.c 4) what happens if you use the various vector types typedefed in the *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE for VECTOR_TYPE is a function call, perhaps it will just be handled as generic BLKmode vectors, which is desirable I think I checked some tests here. With -mno-sse for instance, vector types are not permitted in function arguments and return values and gcc raises a warning/error in each case. With return values, gcc always gives an error if a SSE register is required in a return value. I even fixed this message to not do it for functions marked as extern inline, with gnu_inline keyword as a body for them will not be generated. 5) what happens if you use a target builtin in a function not supporting the corresponding ISA, do you get proper error explaining what you are doing wrong? Yes, please sse intrinsic_4.c test in patch. 6) what happens if you use some intrinsics in a function not supporting the corresponding ISA? Dunno if the inliner chooses not to inline it and error out because it is always_inline, or what exactly will happen then Same deal here. The intrinsic function will, guaranteed, to be inlined into the caller which will be a corresponding builtin call. That builtin call will trigger an error if the ISA is not supported. Thanks Sri For all this you certainly need testcases. Jakub
Re: GCC does not support *mmintrin.h with function specific opts
Hi, I have attached an updated patch that addresses all the comments raised. On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. 1) this shouldn't be an option, either it can be made to work reliably, then it should be done always, or it can't, then it shouldn't be done Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. 2) have you verified that if you always generate all builtins, that the builtins not supported by the ISA selected from the command line are created with the right vector modes? This issue does not arise. When the target builtin is expanded, it is checked if the ISA support is there, either via function specific target opts or global target opts. If not, an error is raised. Test case added for this, please see intrinsic_4.c in patch. 3) the *intrin.h headers in the case where the guarding macro isn't defined should be surrounded by something like #ifndef __FMA4__ #pragma GCC push options #pragma GCC target(fma4) #endif ... #ifndef __FMA4__ #pragma GCC pop options #endif so that everything that is in the headers is compiled with the ISA in question I do not think this should be done because it will break the inlining ability of the header function and cause issues if the caller does not specify the required ISA. The fact that the header functions are marked extern __inline, with gnu_inline guarantees that a body will not be generated and they will be inlined. If the caller does not have the required ISA, appropriate errors will be raised. Test cases added, see intrinsics_1.c, intrinsics_2.c 4) what happens if you use the various vector types typedefed in the *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE for VECTOR_TYPE is a function call, perhaps it will just be handled as generic BLKmode vectors, which is desirable I think I checked some tests here. With -mno-sse for instance, vector types are not permitted in function arguments and return values and gcc raises a warning/error in each case. With return values, gcc always gives an error if a SSE register is required in a return value. I even fixed this message to not do it for functions marked as extern inline, with gnu_inline keyword as a body for them will not be generated. 5) what happens if you use a target builtin in a function not supporting the corresponding ISA, do you get proper error explaining what you are doing wrong? Yes, please sse intrinsic_4.c test in patch. 6) what happens if you use some intrinsics in a function not supporting the corresponding ISA? Dunno if the inliner chooses not to inline it and error out because it is always_inline, or what exactly will happen then Same deal here. The intrinsic function will, guaranteed, to be inlined into the caller which will be a corresponding builtin call. That builtin call will trigger an error if the ISA is not supported. Thanks Sri For all this you certainly need testcases. Jakub * config/i386/i386.c (construct_container): Do not issue SSE return error for extern gnu_inline functions. (def_builtin): Do not generate builtins when -mno-generate-builtins is used. * config/i386/i386.opt (mgenerate-builtins): New target option. * config/i386/i386-c.c (ix86_target_macros_internal): Define macro __ALL_ISA__ when generate_target_builtins is true. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/bmiintrin.h: Ditto. * config/i386/fma4intrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/mmintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h:
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Apr 11, 2013 at 10:18 PM, Xinliang David Li davi...@google.com wrote: Great that it is already covered. I expect that with more convoluted code you'll ICE eventually. There is also a bugreport about exactly this issue which should be referenced when a fix is committed. Oh, and target maintainers disagree about if there is anything to fix and what it would take to fix it ... Note that ICC happily expands the intrinsics to SSEX code even if SSEX is not enabled. Richard. David On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com wrote: What is the compile time impact for turning it on? Code not including the intrinsic headers should not be affected too much. If the impact is small, why not turning on this option by default -- which seems to be the behavior of ICC. I will get back with data on this. With this option, all functions without the appropriate target options will be allowed to reference not supported builtins, without warnings or errors. Is it possible to delay the check till the builtin expansion time? Right now, an error is generated if a function accesses an unsupported builtin. Since the intrinsic functions are marked inline and call some target builtin, this will always be caught. To be clear, same example without the target attribute and with -mgenerate-builtins results in an error: #include smmintrin.h __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc -mgenerate-builtins smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); Sri thanks, David On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, *mmintrin headers does not work with function specific opts. Example 1: #include smmintrin.h __attribute__((target(sse4.1))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled # error SSE4.1 instruction set not enabled This error happens even though foo is marked sse4.1 There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. 1) this shouldn't be an option, either it can be made to work reliably, then it should be done always, or it can't, then it shouldn't be done 2) have you verified that if you always generate all builtins, that the builtins not supported by the ISA selected from the command line are created with the right vector modes? 3) the *intrin.h headers in the case where the guarding macro isn't defined should be surrounded by something like #ifndef __FMA4__ #pragma GCC push options #pragma GCC target(fma4) #endif ... #ifndef __FMA4__ #pragma GCC pop options #endif so that everything that is in the headers is compiled with the ISA in question 4) what happens if you use the various vector types typedefed in the *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE for VECTOR_TYPE is a function call, perhaps it will just be handled as generic BLKmode vectors, which is desirable I think 5) what happens if you use a target builtin in a function not supporting the corresponding ISA, do you get proper error explaining what you are doing wrong? 6) what happens if you use some intrinsics in a function not supporting the corresponding ISA? Dunno if the inliner chooses not to inline it and error out because it is always_inline, or what exactly will happen then For all this you certainly need testcases. Jakub
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, Apr 12, 2013 at 1:22 AM, Richard Biener richard.guent...@gmail.com wrote: On Thu, Apr 11, 2013 at 10:18 PM, Xinliang David Li davi...@google.com wrote: Great that it is already covered. I expect that with more convoluted code you'll ICE eventually. Yes, those are bugs that need to be fixed. There is also a bugreport about exactly this issue which should be referenced when a fix is committed. there is a related thread from 2 years ago : http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01417.html The patch was adding individual ISA macros eagerly which is not a good idea (as they are also used in user code). Sri's patch is certainly cleaner. Oh, and target maintainers disagree about if there is anything to fix and what it would take to fix it ... There are concerns about buggy function level target option handling, vector type lowering etc. It would be great if there are actual cases to demonstrate it. Note that ICC happily expands the intrinsics to SSEX code even if SSEX is not enabled. This seems handy, but may hide user errors. David Richard. David On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com wrote: What is the compile time impact for turning it on? Code not including the intrinsic headers should not be affected too much. If the impact is small, why not turning on this option by default -- which seems to be the behavior of ICC. I will get back with data on this. With this option, all functions without the appropriate target options will be allowed to reference not supported builtins, without warnings or errors. Is it possible to delay the check till the builtin expansion time? Right now, an error is generated if a function accesses an unsupported builtin. Since the intrinsic functions are marked inline and call some target builtin, this will always be caught. To be clear, same example without the target attribute and with -mgenerate-builtins results in an error: #include smmintrin.h __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc -mgenerate-builtins smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); Sri thanks, David On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, *mmintrin headers does not work with function specific opts. Example 1: #include smmintrin.h __attribute__((target(sse4.1))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled # error SSE4.1 instruction set not enabled This error happens even though foo is marked sse4.1 There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri
Re: GCC does not support *mmintrin.h with function specific opts
On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote: On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. 1) this shouldn't be an option, either it can be made to work reliably, then it should be done always, or it can't, then it shouldn't be done Except that if there is compile time/memory consumption concerns, users can use the option to turn it off. 2) have you verified that if you always generate all builtins, that the builtins not supported by the ISA selected from the command line are created with the right vector modes? 3) the *intrin.h headers in the case where the guarding macro isn't defined should be surrounded by something like #ifndef __FMA4__ #pragma GCC push options #pragma GCC target(fma4) #endif ... #ifndef __FMA4__ #pragma GCC pop options #endif so that everything that is in the headers is compiled with the ISA in question For the inline functions? (The caller functions should have the target option), or FE needs the target option properly set? 4) what happens if you use the various vector types typedefed in the *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE for VECTOR_TYPE is a function call, perhaps it will just be handled as generic BLKmode vectors, which is desirable I think Will the veclower pass deal with it (lowered into scalar operations)? 5) what happens if you use a target builtin in a function not supporting the corresponding ISA, do you get proper error explaining what you are doing wrong? Yes, in ix86_expand_builtin there is a check -- that is what Sri mentioned in a previous email. 6) what happens if you use some intrinsics in a function not supporting the corresponding ISA? Dunno if the inliner chooses not to inline it and error out because it is always_inline, or what exactly will happen then I think it may end up with errors about unsupported builtins as above thanks, David For all this you certainly need testcases. Jakub
GCC does not support *mmintrin.h with function specific opts
Hi, *mmintrin headers does not work with function specific opts. Example 1: #include smmintrin.h __attribute__((target(sse4.1))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled # error SSE4.1 instruction set not enabled This error happens even though foo is marked sse4.1 There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri Index: emmintrin.h === --- emmintrin.h (revision 197691) +++ emmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _EMMINTRIN_H_INCLUDED #define _EMMINTRIN_H_INCLUDED -#ifndef __SSE2__ +#if !defined (__SSE2__) !defined (__ALL_ISA__) # error SSE2 instruction set not enabled #else Index: fma4intrin.h === --- fma4intrin.h(revision 197691) +++ fma4intrin.h(working copy) @@ -28,7 +28,7 @@ #ifndef _FMA4INTRIN_H_INCLUDED #define _FMA4INTRIN_H_INCLUDED -#ifndef __FMA4__ +#if !defined (__FMA4__) !defined (__ALL_ISA__) # error FMA4 instruction set not enabled #else Index: lwpintrin.h === --- lwpintrin.h (revision 197691) +++ lwpintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _LWPINTRIN_H_INCLUDED #define _LWPINTRIN_H_INCLUDED -#ifndef __LWP__ +#if !defined (__LWP__) !defined (__ALL_ISA__) # error LWP instruction set not enabled #else Index: xopintrin.h === --- xopintrin.h (revision 197691) +++ xopintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _XOPMMINTRIN_H_INCLUDED #define _XOPMMINTRIN_H_INCLUDED -#ifndef __XOP__ +#if !defined (__XOP__) !defined (__ALL_ISA__) # error XOP instruction set not enabled #else Index: fmaintrin.h === --- fmaintrin.h (revision 197691) +++ fmaintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _FMAINTRIN_H_INCLUDED #define _FMAINTRIN_H_INCLUDED -#ifndef __FMA__ +#if !defined (__FMA__) !defined (__ALL_ISA__) # error FMA instruction set not enabled #else Index: bmiintrin.h === --- bmiintrin.h (revision 197691) +++ bmiintrin.h (working copy) @@ -25,7 +25,7 @@ # error Never use bmiintrin.h directly; include x86intrin.h instead. #endif -#ifndef __BMI__ +#if !defined (__BMI__) !defined (__ALL_ISA__) # error BMI instruction set not enabled #endif /* __BMI__ */ Index: mmintrin.h === --- mmintrin.h (revision 197691) +++ mmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _MMINTRIN_H_INCLUDED #define _MMINTRIN_H_INCLUDED -#ifndef __MMX__ +#if !defined (__MMX__) !defined (__ALL_ISA__) # error MMX instruction set not enabled #else /* The Intel API is flexible enough that we must allow aliasing with other Index: nmmintrin.h === --- nmmintrin.h (revision 197691) +++ nmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _NMMINTRIN_H_INCLUDED #define _NMMINTRIN_H_INCLUDED -#ifndef __SSE4_2__ +#if !defined (__SSE4_2__) !defined (__ALL_ISA__) # error SSE4.2 instruction set not enabled #else /* We just include SSE4.1 header file. */ Index: tbmintrin.h === --- tbmintrin.h (revision 197691) +++ tbmintrin.h (working copy) @@ -25,7 +25,7 @@ # error Never use tbmintrin.h directly; include x86intrin.h instead. #endif -#ifndef __TBM__ +#if !defined (__TBM__) !defined (__ALL_ISA__) # error TBM instruction set not enabled #endif /* __TBM__ */ Index: f16cintrin.h === --- f16cintrin.h(revision 197691) +++ f16cintrin.h(working copy) @@ -25,7 +25,7 @@ # error Never use f16intrin.h directly; include x86intrin.h or immintrin.h instead. #endif -#ifndef __F16C__ +#if !defined (__F16C__) !defined (__ALL_ISA__) # error F16C instruction set not enabled #else Index: i386.opt === ---
Re: GCC does not support *mmintrin.h with function specific opts
What is the compile time impact for turning it on? Code not including the intrinsic headers should not be affected too much. If the impact is small, why not turning on this option by default -- which seems to be the behavior of ICC. With this option, all functions without the appropriate target options will be allowed to reference not supported builtins, without warnings or errors. Is it possible to delay the check till the builtin expansion time? thanks, David On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, *mmintrin headers does not work with function specific opts. Example 1: #include smmintrin.h __attribute__((target(sse4.1))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled # error SSE4.1 instruction set not enabled This error happens even though foo is marked sse4.1 There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com wrote: What is the compile time impact for turning it on? Code not including the intrinsic headers should not be affected too much. If the impact is small, why not turning on this option by default -- which seems to be the behavior of ICC. I will get back with data on this. With this option, all functions without the appropriate target options will be allowed to reference not supported builtins, without warnings or errors. Is it possible to delay the check till the builtin expansion time? Right now, an error is generated if a function accesses an unsupported builtin. Since the intrinsic functions are marked inline and call some target builtin, this will always be caught. Sri thanks, David On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, *mmintrin headers does not work with function specific opts. Example 1: #include smmintrin.h __attribute__((target(sse4.1))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled # error SSE4.1 instruction set not enabled This error happens even though foo is marked sse4.1 There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri
Re: GCC does not support *mmintrin.h with function specific opts
On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com wrote: What is the compile time impact for turning it on? Code not including the intrinsic headers should not be affected too much. If the impact is small, why not turning on this option by default -- which seems to be the behavior of ICC. I will get back with data on this. With this option, all functions without the appropriate target options will be allowed to reference not supported builtins, without warnings or errors. Is it possible to delay the check till the builtin expansion time? Right now, an error is generated if a function accesses an unsupported builtin. Since the intrinsic functions are marked inline and call some target builtin, this will always be caught. To be clear, same example without the target attribute and with -mgenerate-builtins results in an error: #include smmintrin.h __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc -mgenerate-builtins smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); Sri thanks, David On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, *mmintrin headers does not work with function specific opts. Example 1: #include smmintrin.h __attribute__((target(sse4.1))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled # error SSE4.1 instruction set not enabled This error happens even though foo is marked sse4.1 There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri
Re: GCC does not support *mmintrin.h with function specific opts
Great that it is already covered. David On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com wrote: On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com wrote: What is the compile time impact for turning it on? Code not including the intrinsic headers should not be affected too much. If the impact is small, why not turning on this option by default -- which seems to be the behavior of ICC. I will get back with data on this. With this option, all functions without the appropriate target options will be allowed to reference not supported builtins, without warnings or errors. Is it possible to delay the check till the builtin expansion time? Right now, an error is generated if a function accesses an unsupported builtin. Since the intrinsic functions are marked inline and call some target builtin, this will always be caught. To be clear, same example without the target attribute and with -mgenerate-builtins results in an error: #include smmintrin.h __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc -mgenerate-builtins smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option -m32 -msse4.1 return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); Sri thanks, David On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote: Hi, *mmintrin headers does not work with function specific opts. Example 1: #include smmintrin.h __attribute__((target(sse4.1))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled # error SSE4.1 instruction set not enabled This error happens even though foo is marked sse4.1 There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option -mgenerate-builtins that will do two things. It will define a macro __ALL_ISA__ which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri