Re: GCC does not support *mmintrin.h with function specific opts

2013-06-18 Thread Sriraman Tallam
On Mon, Jun 17, 2013 at 10:49 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Fri, Jun 14, 2013 at 11:08 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Fri, Jun 14, 2013 at 1:43 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
   * tree-inline.c (expand_call_inline): Allow the error to be flagged
   in early inline pass.
   * ipa-inline.c (inline_always_inline_functions): Pretend 
 always_inline
   functions are inlined during failures to flag an error.
   * gcc.target/i386/inline_error.c: New test.

 This patch is OK if it passes testing.

 Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing
 because of always_inline functions being present that cannot be
 inlined and the compiler is now generating error messages. I will fix
 them and resend the patch.

 Quick look - pr43791.c is not expected to work at -O0, so skip -O0
 for example by guarding the whole thing with #if __OPTIMIZED__  0.
 Similar for pr44043.c.

 Seems like __OPTIMIZED__ is not defined in my config, dont know why. I
 see other tests checking for __OPTIMIZED.

 I fixed these two tests by adding a dg-prune-output at the end. Is
 that reasonable? I have attached the patch with all tests passing now.

 I have attached the latest patch with a small error in the previous
 patch fixed. I will submit this patch if there are no objections.

I have committed this patch.

Thanks
Sri


 Thanks
 Sri


 Thanks
 Sri


 That we didn't error at -O0 before is a bug.  Eventually I was suggesting
 to error if we end up outputting the body of an always_inline function,
 that is, any uses remain (including indirect calls or address-takens which
 is where we do not error right now).

 Richard.

 Thanks
 Sri

 Thanks for your patience!




 Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-17 Thread Sriraman Tallam
On Fri, Jun 14, 2013 at 11:08 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Fri, Jun 14, 2013 at 1:43 AM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
   * tree-inline.c (expand_call_inline): Allow the error to be flagged
   in early inline pass.
   * ipa-inline.c (inline_always_inline_functions): Pretend 
 always_inline
   functions are inlined during failures to flag an error.
   * gcc.target/i386/inline_error.c: New test.

 This patch is OK if it passes testing.

 Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing
 because of always_inline functions being present that cannot be
 inlined and the compiler is now generating error messages. I will fix
 them and resend the patch.

 Quick look - pr43791.c is not expected to work at -O0, so skip -O0
 for example by guarding the whole thing with #if __OPTIMIZED__  0.
 Similar for pr44043.c.

 Seems like __OPTIMIZED__ is not defined in my config, dont know why. I
 see other tests checking for __OPTIMIZED.

 I fixed these two tests by adding a dg-prune-output at the end. Is
 that reasonable? I have attached the patch with all tests passing now.

I have attached the latest patch with a small error in the previous
patch fixed. I will submit this patch if there are no objections.

Thanks
Sri


 Thanks
 Sri


 That we didn't error at -O0 before is a bug.  Eventually I was suggesting
 to error if we end up outputting the body of an always_inline function,
 that is, any uses remain (including indirect calls or address-takens which
 is where we do not error right now).

 Richard.

 Thanks
 Sri

 Thanks for your patience!




 Honza
* tree-inline.c (expand_call_inline): Allow the error to be flagged
in early inline pass.
* ipa-inline.c (inline_always_inline_functions): Pretend always_inline
functions are inlined during failures to flag an error.
* gcc.target/i386/inline_error.c: New test.
* gcc.c-torture/compile/pr44043.c: Fix test to expect an error.
* gcc.c-torture/compile/pr43791.c: Fix test to expect an error.

Index: testsuite/gcc.c-torture/compile/pr44043.c
===
--- testsuite/gcc.c-torture/compile/pr44043.c   (revision 200034)
+++ testsuite/gcc.c-torture/compile/pr44043.c   (working copy)
@@ -85,3 +85,5 @@ int raw_sendmsg(struct sock *sk, struct msghdr *ms
 {
   raw_send_hdrinc(sk, msg-msg_iov, len, (void *)0, msg-msg_flags);
 }
+
+/* { dg-prune-output (inlining failed in call to always_inline.*indirect 
function call with a yet undetermined callee|called from here) } */
Index: testsuite/gcc.c-torture/compile/pr43791.c
===
--- testsuite/gcc.c-torture/compile/pr43791.c   (revision 200034)
+++ testsuite/gcc.c-torture/compile/pr43791.c   (working copy)
@@ -1,3 +1,4 @@
+
 int owner();
 int clear();
 
@@ -18,4 +19,4 @@ void fasttrylock(void (*slowfn)()) {
 void trylock(void) {
  fasttrylock(slowtrylock);
 }
-
+/* { dg-prune-output (inlining failed in call to always_inline.*indirect 
function call with a yet undetermined callee|called from here) } */
Index: testsuite/gcc.target/i386/inline_error.c
===
--- testsuite/gcc.target/i386/inline_error.c(revision 0)
+++ testsuite/gcc.target/i386/inline_error.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O0 -mno-popcnt } */
+
+inline int __attribute__ ((__gnu_inline__, __always_inline__, 
target(popcnt)))
+foo () /* { dg-error inlining failed in call to always_inline .* target 
specific option mismatch } */
+{
+  return 0;
+}
+
+int bar()
+{
+  return foo (); /* { dg-error called from here } */
+}
Index: ipa-inline.c
===
--- ipa-inline.c(revision 200034)
+++ ipa-inline.c(working copy)
@@ -1911,7 +1911,15 @@ inline_always_inline_functions (struct cgraph_node
}
 
   if (!can_early_inline_edge_p (e))
-   continue;
+   {
+ /* Set inlined to true if the callee is marked always_inline but
+is not inlinable.  This will allow flagging an error later in
+expand_call_inline in tree-inline.c.  */
+ if (lookup_attribute (always_inline,
+DECL_ATTRIBUTES (callee-symbol.decl)) != NULL)
+   inlined = true;
+ continue;
+   }
 
   if (dump_file)
fprintf (dump_file,   Inlining %s into %s (always_inline).\n,
Index: tree-inline.c
===
--- tree-inline.c   (revision 200034)
+++ tree-inline.c   (working copy)
@@ -3905,8 +3905,6 @@ expand_call_inline (basic_block bb, gimple stmt, c
 for inlining, but 

Re: GCC does not support *mmintrin.h with function specific opts

2013-06-14 Thread Richard Biener
On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
   * tree-inline.c (expand_call_inline): Allow the error to be flagged
   in early inline pass.
   * ipa-inline.c (inline_always_inline_functions): Pretend always_inline
   functions are inlined during failures to flag an error.
   * gcc.target/i386/inline_error.c: New test.

 This patch is OK if it passes testing.

 Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing
 because of always_inline functions being present that cannot be
 inlined and the compiler is now generating error messages. I will fix
 them and resend the patch.

Quick look - pr43791.c is not expected to work at -O0, so skip -O0
for example by guarding the whole thing with #if __OPTIMIZED__  0.
Similar for pr44043.c.

That we didn't error at -O0 before is a bug.  Eventually I was suggesting
to error if we end up outputting the body of an always_inline function,
that is, any uses remain (including indirect calls or address-takens which
is where we do not error right now).

Richard.

 Thanks
 Sri

 Thanks for your patience!




 Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-14 Thread Jan Hubicka
 On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote:
  On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
* tree-inline.c (expand_call_inline): Allow the error to be flagged
in early inline pass.
* ipa-inline.c (inline_always_inline_functions): Pretend 
  always_inline
functions are inlined during failures to flag an error.
* gcc.target/i386/inline_error.c: New test.
 
  This patch is OK if it passes testing.
 
  Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing
  because of always_inline functions being present that cannot be
  inlined and the compiler is now generating error messages. I will fix
  them and resend the patch.
 
 Quick look - pr43791.c is not expected to work at -O0, so skip -O0
 for example by guarding the whole thing with #if __OPTIMIZED__  0.
 Similar for pr44043.c.
 
 That we didn't error at -O0 before is a bug.  Eventually I was suggesting
 to error if we end up outputting the body of an always_inline function,
 that is, any uses remain (including indirect calls or address-takens which
 is where we do not error right now).

Well, as dicussed earlier, this will make kernel folks crazy, since they 
define inline to be always inline and likes to take address of that function.
(of course that is crazy already)

Honza
 
 Richard.
 
  Thanks
  Sri
 
  Thanks for your patience!
 
 
 
 
  Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-14 Thread Jakub Jelinek
On Fri, Jun 14, 2013 at 10:43:34AM +0200, Richard Biener wrote:
 On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote:
  On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
* tree-inline.c (expand_call_inline): Allow the error to be flagged
in early inline pass.
* ipa-inline.c (inline_always_inline_functions): Pretend 
  always_inline
functions are inlined during failures to flag an error.
* gcc.target/i386/inline_error.c: New test.
 
  This patch is OK if it passes testing.
 
  Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing
  because of always_inline functions being present that cannot be
  inlined and the compiler is now generating error messages. I will fix
  them and resend the patch.
 
 Quick look - pr43791.c is not expected to work at -O0, so skip -O0
 for example by guarding the whole thing with #if __OPTIMIZED__  0.
 Similar for pr44043.c.
 
 That we didn't error at -O0 before is a bug.  Eventually I was suggesting
 to error if we end up outputting the body of an always_inline function,
 that is, any uses remain (including indirect calls or address-takens which
 is where we do not error right now).

Well, for indirect calls/address-takens and gnu_inline style extern inline
always_inline, it should be fine if we just emit calls to the library
function, otherwise you'd break a lot of code, because glibc uses
always_inline heavily for -D_FORTIFY_SOURCE wrapper inlines, open argument
checking etc.  But that body supposedly doesn't inherit the always_inline
attribute.

Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-14 Thread Sriraman Tallam
On Fri, Jun 14, 2013 at 1:43 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Jun 14, 2013 at 4:52 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
   * tree-inline.c (expand_call_inline): Allow the error to be flagged
   in early inline pass.
   * ipa-inline.c (inline_always_inline_functions): Pretend 
 always_inline
   functions are inlined during failures to flag an error.
   * gcc.target/i386/inline_error.c: New test.

 This patch is OK if it passes testing.

 Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing
 because of always_inline functions being present that cannot be
 inlined and the compiler is now generating error messages. I will fix
 them and resend the patch.

 Quick look - pr43791.c is not expected to work at -O0, so skip -O0
 for example by guarding the whole thing with #if __OPTIMIZED__  0.
 Similar for pr44043.c.

Seems like __OPTIMIZED__ is not defined in my config, dont know why. I
see other tests checking for __OPTIMIZED.

I fixed these two tests by adding a dg-prune-output at the end. Is
that reasonable? I have attached the patch with all tests passing now.

Thanks
Sri


 That we didn't error at -O0 before is a bug.  Eventually I was suggesting
 to error if we end up outputting the body of an always_inline function,
 that is, any uses remain (including indirect calls or address-takens which
 is where we do not error right now).

 Richard.

 Thanks
 Sri

 Thanks for your patience!




 Honza
* tree-inline.c (expand_call_inline): Allow the error to be flagged
in early inline pass.
* ipa-inline.c (inline_always_inline_functions): Pretend always_inline
functions are inlined during failures to flag an error.
* gcc.target/i386/inline_error.c: New test.
* gcc.c-torture/compile/pr44043.c: Fix test to expect an error.
* gcc.c-torture/compile/pr43791.c: Fix test to expect an error.

Index: tree-inline.c
===
--- tree-inline.c   (revision 200034)
+++ tree-inline.c   (working copy)
@@ -3905,8 +3905,6 @@ expand_call_inline (basic_block bb, gimple stmt, c
 for inlining, but we can't do that because frontends overwrite
 the body.  */
   !cg_edge-callee-local.redefined_extern_inline
- /* Avoid warnings during early inline pass. */
-  cgraph_global_info_ready
  /* PR 20090218-1_0.c. Body can be provided by another module. */
   (reason != CIF_BODY_NOT_AVAILABLE || !flag_generate_lto))
{
Index: ipa-inline.c
===
--- ipa-inline.c(revision 200034)
+++ ipa-inline.c(working copy)
@@ -1911,7 +1911,15 @@ inline_always_inline_functions (struct cgraph_node
}
 
   if (!can_early_inline_edge_p (e))
-   continue;
+   {
+ /* Set inlined to true if the callee is marked always_inline but
+is not inlinable.  This will allow flagging an error later in
+expand_call_inline in tree-inline.c.  */
+ if (lookup_attribute (always_inline,
+DECL_ATTRIBUTES (callee-symbol.decl)) != NULL)
+   inlined = true;
+ continue;
+   }
 
   if (dump_file)
fprintf (dump_file,   Inlining %s into %s (always_inline).\n,
Index: testsuite/gcc.target/i386/inline_error.c
===
--- testsuite/gcc.target/i386/inline_error.c(revision 0)
+++ testsuite/gcc.target/i386/inline_error.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O0 -mno-popcnt } */
+
+inline int __attribute__ ((__gnu_inline__, __always_inline__, 
target(popcnt)))
+foo () /* { dg-error inlining failed in call to always_inline .* target 
specific option mismatch } */
+{
+  return 0;
+}
+
+int bar()
+{
+  return foo (); /* { dg-error called from here } */
+}
Index: testsuite/gcc.c-torture/compile/pr44043.c
===
--- testsuite/gcc.c-torture/compile/pr44043.c   (revision 200034)
+++ testsuite/gcc.c-torture/compile/pr44043.c   (working copy)
@@ -85,3 +85,5 @@ int raw_sendmsg(struct sock *sk, struct msghdr *ms
 {
   raw_send_hdrinc(sk, msg-msg_iov, len, (void *)0, msg-msg_flags);
 }
+
+/* { dg-prune-output (inlining failed in call to always_inline.*indirect 
function call with a yet undetermined callee|called from here) } */
Index: testsuite/gcc.c-torture/compile/pr43791.c
===
--- testsuite/gcc.c-torture/compile/pr43791.c   (revision 200034)
+++ testsuite/gcc.c-torture/compile/pr43791.c   (working copy)
@@ -1,3 +1,4 @@
+
 int owner();
 int clear();
 
@@ -6,16 +7,16 @@ static void fixup() {
 }
 
 inline __attribute__ ((always_inline))
-void slowtrylock(void) {

Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Xinliang David Li
Can you create a helper function to flag the error and perhaps also
put that check inside can_inline_edge_p ?

David


On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi Honza,

I have isolated the ipa-inline.c part into a separate patch with a
 test and attached it here. The patch  is simple. Could you please take
 a look?

 * ipa-inline.c (can_early_inline_edge_p): Flag an error when
 the function that cannot be inlined is target specific.
 * gcc.target/i386/inline_error.c: New test.



 Thanks
 Sri

 On Mon, Jun 10, 2013 at 4:10 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Tue, Jun 4, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Ping, for review of ipa-inline.c change.

 Sri

 On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote:
 --- ipa-inline.c  (revision 198950)
 +++ ipa-inline.c  (working copy)
 @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e)
return false;
  }
if (!can_inline_edge_p (e, true))
 -return false;
 +{
 +  enum availability avail;
 +  struct cgraph_node *callee
 += cgraph_function_or_thunk_node (e-callee, avail);
 +  /* Flag an error when the inlining cannot happen because of 
 target option
 +  mismatch but the callee is marked as always_inline.  In -O0 
 mode
 +  this will go undetected because the error flagged in
 +  expand_call_inline in tree-inline.c might not execute and the
 +  inlining will not happen.  Then, the linker could complain about 
 a
 +  missing body for the callee if it turned out that the callee was
 +  also marked gnu_inline with extern inline keyword as bodies of 
 such
 +  functions are not generated.  */
 +  if ((!optimize
 +|| flag_no_inline)

 This should be if ((!optimize || flag_no_inline) on one line.

 I'd prefer also the testcase for the ICEs, something like:

 /* Test case to check if AVX intrinsics and function specific target
optimizations work together.  Check by including x86intrin.h  */

 /* { dg-do compile } */
 /* { dg-options -O2 -mno-sse -mno-avx } */

 #include x86intrin.h

 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }

 and another testcase that does:

 /* { dg-do compile } */
 #pragma GCC target (mavx) /* { dg-error whatever } */

 Otherwise it looks good to me, but I'd prefer the i?86 maintainers to 
 review
 it too (and Honza for ipa-inline.c?).

 Honza, could you please take a look at the ipa-inline.c fix? I will
 split the patches and submit after Honza's review. I will also make
 the changes mentioned.

 Thanks
 Sri



 Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Jan Hubicka
 Can you create a helper function to flag the error and perhaps also
 put that check inside can_inline_edge_p ?
 
 David
 
 
 On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote:
  Hi Honza,
 
 I have isolated the ipa-inline.c part into a separate patch with a
  test and attached it here. The patch  is simple. Could you please take
  a look?
 
  * ipa-inline.c (can_early_inline_edge_p): Flag an error when
  the function that cannot be inlined is target specific.
  * gcc.target/i386/inline_error.c: New test.

Sorry for taking ages to look at the patch, I was too hooked into other 
problems.
I also think can_early_inline_edge_p should not produce diagnostic - it is 
supposed
to be predicate.

So your problem is that the hard worker in tree-inline is not called at -O0
and thus errors are not output? I would suggest arranging 
inline_always_inline_functions
to return true even if inlining failed and thus making inline_calls to be 
called.

Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Sriraman Tallam
On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Can you create a helper function to flag the error and perhaps also
 put that check inside can_inline_edge_p ?

 David


 On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com wrote:
  Hi Honza,
 
 I have isolated the ipa-inline.c part into a separate patch with a
  test and attached it here. The patch  is simple. Could you please take
  a look?
 
  * ipa-inline.c (can_early_inline_edge_p): Flag an error when
  the function that cannot be inlined is target specific.
  * gcc.target/i386/inline_error.c: New test.

 Sorry for taking ages to look at the patch, I was too hooked into other 
 problems.
 I also think can_early_inline_edge_p should not produce diagnostic - it is 
 supposed
 to be predicate.

 So your problem is that the hard worker in tree-inline is not called at -O0
 and thus errors are not output? I would suggest arranging 
 inline_always_inline_functions
 to return true even if inlining failed and thus making inline_calls to be 
 called.

Thanks Honza!  Yes, that is the problem.  Should I just make it return
true for these special conditions (TARGET_MISMATCH + gnu_inline +
always_inline + ...)?

Thanks,
Sri


 Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Jan Hubicka
 On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Can you create a helper function to flag the error and perhaps also
  put that check inside can_inline_edge_p ?
 
  David
 
 
  On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com 
  wrote:
   Hi Honza,
  
  I have isolated the ipa-inline.c part into a separate patch with a
   test and attached it here. The patch  is simple. Could you please take
   a look?
  
   * ipa-inline.c (can_early_inline_edge_p): Flag an error when
   the function that cannot be inlined is target specific.
   * gcc.target/i386/inline_error.c: New test.
 
  Sorry for taking ages to look at the patch, I was too hooked into other 
  problems.
  I also think can_early_inline_edge_p should not produce diagnostic - it is 
  supposed
  to be predicate.
 
  So your problem is that the hard worker in tree-inline is not called at -O0
  and thus errors are not output? I would suggest arranging 
  inline_always_inline_functions
  to return true even if inlining failed and thus making inline_calls to be 
  called.
 
 Thanks Honza!  Yes, that is the problem.  Should I just make it return
 true for these special conditions (TARGET_MISMATCH + gnu_inline +
 always_inline + ...)?

Can't it just return true if there is any alwaysinline call?  I think if inline 
fails,
we always want to error, right?

Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Sriraman Tallam
On Thu, Jun 13, 2013 at 10:19 AM, Jan Hubicka hubi...@ucw.cz wrote:
 On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Can you create a helper function to flag the error and perhaps also
  put that check inside can_inline_edge_p ?
 
  David
 
 
  On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com 
  wrote:
   Hi Honza,
  
  I have isolated the ipa-inline.c part into a separate patch with a
   test and attached it here. The patch  is simple. Could you please take
   a look?
  
   * ipa-inline.c (can_early_inline_edge_p): Flag an error when
   the function that cannot be inlined is target specific.
   * gcc.target/i386/inline_error.c: New test.
 
  Sorry for taking ages to look at the patch, I was too hooked into other 
  problems.
  I also think can_early_inline_edge_p should not produce diagnostic - it is 
  supposed
  to be predicate.
 
  So your problem is that the hard worker in tree-inline is not called at -O0
  and thus errors are not output? I would suggest arranging 
  inline_always_inline_functions
  to return true even if inlining failed and thus making inline_calls to be 
  called.

 Thanks Honza!  Yes, that is the problem.  Should I just make it return
 true for these special conditions (TARGET_MISMATCH + gnu_inline +
 always_inline + ...)?

 Can't it just return true if there is any alwaysinline call?  I think if 
 inline fails,
 we always want to error, right?

Ok, patch attached that does this. Please let me know what you think.

Thanks
Sri


 Honza
* tree-inline.c (expand_call_inline): Allow the error to be flagged
in early inline pass.
* ipa-inline.c (inline_always_inline_functions): Pretend always_inline
functions are inlined during failures to flag an error.
* gcc.target/i386/inline_error.c: New test.

Index: tree-inline.c
===
--- tree-inline.c   (revision 200034)
+++ tree-inline.c   (working copy)
@@ -3905,8 +3905,6 @@ expand_call_inline (basic_block bb, gimple stmt, c
 for inlining, but we can't do that because frontends overwrite
 the body.  */
   !cg_edge-callee-local.redefined_extern_inline
- /* Avoid warnings during early inline pass. */
-  cgraph_global_info_ready
  /* PR 20090218-1_0.c. Body can be provided by another module. */
   (reason != CIF_BODY_NOT_AVAILABLE || !flag_generate_lto))
{
Index: ipa-inline.c
===
--- ipa-inline.c(revision 200034)
+++ ipa-inline.c(working copy)
@@ -1911,7 +1911,15 @@ inline_always_inline_functions (struct cgraph_node
}
 
   if (!can_early_inline_edge_p (e))
-   continue;
+   {
+ /* Set inlined to true if the callee is marked always_inline but
+is not inlinable.  This will allow flagging an error later in
+expand_call_inline in tree-inline.c.  */
+ if (lookup_attribute (always_inline,
+DECL_ATTRIBUTES (callee-symbol.decl)) != NULL)
+   inlined = true;
+ continue;
+   }
 
   if (dump_file)
fprintf (dump_file,   Inlining %s into %s (always_inline).\n,
Index: testsuite/gcc.target/i386/inline_error.c
===
--- testsuite/gcc.target/i386/inline_error.c(revision 0)
+++ testsuite/gcc.target/i386/inline_error.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O0 -mno-popcnt } */
+
+inline int __attribute__ ((__gnu_inline__, __always_inline__, 
target(popcnt)))
+foo () /* { dg-error inlining failed in call to always_inline .* target 
specific option mismatch } */
+{
+  return 0;
+}
+
+int bar()
+{
+  return foo (); /* { dg-error called from here } */
+}


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Xinliang David Li
If you want to flag errors for all possible wrongly used always_inline
attribute, should this change be done in can_inline_edge_p? Or keep
your current change, but also add a warning (something like 'always
inline function is ignored etc') in inline_always_inline_functions
when inline transformation can not meaningfully give a useful error.

David

On Thu, Jun 13, 2013 at 11:33 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Jun 13, 2013 at 10:19 AM, Jan Hubicka hubi...@ucw.cz wrote:
 On Thu, Jun 13, 2013 at 10:07 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Can you create a helper function to flag the error and perhaps also
  put that check inside can_inline_edge_p ?
 
  David
 
 
  On Wed, Jun 12, 2013 at 6:00 PM, Sriraman Tallam tmsri...@google.com 
  wrote:
   Hi Honza,
  
  I have isolated the ipa-inline.c part into a separate patch with a
   test and attached it here. The patch  is simple. Could you please take
   a look?
  
   * ipa-inline.c (can_early_inline_edge_p): Flag an error when
   the function that cannot be inlined is target specific.
   * gcc.target/i386/inline_error.c: New test.
 
  Sorry for taking ages to look at the patch, I was too hooked into other 
  problems.
  I also think can_early_inline_edge_p should not produce diagnostic - it 
  is supposed
  to be predicate.
 
  So your problem is that the hard worker in tree-inline is not called at 
  -O0
  and thus errors are not output? I would suggest arranging 
  inline_always_inline_functions
  to return true even if inlining failed and thus making inline_calls to be 
  called.

 Thanks Honza!  Yes, that is the problem.  Should I just make it return
 true for these special conditions (TARGET_MISMATCH + gnu_inline +
 always_inline + ...)?

 Can't it just return true if there is any alwaysinline call?  I think if 
 inline fails,
 we always want to error, right?

 Ok, patch attached that does this. Please let me know what you think.

 Thanks
 Sri


 Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Jan Hubicka
   * tree-inline.c (expand_call_inline): Allow the error to be flagged
   in early inline pass.
   * ipa-inline.c (inline_always_inline_functions): Pretend always_inline
   functions are inlined during failures to flag an error.
   * gcc.target/i386/inline_error.c: New test.
This patch is OK if it passes testing.
Thanks for your patience!

Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Jan Hubicka
 If you want to flag errors for all possible wrongly used always_inline
 attribute, should this change be done in can_inline_edge_p? Or keep
 your current change, but also add a warning (something like 'always
 inline function is ignored etc') in inline_always_inline_functions
 when inline transformation can not meaningfully give a useful error.

We have the CIF error codes to be able to give useful diagnostic at
transformation time.  I think it is better to have all the diagnostic output at
one place unless we have really good reasons to fork it.  We are not losing any
precision here, right?

Sure, other option would be to move all alwaysinline diagnostic into
inline_always_inline_functions and remove the code path in tree-inline.  The
warnings however are quite meaningfully places in tree-inline, because we want
to warn only after all the inlining algorithms has finished and inlining really
did not happen. They can be moved out there if we walk the edges at end and
output diagnostic, but I do not see anything really wrong with their current
location.

Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Xinliang David Li
yes, what you said makes sense.

thanks,

David

On Thu, Jun 13, 2013 at 12:45 PM, Jan Hubicka hubi...@ucw.cz wrote:
 If you want to flag errors for all possible wrongly used always_inline
 attribute, should this change be done in can_inline_edge_p? Or keep
 your current change, but also add a warning (something like 'always
 inline function is ignored etc') in inline_always_inline_functions
 when inline transformation can not meaningfully give a useful error.

 We have the CIF error codes to be able to give useful diagnostic at
 transformation time.  I think it is better to have all the diagnostic output 
 at
 one place unless we have really good reasons to fork it.  We are not losing 
 any
 precision here, right?

 Sure, other option would be to move all alwaysinline diagnostic into
 inline_always_inline_functions and remove the code path in tree-inline.  The
 warnings however are quite meaningfully places in tree-inline, because we want
 to warn only after all the inlining algorithms has finished and inlining 
 really
 did not happen. They can be moved out there if we walk the edges at end and
 output diagnostic, but I do not see anything really wrong with their current
 location.

 Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-13 Thread Sriraman Tallam
On Thu, Jun 13, 2013 at 12:40 PM, Jan Hubicka hubi...@ucw.cz wrote:
   * tree-inline.c (expand_call_inline): Allow the error to be flagged
   in early inline pass.
   * ipa-inline.c (inline_always_inline_functions): Pretend always_inline
   functions are inlined during failures to flag an error.
   * gcc.target/i386/inline_error.c: New test.

 This patch is OK if it passes testing.

Two tests gcc.c-torture/compile/pr43791.c and pr44043.c are failing
because of always_inline functions being present that cannot be
inlined and the compiler is now generating error messages. I will fix
them and resend the patch.

Thanks
Sri

 Thanks for your patience!




 Honza


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-12 Thread Sriraman Tallam
Hi Honza,

   I have isolated the ipa-inline.c part into a separate patch with a
test and attached it here. The patch  is simple. Could you please take
a look?

* ipa-inline.c (can_early_inline_edge_p): Flag an error when
the function that cannot be inlined is target specific.
* gcc.target/i386/inline_error.c: New test.



Thanks
Sri

On Mon, Jun 10, 2013 at 4:10 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Tue, Jun 4, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping, for review of ipa-inline.c change.

 Sri

 On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote:
 --- ipa-inline.c  (revision 198950)
 +++ ipa-inline.c  (working copy)
 @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e)
return false;
  }
if (!can_inline_edge_p (e, true))
 -return false;
 +{
 +  enum availability avail;
 +  struct cgraph_node *callee
 += cgraph_function_or_thunk_node (e-callee, avail);
 +  /* Flag an error when the inlining cannot happen because of 
 target option
 +  mismatch but the callee is marked as always_inline.  In -O0 mode
 +  this will go undetected because the error flagged in
 +  expand_call_inline in tree-inline.c might not execute and the
 +  inlining will not happen.  Then, the linker could complain about a
 +  missing body for the callee if it turned out that the callee was
 +  also marked gnu_inline with extern inline keyword as bodies of 
 such
 +  functions are not generated.  */
 +  if ((!optimize
 +|| flag_no_inline)

 This should be if ((!optimize || flag_no_inline) on one line.

 I'd prefer also the testcase for the ICEs, something like:

 /* Test case to check if AVX intrinsics and function specific target
optimizations work together.  Check by including x86intrin.h  */

 /* { dg-do compile } */
 /* { dg-options -O2 -mno-sse -mno-avx } */

 #include x86intrin.h

 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }

 and another testcase that does:

 /* { dg-do compile } */
 #pragma GCC target (mavx) /* { dg-error whatever } */

 Otherwise it looks good to me, but I'd prefer the i?86 maintainers to 
 review
 it too (and Honza for ipa-inline.c?).

 Honza, could you please take a look at the ipa-inline.c fix? I will
 split the patches and submit after Honza's review. I will also make
 the changes mentioned.

 Thanks
 Sri



 Jakub
* ipa-inline.c (can_early_inline_edge_p): Flag an error when
the function that cannot be inlined is target specific.
* gcc.target/i386/inline_error.c: New test.

Index: testsuite/gcc.target/i386/inline_error.c
===
--- testsuite/gcc.target/i386/inline_error.c(revision 0)
+++ testsuite/gcc.target/i386/inline_error.c(revision 0)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options -O0 -mno-popcnt } */
+
+inline int __attribute__ ((__gnu_inline__, __always_inline__, 
target(popcnt)))
+foo () /* { dg-error inlining failed in call to extern gnu_inline .* target 
specific option mismatch } */
+{
+  return 0;
+}
+
+int bar()
+{
+  return foo ();
+} /* { dg-error called from here } */
Index: ipa-inline.c
===
--- ipa-inline.c(revision 200034)
+++ ipa-inline.c(working copy)
@@ -374,7 +374,31 @@ can_early_inline_edge_p (struct cgraph_edge *e)
   return false;
 }
   if (!can_inline_edge_p (e, true))
-return false;
+{
+  enum availability avail;
+  struct cgraph_node *callee
+= cgraph_function_or_thunk_node (e-callee, avail);
+  /* Flag an error here when the inlining cannot happen because of target
+option mismatch but the callee is marked as always_inline.
+Otherwise, in -O0 mode this could go unreported because the error
+flagged in expand_call_inline in tree-inline.c might not execute.
+Then, the linker could complain about a missing body for the callee
+if it turned out that the callee was also marked gnu_inline with
+extern inline keyword as bodies of such functions are not
+generated.  */ 
+  if (!optimize
+  e-inline_failed == CIF_TARGET_OPTION_MISMATCH
+  lookup_attribute (always_inline, DECL_ATTRIBUTES 
(callee-symbol.decl))
+  lookup_attribute (gnu_inline, DECL_ATTRIBUTES 
(callee-symbol.decl))
+  DECL_DECLARED_INLINE_P (callee-symbol.decl))
+   {
+ error (inlining failed in call to extern gnu_inline %q+F: %s,
+callee-symbol.decl,
+cgraph_inline_failed_string 

Re: GCC does not support *mmintrin.h with function specific opts

2013-06-10 Thread Sriraman Tallam
Ping.

On Tue, Jun 4, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping, for review of ipa-inline.c change.

 Sri

 On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote:
 --- ipa-inline.c  (revision 198950)
 +++ ipa-inline.c  (working copy)
 @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e)
return false;
  }
if (!can_inline_edge_p (e, true))
 -return false;
 +{
 +  enum availability avail;
 +  struct cgraph_node *callee
 += cgraph_function_or_thunk_node (e-callee, avail);
 +  /* Flag an error when the inlining cannot happen because of target 
 option
 +  mismatch but the callee is marked as always_inline.  In -O0 mode
 +  this will go undetected because the error flagged in
 +  expand_call_inline in tree-inline.c might not execute and the
 +  inlining will not happen.  Then, the linker could complain about a
 +  missing body for the callee if it turned out that the callee was
 +  also marked gnu_inline with extern inline keyword as bodies of 
 such
 +  functions are not generated.  */
 +  if ((!optimize
 +|| flag_no_inline)

 This should be if ((!optimize || flag_no_inline) on one line.

 I'd prefer also the testcase for the ICEs, something like:

 /* Test case to check if AVX intrinsics and function specific target
optimizations work together.  Check by including x86intrin.h  */

 /* { dg-do compile } */
 /* { dg-options -O2 -mno-sse -mno-avx } */

 #include x86intrin.h

 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }

 and another testcase that does:

 /* { dg-do compile } */
 #pragma GCC target (mavx) /* { dg-error whatever } */

 Otherwise it looks good to me, but I'd prefer the i?86 maintainers to 
 review
 it too (and Honza for ipa-inline.c?).

 Honza, could you please take a look at the ipa-inline.c fix? I will
 split the patches and submit after Honza's review. I will also make
 the changes mentioned.

 Thanks
 Sri



 Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-06-04 Thread Sriraman Tallam
Ping.

On Thu, May 23, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping, for review of ipa-inline.c change.

 Sri

 On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote:
 --- ipa-inline.c  (revision 198950)
 +++ ipa-inline.c  (working copy)
 @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e)
return false;
  }
if (!can_inline_edge_p (e, true))
 -return false;
 +{
 +  enum availability avail;
 +  struct cgraph_node *callee
 += cgraph_function_or_thunk_node (e-callee, avail);
 +  /* Flag an error when the inlining cannot happen because of target 
 option
 +  mismatch but the callee is marked as always_inline.  In -O0 mode
 +  this will go undetected because the error flagged in
 +  expand_call_inline in tree-inline.c might not execute and the
 +  inlining will not happen.  Then, the linker could complain about a
 +  missing body for the callee if it turned out that the callee was
 +  also marked gnu_inline with extern inline keyword as bodies of 
 such
 +  functions are not generated.  */
 +  if ((!optimize
 +|| flag_no_inline)

 This should be if ((!optimize || flag_no_inline) on one line.

 I'd prefer also the testcase for the ICEs, something like:

 /* Test case to check if AVX intrinsics and function specific target
optimizations work together.  Check by including x86intrin.h  */

 /* { dg-do compile } */
 /* { dg-options -O2 -mno-sse -mno-avx } */

 #include x86intrin.h

 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }

 and another testcase that does:

 /* { dg-do compile } */
 #pragma GCC target (mavx) /* { dg-error whatever } */

 Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review
 it too (and Honza for ipa-inline.c?).

 Honza, could you please take a look at the ipa-inline.c fix? I will
 split the patches and submit after Honza's review. I will also make
 the changes mentioned.

 Thanks
 Sri



 Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-23 Thread Sriraman Tallam
Ping, for review of ipa-inline.c change.

Sri

On Mon, May 20, 2013 at 11:04 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote:
 --- ipa-inline.c  (revision 198950)
 +++ ipa-inline.c  (working copy)
 @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e)
return false;
  }
if (!can_inline_edge_p (e, true))
 -return false;
 +{
 +  enum availability avail;
 +  struct cgraph_node *callee
 += cgraph_function_or_thunk_node (e-callee, avail);
 +  /* Flag an error when the inlining cannot happen because of target 
 option
 +  mismatch but the callee is marked as always_inline.  In -O0 mode
 +  this will go undetected because the error flagged in
 +  expand_call_inline in tree-inline.c might not execute and the
 +  inlining will not happen.  Then, the linker could complain about a
 +  missing body for the callee if it turned out that the callee was
 +  also marked gnu_inline with extern inline keyword as bodies of such
 +  functions are not generated.  */
 +  if ((!optimize
 +|| flag_no_inline)

 This should be if ((!optimize || flag_no_inline) on one line.

 I'd prefer also the testcase for the ICEs, something like:

 /* Test case to check if AVX intrinsics and function specific target
optimizations work together.  Check by including x86intrin.h  */

 /* { dg-do compile } */
 /* { dg-options -O2 -mno-sse -mno-avx } */

 #include x86intrin.h

 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }

 and another testcase that does:

 /* { dg-do compile } */
 #pragma GCC target (mavx) /* { dg-error whatever } */

 Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review
 it too (and Honza for ipa-inline.c?).

 Honza, could you please take a look at the ipa-inline.c fix? I will
 split the patches and submit after Honza's review. I will also make
 the changes mentioned.

 Thanks
 Sri



 Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-20 Thread Sriraman Tallam
On Fri, May 17, 2013 at 11:21 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote:
 --- ipa-inline.c  (revision 198950)
 +++ ipa-inline.c  (working copy)
 @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e)
return false;
  }
if (!can_inline_edge_p (e, true))
 -return false;
 +{
 +  enum availability avail;
 +  struct cgraph_node *callee
 += cgraph_function_or_thunk_node (e-callee, avail);
 +  /* Flag an error when the inlining cannot happen because of target 
 option
 +  mismatch but the callee is marked as always_inline.  In -O0 mode
 +  this will go undetected because the error flagged in
 +  expand_call_inline in tree-inline.c might not execute and the
 +  inlining will not happen.  Then, the linker could complain about a
 +  missing body for the callee if it turned out that the callee was
 +  also marked gnu_inline with extern inline keyword as bodies of such
 +  functions are not generated.  */
 +  if ((!optimize
 +|| flag_no_inline)

 This should be if ((!optimize || flag_no_inline) on one line.

 I'd prefer also the testcase for the ICEs, something like:

 /* Test case to check if AVX intrinsics and function specific target
optimizations work together.  Check by including x86intrin.h  */

 /* { dg-do compile } */
 /* { dg-options -O2 -mno-sse -mno-avx } */

 #include x86intrin.h

 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }

 and another testcase that does:

 /* { dg-do compile } */
 #pragma GCC target (mavx) /* { dg-error whatever } */

 Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review
 it too (and Honza for ipa-inline.c?).

Honza, could you please take a look at the ipa-inline.c fix? I will
split the patches and submit after Honza's review. I will also make
the changes mentioned.

Thanks
Sri



 Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-19 Thread Uros Bizjak
On Sat, May 18, 2013 at 6:00 AM, Sriraman Tallam tmsri...@google.com wrote:

  I don't really understand why you made the change to x86intrin.h instead 
  of
  making it inside each *mmintrin.h header. The code would be the same size,
  it would let us include smmintrin.h directly if we wanted to, and
  x86intrin.h would also automatically work.

 Right, I should have done that instead!

 Yeah, definitely.  For the standalone headers, which have currently
 __FEATURE__ guards inside of it, please replace it by the larger snippets
 involving #pragma, and in the x86intrin.h/immintrin.h headers include those
 unconditionally, instead of just if __FEATURE__ is defined.
 For the non-standalone headers (newer ones like avxintrin.h), replace
 the #ifdef __FEATURE__ in immintrin.h/x86intrin.h with larger snippets.


 * I did mostly as suggested except that even for avx and avx2 headers
 I did not see the harm in doing it in the header itself.  AVX header
 did not have the #ifndef _AVXINTRIN_H_INCLUDED which I added before
 doing this. I have added test cases to show it is doing the right
 thing  for avx.
 * I also found that when the caller to these intrinsics do not have
 the right target attribute, an error is raised in -O2 mode but not in
 -O0 mode.  I have fixed this with a patch to ipa-inline.c, please see
 if this is alright. Test case intrinsics_5.c checks if an error is
 raised.
 * LZCNT needed to be handled which is done now.

* common/config/i386/i386-common.c: Handle LZCNT.

The above part is OK for mainline and release patches. Please commit
LZCNT part as a separate patch to mainline.

Also, please get middle-end part reviewed and committed before we
proceed with target-dependent part.

Thanks,
Uros.


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-18 Thread Jakub Jelinek
On Fri, May 17, 2013 at 09:00:21PM -0700, Sriraman Tallam wrote:
 --- ipa-inline.c  (revision 198950)
 +++ ipa-inline.c  (working copy)
 @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e)
return false;
  }
if (!can_inline_edge_p (e, true))
 -return false;
 +{
 +  enum availability avail;
 +  struct cgraph_node *callee
 += cgraph_function_or_thunk_node (e-callee, avail);
 +  /* Flag an error when the inlining cannot happen because of target 
 option
 +  mismatch but the callee is marked as always_inline.  In -O0 mode
 +  this will go undetected because the error flagged in
 +  expand_call_inline in tree-inline.c might not execute and the
 +  inlining will not happen.  Then, the linker could complain about a
 +  missing body for the callee if it turned out that the callee was
 +  also marked gnu_inline with extern inline keyword as bodies of such
 +  functions are not generated.  */ 
 +  if ((!optimize
 +|| flag_no_inline)

This should be if ((!optimize || flag_no_inline) on one line.

I'd prefer also the testcase for the ICEs, something like:

/* Test case to check if AVX intrinsics and function specific target
   optimizations work together.  Check by including x86intrin.h  */

/* { dg-do compile } */
/* { dg-options -O2 -mno-sse -mno-avx } */

#include x86intrin.h

__m256 a, b, c;
void __attribute__((target (avx)))
foo (void)
{
  a = _mm256_and_ps (b, c);
}

and another testcase that does:

/* { dg-do compile } */
#pragma GCC target (mavx) /* { dg-error whatever } */

Otherwise it looks good to me, but I'd prefer the i?86 maintainers to review
it too (and Honza for ipa-inline.c?).

Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-16 Thread Sriraman Tallam
Hi Jakub,

   I have taken your proposed changes and made patch for this.  Please
let me know what you think. I have changed only the headers mmintrin.h
and x86intrin.h as that includes all the other headers.  The builtins
get enabled automatically when the pragma target is specified so need
to do any thing to def_builtin.  I have included 4 test case, where
intrinsics_4.c uses your example with __mm256_and_ps. I had to fix a
bug with lzcnt builtins in i386-common.c as that was not handled
there.

Thanks
Sri

On Wed, May 15, 2013 at 7:25 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, May 14, 2013 at 3:04 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote:
 When trying with -O2 -mno-avx:
 #ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target(avx)
 #define __DISABLE_AVX__
 #endif
 typedef float __v8sf __attribute__ ((__vector_size__ (32)));
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
 __artificial__))
 _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
 __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
 #ifdef __DISABLE_AVX__
 #pragma GCC pop_options
 #undef __DISABLE_AVX__
 #endif
 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }
 we get bogus errors and ICE:
 tty2.c: In function '_mm256_and_ps':
 tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has 
 changed in GCC 4.6
 tty2.c: In function 'foo':
 tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32
 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486
 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*)
   ../../gcc/expr.c:3485
 (I have added 1 || instead of your generate_builtins into i386.c
 (def_builtin)), that just shows that target attribute/pragma support still
 has very severe issues that need to be fixed, instead of papered around.

 Note, we ICE on:
 #pragma GCC target (mavx)
 That should be fixed too.

 Ok, I had a brief look at the above two issues.

 The first testcase has the problem that the ix86_previous_fndecl cache
 gets out of date.  When set_cfun is called on _mm256_and_ps (with the
 implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps,
 TARGET_AVX is set to true, target reinited.  Then set_cfun is called
 with NULL, we don't do anything.  Later on #pragma GCC pop_options appears,
 sets !TARGET_AVX (as that is the new target_option_current_node).
 Next foo is being parsed, avx attribute is noticed, the same target node
 is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's
 target node is the same as foo's and so we don't do cl_target_restore_option
 at all, so !TARGET_AVX remains, while it should be set.  That is the reason
 for the bogus inform etc.  Fixed by resetting the ix86_previous_fndecl cache
 on any #pragma GCC target below.  The #pragma GCC target (mavx) is also
 fixed below.  The patch also includes the 1 || to enable building all
 builtins.  We still ICE with:
 #0  fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, 
 function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180
 #1  0x00771c39 in convert_move (to=0x71b2df00, 
 from=0x71b314e0, unsignedp=0) at ../../gcc/expr.c:316
 #2  0x0078009f in store_expr (exp=0x719ab390, 
 target=0x71b2df00, call_param_p=0, nontemporal=false) at 
 ../../gcc/expr.c:5300
 #3  0x0077eba1 in expand_assignment (to=0x71b35090, 
 from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025
 on the first testcase.  We don't ICE say on:
 #ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target(avx)
 #define __DISABLE_AVX__
 #endif
 typedef float __v8sf __attribute__ ((__vector_size__ (32)));
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
 __artificial__))
 _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
 __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
 #ifdef __DISABLE_AVX__
 #pragma GCC pop_options
 #undef __DISABLE_AVX__
 #endif
 __m256 a[10], b[10], c[10];
 void __attribute__((target (avx)))
 foo (void)
 {
   a[0] = _mm256_and_ps (b[0], c[0]);
 }
 The problem is that in the first testcase, the VAR_DECL c (guess also b and
 a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector
 types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode
 (it has been laid out while -mno-avx has been the current) and also
 DECL_RTL which is a mem:BLK.  Guess expr.c would need to special case
 TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have
 DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM
 to the desired mode?

 --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100
 +++ gcc/config/i386/i386-c.c2013-05-14 

Re: GCC does not support *mmintrin.h with function specific opts

2013-05-16 Thread Marc Glisse

On Thu, 16 May 2013, Sriraman Tallam wrote:


Hi Jakub,

  I have taken your proposed changes and made patch for this.  Please
let me know what you think. I have changed only the headers mmintrin.h
and x86intrin.h as that includes all the other headers.


I don't really understand why you made the change to x86intrin.h instead 
of making it inside each *mmintrin.h header. The code would be the same 
size, it would let us include smmintrin.h directly if we wanted to, and 
x86intrin.h would also automatically work.


--
Marc Glisse


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-16 Thread Sriraman Tallam
On Thu, May 16, 2013 at 3:55 PM, Marc Glisse marc.gli...@inria.fr wrote:
 On Thu, 16 May 2013, Sriraman Tallam wrote:

 Hi Jakub,

   I have taken your proposed changes and made patch for this.  Please
 let me know what you think. I have changed only the headers mmintrin.h
 and x86intrin.h as that includes all the other headers.


 I don't really understand why you made the change to x86intrin.h instead of
 making it inside each *mmintrin.h header. The code would be the same size,
 it would let us include smmintrin.h directly if we wanted to, and
 x86intrin.h would also automatically work.

Right, I should have done that instead!

Sri


 --
 Marc Glisse


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-16 Thread Jakub Jelinek
On Thu, May 16, 2013 at 04:00:53PM -0700, Sriraman Tallam wrote:
 On Thu, May 16, 2013 at 3:55 PM, Marc Glisse marc.gli...@inria.fr wrote:
  I don't really understand why you made the change to x86intrin.h instead of
  making it inside each *mmintrin.h header. The code would be the same size,
  it would let us include smmintrin.h directly if we wanted to, and
  x86intrin.h would also automatically work.
 
 Right, I should have done that instead!

Yeah, definitely.  For the standalone headers, which have currently
__FEATURE__ guards inside of it, please replace it by the larger snippets
involving #pragma, and in the x86intrin.h/immintrin.h headers include those
unconditionally, instead of just if __FEATURE__ is defined.
For the non-standalone headers (newer ones like avxintrin.h), replace
the #ifdef __FEATURE__ in immintrin.h/x86intrin.h with larger snippets.

Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-15 Thread Sriraman Tallam
On Tue, May 14, 2013 at 3:04 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote:
 When trying with -O2 -mno-avx:
 #ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target(avx)
 #define __DISABLE_AVX__
 #endif
 typedef float __v8sf __attribute__ ((__vector_size__ (32)));
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
 __artificial__))
 _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
 __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
 #ifdef __DISABLE_AVX__
 #pragma GCC pop_options
 #undef __DISABLE_AVX__
 #endif
 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }
 we get bogus errors and ICE:
 tty2.c: In function '_mm256_and_ps':
 tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has 
 changed in GCC 4.6
 tty2.c: In function 'foo':
 tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32
 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486
 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*)
   ../../gcc/expr.c:3485
 (I have added 1 || instead of your generate_builtins into i386.c
 (def_builtin)), that just shows that target attribute/pragma support still
 has very severe issues that need to be fixed, instead of papered around.

 Note, we ICE on:
 #pragma GCC target (mavx)
 That should be fixed too.

 Ok, I had a brief look at the above two issues.

 The first testcase has the problem that the ix86_previous_fndecl cache
 gets out of date.  When set_cfun is called on _mm256_and_ps (with the
 implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps,
 TARGET_AVX is set to true, target reinited.  Then set_cfun is called
 with NULL, we don't do anything.  Later on #pragma GCC pop_options appears,
 sets !TARGET_AVX (as that is the new target_option_current_node).
 Next foo is being parsed, avx attribute is noticed, the same target node
 is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's
 target node is the same as foo's and so we don't do cl_target_restore_option
 at all, so !TARGET_AVX remains, while it should be set.  That is the reason
 for the bogus inform etc.  Fixed by resetting the ix86_previous_fndecl cache
 on any #pragma GCC target below.  The #pragma GCC target (mavx) is also
 fixed below.  The patch also includes the 1 || to enable building all
 builtins.  We still ICE with:
 #0  fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, 
 function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180
 #1  0x00771c39 in convert_move (to=0x71b2df00, 
 from=0x71b314e0, unsignedp=0) at ../../gcc/expr.c:316
 #2  0x0078009f in store_expr (exp=0x719ab390, 
 target=0x71b2df00, call_param_p=0, nontemporal=false) at 
 ../../gcc/expr.c:5300
 #3  0x0077eba1 in expand_assignment (to=0x71b35090, 
 from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025
 on the first testcase.  We don't ICE say on:
 #ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target(avx)
 #define __DISABLE_AVX__
 #endif
 typedef float __v8sf __attribute__ ((__vector_size__ (32)));
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
 __artificial__))
 _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
 __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
 #ifdef __DISABLE_AVX__
 #pragma GCC pop_options
 #undef __DISABLE_AVX__
 #endif
 __m256 a[10], b[10], c[10];
 void __attribute__((target (avx)))
 foo (void)
 {
   a[0] = _mm256_and_ps (b[0], c[0]);
 }
 The problem is that in the first testcase, the VAR_DECL c (guess also b and
 a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector
 types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode
 (it has been laid out while -mno-avx has been the current) and also
 DECL_RTL which is a mem:BLK.  Guess expr.c would need to special case
 TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have
 DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM
 to the desired mode?

 --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100
 +++ gcc/config/i386/i386-c.c2013-05-14 11:46:50.773806894 +0200
 @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree args, tre

if (! args)
  {
 -  cur_tree = ((pop_target)
 - ? pop_target
 - : target_option_default_node);
 +  cur_tree = (pop_target ? pop_target : target_option_default_node);
cl_target_option_restore (global_options,
 TREE_TARGET_OPTION (cur_tree));
  }
else
  {
cur_tree = ix86_valid_target_attribute_tree (args);
 -  if (!cur_tree)
 -   return false;
 +  if (!cur_tree || cur_tree == 

Re: GCC does not support *mmintrin.h with function specific opts

2013-05-14 Thread Uros Bizjak
On Mon, May 13, 2013 at 11:46 PM, H.J. Lu hjl.to...@gmail.com wrote:

 On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 Any new option needs documenting in invoke.texi.

 Added and new patch attached.

 Thanks
 Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com

 It looks good to me.  But I can't approve it.

* config/i386/i386.c (construct_container): Do not issue SSE
return error for extern gnu_inline functions.
(def_builtin): Do not generate builtins when -mno-generate-builtins
is used.
* doc/invoke.texi: Document option -mgenerate-builtins.
* config/i386/i386.opt (mgenerate-builtins): New target option.
* config/i386/i386-c.c (ix86_target_macros_internal): Define macro
__ALL_ISA__ when generate_target_builtins is true.
* testsuite/gcc.target/i386/intrinsics_1.c: New test.
* testsuite/gcc.target/i386/intrinsics_2.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_3.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_4.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_5.c: Ditto.
* config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined.
* config/i386/lwpintrin.h: Ditto.
* config/i386/xopintrin.h: Ditto.
* config/i386/fmaintrin.h: Ditto.
* config/i386/bmiintrin.h: Ditto.
* config/i386/fma4intrin.h: Ditto.
* config/i386/nmmintrin.h: Ditto.
* config/i386/tbmintrin.h: Ditto.
* config/i386/smmintrin.h: Ditto.
* config/i386/wmmintrin.h: Ditto.
* config/i386/popcntintrin.h: Ditto.
* config/i386/f16cintrin.h: Ditto.
* config/i386/pmmintrin.h: Ditto.
* config/i386/bmi2intrin.h: Ditto.
* config/i386/tmmintrin.h: Ditto.
* config/i386/xmmintrin.h: Ditto.
* config/i386/mmintrin.h: Ditto.
* config/i386/ammintrin.h: Ditto.
* config/i386/emmintrin.h: Ditto.

I think that the option should be named -mtarget-builtins.

Since HJ is OK with this user-visible change, the patch is OK for
mainline (with eventually renamed option). We also have an option to
switch this new functionality off, and we are early enough in the
development cycle to find out if anything is fundamentaly broken with
this approach.

BTW, does this patch address the request in PR57202?

Thanks,
Uros.


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-14 Thread Jakub Jelinek
On Tue, May 14, 2013 at 08:58:55AM +0200, Uros Bizjak wrote:
 I think that the option should be named -mtarget-builtins.

There shouldn't be an option for it at all.  If constructing the builtins is
slow (it is), we should just create them lazily, the
*builtin_decl_{explicit,implicit}* APIs were a first step for that, plus
we should build some gperf table of all the generic and all the target
builtins and record prefixes used to find them (__builtin_, __sync_,
__atomic_, what else?), then just teach FE that if they are looking up
a symbol prefixed with one of these, they should also look it up
in the perfect hash table and if found there, call some function to
construct the builtin.  Of course, this isn't a prerequisite of the
changes you are looking for, but introducing an option that hopefully will
be completely useless in a few months is just a bad idea.

 Since HJ is OK with this user-visible change, the patch is OK for
 mainline (with eventually renamed option). We also have an option to
 switch this new functionality off, and we are early enough in the
 development cycle to find out if anything is fundamentaly broken with
 this approach.
 
 BTW, does this patch address the request in PR57202?

I'm strongly against the patch in its current form, it is a hack rather
than a solution.  I don't see how it could be properly tested, when say
immintrin.h and x86intrin.h is still full of code like:
#ifdef __AVX__
#include avxintrin.h
#endif
so when you just
#include x86intrin.h
rather than the few headers that were tested, you are out of luck.

The right solution is really IMNSHO something along the lines of:
#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target(avx)
#define __AVXINTRIN_H_POP_OPTIONS__
#endif
...
#ifdef __AVXINTRIN_H_POP_OPTIONS__
#pragma GCC pop_options
#undef __AVXINTRIN_H_POP_OPTIONS__
#endif

around the headers.  As can be seen on say -O2 -mno-avx:
#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target(avx)
#define __DISABLE_AVX__
#endif
typedef float __v8sf __attribute__((vector_size (32)));
extern __v8sf a, b;
extern __inline int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__)) bar (int x) { a = b + 1.0f; return x + 1; }
#ifdef __DISABLE_AVX__
#pragma GCC pop_options
#undef __DISABLE_AVX__
#endif
int __attribute__((target (avx2))) avx2 (int x) { return bar (x) + bar (x + 
1); }
int __attribute__((target (avx))) avx (int x) { return bar (x) + bar (x + 1); 
}
int __attribute__((target (xop))) xop (int x) { return bar (x) + bar (x + 1); 
}
int __attribute__((target (sse2))) sse2 (int x) { return bar (x) + bar (x + 
1); }
int nothing (int x) { return bar (x) + bar (x + 1); }
the inliner happily inlines the avx target function into avx/avx2/xop
targetted functions (correct), inlines it even into sse2 (something that
should be fixed not to), and doesn't inline into nothing (IMHO correct, we
want an error in that case, one shouldn't use avx intrinsics in say sse2
only targetted function (unless -mavx is used on command line, i.e. the
check should always be if the caller's target set is equal or superset of
callee's target set).

When trying with -O2 -mno-avx:
#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target(avx)
#define __DISABLE_AVX__
#endif
typedef float __v8sf __attribute__ ((__vector_size__ (32)));
typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
_mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
__builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
#ifdef __DISABLE_AVX__
#pragma GCC pop_options
#undef __DISABLE_AVX__
#endif
__m256 a, b, c;
void __attribute__((target (avx)))
foo (void)
{
  a = _mm256_and_ps (b, c);
}
we get bogus errors and ICE:
tty2.c: In function '_mm256_and_ps':
tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has 
changed in GCC 4.6
tty2.c: In function 'foo':
tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32
tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486
0x77a3d2 emit_move_insn(rtx_def*, rtx_def*)
../../gcc/expr.c:3485
(I have added 1 || instead of your generate_builtins into i386.c
(def_builtin)), that just shows that target attribute/pragma support still
has very severe issues that need to be fixed, instead of papered around.

Note, we ICE on:
#pragma GCC target (mavx)
That should be fixed too.

Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-14 Thread Uros Bizjak
On Tue, May 14, 2013 at 10:39 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, May 14, 2013 at 08:58:55AM +0200, Uros Bizjak wrote:
 I think that the option should be named -mtarget-builtins.

 There shouldn't be an option for it at all.  If constructing the builtins is
 slow (it is), we should just create them lazily, the
 *builtin_decl_{explicit,implicit}* APIs were a first step for that, plus
 we should build some gperf table of all the generic and all the target
 builtins and record prefixes used to find them (__builtin_, __sync_,
 __atomic_, what else?), then just teach FE that if they are looking up
 a symbol prefixed with one of these, they should also look it up
 in the perfect hash table and if found there, call some function to
 construct the builtin.  Of course, this isn't a prerequisite of the
 changes you are looking for, but introducing an option that hopefully will
 be completely useless in a few months is just a bad idea.

 Since HJ is OK with this user-visible change, the patch is OK for
 mainline (with eventually renamed option). We also have an option to
 switch this new functionality off, and we are early enough in the
 development cycle to find out if anything is fundamentaly broken with
 this approach.

 BTW, does this patch address the request in PR57202?

 I'm strongly against the patch in its current form, it is a hack rather
 than a solution.  I don't see how it could be properly tested, when say
 immintrin.h and x86intrin.h is still full of code like:
 #ifdef __AVX__
 #include avxintrin.h
 #endif
 so when you just
 #include x86intrin.h
 rather than the few headers that were tested, you are out of luck.

Jakub, thanks for your thorough analysis.

It looks that the approach in the patch is fundamentally flawed and
that infrastructure is not developed/fixed enough for alternative
solution.

Based on expressed concerns, I retract my previous approval.

Thanks,
Uros.


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-14 Thread Jakub Jelinek
On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote:
 When trying with -O2 -mno-avx:
 #ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target(avx)
 #define __DISABLE_AVX__
 #endif
 typedef float __v8sf __attribute__ ((__vector_size__ (32)));
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
 __artificial__))
 _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
 __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
 #ifdef __DISABLE_AVX__
 #pragma GCC pop_options
 #undef __DISABLE_AVX__
 #endif
 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }
 we get bogus errors and ICE:
 tty2.c: In function '_mm256_and_ps':
 tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has 
 changed in GCC 4.6
 tty2.c: In function 'foo':
 tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32
 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486
 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*)
   ../../gcc/expr.c:3485
 (I have added 1 || instead of your generate_builtins into i386.c
 (def_builtin)), that just shows that target attribute/pragma support still
 has very severe issues that need to be fixed, instead of papered around.
 
 Note, we ICE on:
 #pragma GCC target (mavx)
 That should be fixed too.

Ok, I had a brief look at the above two issues.

The first testcase has the problem that the ix86_previous_fndecl cache
gets out of date.  When set_cfun is called on _mm256_and_ps (with the
implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps,
TARGET_AVX is set to true, target reinited.  Then set_cfun is called
with NULL, we don't do anything.  Later on #pragma GCC pop_options appears,
sets !TARGET_AVX (as that is the new target_option_current_node).
Next foo is being parsed, avx attribute is noticed, the same target node
is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's
target node is the same as foo's and so we don't do cl_target_restore_option
at all, so !TARGET_AVX remains, while it should be set.  That is the reason
for the bogus inform etc.  Fixed by resetting the ix86_previous_fndecl cache
on any #pragma GCC target below.  The #pragma GCC target (mavx) is also
fixed below.  The patch also includes the 1 || to enable building all
builtins.  We still ICE with:
#0  fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, 
function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180
#1  0x00771c39 in convert_move (to=0x71b2df00, from=0x71b314e0, 
unsignedp=0) at ../../gcc/expr.c:316
#2  0x0078009f in store_expr (exp=0x719ab390, 
target=0x71b2df00, call_param_p=0, nontemporal=false) at 
../../gcc/expr.c:5300
#3  0x0077eba1 in expand_assignment (to=0x71b35090, 
from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025
on the first testcase.  We don't ICE say on:
#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target(avx)
#define __DISABLE_AVX__
#endif
typedef float __v8sf __attribute__ ((__vector_size__ (32)));
typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
_mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
__builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
#ifdef __DISABLE_AVX__
#pragma GCC pop_options
#undef __DISABLE_AVX__
#endif
__m256 a[10], b[10], c[10];
void __attribute__((target (avx)))
foo (void)
{
  a[0] = _mm256_and_ps (b[0], c[0]);
}
The problem is that in the first testcase, the VAR_DECL c (guess also b and
a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector
types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode
(it has been laid out while -mno-avx has been the current) and also
DECL_RTL which is a mem:BLK.  Guess expr.c would need to special case
TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have
DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM
to the desired mode?

--- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100
+++ gcc/config/i386/i386-c.c2013-05-14 11:46:50.773806894 +0200
@@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree args, tre
 
   if (! args)
 {
-  cur_tree = ((pop_target)
- ? pop_target
- : target_option_default_node);
+  cur_tree = (pop_target ? pop_target : target_option_default_node);
   cl_target_option_restore (global_options,
TREE_TARGET_OPTION (cur_tree));
 }
   else
 {
   cur_tree = ix86_valid_target_attribute_tree (args);
-  if (!cur_tree)
-   return false;
+  if (!cur_tree || cur_tree == error_mark_node)
+   {
+ cl_target_option_restore (global_options,
+   TREE_TARGET_OPTION (prev_tree));
+

Re: GCC does not support *mmintrin.h with function specific opts

2013-05-14 Thread Richard Biener
On Tue, May 14, 2013 at 12:04 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, May 14, 2013 at 10:39:13AM +0200, Jakub Jelinek wrote:
 When trying with -O2 -mno-avx:
 #ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target(avx)
 #define __DISABLE_AVX__
 #endif
 typedef float __v8sf __attribute__ ((__vector_size__ (32)));
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
 __artificial__))
 _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
 __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
 #ifdef __DISABLE_AVX__
 #pragma GCC pop_options
 #undef __DISABLE_AVX__
 #endif
 __m256 a, b, c;
 void __attribute__((target (avx)))
 foo (void)
 {
   a = _mm256_and_ps (b, c);
 }
 we get bogus errors and ICE:
 tty2.c: In function '_mm256_and_ps':
 tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has 
 changed in GCC 4.6
 tty2.c: In function 'foo':
 tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32
 tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486
 0x77a3d2 emit_move_insn(rtx_def*, rtx_def*)
   ../../gcc/expr.c:3485
 (I have added 1 || instead of your generate_builtins into i386.c
 (def_builtin)), that just shows that target attribute/pragma support still
 has very severe issues that need to be fixed, instead of papered around.

 Note, we ICE on:
 #pragma GCC target (mavx)
 That should be fixed too.

 Ok, I had a brief look at the above two issues.

 The first testcase has the problem that the ix86_previous_fndecl cache
 gets out of date.  When set_cfun is called on _mm256_and_ps (with the
 implicit avx attribute), then ix86_previous_fndecl is set to _mm256_and_ps,
 TARGET_AVX is set to true, target reinited.  Then set_cfun is called
 with NULL, we don't do anything.  Later on #pragma GCC pop_options appears,
 sets !TARGET_AVX (as that is the new target_option_current_node).
 Next foo is being parsed, avx attribute is noticed, the same target node
 is used for it, but when set_cfun is called for foo, ix86_previous_fndecl's
 target node is the same as foo's and so we don't do cl_target_restore_option
 at all, so !TARGET_AVX remains, while it should be set.  That is the reason
 for the bogus inform etc.  Fixed by resetting the ix86_previous_fndecl cache
 on any #pragma GCC target below.  The #pragma GCC target (mavx) is also
 fixed below.  The patch also includes the 1 || to enable building all
 builtins.  We still ICE with:
 #0  fancy_abort (file=0x11d8fad ../../gcc/expr.c, line=316, 
 function=0x11dada3 convert_move) at ../../gcc/diagnostic.c:1180
 #1  0x00771c39 in convert_move (to=0x71b2df00, 
 from=0x71b314e0, unsignedp=0) at ../../gcc/expr.c:316
 #2  0x0078009f in store_expr (exp=0x719ab390, 
 target=0x71b2df00, call_param_p=0, nontemporal=false) at 
 ../../gcc/expr.c:5300
 #3  0x0077eba1 in expand_assignment (to=0x71b35090, 
 from=0x719ab390, nontemporal=false) at ../../gcc/expr.c:5025
 on the first testcase.  We don't ICE say on:
 #ifndef __AVX__
 #pragma GCC push_options
 #pragma GCC target(avx)
 #define __DISABLE_AVX__
 #endif
 typedef float __v8sf __attribute__ ((__vector_size__ (32)));
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
 __artificial__))
 _mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) 
 __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
 #ifdef __DISABLE_AVX__
 #pragma GCC pop_options
 #undef __DISABLE_AVX__
 #endif
 __m256 a[10], b[10], c[10];
 void __attribute__((target (avx)))
 foo (void)
 {
   a[0] = _mm256_and_ps (b[0], c[0]);
 }
 The problem is that in the first testcase, the VAR_DECL c (guess also b and
 a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector
 types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode
 (it has been laid out while -mno-avx has been the current) and also
 DECL_RTL which is a mem:BLK.  Guess expr.c would need to special case
 TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have
 DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM
 to the desired mode?

I think any entity with static storage (maybe even automatic storage) should
have BLKmode (or rather its mode should not matter) and what matters
is the mode we use for the access - that is, the mode of the MEM_REF we
expand, for example.

That TYPE_MODE is dynamic for vector types is a bad thing.  It also means
that type layout may be variable (consider PPC where double has different
alignment in structs, so what layout would a struct with a vector_size(16)
double vector get with -mvsx vs. -mno-vsx?)

Richard.

 --- gcc/config/i386/i386-c.c.jj 2013-01-15 17:20:37.0 +0100
 +++ gcc/config/i386/i386-c.c2013-05-14 11:46:50.773806894 +0200
 @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree 

Re: GCC does not support *mmintrin.h with function specific opts

2013-05-14 Thread Jakub Jelinek
On Tue, May 14, 2013 at 12:22:05PM +0200, Richard Biener wrote:
  The problem is that in the first testcase, the VAR_DECL c (guess also b and
  a) have TYPE_MODE (TREE_TYPE (c)) == V8SFmode (this is dynamic, for vector
  types TYPE_MODE is a function call), but DECL_MODE (c) is BLKmode
  (it has been laid out while -mno-avx has been the current) and also
  DECL_RTL which is a mem:BLK.  Guess expr.c would need to special case
  TREE_STATIC or DECL_EXTERNAL VAR_DECLs with vector type, if they have
  DECL_MODE BLKmode, but TYPE_MODE some vector type, just adjust the MEM
  to the desired mode?

--- gcc/expr.c.jj   2013-05-14 08:25:40.0 +0200
+++ gcc/expr.c  2013-05-14 12:55:46.331983406 +0200
@@ -9310,6 +9310,8 @@ expand_expr_real_1 (tree exp, rtx target
  set_curr_insn_location (saved_loc);
  if (REG_P (r)  !REG_EXPR (r))
set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r);
+ if (MEM_P (r)  GET_MODE (r) == BLKmode  mode != BLKmode)
+   r = adjust_address (r, mode, 0);
  return r;
}

fixes the ICE on that testcase.

 I think any entity with static storage (maybe even automatic storage) should
 have BLKmode (or rather its mode should not matter) and what matters
 is the mode we use for the access - that is, the mode of the MEM_REF we
 expand, for example.

There wasn't any MEM_REF in that case, just SSA_NAME with SSA_NAME_DEF_STMT
of a load from VAR_DECL.

 That TYPE_MODE is dynamic for vector types is a bad thing.  It also means
 that type layout may be variable (consider PPC where double has different
 alignment in structs, so what layout would a struct with a vector_size(16)
 double vector get with -mvsx vs. -mno-vsx?)

I haven't introduced the dynamic TYPE_MODE, but getting rid of it would be
terribly hard I'm afraid.  Anyway, if struct layout depends on modes and
handles the vector modes differently from BLKmode, then it is a target bug,
it really should look at the types instead.  Otherwise if you have
such struct in a header, then -mvsx code will be ABI incompatible with
-mno-vsx code using the same header.

Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-13 Thread Sriraman Tallam
Ping.

On Thu, May 9, 2013 at 2:20 PM, Sriraman Tallam tmsri...@google.com wrote:
 cc:Diego

 On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 Any new option needs documenting in invoke.texi.

 Added and new patch attached.

 Thanks
 Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-13 Thread H.J. Lu
On Mon, May 13, 2013 at 2:21 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 9, 2013 at 2:20 PM, Sriraman Tallam tmsri...@google.com wrote:
 cc:Diego

 On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 Any new option needs documenting in invoke.texi.

 Added and new patch attached.

 Thanks
 Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com

It looks good to me.  But I can't approve it.

Add Uros,

--
H.J.


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-09 Thread Sriraman Tallam
cc:Diego

On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 Any new option needs documenting in invoke.texi.

 Added and new patch attached.

 Thanks
 Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-07 Thread Sriraman Tallam
Ping.

On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 Any new option needs documenting in invoke.texi.

 Added and new patch attached.

 Thanks
 Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: GCC does not support *mmintrin.h with function specific opts

2013-05-02 Thread Sriraman Tallam
Ping.

On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
 jos...@codesourcery.com wrote:
 On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 Any new option needs documenting in invoke.texi.

 Added and new patch attached.

 Thanks
 Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-29 Thread Sriraman Tallam
On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
jos...@codesourcery.com wrote:
 On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 Any new option needs documenting in invoke.texi.

Added and new patch attached.

Thanks
Sri


 --
 Joseph S. Myers
 jos...@codesourcery.com
* config/i386/i386.c (construct_container): Do not issue SSE
return error for extern gnu_inline functions.
(def_builtin): Do not generate builtins when -mno-generate-builtins
is used.
* doc/invoke.texi: Document option -mgenerate-builtins.
* config/i386/i386.opt (mgenerate-builtins): New target option.
* config/i386/i386-c.c (ix86_target_macros_internal): Define macro
__ALL_ISA__ when generate_target_builtins is true.
* testsuite/gcc.target/i386/intrinsics_1.c: New test.
* testsuite/gcc.target/i386/intrinsics_2.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_3.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_4.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_5.c: Ditto.
* config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined.
* config/i386/lwpintrin.h: Ditto.
* config/i386/xopintrin.h: Ditto.
* config/i386/fmaintrin.h: Ditto.
* config/i386/bmiintrin.h: Ditto.
* config/i386/fma4intrin.h: Ditto.
* config/i386/nmmintrin.h: Ditto.
* config/i386/tbmintrin.h: Ditto.
* config/i386/smmintrin.h: Ditto.
* config/i386/wmmintrin.h: Ditto.
* config/i386/popcntintrin.h: Ditto.
* config/i386/f16cintrin.h: Ditto.
* config/i386/pmmintrin.h: Ditto.
* config/i386/bmi2intrin.h: Ditto.
* config/i386/tmmintrin.h: Ditto.
* config/i386/xmmintrin.h: Ditto.
* config/i386/mmintrin.h: Ditto.
* config/i386/ammintrin.h: Ditto.
* config/i386/emmintrin.h: Ditto.

Index: config/i386/smmintrin.h
===
--- config/i386/smmintrin.h (revision 198212)
+++ config/i386/smmintrin.h (working copy)
@@ -27,7 +27,7 @@
 #ifndef _SMMINTRIN_H_INCLUDED
 #define _SMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_1__
+#if !defined (__SSE4_1__)  !defined (__ALL_ISA__)
 # error SSE4.1 instruction set not enabled
 #else
 
Index: config/i386/f16cintrin.h
===
--- config/i386/f16cintrin.h(revision 198212)
+++ config/i386/f16cintrin.h(working copy)
@@ -25,7 +25,7 @@
 # error Never use f16intrin.h directly; include x86intrin.h or 
immintrin.h instead.
 #endif
 
-#ifndef __F16C__
+#if !defined (__F16C__)  !defined (__ALL_ISA__)
 # error F16C instruction set not enabled
 #else
 
Index: config/i386/wmmintrin.h
===
--- config/i386/wmmintrin.h (revision 198212)
+++ config/i386/wmmintrin.h (working copy)
@@ -30,7 +30,7 @@
 /* We need definitions from the SSE2 header file.  */
 #include emmintrin.h
 
-#if !defined (__AES__)  !defined (__PCLMUL__)
+#if !defined (__AES__)  !defined (__PCLMUL__)  !defined (__ALL_ISA__)
 # error AES/PCLMUL instructions not enabled
 #else
 
Index: config/i386/bmi2intrin.h
===
--- config/i386/bmi2intrin.h(revision 198212)
+++ config/i386/bmi2intrin.h(working copy)
@@ -25,7 +25,7 @@
 # error Never use bmi2intrin.h directly; include x86intrin.h instead.
 #endif
 
-#ifndef __BMI2__
+#if !defined (__BMI2__)  !defined (__ALL_ISA__)
 # error BMI2 instruction set not enabled
 #endif /* __BMI2__ */
 
Index: config/i386/pmmintrin.h
===
--- config/i386/pmmintrin.h (revision 198212)
+++ config/i386/pmmintrin.h (working copy)
@@ -27,7 +27,7 @@
 #ifndef _PMMINTRIN_H_INCLUDED
 #define _PMMINTRIN_H_INCLUDED
 
-#ifndef __SSE3__
+#if !defined (__SSE3__)  !defined (__ALL_ISA__)
 # error SSE3 instruction set not enabled
 #else
 
Index: config/i386/lzcntintrin.h
===
--- config/i386/lzcntintrin.h   (revision 198212)
+++ config/i386/lzcntintrin.h   (working copy)
@@ -25,7 +25,7 @@
 # error Never use lzcntintrin.h directly; include x86intrin.h instead.
 #endif
 
-#ifndef __LZCNT__
+#if !defined (__LZCNT__)  !defined (__ALL_ISA__)
 # error LZCNT instruction is not enabled
 #endif /* __LZCNT__ */
 
Index: config/i386/tmmintrin.h
===
--- config/i386/tmmintrin.h (revision 198212)
+++ config/i386/tmmintrin.h (working copy)
@@ -27,7 +27,7 @@
 #ifndef _TMMINTRIN_H_INCLUDED
 #define _TMMINTRIN_H_INCLUDED
 
-#ifndef __SSSE3__
+#if !defined (__SSSE3__)  !defined (__ALL_ISA__)
 # error SSSE3 instruction set not enabled
 #else
 
Index: config/i386/xmmintrin.h

Re: GCC does not support *mmintrin.h with function specific opts

2013-04-25 Thread Joseph S. Myers
On Tue, 16 Apr 2013, Sriraman Tallam wrote:

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

Any new option needs documenting in invoke.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-20 Thread Sriraman Tallam
Ping.

On Wed, Apr 17, 2013 at 7:13 PM, Sriraman Tallam tmsri...@google.com wrote:
 +HJ

 On Tue, Apr 16, 2013 at 1:54 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi,

 I have attached an updated patch that  addresses all the comments raised.

 On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 1) this shouldn't be an option, either it can be made to work reliably,
then it should be done always, or it can't, then it shouldn't be done

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 2) have you verified that if you always generate all builtins, that the
builtins not supported by the ISA selected from the command line are
created with the right vector modes?

 This issue does not arise.  When the target builtin is expanded, it is
 checked if the ISA support is there, either via function specific
 target opts or global target opts. If not, an error is raised. Test
 case added for this, please see intrinsic_4.c in patch.

 3) the *intrin.h headers in the case where the guarding macro isn't defined
should be surrounded by something like
#ifndef __FMA4__
#pragma GCC push options
#pragma GCC target(fma4)
#endif
...
#ifndef __FMA4__
#pragma GCC pop options
#endif
so that everything that is in the headers is compiled with the ISA
in question

 I do not think this should be done because it will break the inlining
 ability of the header function and cause issues if the caller does not
 specify the required ISA. The fact that the header functions are
 marked extern __inline, with gnu_inline guarantees that a body will
 not be generated and they will be inlined.  If the caller does not
 have the required ISA, appropriate errors will be raised. Test cases
 added, see intrinsics_1.c, intrinsics_2.c

 4) what happens if you use the various vector types typedefed in the
*intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
for VECTOR_TYPE is a function call, perhaps it will just be handled as
generic BLKmode vectors, which is desirable I think

 I checked some tests here.  With -mno-sse for instance, vector types
 are not permitted in function arguments and return values and gcc
 raises a warning/error in each case.  With return values, gcc always
 gives an error if a SSE register is required in a return value.  I
 even fixed this message to not do it for functions marked as extern
 inline, with gnu_inline keyword as a body for them will not be
 generated.


 5) what happens if you use a target builtin in a function not supporting
the corresponding ISA, do you get proper error explaining what you are
doing wrong?

 Yes, please sse intrinsic_4.c test in patch.

 6) what happens if you use some intrinsics in a function not supporting
the corresponding ISA?  Dunno if the inliner chooses not to inline it
and error out because it is always_inline, or what exactly will happen
then

 Same deal here. The intrinsic function will, guaranteed, to be inlined
 into the caller which will be a corresponding builtin call. That
 builtin call will trigger an error if the ISA is not supported.

 Thanks
 Sri


 For all this you certainly need testcases.

 Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-17 Thread Sriraman Tallam
+HJ

On Tue, Apr 16, 2013 at 1:54 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi,

 I have attached an updated patch that  addresses all the comments raised.

 On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 1) this shouldn't be an option, either it can be made to work reliably,
then it should be done always, or it can't, then it shouldn't be done

 Ok, it is on by default now.  There is a way to turn it off, with
 -mno-generate-builtins.

 2) have you verified that if you always generate all builtins, that the
builtins not supported by the ISA selected from the command line are
created with the right vector modes?

 This issue does not arise.  When the target builtin is expanded, it is
 checked if the ISA support is there, either via function specific
 target opts or global target opts. If not, an error is raised. Test
 case added for this, please see intrinsic_4.c in patch.

 3) the *intrin.h headers in the case where the guarding macro isn't defined
should be surrounded by something like
#ifndef __FMA4__
#pragma GCC push options
#pragma GCC target(fma4)
#endif
...
#ifndef __FMA4__
#pragma GCC pop options
#endif
so that everything that is in the headers is compiled with the ISA
in question

 I do not think this should be done because it will break the inlining
 ability of the header function and cause issues if the caller does not
 specify the required ISA. The fact that the header functions are
 marked extern __inline, with gnu_inline guarantees that a body will
 not be generated and they will be inlined.  If the caller does not
 have the required ISA, appropriate errors will be raised. Test cases
 added, see intrinsics_1.c, intrinsics_2.c

 4) what happens if you use the various vector types typedefed in the
*intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
for VECTOR_TYPE is a function call, perhaps it will just be handled as
generic BLKmode vectors, which is desirable I think

 I checked some tests here.  With -mno-sse for instance, vector types
 are not permitted in function arguments and return values and gcc
 raises a warning/error in each case.  With return values, gcc always
 gives an error if a SSE register is required in a return value.  I
 even fixed this message to not do it for functions marked as extern
 inline, with gnu_inline keyword as a body for them will not be
 generated.


 5) what happens if you use a target builtin in a function not supporting
the corresponding ISA, do you get proper error explaining what you are
doing wrong?

 Yes, please sse intrinsic_4.c test in patch.

 6) what happens if you use some intrinsics in a function not supporting
the corresponding ISA?  Dunno if the inliner chooses not to inline it
and error out because it is always_inline, or what exactly will happen
then

 Same deal here. The intrinsic function will, guaranteed, to be inlined
 into the caller which will be a corresponding builtin call. That
 builtin call will trigger an error if the ISA is not supported.

 Thanks
 Sri


 For all this you certainly need testcases.

 Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-16 Thread Sriraman Tallam
Hi,

I have attached an updated patch that  addresses all the comments raised.

On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 1) this shouldn't be an option, either it can be made to work reliably,
then it should be done always, or it can't, then it shouldn't be done

Ok, it is on by default now.  There is a way to turn it off, with
-mno-generate-builtins.

 2) have you verified that if you always generate all builtins, that the
builtins not supported by the ISA selected from the command line are
created with the right vector modes?

This issue does not arise.  When the target builtin is expanded, it is
checked if the ISA support is there, either via function specific
target opts or global target opts. If not, an error is raised. Test
case added for this, please see intrinsic_4.c in patch.

 3) the *intrin.h headers in the case where the guarding macro isn't defined
should be surrounded by something like
#ifndef __FMA4__
#pragma GCC push options
#pragma GCC target(fma4)
#endif
...
#ifndef __FMA4__
#pragma GCC pop options
#endif
so that everything that is in the headers is compiled with the ISA
in question

I do not think this should be done because it will break the inlining
ability of the header function and cause issues if the caller does not
specify the required ISA. The fact that the header functions are
marked extern __inline, with gnu_inline guarantees that a body will
not be generated and they will be inlined.  If the caller does not
have the required ISA, appropriate errors will be raised. Test cases
added, see intrinsics_1.c, intrinsics_2.c

 4) what happens if you use the various vector types typedefed in the
*intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
for VECTOR_TYPE is a function call, perhaps it will just be handled as
generic BLKmode vectors, which is desirable I think

I checked some tests here.  With -mno-sse for instance, vector types
are not permitted in function arguments and return values and gcc
raises a warning/error in each case.  With return values, gcc always
gives an error if a SSE register is required in a return value.  I
even fixed this message to not do it for functions marked as extern
inline, with gnu_inline keyword as a body for them will not be
generated.


 5) what happens if you use a target builtin in a function not supporting
the corresponding ISA, do you get proper error explaining what you are
doing wrong?

Yes, please sse intrinsic_4.c test in patch.

 6) what happens if you use some intrinsics in a function not supporting
the corresponding ISA?  Dunno if the inliner chooses not to inline it
and error out because it is always_inline, or what exactly will happen
then

Same deal here. The intrinsic function will, guaranteed, to be inlined
into the caller which will be a corresponding builtin call. That
builtin call will trigger an error if the ISA is not supported.

Thanks
Sri


 For all this you certainly need testcases.

 Jakub
* config/i386/i386.c (construct_container): Do not issue SSE
return error for extern gnu_inline functions.
(def_builtin): Do not generate builtins when -mno-generate-builtins
is used.
* config/i386/i386.opt (mgenerate-builtins): New target option.
* config/i386/i386-c.c (ix86_target_macros_internal): Define macro
__ALL_ISA__ when generate_target_builtins is true.
* testsuite/gcc.target/i386/intrinsics_1.c: New test.
* testsuite/gcc.target/i386/intrinsics_2.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_3.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_4.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_5.c: Ditto.
* config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined.
* config/i386/lwpintrin.h: Ditto.
* config/i386/xopintrin.h: Ditto.
* config/i386/fmaintrin.h: Ditto.
* config/i386/bmiintrin.h: Ditto.
* config/i386/fma4intrin.h: Ditto.
* config/i386/nmmintrin.h: Ditto.
* config/i386/tbmintrin.h: Ditto.
* config/i386/smmintrin.h: Ditto.
* config/i386/wmmintrin.h: Ditto.
* config/i386/popcntintrin.h: Ditto.
* config/i386/f16cintrin.h: Ditto.
* config/i386/pmmintrin.h: Ditto.
* config/i386/bmi2intrin.h: Ditto.
* config/i386/tmmintrin.h: Ditto.
* config/i386/xmmintrin.h: Ditto.
* config/i386/mmintrin.h: Ditto.
* config/i386/ammintrin.h: Ditto.
* config/i386/emmintrin.h: 

Re: GCC does not support *mmintrin.h with function specific opts

2013-04-12 Thread Richard Biener
On Thu, Apr 11, 2013 at 10:18 PM, Xinliang David Li davi...@google.com wrote:
 Great that it is already covered.

I expect that with more convoluted code you'll ICE eventually.  There is also
a bugreport about exactly this issue which should be referenced when a fix
is committed.

Oh, and target maintainers disagree about if there is anything to fix and
what it would take to fix it ...

Note that ICC happily expands the intrinsics to SSEX code even if
SSEX is not enabled.

Richard.

 David

 On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com 
 wrote:
 What is the compile time impact for turning it on? Code not including
 the intrinsic headers should not be affected too much.  If the impact
 is small, why not turning on this option by default -- which seems to
 be the behavior of ICC.

 I will get back with data on this.


 With this option, all functions without the appropriate target options
 will be allowed to reference not supported builtins, without warnings
 or errors. Is it possible to delay the check till the builtin
 expansion time?

 Right now, an error is generated if a function accesses an unsupported
 builtin. Since the intrinsic functions are marked inline and call some
 target builtin, this will always be caught.

 To be clear, same example without the target attribute and with
 -mgenerate-builtins results in an error:

 #include smmintrin.h

 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }

 $ g++ test.cc -mgenerate-builtins

 smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
 -m32 -msse4.1
return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);



 Sri


 thanks,

 David

 On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Hi,

 *mmintrin headers does not work with function specific opts.

 Example 1:


 #include smmintrin.h

 __attribute__((target(sse4.1)))
 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }


 $ g++ test.cc
 smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled
  # error SSE4.1 instruction set not enabled

 This error happens even though foo is marked sse4.1

 There are multiple issues at play here. One, the headers are guarded
 by macros that are switched on only when the target specific options,
 like -msse4.1 in this case, are present in the command line. Also, the
 target specific builtins, like __builtin_ia32_movntdqa called by
 _mm_stream_load_si128, are exposed only in the presence of the
 appropriate target ISA option.


 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 This feature will greatly benefit the function multiversioning usability 
 too.

 Comments?

 Thanks
 Sri


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-12 Thread Jakub Jelinek
On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

1) this shouldn't be an option, either it can be made to work reliably,
   then it should be done always, or it can't, then it shouldn't be done
2) have you verified that if you always generate all builtins, that the
   builtins not supported by the ISA selected from the command line are
   created with the right vector modes?
3) the *intrin.h headers in the case where the guarding macro isn't defined
   should be surrounded by something like
   #ifndef __FMA4__
   #pragma GCC push options
   #pragma GCC target(fma4)
   #endif
   ...
   #ifndef __FMA4__
   #pragma GCC pop options
   #endif
   so that everything that is in the headers is compiled with the ISA
   in question
4) what happens if you use the various vector types typedefed in the
   *intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
   for VECTOR_TYPE is a function call, perhaps it will just be handled as
   generic BLKmode vectors, which is desirable I think
5) what happens if you use a target builtin in a function not supporting
   the corresponding ISA, do you get proper error explaining what you are
   doing wrong?
6) what happens if you use some intrinsics in a function not supporting
   the corresponding ISA?  Dunno if the inliner chooses not to inline it
   and error out because it is always_inline, or what exactly will happen
   then

For all this you certainly need testcases.

Jakub


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-12 Thread Xinliang David Li
On Fri, Apr 12, 2013 at 1:22 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Thu, Apr 11, 2013 at 10:18 PM, Xinliang David Li davi...@google.com 
 wrote:
 Great that it is already covered.

 I expect that with more convoluted code you'll ICE eventually.

Yes, those are bugs that need to be fixed.

 There is also
 a bugreport about exactly this issue which should be referenced when a fix
 is committed.

there is a related thread from 2 years ago :
http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01417.html

The patch was adding individual ISA macros eagerly which is not a good
idea (as they are also used in user code). Sri's patch is certainly
cleaner.


 Oh, and target maintainers disagree about if there is anything to fix and
 what it would take to fix it ...


There are concerns about buggy function level target option handling,
vector type lowering etc. It would be great if there are actual cases
to demonstrate it.

 Note that ICC happily expands the intrinsics to SSEX code even if
 SSEX is not enabled.

This seems handy, but may hide user errors.

David



 Richard.

 David

 On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com 
 wrote:
 What is the compile time impact for turning it on? Code not including
 the intrinsic headers should not be affected too much.  If the impact
 is small, why not turning on this option by default -- which seems to
 be the behavior of ICC.

 I will get back with data on this.


 With this option, all functions without the appropriate target options
 will be allowed to reference not supported builtins, without warnings
 or errors. Is it possible to delay the check till the builtin
 expansion time?

 Right now, an error is generated if a function accesses an unsupported
 builtin. Since the intrinsic functions are marked inline and call some
 target builtin, this will always be caught.

 To be clear, same example without the target attribute and with
 -mgenerate-builtins results in an error:

 #include smmintrin.h

 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }

 $ g++ test.cc -mgenerate-builtins

 smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
 -m32 -msse4.1
return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);



 Sri


 thanks,

 David

 On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Hi,

 *mmintrin headers does not work with function specific opts.

 Example 1:


 #include smmintrin.h

 __attribute__((target(sse4.1)))
 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }


 $ g++ test.cc
 smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled
  # error SSE4.1 instruction set not enabled

 This error happens even though foo is marked sse4.1

 There are multiple issues at play here. One, the headers are guarded
 by macros that are switched on only when the target specific options,
 like -msse4.1 in this case, are present in the command line. Also, the
 target specific builtins, like __builtin_ia32_movntdqa called by
 _mm_stream_load_si128, are exposed only in the presence of the
 appropriate target ISA option.


 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 This feature will greatly benefit the function multiversioning usability 
 too.

 Comments?

 Thanks
 Sri


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-12 Thread Xinliang David Li
On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek ja...@redhat.com wrote:
 On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 1) this shouldn't be an option, either it can be made to work reliably,
then it should be done always, or it can't, then it shouldn't be done

Except that if there is compile time/memory consumption concerns,
users can use the option to turn it off.

 2) have you verified that if you always generate all builtins, that the
builtins not supported by the ISA selected from the command line are
created with the right vector modes?
 3) the *intrin.h headers in the case where the guarding macro isn't defined
should be surrounded by something like
#ifndef __FMA4__
#pragma GCC push options
#pragma GCC target(fma4)
#endif
...
#ifndef __FMA4__
#pragma GCC pop options
#endif
so that everything that is in the headers is compiled with the ISA
in question

For the inline functions? (The caller functions should have the target
option), or FE needs the target option properly set?

 4) what happens if you use the various vector types typedefed in the
*intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
for VECTOR_TYPE is a function call, perhaps it will just be handled as
generic BLKmode vectors, which is desirable I think

Will the veclower pass deal with it (lowered into scalar operations)?


 5) what happens if you use a target builtin in a function not supporting
the corresponding ISA, do you get proper error explaining what you are
doing wrong?

Yes, in ix86_expand_builtin there is a check -- that is what Sri
mentioned in a previous email.

 6) what happens if you use some intrinsics in a function not supporting
the corresponding ISA?  Dunno if the inliner chooses not to inline it
and error out because it is always_inline, or what exactly will happen
then

I think it may end up with errors about unsupported builtins as above

thanks,

David


 For all this you certainly need testcases.

 Jakub


GCC does not support *mmintrin.h with function specific opts

2013-04-11 Thread Sriraman Tallam
Hi,

*mmintrin headers does not work with function specific opts.

Example 1:


#include smmintrin.h

__attribute__((target(sse4.1)))
__m128i foo(__m128i *V)
{
return _mm_stream_load_si128(V);
}


$ g++ test.cc
smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled
 # error SSE4.1 instruction set not enabled

This error happens even though foo is marked sse4.1

There are multiple issues at play here. One, the headers are guarded
by macros that are switched on only when the target specific options,
like -msse4.1 in this case, are present in the command line. Also, the
target specific builtins, like __builtin_ia32_movntdqa called by
_mm_stream_load_si128, are exposed only in the presence of the
appropriate target ISA option.


I have attached a patch that fixes this. I have added an option
-mgenerate-builtins that will do two things.  It will define a macro
__ALL_ISA__ which will expose the *intrin.h functions. It will also
expose all the target specific builtins.  -mgenerate-builtins will not
affect code generation.

This feature will greatly benefit the function multiversioning usability too.

Comments?

Thanks
Sri
Index: emmintrin.h
===
--- emmintrin.h (revision 197691)
+++ emmintrin.h (working copy)
@@ -27,7 +27,7 @@
 #ifndef _EMMINTRIN_H_INCLUDED
 #define _EMMINTRIN_H_INCLUDED
 
-#ifndef __SSE2__
+#if !defined (__SSE2__)  !defined (__ALL_ISA__)
 # error SSE2 instruction set not enabled
 #else
 
Index: fma4intrin.h
===
--- fma4intrin.h(revision 197691)
+++ fma4intrin.h(working copy)
@@ -28,7 +28,7 @@
 #ifndef _FMA4INTRIN_H_INCLUDED
 #define _FMA4INTRIN_H_INCLUDED
 
-#ifndef __FMA4__
+#if !defined (__FMA4__)  !defined (__ALL_ISA__)
 # error FMA4 instruction set not enabled
 #else
 
Index: lwpintrin.h
===
--- lwpintrin.h (revision 197691)
+++ lwpintrin.h (working copy)
@@ -28,7 +28,7 @@
 #ifndef _LWPINTRIN_H_INCLUDED
 #define _LWPINTRIN_H_INCLUDED
 
-#ifndef __LWP__
+#if !defined (__LWP__)  !defined (__ALL_ISA__)
 # error LWP instruction set not enabled
 #else
 
Index: xopintrin.h
===
--- xopintrin.h (revision 197691)
+++ xopintrin.h (working copy)
@@ -28,7 +28,7 @@
 #ifndef _XOPMMINTRIN_H_INCLUDED
 #define _XOPMMINTRIN_H_INCLUDED
 
-#ifndef __XOP__
+#if !defined (__XOP__)  !defined (__ALL_ISA__)
 # error XOP instruction set not enabled
 #else
 
Index: fmaintrin.h
===
--- fmaintrin.h (revision 197691)
+++ fmaintrin.h (working copy)
@@ -28,7 +28,7 @@
 #ifndef _FMAINTRIN_H_INCLUDED
 #define _FMAINTRIN_H_INCLUDED
 
-#ifndef __FMA__
+#if !defined (__FMA__)  !defined (__ALL_ISA__)
 # error FMA instruction set not enabled
 #else
 
Index: bmiintrin.h
===
--- bmiintrin.h (revision 197691)
+++ bmiintrin.h (working copy)
@@ -25,7 +25,7 @@
 # error Never use bmiintrin.h directly; include x86intrin.h instead.
 #endif
 
-#ifndef __BMI__
+#if !defined (__BMI__)  !defined (__ALL_ISA__)
 # error BMI instruction set not enabled
 #endif /* __BMI__ */
 
Index: mmintrin.h
===
--- mmintrin.h  (revision 197691)
+++ mmintrin.h  (working copy)
@@ -27,7 +27,7 @@
 #ifndef _MMINTRIN_H_INCLUDED
 #define _MMINTRIN_H_INCLUDED
 
-#ifndef __MMX__
+#if !defined (__MMX__)  !defined (__ALL_ISA__)
 # error MMX instruction set not enabled
 #else
 /* The Intel API is flexible enough that we must allow aliasing with other
Index: nmmintrin.h
===
--- nmmintrin.h (revision 197691)
+++ nmmintrin.h (working copy)
@@ -27,7 +27,7 @@
 #ifndef _NMMINTRIN_H_INCLUDED
 #define _NMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_2__
+#if !defined (__SSE4_2__)  !defined (__ALL_ISA__)
 # error SSE4.2 instruction set not enabled
 #else
 /* We just include SSE4.1 header file.  */
Index: tbmintrin.h
===
--- tbmintrin.h (revision 197691)
+++ tbmintrin.h (working copy)
@@ -25,7 +25,7 @@
 # error Never use tbmintrin.h directly; include x86intrin.h instead.
 #endif
 
-#ifndef __TBM__
+#if !defined (__TBM__)  !defined (__ALL_ISA__)
 # error TBM instruction set not enabled
 #endif /* __TBM__ */
 
Index: f16cintrin.h
===
--- f16cintrin.h(revision 197691)
+++ f16cintrin.h(working copy)
@@ -25,7 +25,7 @@
 # error Never use f16intrin.h directly; include x86intrin.h or 
immintrin.h instead.
 #endif
 
-#ifndef __F16C__
+#if !defined (__F16C__)  !defined (__ALL_ISA__)
 # error F16C instruction set not enabled
 #else
 
Index: i386.opt
===
--- 

Re: GCC does not support *mmintrin.h with function specific opts

2013-04-11 Thread Xinliang David Li
What is the compile time impact for turning it on? Code not including
the intrinsic headers should not be affected too much.  If the impact
is small, why not turning on this option by default -- which seems to
be the behavior of ICC.

With this option, all functions without the appropriate target options
will be allowed to reference not supported builtins, without warnings
or errors. Is it possible to delay the check till the builtin
expansion time?

thanks,

David

On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi,

 *mmintrin headers does not work with function specific opts.

 Example 1:


 #include smmintrin.h

 __attribute__((target(sse4.1)))
 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }


 $ g++ test.cc
 smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled
  # error SSE4.1 instruction set not enabled

 This error happens even though foo is marked sse4.1

 There are multiple issues at play here. One, the headers are guarded
 by macros that are switched on only when the target specific options,
 like -msse4.1 in this case, are present in the command line. Also, the
 target specific builtins, like __builtin_ia32_movntdqa called by
 _mm_stream_load_si128, are exposed only in the presence of the
 appropriate target ISA option.


 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 This feature will greatly benefit the function multiversioning usability too.

 Comments?

 Thanks
 Sri


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-11 Thread Sriraman Tallam
On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com wrote:
 What is the compile time impact for turning it on? Code not including
 the intrinsic headers should not be affected too much.  If the impact
 is small, why not turning on this option by default -- which seems to
 be the behavior of ICC.

I will get back with data on this.


 With this option, all functions without the appropriate target options
 will be allowed to reference not supported builtins, without warnings
 or errors. Is it possible to delay the check till the builtin
 expansion time?

Right now, an error is generated if a function accesses an unsupported
builtin. Since the intrinsic functions are marked inline and call some
target builtin, this will always be caught.

Sri


 thanks,

 David

 On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi,

 *mmintrin headers does not work with function specific opts.

 Example 1:


 #include smmintrin.h

 __attribute__((target(sse4.1)))
 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }


 $ g++ test.cc
 smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled
  # error SSE4.1 instruction set not enabled

 This error happens even though foo is marked sse4.1

 There are multiple issues at play here. One, the headers are guarded
 by macros that are switched on only when the target specific options,
 like -msse4.1 in this case, are present in the command line. Also, the
 target specific builtins, like __builtin_ia32_movntdqa called by
 _mm_stream_load_si128, are exposed only in the presence of the
 appropriate target ISA option.


 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 This feature will greatly benefit the function multiversioning usability too.

 Comments?

 Thanks
 Sri


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-11 Thread Sriraman Tallam
On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com 
 wrote:
 What is the compile time impact for turning it on? Code not including
 the intrinsic headers should not be affected too much.  If the impact
 is small, why not turning on this option by default -- which seems to
 be the behavior of ICC.

 I will get back with data on this.


 With this option, all functions without the appropriate target options
 will be allowed to reference not supported builtins, without warnings
 or errors. Is it possible to delay the check till the builtin
 expansion time?

 Right now, an error is generated if a function accesses an unsupported
 builtin. Since the intrinsic functions are marked inline and call some
 target builtin, this will always be caught.

To be clear, same example without the target attribute and with
-mgenerate-builtins results in an error:

#include smmintrin.h

__m128i foo(__m128i *V)
{
return _mm_stream_load_si128(V);
}

$ g++ test.cc -mgenerate-builtins

smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
-m32 -msse4.1
   return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);



 Sri


 thanks,

 David

 On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Hi,

 *mmintrin headers does not work with function specific opts.

 Example 1:


 #include smmintrin.h

 __attribute__((target(sse4.1)))
 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }


 $ g++ test.cc
 smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled
  # error SSE4.1 instruction set not enabled

 This error happens even though foo is marked sse4.1

 There are multiple issues at play here. One, the headers are guarded
 by macros that are switched on only when the target specific options,
 like -msse4.1 in this case, are present in the command line. Also, the
 target specific builtins, like __builtin_ia32_movntdqa called by
 _mm_stream_load_si128, are exposed only in the presence of the
 appropriate target ISA option.


 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 This feature will greatly benefit the function multiversioning usability 
 too.

 Comments?

 Thanks
 Sri


Re: GCC does not support *mmintrin.h with function specific opts

2013-04-11 Thread Xinliang David Li
Great that it is already covered.

David

On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li davi...@google.com 
 wrote:
 What is the compile time impact for turning it on? Code not including
 the intrinsic headers should not be affected too much.  If the impact
 is small, why not turning on this option by default -- which seems to
 be the behavior of ICC.

 I will get back with data on this.


 With this option, all functions without the appropriate target options
 will be allowed to reference not supported builtins, without warnings
 or errors. Is it possible to delay the check till the builtin
 expansion time?

 Right now, an error is generated if a function accesses an unsupported
 builtin. Since the intrinsic functions are marked inline and call some
 target builtin, this will always be caught.

 To be clear, same example without the target attribute and with
 -mgenerate-builtins results in an error:

 #include smmintrin.h

 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }

 $ g++ test.cc -mgenerate-builtins

 smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
 -m32 -msse4.1
return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);



 Sri


 thanks,

 David

 On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Hi,

 *mmintrin headers does not work with function specific opts.

 Example 1:


 #include smmintrin.h

 __attribute__((target(sse4.1)))
 __m128i foo(__m128i *V)
 {
 return _mm_stream_load_si128(V);
 }


 $ g++ test.cc
 smmintrin.h:31:3: error: #error SSE4.1 instruction set not enabled
  # error SSE4.1 instruction set not enabled

 This error happens even though foo is marked sse4.1

 There are multiple issues at play here. One, the headers are guarded
 by macros that are switched on only when the target specific options,
 like -msse4.1 in this case, are present in the command line. Also, the
 target specific builtins, like __builtin_ia32_movntdqa called by
 _mm_stream_load_si128, are exposed only in the presence of the
 appropriate target ISA option.


 I have attached a patch that fixes this. I have added an option
 -mgenerate-builtins that will do two things.  It will define a macro
 __ALL_ISA__ which will expose the *intrin.h functions. It will also
 expose all the target specific builtins.  -mgenerate-builtins will not
 affect code generation.

 This feature will greatly benefit the function multiversioning usability 
 too.

 Comments?

 Thanks
 Sri