Re: [PATCH][AArch64][14/14] Reuse target_option_current_node when passing pragma string to target attribute

2015-07-21 Thread James Greenhalgh
On Thu, Jul 16, 2015 at 04:21:22PM +0100, Kyrill Tkachov wrote:
 Hi all,
 
 This patch improves compilation times for code using the arm_neon.h 
 intrinsics.
 The problem there is that since we now wrap all the intrinsics in arm_neon.h 
 inside a pragma,
 the midend will apply the pragma string onto every single intrinsic as an 
 attribute, calling
 the target attribute parsing code thousands of times on the same string.  
 I've seen this cause
 slowdown on large intrinsics programs in the area of 3-5%.
 
 This patch checks if the ARGS we're supposed to process is the same as the 
 prgma already
 processed by the pragma processing code in aarch64-c.c. If it is, then we 
 know that the correct
 target node is already set in target_option_current_node, so we can just 
 reuse that, saving us
 the trouble of parsing the string.
 
 This gets compilation times for large intrinsic programs to the previous 
 levels.
 We still get a compile-time hit on small programs due to grokdeclarator in 
 the frontend
 appearing high in the profile due to the pragma use, I presume. But for large 
 programs
 we should be good.  The compilation time will be dominated by the other parts 
 of the compiler.
 In any case, for small programs, garbage collection is at the top of the 
 profile in either case.
 
 Bootstrapped and tested on aarch64.
 
 Ok for trunk?

OK.

Thanks,
James

 2015-07-16  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
  * config/aarch64/aarch64.c (aarch64_option_valid_attribute_p):
  Exit early and use target_option_current_node if processing current
  pragma.




[PATCH][AArch64][14/14] Reuse target_option_current_node when passing pragma string to target attribute

2015-07-16 Thread Kyrill Tkachov

Hi all,

This patch improves compilation times for code using the arm_neon.h intrinsics.
The problem there is that since we now wrap all the intrinsics in arm_neon.h 
inside a pragma,
the midend will apply the pragma string onto every single intrinsic as an 
attribute, calling
the target attribute parsing code thousands of times on the same string.  I've 
seen this cause
slowdown on large intrinsics programs in the area of 3-5%.

This patch checks if the ARGS we're supposed to process is the same as the 
prgma already
processed by the pragma processing code in aarch64-c.c. If it is, then we know 
that the correct
target node is already set in target_option_current_node, so we can just reuse 
that, saving us
the trouble of parsing the string.

This gets compilation times for large intrinsic programs to the previous levels.
We still get a compile-time hit on small programs due to grokdeclarator in the 
frontend
appearing high in the profile due to the pragma use, I presume. But for large 
programs
we should be good.  The compilation time will be dominated by the other parts 
of the compiler.
In any case, for small programs, garbage collection is at the top of the 
profile in either case.

Bootstrapped and tested on aarch64.

Ok for trunk?

Thanks,
Kyrill

2015-07-16  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.c (aarch64_option_valid_attribute_p):
Exit early and use target_option_current_node if processing current
pragma.
commit 0bbab2ef7fb4be18780b5c87d338d2f9d9116fe4
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Thu May 28 15:33:49 2015 +0100

[AArch64][14/N] Reuse target_option_current_node when passing pragma string to target attribute

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f0f3cdc..f8c5aa4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8431,6 +8431,18 @@ aarch64_option_valid_attribute_p (tree fndecl, tree, tree args, int)
   tree old_optimize;
   tree new_target, new_optimize;
   tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl);
+
+  /* If what we're processing is the current pragma string then the
+ target option node is already stored in target_option_current_node
+ by aarch64_pragma_target_parse in aarch64-c.c.  Use that to avoid
+ having to re-parse the string.  This is especially useful to keep
+ arm_neon.h compile times down since that header contains a lot
+ of intrinsics enclosed in pragmas.  */
+  if (!existing_target  args == current_target_pragma)
+{
+  DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_option_current_node;
+  return true;
+}
   tree func_optimize = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
 
   old_optimize = build_optimization_node (global_options);