Replacing malloc with alloca.

2015-09-13 Thread Ajit Kumar Agarwal
All: The replacement of malloc with alloca can be done on the following analysis. If the lifetime of an object does not stretch beyond the immediate scope. In such cases the malloc can be replaced with alloca. This increases the performance to a great extent. Inlining helps to a great extent th

RE: Live range Analysis based on tree representations

2015-09-12 Thread Ajit Kumar Agarwal
-Original Message- From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com] Sent: Friday, September 04, 2015 11:51 PM To: Ajit Kumar Agarwal Cc: Jeff Law; vmaka...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala

Selective criteria and Heuristics for Loop Unrolling.

2015-09-12 Thread Ajit Kumar Agarwal
All: The Loop unrolling and the decisions on unrolling factor is an important criteria for loop Unrolling optimization. The decision on unrolling factor for the loops based on the below criteria improves the performance of unrolled loops. 1. Number of operations. 2. Number of operands. 3. Numb

Inlining Decision Priority Function.

2015-09-12 Thread Ajit Kumar Agarwal
All: Inlining decisions that reduces the formulation of callee's stacks frame and including the callee in the caller context increases The performance. The priority function of Inlining decisions can be calculated as follows considering the following. 1. Level nest of the callee. 2. code size

Cost and Benefit Allocation for moving the expression above the conditional.

2015-09-06 Thread Ajit Kumar Agarwal
All: The cost and benefit associated for moving a given expression above conditional are the important factors for the performance boost. Considering the above, the cost and benefit calculation can be derived based on below. For a given conditional entry point 'n', the benefit path 'p' are t

RE: Live range Analysis based on tree representations

2015-09-03 Thread Ajit Kumar Agarwal
-Original Message- From: Aaron Sawdey [mailto:acsaw...@linux.vnet.ibm.com] Sent: Wednesday, September 02, 2015 8:23 PM To: Ajit Kumar Agarwal Cc: Jeff Law; vmaka...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala

RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-09-02 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal Sent: Wednesday, August 19, 2015 2:53 PM To: Richard Biener Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE

Live range Analysis based on tree representations

2015-09-01 Thread Ajit Kumar Agarwal
All: The Live ranges info on tree SSA representation is important step towards the SSA based code motion optimizations. As the code motion optimization based on the SSA representation effects the register pressure and reasons for performance Bottleneck. I am proposing the Live range Analysis ba

Commoning the control and Data Dependence

2015-09-01 Thread Ajit Kumar Agarwal
All: The Data Dependency graph augmented with control dependence can be common out based on the dominator info. The instruction I1 dominates all the uses say instruction I2 and I3. Then I2 and I3 depends on I1. Thus the Graph can be Formed from the dominator tree of all the instructions and the

Awareness of register pressure on strength reduction of induction variables.

2015-09-01 Thread Ajit Kumar Agarwal
All; The Global code motion are the important optimization that have an impact on register spills and Fetch. Thus The Global code motion takes into account the increase or decrease of register pressure. Strength Reductions is an important optimization that has an impact on register pressure. T

RE: [RFC]: Vectorization cost benefit changes.

2015-08-21 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Friday, August 21, 2015 2:03 PM To: Ajit Kumar Agarwal Cc: Jeff Law; GCC Patches; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC

[RFC]: Vectorization cost benefit changes.

2015-08-20 Thread Ajit Kumar Agarwal
All: I have done the vectorization cost changes as given below. I have considered only the cost associated with the inner instead of outside. The consideration of inside scalar and vector cost is done as the inner cost are the most cost effective than the outside cost. min_profitable_i

RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-08-19 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal Sent: Monday, August 17, 2015 4:03 PM To: Richard Biener Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE

RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-08-17 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Friday, August 14, 2015 9:59 PM To: Ajit Kumar Agarwal Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE: vectorization cost macro

More of a Loop fusion

2015-08-16 Thread Ajit Kumar Agarwal
All: Loop fusion is an important optimizations that fuses the set of Loops if the following condition is valid. 1) Loops are conformant ( i.e. they have same iteration count). 2. Loops are control equivalent. The control equivalence of the loops can be identified with the dominator and post dom

RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-08-14 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Monday, August 03, 2015 2:59 PM To: Ajit Kumar Agarwal Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: vectorization cost macro

RE: More of a Loop distribution.

2015-08-13 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Friday, August 14, 2015 11:30 AM To: Ajit Kumar Agarwal Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE: More of a Loop distribution

RE: More of a Loop distribution.

2015-08-13 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Thursday, August 13, 2015 3:23 PM To: Ajit Kumar Agarwal Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: More of a Loop distribution

More of a Loop distribution.

2015-08-13 Thread Ajit Kumar Agarwal
All: Loop distribution considers DDG to decide on distributing the Loops. The Loops with control statements like IF-THEN-ELSE can also be Distributed. Instead of Data Dependency Graph, the Control Dependence Graph should be considered in order to distribute the loops In presence of control Stat

Loop distribution for nested Loops.

2015-08-04 Thread Ajit Kumar Agarwal
All: For the Loop given in Fig(1), there is no possibility of loop distribution because of the dependency of S1 and S2 on the outerloop index k. Due to the dependency the Loop cannot be distributed. The Loop can be distributed with the transformation given in Fig(2) where the loop given in Fi

RE: vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-08-04 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Monday, August 03, 2015 2:59 PM To: Ajit Kumar Agarwal Cc: Jeff Law; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: vectorization cost macro

vectorization cost macro TARGET_VECTORIZE_ADD_STMT_COST

2015-08-02 Thread Ajit Kumar Agarwal
All: The definition of the following macro that determine the statement cost that adds to vectorization cost. #define TARGET_VECTORIZE_ADD_STMT_COST. In the implementation of the above macro the following is done for many vectorization supported architectures like i386, ARM. if (where == vect

RETURN_ADDRESS_POINTER_REGNUM Macro

2015-07-23 Thread Ajit Kumar Agarwal
All: >From the description of the definition of the macro >RETURN_ADDRESS_POINTER_REGNUM , it is derived that this macro is used to Define a register for the above macro that helps in getting the return address from the stack or frame pointer. I could see many of the architectures supported by

RE: Traces on Data Dependency graph.

2015-07-14 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Tuesday, July 14, 2015 6:35 PM To: Ajit Kumar Agarwal Cc: Jeff Law; Jan Hubicka; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Traces on Data

Traces on Data Dependency graph.

2015-07-14 Thread Ajit Kumar Agarwal
All: I am wondering how useful to form the traces on Data Dependency Graph. On top of the traces in the Control flow graph, I was thinking of forming the traces on data Dependency graph(DDG). Would this helps in further vectorization and parallelization candidates. Thoughts? Thanks & Regar

Partition and subpartition Analysis that helps in further vectorization and parallelization

2015-07-14 Thread Ajit Kumar Agarwal
All: I am trying the place the following Analysis in the vectorizer of GCC that helps in improving the vectorizer to a great extent For the unit stride, zero stride and non stride accesses of memory that helps in vectorizer. For the Data Dependency graph, the topological sort is performed. The

RE: [RFC] Design and Implementation for Path Splitting for Loop with Conditional IF-THEN-ELSE

2015-07-10 Thread Ajit Kumar Agarwal
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, July 10, 2015 4:04 AM To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC] Design and Implementation for Path

RE: [RFC] Design and Implementation for Path Splitting for Loop with Conditional IF-THEN-ELSE

2015-07-10 Thread Ajit Kumar Agarwal
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, July 10, 2015 4:04 AM To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC] Design and Implementation for Path

CFG transformation of loops with continue statement inside the loops.

2015-07-08 Thread Ajit Kumar Agarwal
All: While/For ( condition1) { Some code here. If(condition2 ) continue; Some code here. } Fig(1) For the above loop in Fig(1) there will be two backedges and multiple latches. The below code can be transformed to the below in order to have a single backedge. While/For (condition

RE: Live on Exit renaming.

2015-07-05 Thread Ajit Kumar Agarwal
-Original Message- From: Bin.Cheng [mailto:amker.ch...@gmail.com] Sent: Monday, July 06, 2015 10:26 AM To: Ajit Kumar Agarwal Cc: Steven Bosscher; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re

RE: Live on Exit renaming.

2015-07-05 Thread Ajit Kumar Agarwal
-Original Message- From: Bin.Cheng [mailto:amker.ch...@gmail.com] Sent: Monday, July 06, 2015 7:04 AM To: Steven Bosscher Cc: Ajit Kumar Agarwal; l...@redhat.com; Richard Biener; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Live

Reduction Pattern ( Vectorization or Parallelization)

2015-07-05 Thread Ajit Kumar Agarwal
All: The scalar and array reduction patterns can be identified if the result of commutative updates Is applied to the same scalar or array variables on the LHS with +, *, Min or Max. Thus the reduction pattern identified with the commutative update help in vectorization or parallelization. Fo

Allocation of hotness of data structure with respect to the top of stack.

2015-07-05 Thread Ajit Kumar Agarwal
All: I am wondering allocation of hot data structure closer to the top of the stack increases the performance of the application. The data structure are identified as hot and cold data structure and all the data structures are sorted in decreasing order of The hotness and the hot data structure

RE: Live on Exit renaming.

2015-07-04 Thread Ajit Kumar Agarwal
Sorry for the typo error. The below is the corrected Fig (1). While (a[i] != key) I = i+1; Return I; Fig (1). Thanks & Regards Ajit -Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal Sent: Saturday, July 04, 20

Live on Exit renaming.

2015-07-04 Thread Ajit Kumar Agarwal
All: Design and Analysis of Profile-Based Optimization in Compaq's Compilation Tools for Alpha; Journal of Instruction-Level Parallelism 3 (2000) 1-25 The above paper based on this paper the existing tracer pass (This pass performs the tail duplication needed for superblock formation.)

RE: Consideration of Cost associated with SEME regions.

2015-07-02 Thread Ajit Kumar Agarwal
Ajit -Original Message- From: Ajit Kumar Agarwal Sent: Thursday, July 02, 2015 3:33 PM To: vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Consideration of Cost associated with SEME regions. All: The

Consideration of Cost associated with SEME regions.

2015-07-02 Thread Ajit Kumar Agarwal
All: The Cost Calculation for a candidate to Spill in the Integrated Register Allocator(IRA) considers only the SESE regions. The Cost Calculation in the IRA should consider the SEME regions into consider for spilling decisions. The Cost associated with the path that has un-matured exists shou

Transformation from SEME(Single Entry Multiple Exit) to SESE(Single Entry Single Exit)

2015-07-02 Thread Ajit Kumar Agarwal
All: Single Entry and Multiple Exits disables traditional Loop optimization. The presence of short circuit also makes the CFG as Single Entry and Multiple Exits. The transformation from SEME(Single Entry and Multiple Exits) to SESE( Single Entry and Single Exits enables many Loop Optimizations.

Multi-version IF-THEN-ELSE conditional

2015-06-27 Thread Ajit Kumar Agarwal
All: The presence of aliases disables many optimizations like CCP(conditional constant propagation) , PRE(Partial Redundancy Elimination), Scalar Replacements for conditional IF-THEN-ELSE. The presence of aliasing also disables the IF-conversion. I am proposing the Multi-version IF-THEN-ELSE w

RE: set_src_cost lying comment

2015-06-24 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Richard Kenner Sent: Wednesday, June 24, 2015 9:28 PM To: l...@redhat.com Cc: gcc@gcc.gnu.org Subject: Re: set_src_cost lying comment > These are good examples of things the costing model simply w

RE: set_src_cost lying comment

2015-06-24 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Jeff Law Sent: Wednesday, June 24, 2015 10:36 AM To: gcc@gcc.gnu.org Subject: Re: set_src_cost lying comment On 06/21/2015 11:57 PM, Alan Modra wrote: > set_src_cost says it is supposed to > /* Ret

Proposal of new Unrolling degree before/after the allocated Register Allocation is done in GCC.

2015-06-13 Thread Ajit Kumar Agarwal
All: Given a Data Dependency Graph(DDG) the unrolling degree proposed by Monica Lam et.al calculates the unrolling degree as follows. Unrolling degree = Length of Longest Live range/ Number of cycles in the kernel ( Initiation Interval). The unrolling degree based on the Above leads to more re

RE: Question about find modifiable mems

2015-06-03 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of shmeel gutl Sent: Wednesday, June 03, 2015 12:10 PM To: GCC Development Subject: Question about find modifiable mems >>find_modifiable_mems was introduced to gcc 4.8 in september 2012. Is there >

RE: [RFC] Design and Implementation for Path Splitting for Loop with Conditional IF-THEN-ELSE

2015-06-02 Thread Ajit Kumar Agarwal
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Tuesday, June 02, 2015 9:19 PM To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC] Design and Implementation for Path

RE: [RFC] Design and Implementation for Path Splitting for Loop with Conditional IF-THEN-ELSE

2015-06-01 Thread Ajit Kumar Agarwal
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Friday, May 29, 2015 9:24 PM To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: [RFC] Design and Implementation for Path

[RFC] Design and Implementation for Path Splitting for Loop with Conditional IF-THEN-ELSE

2015-05-16 Thread Ajit Kumar Agarwal
I have Designed and implemented with the following design for the path splitting of the loops with conditional IF-THEN-ELSE. The implementation has gone through the bootstrap for Microblaze target along DEJA GNU regressions tests and running the MIBench/EEMBC benchmarks. There is no regression s

RE: dom1 prevents vectorization via partial loop peeling?

2015-04-28 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Richard Biener Sent: Tuesday, April 28, 2015 4:12 PM To: Jeff Law Cc: Alan Lawrence; gcc@gcc.gnu.org Subject: Re: dom1 prevents vectorization via partial loop peeling? On Mon, Apr 27, 2015 at 7:06

More methods of reducing the register pressure

2015-04-19 Thread Ajit Kumar Agarwal
Hello All: To reduce the register pressure, I am proposing the following methods of reducing the registers. 1. Assigning same registers or sharing same register for the logical registers having the same value. To determine the logical registers having the same value is the real challenge. Is t

RE: Proposal for another approach for Loop transformation with conditional in Loops.

2015-03-16 Thread Ajit Kumar Agarwal
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Monday, March 16, 2015 11:45 PM To: Ajit Kumar Agarwal; Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Proposal for another approach for Loop

RE: Short Circuit compiler transformations!!

2015-03-15 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Sunday, March 15, 2015 9:30 PM To: Ajit Kumar Agarwal; Jeff Law; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE: Short Circuit compiler

Proposal for the coarse grain unrolling heuristics and renaming for the enablement of better fine grain Loop transformation.

2015-03-15 Thread Ajit Kumar Agarwal
Hello All: Below examples are the transformation for the given loop in Fig(1). Fig(2) unroll and jam and the Fig(3) does the Code motion to bring two IF adjacent to each other and two while loops adjacent to each other. The Fig(4 ) does the IF-merging and the Loop fusion on the transformed Loo

RE: Function outlining and partial Inlining

2015-03-15 Thread Ajit Kumar Agarwal
-Original Message- From: Jan Hubicka [mailto:hubi...@ucw.cz] Sent: Thursday, February 12, 2015 10:34 PM To: Ajit Kumar Agarwal Cc: hubi...@ucw.cz; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Function outlining and partial

RE: Short Circuit compiler transformations!!

2015-03-15 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal Sent: Sunday, March 15, 2015 3:35 PM To: Richard Biener; Jeff Law; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE

RE: Short Circuit compiler transformations!!

2015-03-15 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Sunday, March 15, 2015 3:05 PM To: Ajit Kumar Agarwal; Jeff Law; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Short Circuit compiler

Short Circuit compiler transformations!!

2015-03-15 Thread Ajit Kumar Agarwal
Hello All: Short circuit compiler transformation for conditional branches. The conditional branches based on the conditional Expressions one of the path is always executed thus short circuiting the path. Certains values of the conditional Expressions makes the conditional expressions always true

RE: Proposal for another approach for Loop transformation with conditional in Loops.

2015-03-14 Thread Ajit Kumar Agarwal
-Original Message- From: Aditya K [mailto:hiradi...@msn.com] Sent: Sunday, March 15, 2015 11:37 AM To: Ajit Kumar Agarwal; Jeff Law; Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE: Proposal for another approach

Proposal for another approach for Loop transformation with conditional in Loops.

2015-03-14 Thread Ajit Kumar Agarwal
Hello All: I am proposing the new approach to Loop transformation as given below in the example For the loops with conditional expression inside the Loops. The Loop body should be reducible control flow graph. The iteration space is partitioned into different spaces for which either the cond_exp

RE: Proposal for path splitting for reduction in register pressure for Loops.

2015-03-09 Thread Ajit Kumar Agarwal
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Monday, March 09, 2015 11:01 PM To: Richard Biener Cc: Ajit Kumar Agarwal; vmaka...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Proposal for path

RE: Proposal for path splitting for reduction in register pressure for Loops.

2015-03-08 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Sunday, March 08, 2015 9:05 PM To: Ajit Kumar Agarwal; vmaka...@redhat.com; Jeff Law; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Proposal for

Proposal for path splitting for reduction in register pressure for Loops.

2015-03-08 Thread Ajit Kumar Agarwal
Hello All: The path splitting that replicates the code for better Data flow Analysis available. One of the properties of path splitting removes the joining nodes for the forked path like IF-THEN-ELSE and the Loops. The removal of joining nodes makes the path splitted into two independent path

Proposal for inter-procedural loops fusion.

2015-03-07 Thread Ajit Kumar Agarwal
Hello All: I am proposing the inter-procedural Loop fusion. Generally the Loops adjacent to each other and the conformable Candidates of loop fusions are done with respect to intra-procedural loops. The whole program analysis needs to Be done with array sections analysis across the procedure ca

RE: Proposal on Unrolling factor based on Data reuse.

2015-03-07 Thread Ajit Kumar Agarwal
om: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Ajit Kumar Agarwal Sent: Saturday, March 07, 2015 3:31 PM To: Richard Biener; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Proposal on Unrolling factor based on Data reuse.

Proposal on Unrolling factor based on Data reuse.

2015-03-07 Thread Ajit Kumar Agarwal
Hello All: I would like to propose the Unrolling factor based on Data reuse between different iterations. This combines the data reuse of different iterations into single iterations. There is a use of MaxFactor which decides on the calculation of unroll factor based on Data reuse.The MaxFactor

RE: Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Tuesday, February 17, 2015 5:49 PM To: Ajit Kumar Agarwal Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Tree SSA If-combine optimization pass

RE: Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Tuesday, February 17, 2015 3:42 PM To: Ajit Kumar Agarwal Cc: gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Tree SSA If-combine optimization pass

Cost Calculation on Loop Invariant on Arithmetic operations on RTL

2015-02-17 Thread Ajit Kumar Agarwal
Hello All: I can see the Loop invariant pass in the GCC on RTL considering the register pressure and the cost manipulation With respect to SET destination node in RTL. The Loop invariant takes care of only address arithmetic candidates of Loop invariance. In the function get_inv_cost, I can se

Tree SSA If-combine optimization pass in GCC

2015-02-17 Thread Ajit Kumar Agarwal
Hello All: I can see the IF-combining (If-merging) pass of optimization on tree-ssa form of intermediate representation. The IF-combine or merging takes of merging the IF-THEN-ELSE if the condition Expr found be congruent or Similar. The IF-combine happens if the two IF-THEN-ELSE are contiguo

RE: Function outlining and partial Inlining

2015-02-16 Thread Ajit Kumar Agarwal
-Original Message- From: Jan Hubicka [mailto:hubi...@ucw.cz] Sent: Thursday, February 12, 2015 10:34 PM To: Ajit Kumar Agarwal Cc: hubi...@ucw.cz; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Function outlining and partial

unaligned memory access for vectorization

2015-02-12 Thread Ajit Kumar Agarwal
Hello All: The unaligned array access are the blocking factor in the vectorization. This is due to unaligned load and stores with respect to SIMD instructions are costly operations. To enable the vectorizations for unaligned array access the loop peeling is done to make the multiversioning of

Function outlining and partial Inlining

2015-02-12 Thread Ajit Kumar Agarwal
Hello All: The large functions are the important part of high performance application. They contribute to performance bottleneck with many respect. Some of the large hot functions are frequently executed but many regions inside the functions are cold regions. The large Function blocks the functi

Unrolling factor heuristics for Loop Unrolling

2015-02-12 Thread Ajit Kumar Agarwal
Hello All: The Loop unrolling without good unrolling factor heuristics becomes the performance bottleneck. The Unrolling factor heuristics based on minimum Initiation interval is quite useful with respect to better ILP. The minimum Initiation interval based on recurrence and resource calculati

RE: Rematerialization and Live Range Splitting on Region Frequency

2015-01-28 Thread Ajit Kumar Agarwal
Thanks Vladimir for the inputs. It is quite helpful. Thanks & Regards Ajit -Original Message- From: Vladimir Makarov [mailto:vmaka...@redhat.com] Sent: Tuesday, January 27, 2015 1:10 AM To: Ajit Kumar Agarwal; l...@redhat.com; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya G

Rematerialization and Live Range Splitting on Region Frequency

2015-01-25 Thread Ajit Kumar Agarwal
Hello All: Looks like Live range splitting and rematerialization are connected to each other. If the boundary of Live range Splitting is in the high frequency of the region then the move connected to splitted live ranges are inside the High frequency region which is the performance bottleneck f

Optimal Coalescing with respect to move instruction for Live range splitting

2015-01-17 Thread Ajit Kumar Agarwal
Register allocation with two phase approach does optimal coalescing after the spilling. Sometime Live range splitting makes the coalescing non optimal. The splitted Live range are connected by move instruction. Thus the Live range splitting and more specifically aggressive Live range splitting

RE: Allocating some Loop allocno in memory

2015-01-12 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Monday, January 12, 2015 2:33 PM To: Ajit Kumar Agarwal Cc: vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re

RE: Allocating some Loop allocno in memory

2015-01-11 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Sunday, January 11, 2015 8:05 PM To: Ajit Kumar Agarwal; vmaka...@redhat.com; l...@redhat.com; gcc@gcc.gnu.org Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re

Allocating some Loop allocno in memory

2015-01-10 Thread Ajit Kumar Agarwal
I was thinking of some of the opportunities with respect to reducing spills inside the Loop. If the Live range(allocno) spans through the Loop and Live out at the exit of the Loop and there are no references or not being touched upon inside the Loop, assign the allocno to the memory. This incre

RE: Support for architectures without hardware interlocks

2015-01-08 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Joel Sherrill Sent: Thursday, January 08, 2015 8:59 PM To: Eric Botcazou; Claudiu Zissulescu Cc: gcc@gcc.gnu.org; David Kang Subject: Re: Support for architectures without hardware interlocks On

IRA : Changes in the cost of putting allocno into memory.

2015-01-08 Thread Ajit Kumar Agarwal
s are given below. From 758ee2227e9dde946ac35b772bee99279b1bf996 Mon Sep 17 00:00:00 2001 From: Ajit Kumar Agarwal Date: Tue, 6 Jan 2015 19:42:16 +0530 Subject: [PATCH] IRA : Changes in the cost of putting allocno into memory. Changes are made to not consider the back edge frequency for

Vectorization opportunities for conditional branches.

2015-01-04 Thread Ajit Kumar Agarwal
The following fig (1) shows an implementation of the SSQ kernel from the BLAS Library in ATLAS. Fig(2) shows the conversions of the IF-THEN-ELSE in Fig(1) to vectorized code. Normally in the automatic vectorization the IF-THEN-ELSE is vectorized only after the IF-CONVERSION that converts con

Register Allocation with Instruction Scheduling.

2014-12-21 Thread Ajit Kumar Agarwal
Hello All: I was going through the following article " Register Allocation with instruction scheduling: a new approach" by Pinter etal. The phase ordering of register allocation and Instruction scheduling is important topic. The scheduling before register allocator increases the register pr

RE: Instruction scheduler with respect to prefetch instructions.

2014-12-19 Thread Ajit Kumar Agarwal
-Original Message- From: paul_kon...@dell.com [mailto:paul_kon...@dell.com] Sent: Saturday, December 13, 2014 9:46 PM To: Ajit Kumar Agarwal Cc: vmaka...@redhat.com; l...@redhat.com; richard.guent...@gmail.com; gcc@gcc.gnu.org; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida

Instruction scheduler with respect to prefetch instructions.

2014-12-13 Thread Ajit Kumar Agarwal
Hello All: Since the prefetch instruction have no direct consumers in the code stream, they provide considerable freedom to the Instruction scheduler. They are typically assigned lower priorities than most of the instructions in the code stream. This tends to cause all the prefetch instruction

RE: A Question About LRA/reload

2014-12-10 Thread Ajit Kumar Agarwal
-Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Jeff Law Sent: Tuesday, December 09, 2014 11:26 PM To: Vladimir Makarov; lin zuojian; gcc@gcc.gnu.org Subject: Re: A Question About LRA/reload On 12/09/14 10:10, Vladimir Makarov wrote: > generate

RE: Optimized Allocation of Argument registers

2014-11-24 Thread Ajit Kumar Agarwal
From: Ajit Kumar Agarwal Sent: Tuesday, November 18, 2014 7:01 PM To: 'Vladimir Makarov'; gcc Mailing List Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: RE: Optimized Allocation of Argument registers -Original Message- From: Vl

RE: Optimized Allocation of Argument registers

2014-11-18 Thread Ajit Kumar Agarwal
-Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: Monday, November 17, 2014 9:27 PM To: Ajit Kumar Agarwal; Vladimir Makarov; gcc Mailing List Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Optimized Allocation of Argument

RE: Optimized Allocation of Argument registers

2014-11-18 Thread Ajit Kumar Agarwal
-Original Message- From: Vladimir Makarov [mailto:vmaka...@redhat.com] Sent: Tuesday, November 18, 2014 1:57 AM To: Ajit Kumar Agarwal; gcc Mailing List Cc: Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Optimized Allocation of Argument registers

RE: Optimized Allocation of Argument registers

2014-11-17 Thread Ajit Kumar Agarwal
Hello All: I was looking at the optimized usage and allocation to argument registers. There are two aspects to it as follows. 1. We need to specify the argument registers as followed by ABI in the target specific code. Based on the function argument registers defined in the target dependent co

Expansion of memset and memcpy calls.

2014-10-21 Thread Ajit Kumar Agarwal
Hello All: Memset and Memcpy calls are extensively used in many benchmarks. Inlining or expansion the memcpy and memset calls improves the performance of many performance Benchmark. I have implemented the expansion of strcmp to the optimizaed sequence of instruction In open64 compiler for AMD

Global Value Numbering on SSA representation based on Redundancy Class

2014-09-19 Thread Ajit Kumar Agarwal
Hello All: Please find the different Global Value numbering techniques on SSA representation and proposing in GCC Global Value Numbering on SSA representation based on Redundancy Class. Can this be proposed. SSA representation with control graph can be formulated with Global Value Numbering A

RE: Possible LRA issue?

2014-08-28 Thread Ajit Kumar Agarwal
-Original Message- From: Daniel Gutson [mailto:daniel.gut...@tallertechnologies.com] Sent: Wednesday, August 27, 2014 8:53 PM To: Ajit Kumar Agarwal Cc: gcc Mailing List Subject: Re: Possible LRA issue? On Wed, Aug 27, 2014 at 12:16 PM, Ajit Kumar Agarwal wrote: > The cause

RE: Possible LRA issue?

2014-08-27 Thread Ajit Kumar Agarwal
The cause of xmalloc occurring at times given below in Register Allocator will not be caused only by the structure and changing the passed S as template argument. It depends on how the below structures is referenced or used. From the stack trace I can see the live ranges creation is based on how

RE: Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Ajit Kumar Agarwal
-Original Message- From: Richard Biener [mailto:richard.guent...@gmail.com] Sent: Monday, June 16, 2014 7:55 PM To: Ajit Kumar Agarwal Cc: gcc@gcc.gnu.org; Vladimir Makarov; Michael Eager; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala Subject: Re: Register

Register Pressure guided Unroll and Jam in GCC !!

2014-06-16 Thread Ajit Kumar Agarwal
Hello All: I have worked on the Open64 compiler where the Register Pressure Guided Unroll and Jam gave a good amount of performance improvement for the C and C++ Spec Benchmark and also Fortran benchmarks. The Unroll and Jam increases the register pressure in the Unrolled Loop leading to inc

vector load Rematerialization!!

2014-06-16 Thread Ajit Kumar Agarwal
Hello All: There has been work done for load rematerialization. Instead of Store and Load of variables they kept in registers for the Live range. Till now we are doing the rematerialization of scalar loads. Is it feasible to have rematerialization for the vector Loads? This will be helpful

Reducing Register Pressure based on Instruction Scheduling and Register Allocator!!

2014-06-06 Thread Ajit Kumar Agarwal
Hello All: I was looking further the aspect of reducing register pressure based on Register Allocation and Instruction Scheduling and the Following observation being made on reducing register pressure based on the existing papers on reducing register pressure Based on scheduling approach. Does

RE: Reducing Register Pressure through Live range Shrinking through Loops!!

2014-05-25 Thread Ajit Kumar Agarwal
On Friday, May 23, 2014 1:46 AM Vladimir Makarov wrote: On 05/21/2014 12:25 AM, Ajit Kumar Agarwal wrote: > Hello All: > > Simpson does the Live range shrinking and reduction of register > pressure by using the computation that are not load and store but the > arithmetic c

Reducing Register Pressure through Live range Shrinking through Loops!!

2014-05-20 Thread Ajit Kumar Agarwal
Hello All: Simpson does the Live range shrinking and reduction of register pressure by using the computation that are not load and store but the arithmetic computation. The computation where the operands and registers are live at the entry and exit of the basic block but not touched inside the

RE: negative latencies

2014-05-19 Thread Ajit Kumar Agarwal
Is it the case of code speculation where the negative latencies are used? Thanks & Regards Ajit -Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of shmeel gutl Sent: Monday, May 19, 2014 12:23 PM To: Andrew Pinski Cc: gcc@gcc.gnu.org; Vladimir Makar

RE: Live Range Splitting in Integrated Register Allocator

2014-05-15 Thread Ajit Kumar Agarwal
Thanks Vladimir for the clarification. Thanks & Regards Ajit -Original Message- From: Vladimir Makarov [mailto:vmaka...@redhat.com] Sent: Thursday, May 15, 2014 8:39 PM To: Ajit Kumar Agarwal; gcc@gcc.gnu.org Cc: Michael Eager; Vinod Kathail; Vidhumouli Hunsigida; Nagaraju Me

  1   2   >