DU-chains Vs UD-chains

2008-04-15 Thread Fran Baena
Hi all,

i'm currently studing alias analysis, and i have some questions, for
instance,  when are the du/ud-chains calculated? Before translating to
SSA form?
If i'm not wrong the def-use chain connects a definition of a variable
to all the uses it may flow to, and the use-def chain connects a use
to all the definitions that may flow to it. But, what chain is better
for data-flow analysis?
Maybe the answer is both?

Thanks in advance

Fran


Mapping back to original variables after SSA optimizations

2008-04-10 Thread Fran Baena
Hi all,

i have a doubt about unSSA: is it allways possible to map back the
versioned variables to the original variable? If it could be possible,
is there an algorithm that describe this translation back?
I have read the paper Efficiently computing static single assignment
form and the control dependence graph (cytron91) and no way to
translate back from SSA is explained, it only points out that after
SSA optimizations dead code elminitation and allocation by
colloring are recommended to be performed.

Thanks

Fran


Re: SSA Vs unSSA

2008-04-09 Thread Fran Baena
  You'd want to avoid translating from tuples back to nested trees.  Instead 
 when
  expanding from SSA form (ok, let's make that semi-SSA form that just keeps
  the UD chains but gets rid of PHI nodes (and maybe overlapping life ranges))
  the expander can do expression combining by following the UD chain instead
  of relying on complex expressions re-constructed by TER.

Sorry, i didn't understand the whole explanation. I think that you are
talking about generation directly from SSA form, without translating
back to GIMPLE, aren't you?

Fran


Re: SSA Vs unSSA

2008-04-09 Thread Fran Baena
  Depends on what SSA form you want.  A rewriting form is problematic because
 backends are allowed to modify the IL behind the back of the optimizers and
 can emit instructions that are difficult/impossible to represent in SSA form
 (parallel).

I don't exactly know what rewriting SSA form means. I think that is a
SSA form that could be updated along the several optimizations passes.
From this previous explanation I deduce that SSA form goes on once the
Middle-End layer has finished, and no translation to GIMPLE is
performed, because target-dependent (or RTL) optimizations are applied
to this SSA form.

  A non-rewriting form, however, is not hard to implement.  We've discussed
 this a few weeks ago.  At some point we will build a FUD-chain
 representation from the currently dense UD links in the DF code.

What is the difference between rewriting and non-rewriting SSA form?
From this I deduce that FUD-chain are used to translate from SSA to
GIMPLE, isn't it?

It's clear that translating from SSA form to GIMPLE is not trivial,
and not always can be done. During the optimizations it is possible
that new variables appear or that some statement are removed from
Intermediate Representation, son unSSA wouldn't be performed.

Fran


Re: RTL definition

2008-03-11 Thread Fran Baena
Hi,

  By the way, RTL is not really machine-independent.  The data
  structures are machine independent.  But the contents are not.  You
  can not, even in principle, take the RTL generated for one processor
  and compile it on another processor.

I thought that RTL represented something close to the target machine,
but not machine-dependent. I firstly thought that the output of the
middle-end was an RTL machine-independent representation, to which is
applied a few low-optimization machine-independent passes, and after
that is translated to a RTL machine-dependent to be applied other
optimization passes.

I read the rtl.def and rtl.h files, are very interesting, and i better
understand the whole process. But reading the output files by debuggin
options (-fdump-rtl-all) i have seen instructions like this:

(insn 8 6 10 1 (set (mem/c/i:SI (plus:DI (reg/f:DI 54 virtual-stack-vars)
(const_int -8 [0xfff8])) [0 a+0 S4 A64])
(const_int 1 [0x1])) -1 (nil)
(nil))

Among the multiple questions that appears i have a particular one,
what does 8 6 10 1 represents? Is it the print format defined in
rtl.def?

Thanks,

Fran


RTL definition

2008-03-10 Thread Fran Baena
Hi all,

RTL represents a low-level language, machine-independent. But I didn't
find any especification of such language represented. This is, I found
no document where the language represented were described or defined
in a grammar way. So, I 'd thank you to show me where the RTL-language
 is defined.  Is it an assembler subset?
I have read the documentation and i didn't found where it is
described, maybe I searched in wrong place.

Thanks all you


Re: RTL definition

2008-03-10 Thread Fran Baena
Hi Ramana,

   I have read the documentation and i didn't found where it is
   described, maybe I searched in wrong place.


 RTL language definition is in rtl.def and gives the different
  operators and operands. info gccint on a standard linux distribution
  should help you figure out details about RTL .

  You could look at the wiki for getting started

  http://gcc.gnu.org/wiki/GettingStarted

  There are links to a number of tutorials that talk about the RTL IR in
  the wiki that you could take a look at.  I am not clear about what you
  want to do with RTL so can't help you further.

I have read http://gcc.gnu.org/onlinedocs/gccint/RTL.html document and
I have got compiler information applying -fdump-rtl-all option. My
main doubt is how RTL is defined, this is, what it represents.
I suppose is something very close to assembler, because It has to
represent the operations between registers. But I don't know how RTL
is defined from IR. At the end, RTL represents a low-level
intermediate representation (language) that its internally represented
by a tree, but I find no place where is defined what can representate
a RTL tree node, etc.

Thanks

Fran


Re: RTL definition

2008-03-10 Thread Fran Baena
2008/3/10, Jim Wilson [EMAIL PROTECTED]:
 Fran Baena wrote:
   RTL represents a low-level language, machine-independent. But I didn't
   find any especification of such language represented. This is, I found
   no document where the language represented were described or defined
   in a grammar way.


 RTL isn't a programming language, and hence has no grammar.  It is
  merely one of the internal representations that gcc uses for optimizing
  and generating code.  We do have gcc internals documentation that you
  have already been pointed at.

i agree with you, RTL is not a programming language, is a language
that is represented internally. But it could match a grammar
definition. And this is my question, what is the process to define a
RTL.

Thank you

Fran


SSA alias representation

2008-02-27 Thread Fran Baena
  Symbols with their address taken are only renamed when they appear as
   virtual operands.  So, if you have:
 
   p_3 = (i_5  10) ? a : b
   a = 4
 
   notice that 'a' is never renamed in the LHS of the assignment.  It's
   renamed as a virtual operand:
 
   p_3 = (i_5  10) ? a : b
 
   # a_9 = VDEF a_8
   a = 4

 I have seen -fdump-tree-salias-all-vops, and i have realized that a
 is never renamed. That means that a previous scan of the tree is
 needed to know which symbols do not need to be versioned and which do
 need. I suppose that each definition and use of a has associated its
 virtual operator (as shown in next portion of code), to maintain the
 FUD chain.
 After that previous scan the renaming pass is applied.

  # a_8 = VDEF a_7
  a = 1;
  


  p_3 = (i_5  10) ? a : b

  # a_9 = VDEF a_8
  a = 4


 Thanks

 Fran


Re: SSA alias representation

2008-02-25 Thread Fran Baena
If a name tag is associated to a ssa name, ¿when does it make sense to
version a name tag? (If it does)

2008/2/21, Diego Novillo [EMAIL PROTECTED]:
 On 2/21/08 1:13 PM, Fran Baena wrote:
   2008/2/21, Diego Novillo [EMAIL PROTECTED]:
   On 2/19/08 2:27 PM, Fran Baena wrote:
 Hi everybody,

 i am studing how gcc carries out Alias Representation and some 
 questions appear.

 For instance, given this code portion:

  if ( ... )
p  = a;
  else
if ( ... )
  p = b;
else
  p = c;

  a = 5;
  b = 3;
  d = *p4;

 My questions are:

 - both p like *p need a Name Memory Tag structure? It is enough with 
 only one?
  
  
   Pointer dereferences do not receive NMTs, only the pointers.  So a name
tag will be associated with the SSA name resulting from the PHI node
created at the end of that if() tree.
  
   And this name tag is versioned too, like this:
if ( ... )
   p  = a;   (NMTv1 associated to p)
else
if ( ... )
   p = b;(NMTv2 associated to p)
else
   p = c;(NMTv3 associated to p)
  

 No.  None of these SSA names are dereferenced so no name tag is created
  for them.  The only SSA name with a name tag will be the one produced by
  the PHI.

  See the output of -fdump-tree-salias-vops-blocks.  That will show you
  the results of the first alias analysis run.


   To organize temporarily the passes. Firstly alias analysis is
   computed, it is followed by ssa form, and after that the memory ssa
   web, aren't they?


 No.  First SSA then alias analysis then the memory SSA


   Where could i find all the passes applied and their order?


 passes.c:init_optimization_passes.


   What is register SSA form? Is it the standard SSA form explained in
   Efficiently computing static single assignment form and the control
   dependence graph (cytron91)?


 Yes.  It's the traditional rewriting form.  The memory SSA web is the
  factored use-def chains by Wolfe.


  See the tutorials in http://gcc.gnu.org/wiki/GettingStarted, all this is
  described there.



  Diego.



Re: SSA alias representation

2008-02-21 Thread Fran Baena
2008/2/21, Diego Novillo [EMAIL PROTECTED]:
 On 2/19/08 2:27 PM, Fran Baena wrote:
   Hi everybody,
  
   i am studing how gcc carries out Alias Representation and some questions 
 appear.
  
   For instance, given this code portion:
  
if ( ... )
  p  = a;
else
  if ( ... )
p = b;
  else
p = c;
  
a = 5;
b = 3;
d = *p4;
  
   My questions are:
  
   - both p like *p need a Name Memory Tag structure? It is enough with only 
 one?


 Pointer dereferences do not receive NMTs, only the pointers.  So a name
  tag will be associated with the SSA name resulting from the PHI node
  created at the end of that if() tree.

And this name tag is versioned too, like this:
 if ( ... )
p  = a;   (NMTv1 associated to p)
 else
 if ( ... )
p = b;(NMTv2 associated to p)
 else
p = c;(NMTv3 associated to p)

 # p = PHI  (NMTv4 associated to p)
 

This is, a name tag is constantly being versioned, isnt it?

   - About versioning : every name can be versioned, cannot it?


 Sorry, I don't understand the question.  SSA names have a version
  number, yes.  Is that what you're asking?

Yes, bad formed. Sorry, it is about my english

  - when are virtual operands inserted? Do they have to be versioned
   at same time that real operands? For instance, # a6 = VDEF a5; a7 =
   3;, this implies that alias computation is processed at same time that
   SSA Renaming?


 No, the memory SSA web is built after the first stage of alias analysis.
   First, the program is put in register SSA form, a few passes later the
  memory SSA UD web is built on top.


To organize temporarily the passes. Firstly alias analysis is
computed, it is followed by ssa form, and after that the memory ssa
web, aren't they?
Where could i find all the passes applied and their order?
What is register SSA form? Is it the standard SSA form explained in
Efficiently computing static single assignment form and the control
dependence graph (cytron91)?

  - Could the others names (TMT, NMT,  SFT, etc.) be versioned?
   (every name that is able to be part of a virtual operand could be
   versioned)


 Yes.  When a pointer does not have a specific set of symbols in its
  points-to set, its memory tag is used.


My bigger doubt is when are each structure computed: alias set, names
tags, UD chains.


Thanks very much!!

Fran


SSA alias representation

2008-02-19 Thread Fran Baena
Hi everybody,

i am studing how gcc carries out Alias Representation and some questions appear.

For instance, given this code portion:

 if ( ... )
   p  = a;
 else
   if ( ... )
 p = b;
   else
 p = c;

 a = 5;
 b = 3;
 d = *p4;

My questions are:

- both p like *p need a Name Memory Tag structure? It is enough with only one?

- About versioning : every name can be versioned, cannot it?
   - when are virtual operands inserted? Do they have to be versioned
at same time that real operands? For instance, # a6 = VDEF a5; a7 =
3;, this implies that alias computation is processed at same time that
SSA Renaming?

   - Could the others names (TMT, NMT,  SFT, etc.) be versioned?
(every name that is able to be part of a virtual operand could be
versioned)

Thank you in advance,

Fran


Re: alias and pointers analysis

2007-11-14 Thread Fran Baena
2007/11/13, Diego Novillo [EMAIL PROTECTED]:
 On Nov 13, 2007 1:38 PM, Fran Baena [EMAIL PROTECTED] wrote:

 1. Convert the function into GIMPLE form. Implemented in gimplify.c
  and c-simplify.c.
 2. Find variable references in the code. Implemented in tree-dfa.c.
 3. Build a control-flow graph (CFG). Implemented in tree-cfg.c.
  This implementation uses the same basic_block structure used by the
  RTL optimizers. This allows us to share most of the existing CFG code.
 4. Rewrite the tree in SSA form. Implemented in tree-ssa.c.
  []
 
  But i still doubt about the process, in some ways:
 
  *  Is the step #2 related to the alias analysis?

 Yes, though this documentation is fairly old.  Finding referenced
 variables is needed to determine what symbols are of interest.

  That is hold with the
  def-use chains, and SMT / NMT structures. And, make any sense doing
  these step before the SSA variable versioning? If positive answer,
  why?

 Sorry, I don't understand this question.

I mean that, if that structures (def-use chains, SMT, NMT, etc) are
constructed before the SSA step where the program variables are
versioned, after the variable versioning we will get that these
structures are not congruent with the new variables generated in the
mentioned step.

The issue i dont understand is why alias analysis is done before the
SSA pass, and, if all the structures created within these analysis are
used in next optimization passes (dead code elimination, constant
propagation.)

Thanks for all

Fran


Re: alias and pointers analysis

2007-11-13 Thread Fran Baena
Hi again,

i have been studing gcc docs to undestand SSA and steps to take to get
SSA form. In one GCC online document:
http://gcc.gnu.org/projects/tree-ssa/#ssa, the steps to translate to
SSA form are listed. Here, i copy and paste the mentioned text:

[]
Conversion to SSA form is a three step process driven from tree-optimize.c:

   1. Convert the function into GIMPLE form. Implemented in gimplify.c
and c-simplify.c.
   2. Find variable references in the code. Implemented in tree-dfa.c.
   3. Build a control-flow graph (CFG). Implemented in tree-cfg.c.
This implementation uses the same basic_block structure used by the
RTL optimizers. This allows us to share most of the existing CFG code.
   4. Rewrite the tree in SSA form. Implemented in tree-ssa.c.
[]

But i still doubt about the process, in some ways:

*  Is the step #2 related to the alias analysis? That is hold with the
def-use chains, and SMT / NMT structures. And, make any sense doing
these step before the SSA variable versioning? If positive answer,
why?

Thanks in advance

Fran.



2007/10/26, Diego Novillo [EMAIL PROTECTED]:
 On 10/26/07, J.C. Pizarro [EMAIL PROTECTED] wrote:

  What is the matter if the 'b' var. is unused and
  optimally removed by the SSA algorithm?

 In this case, it will not be removed.  If any of the p_i pointers is
 ever dereferenced in this code, that will be considered a use of
 variable 'b'.


  int a;
  int b;
 
  a = 2;
  p_4 = phi(a)

 Is this 'phi' as in a PHI function or a function in your code?  If the
 former, then it's wrong, you can never have such a phi function in
 this code snippet.

  // b doesn't used here
  if (...)
p_1 = a;
  else
p_2 = b;
  endif
  p_3 = phi (p_1, p_2)
 
  points-to (p_1) = { a }
  points-to (p_2) = { b }
  points-to (p_3) = { a b }
 
  In this case, should exist hidden p_5 = phi(b) although 'b' is not used
  but yes used his reference to phantom cell 'b'. It's weird for me.

 I recommend that you read about the SSA form.  PHI nodes are special
 constructs that exist only where multiple control flow paths reach a
 common join node.  The getting started section of the wiki has links
 to books and articles about it.  Morgan's book on compiler
 optimization is fairly good.

  I've not idea WHERE put hidden p_5 = phi(b)!

 No such thing exists.


  Too it's possible to ocurr *p_2 = c where 'b' will be hidden used through
  the pointer p_2. It's too weird for me.

 Yes, that is possible, an that is precisely what alias analysis tells
 the compiler.  We know from the analysis that reading/writing to
 '*p_2' is exactly the same as reading/writing to 'b'.



alias and pointers analysis

2007-10-25 Thread Fran Baena
Hi,

once i have been reading lots of documents about SSA computing. I have
a few questions, conceptual ones, that stop me understanding the whole
goal of the method. Here i go:

* Why alias analysis? I mean, the pointers could be versioned like the
others vars. (I'm not saying it's not usefull, i mean i don't
understand the alias analisys goals)

* What is the significance of Use-Def chains? How are they construct?
Are related to VDEF/DEF and VUSE/USE?

Note: i'm beginning i'd need something basic, conceptual docs.

thanks and c u all


indirect memory op in SSA

2007-10-18 Thread Fran Baena
Hi!

i am trying to implement the SSA form, inspirated in GCC way. I have
found problems in processing indirect memory operations like arrays,
structures and pointers.
My questions are: are nodes STRUCT_FIELD_TAG, NAME_MEMORY_TAG,
SYMBOL_MEMORY_TAG used for that purpose? and, where i can find
documentation about their use within the translation process to SSA?

Thanks.