DU-chains Vs UD-chains
Hi all, i'm currently studing alias analysis, and i have some questions, for instance, when are the du/ud-chains calculated? Before translating to SSA form? If i'm not wrong the def-use chain connects a definition of a variable to all the uses it may flow to, and the use-def chain connects a use to all the definitions that may flow to it. But, what chain is better for data-flow analysis? Maybe the answer is both? Thanks in advance Fran
Mapping back to original variables after SSA optimizations
Hi all, i have a doubt about unSSA: is it allways possible to map back the versioned variables to the original variable? If it could be possible, is there an algorithm that describe this translation back? I have read the paper Efficiently computing static single assignment form and the control dependence graph (cytron91) and no way to translate back from SSA is explained, it only points out that after SSA optimizations dead code elminitation and allocation by colloring are recommended to be performed. Thanks Fran
Re: SSA Vs unSSA
You'd want to avoid translating from tuples back to nested trees. Instead when expanding from SSA form (ok, let's make that semi-SSA form that just keeps the UD chains but gets rid of PHI nodes (and maybe overlapping life ranges)) the expander can do expression combining by following the UD chain instead of relying on complex expressions re-constructed by TER. Sorry, i didn't understand the whole explanation. I think that you are talking about generation directly from SSA form, without translating back to GIMPLE, aren't you? Fran
Re: SSA Vs unSSA
Depends on what SSA form you want. A rewriting form is problematic because backends are allowed to modify the IL behind the back of the optimizers and can emit instructions that are difficult/impossible to represent in SSA form (parallel). I don't exactly know what rewriting SSA form means. I think that is a SSA form that could be updated along the several optimizations passes. From this previous explanation I deduce that SSA form goes on once the Middle-End layer has finished, and no translation to GIMPLE is performed, because target-dependent (or RTL) optimizations are applied to this SSA form. A non-rewriting form, however, is not hard to implement. We've discussed this a few weeks ago. At some point we will build a FUD-chain representation from the currently dense UD links in the DF code. What is the difference between rewriting and non-rewriting SSA form? From this I deduce that FUD-chain are used to translate from SSA to GIMPLE, isn't it? It's clear that translating from SSA form to GIMPLE is not trivial, and not always can be done. During the optimizations it is possible that new variables appear or that some statement are removed from Intermediate Representation, son unSSA wouldn't be performed. Fran
Re: RTL definition
Hi, By the way, RTL is not really machine-independent. The data structures are machine independent. But the contents are not. You can not, even in principle, take the RTL generated for one processor and compile it on another processor. I thought that RTL represented something close to the target machine, but not machine-dependent. I firstly thought that the output of the middle-end was an RTL machine-independent representation, to which is applied a few low-optimization machine-independent passes, and after that is translated to a RTL machine-dependent to be applied other optimization passes. I read the rtl.def and rtl.h files, are very interesting, and i better understand the whole process. But reading the output files by debuggin options (-fdump-rtl-all) i have seen instructions like this: (insn 8 6 10 1 (set (mem/c/i:SI (plus:DI (reg/f:DI 54 virtual-stack-vars) (const_int -8 [0xfff8])) [0 a+0 S4 A64]) (const_int 1 [0x1])) -1 (nil) (nil)) Among the multiple questions that appears i have a particular one, what does 8 6 10 1 represents? Is it the print format defined in rtl.def? Thanks, Fran
RTL definition
Hi all, RTL represents a low-level language, machine-independent. But I didn't find any especification of such language represented. This is, I found no document where the language represented were described or defined in a grammar way. So, I 'd thank you to show me where the RTL-language is defined. Is it an assembler subset? I have read the documentation and i didn't found where it is described, maybe I searched in wrong place. Thanks all you
Re: RTL definition
Hi Ramana, I have read the documentation and i didn't found where it is described, maybe I searched in wrong place. RTL language definition is in rtl.def and gives the different operators and operands. info gccint on a standard linux distribution should help you figure out details about RTL . You could look at the wiki for getting started http://gcc.gnu.org/wiki/GettingStarted There are links to a number of tutorials that talk about the RTL IR in the wiki that you could take a look at. I am not clear about what you want to do with RTL so can't help you further. I have read http://gcc.gnu.org/onlinedocs/gccint/RTL.html document and I have got compiler information applying -fdump-rtl-all option. My main doubt is how RTL is defined, this is, what it represents. I suppose is something very close to assembler, because It has to represent the operations between registers. But I don't know how RTL is defined from IR. At the end, RTL represents a low-level intermediate representation (language) that its internally represented by a tree, but I find no place where is defined what can representate a RTL tree node, etc. Thanks Fran
Re: RTL definition
2008/3/10, Jim Wilson [EMAIL PROTECTED]: Fran Baena wrote: RTL represents a low-level language, machine-independent. But I didn't find any especification of such language represented. This is, I found no document where the language represented were described or defined in a grammar way. RTL isn't a programming language, and hence has no grammar. It is merely one of the internal representations that gcc uses for optimizing and generating code. We do have gcc internals documentation that you have already been pointed at. i agree with you, RTL is not a programming language, is a language that is represented internally. But it could match a grammar definition. And this is my question, what is the process to define a RTL. Thank you Fran
SSA alias representation
Symbols with their address taken are only renamed when they appear as virtual operands. So, if you have: p_3 = (i_5 10) ? a : b a = 4 notice that 'a' is never renamed in the LHS of the assignment. It's renamed as a virtual operand: p_3 = (i_5 10) ? a : b # a_9 = VDEF a_8 a = 4 I have seen -fdump-tree-salias-all-vops, and i have realized that a is never renamed. That means that a previous scan of the tree is needed to know which symbols do not need to be versioned and which do need. I suppose that each definition and use of a has associated its virtual operator (as shown in next portion of code), to maintain the FUD chain. After that previous scan the renaming pass is applied. # a_8 = VDEF a_7 a = 1; p_3 = (i_5 10) ? a : b # a_9 = VDEF a_8 a = 4 Thanks Fran
Re: SSA alias representation
If a name tag is associated to a ssa name, ¿when does it make sense to version a name tag? (If it does) 2008/2/21, Diego Novillo [EMAIL PROTECTED]: On 2/21/08 1:13 PM, Fran Baena wrote: 2008/2/21, Diego Novillo [EMAIL PROTECTED]: On 2/19/08 2:27 PM, Fran Baena wrote: Hi everybody, i am studing how gcc carries out Alias Representation and some questions appear. For instance, given this code portion: if ( ... ) p = a; else if ( ... ) p = b; else p = c; a = 5; b = 3; d = *p4; My questions are: - both p like *p need a Name Memory Tag structure? It is enough with only one? Pointer dereferences do not receive NMTs, only the pointers. So a name tag will be associated with the SSA name resulting from the PHI node created at the end of that if() tree. And this name tag is versioned too, like this: if ( ... ) p = a; (NMTv1 associated to p) else if ( ... ) p = b;(NMTv2 associated to p) else p = c;(NMTv3 associated to p) No. None of these SSA names are dereferenced so no name tag is created for them. The only SSA name with a name tag will be the one produced by the PHI. See the output of -fdump-tree-salias-vops-blocks. That will show you the results of the first alias analysis run. To organize temporarily the passes. Firstly alias analysis is computed, it is followed by ssa form, and after that the memory ssa web, aren't they? No. First SSA then alias analysis then the memory SSA Where could i find all the passes applied and their order? passes.c:init_optimization_passes. What is register SSA form? Is it the standard SSA form explained in Efficiently computing static single assignment form and the control dependence graph (cytron91)? Yes. It's the traditional rewriting form. The memory SSA web is the factored use-def chains by Wolfe. See the tutorials in http://gcc.gnu.org/wiki/GettingStarted, all this is described there. Diego.
Re: SSA alias representation
2008/2/21, Diego Novillo [EMAIL PROTECTED]: On 2/19/08 2:27 PM, Fran Baena wrote: Hi everybody, i am studing how gcc carries out Alias Representation and some questions appear. For instance, given this code portion: if ( ... ) p = a; else if ( ... ) p = b; else p = c; a = 5; b = 3; d = *p4; My questions are: - both p like *p need a Name Memory Tag structure? It is enough with only one? Pointer dereferences do not receive NMTs, only the pointers. So a name tag will be associated with the SSA name resulting from the PHI node created at the end of that if() tree. And this name tag is versioned too, like this: if ( ... ) p = a; (NMTv1 associated to p) else if ( ... ) p = b;(NMTv2 associated to p) else p = c;(NMTv3 associated to p) # p = PHI (NMTv4 associated to p) This is, a name tag is constantly being versioned, isnt it? - About versioning : every name can be versioned, cannot it? Sorry, I don't understand the question. SSA names have a version number, yes. Is that what you're asking? Yes, bad formed. Sorry, it is about my english - when are virtual operands inserted? Do they have to be versioned at same time that real operands? For instance, # a6 = VDEF a5; a7 = 3;, this implies that alias computation is processed at same time that SSA Renaming? No, the memory SSA web is built after the first stage of alias analysis. First, the program is put in register SSA form, a few passes later the memory SSA UD web is built on top. To organize temporarily the passes. Firstly alias analysis is computed, it is followed by ssa form, and after that the memory ssa web, aren't they? Where could i find all the passes applied and their order? What is register SSA form? Is it the standard SSA form explained in Efficiently computing static single assignment form and the control dependence graph (cytron91)? - Could the others names (TMT, NMT, SFT, etc.) be versioned? (every name that is able to be part of a virtual operand could be versioned) Yes. When a pointer does not have a specific set of symbols in its points-to set, its memory tag is used. My bigger doubt is when are each structure computed: alias set, names tags, UD chains. Thanks very much!! Fran
SSA alias representation
Hi everybody, i am studing how gcc carries out Alias Representation and some questions appear. For instance, given this code portion: if ( ... ) p = a; else if ( ... ) p = b; else p = c; a = 5; b = 3; d = *p4; My questions are: - both p like *p need a Name Memory Tag structure? It is enough with only one? - About versioning : every name can be versioned, cannot it? - when are virtual operands inserted? Do they have to be versioned at same time that real operands? For instance, # a6 = VDEF a5; a7 = 3;, this implies that alias computation is processed at same time that SSA Renaming? - Could the others names (TMT, NMT, SFT, etc.) be versioned? (every name that is able to be part of a virtual operand could be versioned) Thank you in advance, Fran
Re: alias and pointers analysis
2007/11/13, Diego Novillo [EMAIL PROTECTED]: On Nov 13, 2007 1:38 PM, Fran Baena [EMAIL PROTECTED] wrote: 1. Convert the function into GIMPLE form. Implemented in gimplify.c and c-simplify.c. 2. Find variable references in the code. Implemented in tree-dfa.c. 3. Build a control-flow graph (CFG). Implemented in tree-cfg.c. This implementation uses the same basic_block structure used by the RTL optimizers. This allows us to share most of the existing CFG code. 4. Rewrite the tree in SSA form. Implemented in tree-ssa.c. [] But i still doubt about the process, in some ways: * Is the step #2 related to the alias analysis? Yes, though this documentation is fairly old. Finding referenced variables is needed to determine what symbols are of interest. That is hold with the def-use chains, and SMT / NMT structures. And, make any sense doing these step before the SSA variable versioning? If positive answer, why? Sorry, I don't understand this question. I mean that, if that structures (def-use chains, SMT, NMT, etc) are constructed before the SSA step where the program variables are versioned, after the variable versioning we will get that these structures are not congruent with the new variables generated in the mentioned step. The issue i dont understand is why alias analysis is done before the SSA pass, and, if all the structures created within these analysis are used in next optimization passes (dead code elimination, constant propagation.) Thanks for all Fran
Re: alias and pointers analysis
Hi again, i have been studing gcc docs to undestand SSA and steps to take to get SSA form. In one GCC online document: http://gcc.gnu.org/projects/tree-ssa/#ssa, the steps to translate to SSA form are listed. Here, i copy and paste the mentioned text: [] Conversion to SSA form is a three step process driven from tree-optimize.c: 1. Convert the function into GIMPLE form. Implemented in gimplify.c and c-simplify.c. 2. Find variable references in the code. Implemented in tree-dfa.c. 3. Build a control-flow graph (CFG). Implemented in tree-cfg.c. This implementation uses the same basic_block structure used by the RTL optimizers. This allows us to share most of the existing CFG code. 4. Rewrite the tree in SSA form. Implemented in tree-ssa.c. [] But i still doubt about the process, in some ways: * Is the step #2 related to the alias analysis? That is hold with the def-use chains, and SMT / NMT structures. And, make any sense doing these step before the SSA variable versioning? If positive answer, why? Thanks in advance Fran. 2007/10/26, Diego Novillo [EMAIL PROTECTED]: On 10/26/07, J.C. Pizarro [EMAIL PROTECTED] wrote: What is the matter if the 'b' var. is unused and optimally removed by the SSA algorithm? In this case, it will not be removed. If any of the p_i pointers is ever dereferenced in this code, that will be considered a use of variable 'b'. int a; int b; a = 2; p_4 = phi(a) Is this 'phi' as in a PHI function or a function in your code? If the former, then it's wrong, you can never have such a phi function in this code snippet. // b doesn't used here if (...) p_1 = a; else p_2 = b; endif p_3 = phi (p_1, p_2) points-to (p_1) = { a } points-to (p_2) = { b } points-to (p_3) = { a b } In this case, should exist hidden p_5 = phi(b) although 'b' is not used but yes used his reference to phantom cell 'b'. It's weird for me. I recommend that you read about the SSA form. PHI nodes are special constructs that exist only where multiple control flow paths reach a common join node. The getting started section of the wiki has links to books and articles about it. Morgan's book on compiler optimization is fairly good. I've not idea WHERE put hidden p_5 = phi(b)! No such thing exists. Too it's possible to ocurr *p_2 = c where 'b' will be hidden used through the pointer p_2. It's too weird for me. Yes, that is possible, an that is precisely what alias analysis tells the compiler. We know from the analysis that reading/writing to '*p_2' is exactly the same as reading/writing to 'b'.
alias and pointers analysis
Hi, once i have been reading lots of documents about SSA computing. I have a few questions, conceptual ones, that stop me understanding the whole goal of the method. Here i go: * Why alias analysis? I mean, the pointers could be versioned like the others vars. (I'm not saying it's not usefull, i mean i don't understand the alias analisys goals) * What is the significance of Use-Def chains? How are they construct? Are related to VDEF/DEF and VUSE/USE? Note: i'm beginning i'd need something basic, conceptual docs. thanks and c u all
indirect memory op in SSA
Hi! i am trying to implement the SSA form, inspirated in GCC way. I have found problems in processing indirect memory operations like arrays, structures and pointers. My questions are: are nodes STRUCT_FIELD_TAG, NAME_MEMORY_TAG, SYMBOL_MEMORY_TAG used for that purpose? and, where i can find documentation about their use within the translation process to SSA? Thanks.