Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On Thu, Aug 31, 2006 at 04:00:25PM -0700, Mark Mitchell wrote: I think this is probably moot, since I believe that Kenny feels DWARF is not suitable for reasons other than the abbreviation table issue, but this is a clever technique. ... for GIMPLE; when I discussed with him, I got the impression he was still open to using it for other things, like types. I may have been mistaken. -- Daniel Jacobowitz CodeSourcery
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Daniel Jacobowitz wrote: On Thu, Aug 31, 2006 at 04:00:25PM -0700, Mark Mitchell wrote: I think this is probably moot, since I believe that Kenny feels DWARF is not suitable for reasons other than the abbreviation table issue, but this is a clever technique. ... for GIMPLE; when I discussed with him, I got the impression he was still open to using it for other things, like types. I may have been mistaken. Given that Mark, and for that matter no one else, did not really push back, I am pretty much committed not to use dwarf. Kenny
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote on 08/28/06 09:57: I have not done this because I do not rule the earth. That was not what I was assigned to do, and I agreed that DWARF3 sounded like a reasonable way to go. Now that I understand the details of DWARF3, I have changed my mind about the correct direction. Now is the time to make that change before there is a lot of infrastructure built that assumes the DWARF3 encoding. FWIW. I agree with your conclusion. I do not know DWARF3 very well, but it always felt a bit heavy handed to me. At first, I was under the impression that we were going to use DWARF3 to describe the types and symbol table, and embed our bytecode alongside. It sounds to me like we want to either invent our own bytecode or adapt an existing one to our needs. Inventing our own is probably the best long-term alternative. A few comments on the code: Index: lto-tree-flags.def [ ... ] +/* This file is designed to be inlined into both the writing and the + reading of the lto information. What it does depends on the glue + that put in front and at the end of this and how ADD_BASE_FLAG is + defined. + + For those interested in extra credit, this could also be used as + the checking code to see if each flag is used correctly. 10 extra + credit points will be given for the intrepid programmer who does + this and is able to removes the comment that this was generated + from in tree.h. */ Nothing in this comment helps in understanding what the file is supposed to do. What is this used for? + switch (code) +{ +case ABS_EXPR: + break; + +case ADDR_EXPR: + break; + + ADD_BASE_FLAG (static_flag) This code is unreachable. There are many instances of this here. +/* Return the master clone node of N if it is available and if + CHECK_OVERWRITE is true, not overwritable. */ + What does it mean to be overwritable? You never seem to call this function with check_overwrite == false. +struct char_ptr_base +{ + char *ptr; +}; + Hmm? Why? Planning to have something different in the future? +/* An incore byte stream to buffer the various parts of the +function. The entire structure should be zeroed when created. The +record consists of a set of blocks. The first sizeof (ptr) bytes are +used as a chain, and the rest store the bytes to be written. */ + +struct output_stream +{ + /* The pointer to the first block in the stream. */ + struct char_ptr_base * first_block; + /* The pointer to the last and current block in the stream. */ + struct char_ptr_base * current_block; + /* The pointer to where the next char should be written. */ + char * current_pointer; + /* The number of characters left in the current block. */ + unsigned int left_in_block; + /* The block size of the last block allocated. */ + unsigned int block_size; + /* The total number of characters written. */ + unsigned int total_size; +}; + Maybe there's code out there for paging/caching data streams? Though we would probably want to tweak it quite a bit, it may save some effort for the base functionality. Even if we end up not using DWARF3 as output, the streaming aspect of this code will remain, I guess? +#if STUPID_TYPE_SYSTEM STUPID_TYPE_SYSTEM? No need to be insulting. It's unpleasant. +/* Find, or generate, if there is not one already, the abbrev table + entries for the various stmt_table forms. The forms in the case + statement here must EXACTLY MATCH the forms in the case statement + in output_stmt_operands. */ + +static int +get_stmt_form (unsigned int tag) +{ + int index = tag - DW_TAG_gimple_stmt_table_first; + int entry = stmt_form_abbrev_table[index]; + + if (entry) +return entry; + + switch (tag) +{ +case DW_TAG_gimple_asm_expr: Hmm. Interesting. We can use this as a verification tool, as well. From here we could synthesize an up-to-date GIMPLE grammar and keep it self-consistent. Nice. As we find problems with the writer/reader, we should update either the grammar or GIMPLE (or both). @@ -168,12 +167,6 @@ struct eh_region GTY(()) int filter; } GTY ((tag (ERT_ALLOWED_EXCEPTIONS))) allowed; -/* The type given by a call to throw foo();, or discovered - for a throw. */ -struct eh_region_u_throw { - tree type; -} GTY ((tag (ERT_THROW))) throw; - /* Retain the cleanup expression even after expansion so that we can match up fixup regions. */ struct eh_region_u_cleanup { @@ -706,13 +699,6 @@ remove_unreachable_regions (rtx insns) bool kill_it = true; switch (r-type) { - case ERT_THROW: - /* Don't remove ERT_THROW regions if their outer region - is reachable. */ - if (r-outer reachable[r-outer-region_number]) - kill_it = false; - break; - case ERT_MUST_NOT_THROW:
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On Fri, Sep 01, 2006 at 09:45:34AM -0400, Kenneth Zadeck wrote: Given that Mark, and for that matter no one else, did not really push back, I am pretty much committed not to use dwarf. Then... what are you going to do about things like types? Invent a new serialization for those too? I think that confusing dwarf-for-types and dwarf-for-gimple would be a mistake. -- Daniel Jacobowitz CodeSourcery
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Daniel Jacobowitz wrote: On Fri, Sep 01, 2006 at 09:45:34AM -0400, Kenneth Zadeck wrote: Given that Mark, and for that matter no one else, did not really push back, I am pretty much committed not to use dwarf. Then... what are you going to do about things like types? Invent a new serialization for those too? I think that confusing dwarf-for-types and dwarf-for-gimple would be a mistake. My part is only the function bodies, we are still going to use dwarf for the types and the global variables. There are people at codesoucery who, even as we speak, are busily enhancing that part to get all of the pieces output, not just the parts used for the debugger. Kenny
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On Fri, Sep 01, 2006 at 10:19:07AM -0400, Kenneth Zadeck wrote: Daniel Jacobowitz wrote: On Fri, Sep 01, 2006 at 09:45:34AM -0400, Kenneth Zadeck wrote: Given that Mark, and for that matter no one else, did not really push back, I am pretty much committed not to use dwarf. Then... what are you going to do about things like types? Invent a new serialization for those too? I think that confusing dwarf-for-types and dwarf-for-gimple would be a mistake. My part is only the function bodies, we are still going to use dwarf for the types and the global variables. There are people at codesoucery who, even as we speak, are busily enhancing that part to get all of the pieces output, not just the parts used for the debugger. OK, violent agreement. Thanks for clarifying. -- Daniel Jacobowitz CodeSourcery
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Diego Novillo wrote: Kenneth Zadeck wrote on 08/28/06 09:57: I have not done this because I do not rule the earth. That was not what I was assigned to do, and I agreed that DWARF3 sounded like a reasonable way to go. Now that I understand the details of DWARF3, I have changed my mind about the correct direction. Now is the time to make that change before there is a lot of infrastructure built that assumes the DWARF3 encoding. FWIW. I agree with your conclusion. I do not know DWARF3 very well, but it always felt a bit heavy handed to me. At first, I was under the impression that we were going to use DWARF3 to describe the types and symbol table, and embed our bytecode alongside. It sounds to me like we want to either invent our own bytecode or adapt an existing one to our needs. Inventing our own is probably the best long-term alternative. A few comments on the code: Index: lto-tree-flags.def [ ... ] +/* This file is designed to be inlined into both the writing and the + reading of the lto information. What it does depends on the glue + that put in front and at the end of this and how ADD_BASE_FLAG is + defined. + + For those interested in extra credit, this could also be used as + the checking code to see if each flag is used correctly. 10 extra + credit points will be given for the intrepid programmer who does + this and is able to removes the comment that this was generated + from in tree.h. */ Nothing in this comment helps in understanding what the file is supposed to do. What is this used for? When I get the other side of the code finished and enhance the comments, you will see. + switch (code) +{ +case ABS_EXPR: + break; + +case ADDR_EXPR: + break; + + ADD_BASE_FLAG (static_flag) This code is unreachable. There are many instances of this here. I would have found that when I wrote the code that reads this back. +/* Return the master clone node of N if it is available and if + CHECK_OVERWRITE is true, not overwritable. */ + What does it mean to be overwritable? You never seem to call this function with check_overwrite == false. I have no idea what overwritable is used for. This is something that honza and I went around many times on when I was writing my ipa code. It appears that you can put attributes on some functions that allow the function to replaced late, like at link or even runtime. For instance, when I was doing my pure-const analysis, I have to mark these as not being pure or const even if they looked like they could be since they could be replaced by versions that were not pure or const. However, here, I need the master clone node, but I do not care if it is overwritable, I have to put it out anyway. I do call it that way now. Thanks for noticing. +struct char_ptr_base +{ + char *ptr; +}; + Hmm? Why? Planning to have something different in the future? +/* An incore byte stream to buffer the various parts of the +function. The entire structure should be zeroed when created. The +record consists of a set of blocks. The first sizeof (ptr) bytes are +used as a chain, and the rest store the bytes to be written. */ + +struct output_stream +{ + /* The pointer to the first block in the stream. */ + struct char_ptr_base * first_block; + /* The pointer to the last and current block in the stream. */ + struct char_ptr_base * current_block; + /* The pointer to where the next char should be written. */ + char * current_pointer; + /* The number of characters left in the current block. */ + unsigned int left_in_block; + /* The block size of the last block allocated. */ + unsigned int block_size; + /* The total number of characters written. */ + unsigned int total_size; +}; + Maybe there's code out there for paging/caching data streams? Though we would probably want to tweak it quite a bit, it may save some effort for the base functionality. Even if we end up not using DWARF3 as output, the streaming aspect of this code will remain, I guess? Yes, the streaming remains, I need to do it this way so that I can then pass some of the buffers into a compressor. I had thought of using obstacks, they were close but they were a little wierd. +#if STUPID_TYPE_SYSTEM STUPID_TYPE_SYSTEM? No need to be insulting. It's unpleasant. +/* Find, or generate, if there is not one already, the abbrev table + entries for the various stmt_table forms. The forms in the case + statement here must EXACTLY MATCH the forms in the case statement + in output_stmt_operands. */ + +static int +get_stmt_form (unsigned int tag) +{ + int index = tag - DW_TAG_gimple_stmt_table_first; + int entry = stmt_form_abbrev_table[index]; + + if (entry) +return entry; + + switch (tag) +{ +case DW_TAG_gimple_asm_expr: Hmm.
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On Fri, 2006-09-01 at 09:51 -0400, Diego Novillo wrote: +#if STUPID_TYPE_SYSTEM STUPID_TYPE_SYSTEM? No need to be insulting. It's unpleasant. Well it right now it is stupid, this is just a work around anyways until people fix the type mismatches really. Is it more insulting than having stupid.c which existed in GCC before 3.0.0? Also this is very temporary and will go away when the problem in the rest of the compiler is fixed. -- Pinski
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Andrew Pinski wrote: On Fri, 2006-09-01 at 09:51 -0400, Diego Novillo wrote: +#if STUPID_TYPE_SYSTEM STUPID_TYPE_SYSTEM? No need to be insulting. It's unpleasant. Well it right now it is stupid, this is just a work around anyways until people fix the type mismatches really. Is it more insulting than having stupid.c which existed in GCC before 3.0.0? Also this is very temporary and will go away when the problem in the rest of the compiler is fixed. -- Pinski I will tone it down, the point has been made. kenny
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
hi, On Wed, 2006-08-30 at 16:44 -0500, Mark Mitchell wrote: [snip] (Implied, but not stated, in your mail is the fact that the abbreviation table cannot be indexed directly. If it could be, then you wouldn't have to read the entire abbreviation table for each function; you would just read the referenced abbreviations. Because the abbreviation table records are of variable length, it is indeed true that you cannot make random accesses to the table. So, this paragraph is just fleshing out your argument.) I have spent a considerable amount of time looking at the abbrev tables output by gcc are not totally random: their entries are sorted by their abbrev code. That is, the abbrev code of entry i+1 is higher than that of entry i. I don't know how useful this would be for you but for me, it made a _huge_ difference because it means I did not have to parse the whole abbrev table when looking for an entry with a specific abbrev code and I did not have to create a cache of code-offset mappings for the full table. You can use a very small (I use 16 entries) cache of abbrev codes for each abbrev table which tells you where the entry which contains that abbrev code is located in the table. The key here is that you do not need to cache all codes because you can use a code smaller than the one you need to start parsing the table at that point. The attached file implements such a cache. The dwarf2_abbrev_cu_read_decl function searches for an entry in the abbrev table which matches a given abbrev code using the code cache. I cannot remember if this version actually works because I remember hacking it quite a bit and stopping in the middle of the rework but this code should illustrate what I was suggesting. I hope this helps, regards, Mathieu /* -*- Mode: C; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 8 -*- */ /* This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License version 2 as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. Copyright (C) 2004,2005 Mathieu Lacage Author: Mathieu Lacage [EMAIL PROTECTED] */ #ifndef DWARF2_ABBREV_H #define DWARF2_ABBREV_H #include stdint.h #include stdbool.h struct reader; struct dwarf2_abbrev { uint32_t start; /* start of .debug_abbrev from start of file */ uint32_t end; /* end of .debug_abbrev from start of file */ }; #define CACHE_SIZE (16) struct dwarf2_abbrev_cu { uint32_t start; /* offset from start of file */ uint32_t end; /* offset from start of file */ struct cache { uint8_t keys[CACHE_SIZE]; uint32_t values[CACHE_SIZE]; uint8_t last_used[CACHE_SIZE]; uint8_t time; } cache; }; struct dwarf2_abbrev_decl { uint32_t offset; /* offset from start of file to this decl. */ uint64_t abbr_code; /* abbrev code for this entry */ uint64_t tag;/* abbrev tag for this entry */ uint8_t children;/* children for this entry */ }; struct dwarf2_abbrev_attr { uint64_t form; uint64_t name; }; void dwarf2_abbrev_initialize (struct dwarf2_abbrev *abbrev, uint32_t abbrev_start /* offset from start of file */, uint32_t abbrev_end /* offset from start of file */); void dwarf2_abbrev_initialize_cu (struct dwarf2_abbrev *abbrev, struct dwarf2_abbrev_cu *abbrev_cu, uint32_t offset /* offset from start of file */); void dwarf2_abbrev_cu_read_decl (struct dwarf2_abbrev_cu *abbrev_cu, struct dwarf2_abbrev_decl *decl, uint64_t code, struct reader *reader); void dwarf2_abbrev_decl_read_attr_first (struct dwarf2_abbrev_decl *decl, struct dwarf2_abbrev_attr *attr, uint32_t *new_offset, struct reader *reader); void dwarf2_abbrev_read_attr (uint32_t cur_offset, struct dwarf2_abbrev_attr *attr, uint32_t *new_offset, struct reader *reader); bool dwarf2_abbrev_attr_is_last (struct dwarf2_abbrev_attr *attr); #endif /* DWARF2_ABBREV_H */ /* -*- Mode: C; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 8 -*- */ /* This program is free
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Mark Mitchell wrote: Kenneth Zadeck wrote: Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I understand -- but it's natural to expect that those functions will be clumped together. In a gigantic program, I expect there are going to be clumps of tightly connected object files, with relatively few connections between the clumps. So, you're likely to get good cache behavior for any per-object-file specific data that you need to access. I just do not know. I assume that you are right, that there is some clumping. But I am just no sure. I have never depended on the kindness of strangers or the virtues of virtual memory. I fear the size of the virtual memory when we go to compile really large programs. I don't think we're going to blow out a 64-bit address space any time soon. Disks are big, but they are nowhere near *that* big, so it's going to be pretty hard for anyone to hand us that many .o files. And, there's no point manually reading/writing stuff (as opposed to mapping it into memory), unless we actually run out of address space. I am not so concerned with running out of virtual address space than I am about being able to break this up so that it can be done in parallel, on a farm of machines. Otherwise, lto can never be part of anyone's compile/test loop. The notion of having 20 or 50 compile servers each mapping all of the files of a large system in seems like a bad design point. In fact, if you're going to design your own encoding formats, I would consider a format with self-relative pointers (or, offsets from some fixed base) that you could just map into memory. It wouldn't be as compact as using compression, so the total number of bytes written when generating the object files would be bigger. But, it will be very quick to load it into memory. If you look at my code, that is what I have, at least with respect to the function itself. There is one big difference here between lto and what a debugger needs. I could see designing a debugger (and I have no idea if any such debuggers exist) that simply maps in the debug information and just uses the incore representation as is. Dwarf seems to have been designed to support this. (but then again I could be dreaming). With an intermediate form of a compiler, the usage is quite different. All that we are going to do is load a function, convert it gimple and then throw away (the notion of throw away may not have meaning for memory mapped files) the on disk version. The prime goal is that the format be designed so that an enitity (generally a function) can be expanded into gimple in one pass. Then the question of the benefit of using a compressor comes down to processor speed vs io speed. With the parts that you are in charge of, namely the types and the globals, this is not true. I can very easily see an implementation of the types and decls that is like I describe for the debugger, you map it into mem and just use if from there. But since the intermediate code for a function body is going to get very modified, and our existing gimple is chocked full of pointers, it is harder to envision ever winning at that the mapping game. I guess my overriding concern is that we're focusing heavily on the data format here (DWARF? Something else? Memory-mappable? What compression scheme?) and we may not have enough data. I guess we just have to pick something and run with it. I think we should try to keep that code as as separate as possible so that we can recover easily if whatever we pick turns out to be (another) bad choice. :-) One of the comments that was made by a person on the dwarf committee is that the abbrev tables really can be used for compression. If you have information that is really common to a bunch of records, you can build an abbrev entry with the common info in it. Yes. I was a little bit surprised that you don't seem to have seen much commonality. If you recorded most of the tree flags, and treated them as DWARF attributes, I'd expect you would see relatively many expressions of a fixed form. Like, there must be a lot of PLUS_EXPRs with TREE_USED set on them. But, I gather that you're trying to avoid recording some of these flags, hoping either that (a) they won't be needed, or (b) you can recreate them when reading the file. I think both (a) and (b) hold in many cases, so I think it's reasonable to assume we're writing out very few attributes. I had thought about that with the flags, and decided it was too much work/per gain. I decided to just compress the flags, so that only the flags that are used for a given tree node were written as a uleb128 bit number and then let zlib do the rest. The flags
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On 8/31/06, Kenneth Zadeck [EMAIL PROTECTED] wrote: Mark Mitchell wrote: Kenneth Zadeck wrote: Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I understand -- but it's natural to expect that those functions will be clumped together. In a gigantic program, I expect there are going to be clumps of tightly connected object files, with relatively few connections between the clumps. So, you're likely to get good cache behavior for any per-object-file specific data that you need to access. I just do not know. I assume that you are right, that there is some clumping. But I am just no sure. I just want to point out that this argument (okay cache locality) was used as a reason the massive amount of open/seek/close behavior by Subversion's FSFS filesystem is a-ok. It turns out not to be in practice, for a few reasons: First, the syscall overhead ends up being pretty bad. Probably not as bad in LTO as in Subversion (for obvious reasons), but you shouldn't discount it. Second, and more importantly, on a network file system, the open/seek/read calls to get at the cached data are going to be RPC calls. --Dan
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Daniel Berlin wrote: On 8/31/06, Kenneth Zadeck [EMAIL PROTECTED] wrote: Mark Mitchell wrote: Kenneth Zadeck wrote: Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I understand -- but it's natural to expect that those functions will be clumped together. In a gigantic program, I expect there are going to be clumps of tightly connected object files, with relatively few connections between the clumps. So, you're likely to get good cache behavior for any per-object-file specific data that you need to access. I just do not know. I assume that you are right, that there is some clumping. But I am just no sure. I just want to point out that this argument (okay cache locality) was used as a reason the massive amount of open/seek/close behavior by Subversion's FSFS filesystem is a-ok. Here, we won't be making syscalls -- but we will be taking page faults if we go out of cache. I don't know what the consequences of page faults for files backed over NFS are, but if your object files are coming over NFS, your linker isn't going to go too fast anyhow. I would expect most users carefully use local disk for object files. Since we're descending into increasingly general arguments, let me say it more generally: we're optimizing before we've fully profiled. Kenny had a very interesting datapoint: that abbreviation tables tended to be about the size of a function. That's great information. All I'm suggesting is that this data doesn't necessarily imply that enabling random access to functions (as we all agree is necessary) implies a 2x I/O cost. It's only a 2x I/O cost if every time you need to go look at a function the abbreviation table has been paged out. I think we've gotten extremely academic here. As far as I can tell, Kenny has decided not to use DWARF, and nobody's trying to argue that he should, so we should probably just move on. My purpose in raising a few counterpoints is just to make sure that we're not overlooking anything obvious in favor of DWARF; since Kenny's already got that code written, it would be nice if we had a good reason not to start over. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On Thu, Aug 31, 2006 at 09:24:20AM -0700, Mark Mitchell wrote: Here, we won't be making syscalls Yes, you almost certainly will. If you've got a thousand object files, you probably don't want to keep them all opened all the time; there are these pesky things like open file descriptor limits, for instance, and you'll tend to degrade performance. -- Daniel Jacobowitz CodeSourcery
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: I am not so concerned with running out of virtual address space than I am about being able to break this up so that it can be done in parallel, on a farm of machines. Otherwise, lto can never be part of anyone's compile/test loop. I think we just expanded the scope of work by an order of magnitude. :-) If you had just said that you wanted to support multi-threaded LTO, that would have been a big deal. But multiple machines with multiple address spaces trying to do LTO on one program is a really big deal. (Of course, there is a cheap hack way of doing what you want: run LTO on clumps of object files in parallel, and then just link the pre-optimized files together in the ordinary way.) I'd really like to see us inline a function before we even begin to have this conversation. :-) I have no idea how stable all the types and decls are over a compilation. I write my info pretty early, and I assume the types and decls are written pretty late in the compilation (otherwise you would not have address expressions for the debugger). If there has been any processing on these between when I write my stuff and when the types and decls get written, things may not match up. I don't think that this is an issue. The important information about types and declaration is stable. Things like is this declaration used? change over the course of the compilation, but that's not useful for DWARF anyhow -- and, in general, we don't write out information about types/declarations that are entirely unused. The key aspects (sizes/layouts/etc.) are fixed. -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Mark Mitchell wrote: Kenneth Zadeck wrote: I am not so concerned with running out of virtual address space than I am about being able to break this up so that it can be done in parallel, on a farm of machines. Otherwise, lto can never be part of anyone's compile/test loop. I think we just expanded the scope of work by an order of magnitude. :-) If you had just said that you wanted to support multi-threaded LTO, that would have been a big deal. But multiple machines with multiple address spaces trying to do LTO on one program is a really big deal. (Of course, there is a cheap hack way of doing what you want: run LTO on clumps of object files in parallel, and then just link the pre-optimized files together in the ordinary way.) I'd really like to see us inline a function before we even begin to have this conversation. :-) I have no intention in expanding the scope of the work. I just am not going to do anything that precludes doing this. As long as the function bodies are in a form that can be easily accessed without reading and processing the entire file, I am fine. I have no idea how stable all the types and decls are over a compilation. I write my info pretty early, and I assume the types and decls are written pretty late in the compilation (otherwise you would not have address expressions for the debugger). If there has been any processing on these between when I write my stuff and when the types and decls get written, things may not match up. I don't think that this is an issue. The important information about types and declaration is stable. Things like is this declaration used? change over the course of the compilation, but that's not useful for DWARF anyhow -- and, in general, we don't write out information about types/declarations that are entirely unused. The key aspects (sizes/layouts/etc.) are fixed. You are either right, or we will fix it. Those are the only two options. My only point was that there is a lot of compiler in between where someone could have done something silly.
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Daniel Jacobowitz wrote: On Thu, Aug 31, 2006 at 09:24:20AM -0700, Mark Mitchell wrote: Here, we won't be making syscalls Yes, you almost certainly will. OK, good point. In any case, my concern is that we're worrying a lot about on-disk encoding, but that there are lots of other hard problems we have to solve before we can make this work -- even independent of resource constraints. So, I suggest we choose an encoding that seems approximately reasonable, but not worry too much about exactly how optimal it is. I think that boils down to: Kenny, I think you should do what you think best, but without working too terribly hard. :-) -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
mathieu lacage wrote: I have spent a considerable amount of time looking at the abbrev tables output by gcc are not totally random: their entries are sorted by their abbrev code. That is, the abbrev code of entry i+1 is higher than that of entry i. That's an interesting observation. Essentially, you've shown that by storing lg(n) information, you can cut the cost to find an entry in an abbreviation table of size n to a constant. Since, for LTO, we certainly can depend on the .o file being produced by GCC, we could depend on this behavior, even though it's not mandated by the DWARF standard. I think this is probably moot, since I believe that Kenny feels DWARF is not suitable for reasons other than the abbreviation table issue, but this is a clever technique. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
KZ == Kenneth Zadeck [EMAIL PROTECTED] writes: KZ 2) To have a discussion about the use of DWARF3. I am now against the KZ use of DWARF3 for encoding the GIMPLE. FWIW your arguments convinced me. I think what matters most is that the resulting format be relatively well documented (say, better than GIMPLE), efficient, suitable, etc. Reusing DWARF3 seems cute but inessential. [...] KZ +case TRUTH_NOT_EXPR: KZ +case VIEW_CONVERT_EXPR: KZ +#if STUPID_TYPE_SYSTEM KZ + output_type_ref (ob, TREE_TYPE (expr)); KZ +#endif I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and CONVERT_EXPR in the STUPID_TYPE_SYSTEM case. VIEW_CONVERT_EXPR is a type-casting expression. KZ +/* When we get a strongly typed gimple, which may not be for another KZ + 15 years, this flag should be set to 0 so we do not waste so much KZ + space printing out largely redundant type codes. */ KZ +#define STUPID_TYPE_SYSTEM 1 You could write a more compact form by emitting explicit fake nop nodes where needed, and then strip those when reading. I think this would avoid tweaking the optimizer bugs, as the reloaded trees would be identical. This does bring up another point about the format though: there's got to be some versioning capability in there. Tom
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
[...] KZ +case TRUTH_NOT_EXPR: KZ +case VIEW_CONVERT_EXPR: KZ +#if STUPID_TYPE_SYSTEM KZ + output_type_ref (ob, TREE_TYPE (expr)); KZ +#endif I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and CONVERT_EXPR in the STUPID_TYPE_SYSTEM case. VIEW_CONVERT_EXPR is a type-casting expression. No it is not, it is more complex than a just a simple type-casting expression, it is a cast which does nothing to the bits at all. It acts more like a reference than a cast as it is also can be used on the left hand side. I am working on a patch to have GCC use VCE more. You could write a more compact form by emitting explicit fake nop nodes where needed, and then strip those when reading. I think this would avoid tweaking the optimizer bugs, as the reloaded trees would be identical. Actually it would be better just to fix the problem in the first place as mentioned before. -- Pinski
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: This posting is a progress report of my task of encoding and decoding the GIMPLE stream into LTO. Included in this posting is a patch that encodes functions and dumps the result to files. [I'm sorry for not replying to this sooner. I've been on a plane or in a meeting virtually all of my waking hours since you wrote this...] Exciting! 2) To have a discussion about the use of DWARF3. I am now against the use of DWARF3 for encoding the GIMPLE. As the person who suggested this to you, I'm sorry it doesn't seem to meet your needs. I'll make a few more specific comments below, but, to echo what I said at the time: I think it was a good idea to try it, but if it's not the right tool for the job, then so be it. My opinion has always been that, for the bodies of functions, DWARF is just an encoding: if it's a decent encoding, then, it's nice that it's a standard; if it's a bad encoding, then, by all means, let's not use it! I do think DWARF is a good choice for the non-executable information, for the same reasons I did initially: (a) for debugging, we need to generate most of that information anyhow, so we're piggy-backing on existing code -- and not making object files bigger by encoding the same information twice in the case of -O2 -g, which is the default way that many GNU applications are built. (b) it's a standard, and we already have tools for reading DWARF, so it saves the trouble of coming up with a new encoding, (c) because bugs in the DWARF emission may not result in problems at LTO, we'll be validating our LTO information every time we use GDB, and, similarly, improving the GDB experience every time we fix an LTO bug in this area. I understand that some of these benefits do not apply to the executable code, and that, even to the extent they may apply, the tradeoffs are different. The comments I've made below about specific issues should therefore be considered as academic responses, not an attempt to argue the decision you have made. 3) To get some one to tell me what option we are going to add to the compiler to tell it to write this information. I think a reasonable spelling is probably -flto. It should not be a -m option since it is not machine-specific. I don't think it should be a -O option either, since writing out LTO information isn't really supposed to affect optimization per se. 2) The code is, by design, fragile. It takes nothing for granted. Every case statement has gcc_unreachable as it's default case. That's the same way I've approached the DWARF reading code, and for the same reason. I think that's exactly the right decision. 1) ABBREV TABLES ARE BAD FOR LTO. However, this mechanism is only self descriptive if you do not extend the set of tags. That is not an option for LTO. Definitely true. When we talked on the phone, we talked about creating a tag corresponding to each GIMPLE tree code. However, you could also create a numeric attribute giving the GIMPLE tree code. If you did that, you might find that the abbreviation table became extremely small -- because almost all interior nodes would be DW_TAG_GIMPLE_EXPR nodes. The downside, of course, is that the storage required to store the nodes would be greater, as it would now contain the expression code (e.g., PLUS_EXPR), rather than having a DW_TAG_GIMPLE_PLUS_EXPR. I strongly believe that for LTO to work, we are going to have to implement some mechanism where the function bodies are loaded into the compiler on demand (possibly kept in cache, but maybe not). Agreed. This will be more cumbersome if we have to keep reloading each object file's abbrev table just to tear apart a single function in that .o file. While the abbrev sections average slightly less than %2 of the of the size of the GIMPLE encoding for an entire file, each abbrev table averages about the same size as a single function. Interesting datapoint. (Implied, but not stated, in your mail is the fact that the abbreviation table cannot be indexed directly. If it could be, then you wouldn't have to read the entire abbreviation table for each function; you would just read the referenced abbreviations. Because the abbreviation table records are of variable length, it is indeed true that you cannot make random accesses to the table. So, this paragraph is just fleshing out your argument.) I think the conclusion that you reach (that the size of the tables is a problem) depends on how you expect the compiler to process functions at link-time. My expectation was that you would form a global control-flow graph for the entire program (by reading CFG data encoded in each .o file), eliminate unreachable functions, and then inline/optimize functions one-at-a-time. If you sort the function-reading so that you prefer to read functions from the same object file in order, then I would expect that you would considerably reduce the impact of reading
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Mark Mitchell wrote: Kenneth Zadeck wrote: This will be more cumbersome if we have to keep reloading each object file's abbrev table just to tear apart a single function in that .o file. While the abbrev sections average slightly less than %2 of the of the size of the GIMPLE encoding for an entire file, each abbrev table averages about the same size as a single function. Interesting datapoint. (Implied, but not stated, in your mail is the fact that the abbreviation table cannot be indexed directly. If it could be, then you wouldn't have to read the entire abbreviation table for each function; you would just read the referenced abbreviations. Because the abbreviation table records are of variable length, it is indeed true that you cannot make random accesses to the table. So, this paragraph is just fleshing out your argument.) I think the conclusion that you reach (that the size of the tables is a problem) depends on how you expect the compiler to process functions at link-time. My expectation was that you would form a global control-flow graph for the entire program (by reading CFG data encoded in each .o file), eliminate unreachable functions, and then inline/optimize functions one-at-a-time. If you sort the function-reading so that you prefer to read functions from the same object file in order, then I would expect that you would considerably reduce the impact of reading the abbreviation tables. I'm making the assumption that it f calls N functions, then they probably come from N object files. I have no data to back up that assumption. (There is nothing that says that you can only have one abbreviation table for all functions. You can equally well have one abbreviation table per function. In that mode, you trade space (more abbreviation tables, and the same abbreviation appearing in multiple tables) against the fact that you now only need to read the abbreviation tables you need. I'm not claiming this is a good idea.) I don't find this particular argument (that the abbreviation tables will double file I/O) very convincing. I don't think it's likely that the problem we're going to have with LTO is running out of *virtual* memory, especially as 64-bit hardware becomes nearly universal. The problem is going to be running out of physical memory, and thereby paging like crazy, running out of D-cache. So, I'd assume you'd just read the tables as-needed, and never both discarding them. As long as there is reasonable locality of reference to abbreviation tables (i.e., you can arrange to hit object files in groups), then the cost here doesn't seem like it would be very big. Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I have never depended on the kindness of strangers or the virtues of virtual memory. I fear the size of the virtual memory when we go to compile really large programs. 2) I PROMISED TO USE THE DWARF3 STACK MACHINE AND I DID NOT. I never imagined you doing this; as per above, I always expected that you would use DWARF tags for the expression nodes. I agree that the stack-machine is ill-suited. 3) THERE IS NO COMPRESSION IN DWARF3. In 1 file per mode, zlib -9 compression is almost 6:1. In 1 function per mode, zlib -9 compression averages about 3:1. In my opinion, if you considered DWARF + zlib to be satisfactory, then I think that would be fine. For LTO, we're allowed to do whatever we want. I feel the same about your confession that you invented a new record form; if DWARF + extensions is a suitable format, that's fine. In other words, in principle, using a somewhat non-standard variant of DWARF for LTO doesn't seem evil to me -- if that met our needs. One of the comments that was made by a person on the dwarf committee is that the abbrev tables really can be used for compression. If you have information that is really common to a bunch of records, you can build an abbrev entry with the common info in it. I have not seen a place where any use can be made of this for encoding gimple except for a couple of places where I have encoded a true or false. I therefor really do not see that they really add anything except for the code to read and write them. 2) LOCAL DECLARATIONS Mark was going to do all of the types and all of the declarations. His plan was to use the existing DWARF3 and enhance it where it was necessary eventually replacing the GCC type trees with direct references to the DWARF3 symbol table. The types and global variables are likely OK, or at least Mark should be able to add any missing info. I had a discussion on chat today with drow and he indicated that you were busily adding all of the missing stuff here. I
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: Even if we decide that we are going to process all of the functions in one file at one time, we still have to have access to the functions that are going to be inlined into the function being compiled. Getting at those functions that are going to be inlined is where the double the i/o arguement comes from. I understand -- but it's natural to expect that those functions will be clumped together. In a gigantic program, I expect there are going to be clumps of tightly connected object files, with relatively few connections between the clumps. So, you're likely to get good cache behavior for any per-object-file specific data that you need to access. I have never depended on the kindness of strangers or the virtues of virtual memory. I fear the size of the virtual memory when we go to compile really large programs. I don't think we're going to blow out a 64-bit address space any time soon. Disks are big, but they are nowhere near *that* big, so it's going to be pretty hard for anyone to hand us that many .o files. And, there's no point manually reading/writing stuff (as opposed to mapping it into memory), unless we actually run out of address space. In fact, if you're going to design your own encoding formats, I would consider a format with self-relative pointers (or, offsets from some fixed base) that you could just map into memory. It wouldn't be as compact as using compression, so the total number of bytes written when generating the object files would be bigger. But, it will be very quick to load it into memory. I guess my overriding concern is that we're focusing heavily on the data format here (DWARF? Something else? Memory-mappable? What compression scheme?) and we may not have enough data. I guess we just have to pick something and run with it. I think we should try to keep that code as as separate as possible so that we can recover easily if whatever we pick turns out to be (another) bad choice. :-) One of the comments that was made by a person on the dwarf committee is that the abbrev tables really can be used for compression. If you have information that is really common to a bunch of records, you can build an abbrev entry with the common info in it. Yes. I was a little bit surprised that you don't seem to have seen much commonality. If you recorded most of the tree flags, and treated them as DWARF attributes, I'd expect you would see relatively many expressions of a fixed form. Like, there must be a lot of PLUS_EXPRs with TREE_USED set on them. But, I gather that you're trying to avoid recording some of these flags, hoping either that (a) they won't be needed, or (b) you can recreate them when reading the file. I think both (a) and (b) hold in many cases, so I think it's reasonable to assume we're writing out very few attributes. I had a discussion on chat today with drow and he indicated that you were busily adding all of the missing stuff here. All is an overstatement. :-) Sandra is busily adding missing stuff and I'll be working on the new APIs you need. I told him that I thought this was fine as long as there is not a temporal drift in information encoded for the types and decls between the time I write my stuff and when the types and decls are written. I'm not sure what this means. Thanks, -- Mark Mitchell CodeSourcery [EMAIL PROTECTED] (650) 331-3385 x713
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On 8/30/06, Mark Mitchell [EMAIL PROTECTED] wrote: ... I guess my overriding concern is that we're focusing heavily on the data format here (DWARF? Something else? Memory-mappable? What compression scheme?) and we may not have enough data. I guess we just have to pick something and run with it. I think we should try to keep that code as as separate as possible so that we can recover easily if whatever we pick turns out to be (another) bad choice. :-) At the risk of stating the obvious and also repeating myself, please allow me give my thought on this issue. I think we should take even a step further than try to keep the code as separate. We should try to come up with a set of procedural and datastructural interface for the input/output of the program structure, and try to *completely* separate the optimization/datastructure cleanup work from the encoding/decoding. Beside the basic requirement of being able to pass through enough information to produce valid program, I think there is a critical requirement to implement inter-module/inter-procedural optimization efficiently - that the I/O interface allows efficient handling of iterating through module/procedure-level information without reading each and every module/procedure bodies (as Ken mentioned). There are certain amount of information per object/procedure that are accessed during different optimization and with sufficiently different pattern - e.g. type tree is naturally an object-level information that we may want to go through for each and every object file, without read all function bodies, and other function level information such as caller/callee information would be useful without the function body. We'll need to identify such information (in other words, the requirement of the interprocedural optimization/analysis) so that the new interface would provide ways to walk through them without loading the entire function bodies - even with large address space, if the data is scattered everywhere, it becomes extremely inefficient on modern machines to go through them, so it's actually more important to identify what logical information that we want to access during various interprocedural optimizations and the I/O interface needs to handle them efficiently. This requirement should dictate how we encode/layout the data into the disk, before anything else. Also how the information is presented to the actual inter-module optimization/analysis. Also, part of defining the interface would involve restricting the existing structures (e.g. GIMPLE) in possibly more limited form than what's currently allowed. By virtue of having an interface that separates the encoding/decoding from the rest of the compilation, we can throw away and recompute certain information (e.g. often certain control flow graph can be recovered, hence does not need to be encoded) but those details can be worked out as the implementation of the IO interface gets more in shape. -- #pragma ident Seongbae Park, compiler, http://seongbae.blogspot.com;
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On Aug 28, 2006, at 6:57 AM, Kenneth Zadeck wrote: This posting is a progress report of my task of encoding and decoding the GIMPLE stream into LTO. Included in this posting is a patch that encodes functions and dumps the result to files. Interesting email. One question: how big are the files you're generating? For example, how much space does it take to store the body of a function with 100 add operations in it (before gzip'ing it)? What is the size proportional to? # instructions, # basic blocks, what else? -Chris
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Chris Lattner wrote: On Aug 28, 2006, at 6:57 AM, Kenneth Zadeck wrote: This posting is a progress report of my task of encoding and decoding the GIMPLE stream into LTO. Included in this posting is a patch that encodes functions and dumps the result to files. Interesting email. One question: how big are the files you're generating? For example, how much space does it take to store the body of a function with 100 add operations in it (before gzip'ing it)? What is the size proportional to? # instructions, # basic blocks, what else? -Chris The size will be large because the trees are still large. Encoding everything in a bootstrap for the three languages was on the order of 60mb. As you know, size is a well known problem in gcc. The plan is to track the changes in gimple and as that gets smaller, this will get smaller. I would like to get the types and perhaps the flags out of the internal nodes of the trees. when that gets done, these operations should be small. I actually do not think that it is productive to spend that much time measuring this until we first assure ourselves that we are getting all of the information output correctly. That 60mb number will change a lot (both up and down) as we get closer to getting everything tied together. Then we can spend more time looking at the pieces and seeing where we need to do better.
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
On Aug 28, 2006, at 10:36 AM, Kenneth Zadeck wrote: I actually do not think that it is productive to spend that much time measuring this until we first assure ourselves that we are getting all of the information output correctly. That 60mb number will change a lot (both up and down) as we get closer to getting everything tied together. Then we can spend more time looking at the pieces and seeing where we need to do better. Okay, thanks! -Chris
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Kenneth Zadeck wrote: This posting is a progress report of my task of encoding and decoding the GIMPLE stream into LTO. Included in this posting is a patch that encodes functions and dumps the result to files. I have only a limited understanding of GIMPLE and LTO, but here are my comments regarding DWARF. DWARF is a debugging format used to describe a program's source and how it is represented in an executable file. GIMPLE is an intermediate representation, not program source. It's always interesting to see one tool used in a different context, but I'm not clear that in this case you are using the right tool for the right task in the way that it was designed. Speaking on behalf of the DWARF Standards Workgroup, we welcome suggestions for extensions or improvements in the DWARF standard. I wasn't able to identify any specific extension in your email (although I didn't look at the patch in detail). If there are specific areas where DWARF can be improved, please let me know. 1) ABBREV TABLES ARE BAD FOR LTO. The abbrev table is a mechanism for making a DWARF3 file self descriptive. It contains a set of templates that are used to describe the records and attributes that are written to describe the object module. This self description mechanism is much weaker (an therefor much more compact) than an XML file's schemata because it depends on a common understanding of the meaning of the tags and attributes that describe the pieces of the data. The tags them selves can be thought of as characters, and an abbrev entry is a magic cookie that tells you how to break the characters into words. DWARF is not self-descriptive, except in the very limited sense that several sections contain version numbers. The DWARF abbreviation table is not intended to be self-descriptive or to be descriptive of the DWARF data. It is a simple compression scheme which compacts the Tag/Attribute/Value structure of a DWARF DIE into a much more condensed format. Certainly the DWARF Abbrev table makes no pretense of being anything similar to an XML schema. I wouldn't use an XML schema for debugging information (it would be huge and slow), and I wouldn't expect to use DWARF abbreviation tables to describe anything at all. However, this mechanism is only self descriptive if you do not extend the set of tags. That is not an option for LTO. While the DWARF3 designers seem to have added every conceivable tag to describe object code, it is not be surprising that this comes up short when describing GIMPLE. While some of GIMPLE could be shoe-horned into the existing tags, all of it will not fit, so anyone who wants to understand GIMPLE, will need to use the GCC headers. The DWARF standard provides for a consistent means to extend the debugging format. If you extend DWARF beyond the standard, as has often been done, you will need to provide descriptions (in some form) of what these extensions mean. You can provide this as written material, or you can have the implementer read the code. In either case, the Abbrev tables only provide compression for the Tag/Attr/Val structure of a DIE. They do no provide any semantic description. Using the abbrev tables also add a lot of complexity. It should be understood that while the abbrev table for two separate files encodes the same information, the way that the abbrev tables are generated almost guarantees that the abbrev table will be encoded differently for each file. The abbrev table is a sequential list of record descriptors. The way that the list is generated is first needed, first output, and you only output the records you need. Each of these records is numbered sequentially. So the numbering of the abbrev records will generally be different for the two created object files since they reference different GIMPLE nodes. There is no need for Abbrev tables to be generated differently for each object file. Several compilers generate a standard Abbrev table and then generate DWARF using these predefined abbreviations. The DWARF standard also allows the Abbrev table to be stored in a shared object and referenced by individual object files. I'm not aware of any actual implementation of this, but the structure is there. I strongly believe that for LTO to work, we are going to have to implement some mechanism where the function bodies are loaded into the compiler on demand (possibly kept in cache, but maybe not). This will be more cumbersome if we have to keep reloading each object file's abbrev table just to tear apart a single function in that .o file. While the abbrev sections average slightly less than %2 of the of the size of the GIMPLE encoding for an entire file, each abbrev table averages about the same size as a single function. Thus, we will either have to keep all the abbrev entries for an entire compilation, implement a second caching mechanism, or do about twice the file reading to load a single function. I'm not exactly sure what you are trying to
Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!
Michael Eager wrote: Kenneth Zadeck wrote: This posting is a progress report of my task of encoding and decoding the GIMPLE stream into LTO. Included in this posting is a patch that encodes functions and dumps the result to files. I have only a limited understanding of GIMPLE and LTO, but here are my comments regarding DWARF. DWARF is a debugging format used to describe a program's source and how it is represented in an executable file. GIMPLE is an intermediate representation, not program source. It's always interesting to see one tool used in a different context, but I'm not clear that in this case you are using the right tool for the right task in the way that it was designed. Speaking on behalf of the DWARF Standards Workgroup, we welcome suggestions for extensions or improvements in the DWARF standard. I wasn't able to identify any specific extension in your email (although I didn't look at the patch in detail). If there are specific areas where DWARF can be improved, please let me know. Michael, My original posting was primarily intended for the gcc developer community which is familiar with gimple and not so familiar with dwarf. I actually think that dwarf is probably pretty good for what it was designed for and it is also documented much better than most of the internals of gcc. The task I was given was to encode gimple into dwarf, something that looked like it might be a reasonable idea, but is, in fact, a bad idea. My posting was to tell the gcc community, been there, done that, now lets move on. I will respond to many of your points but you should not take this as criticism of dwarf, these are mostly just places where the interface between gimple and dwarf was particularly troublesome and unless you want to get in the transporting intermediate code business, which I strongly suggest you avoid, you can just ignore most of these problems. 1) ABBREV TABLES ARE BAD FOR LTO. The abbrev table is a mechanism for making a DWARF3 file self descriptive. It contains a set of templates that are used to describe the records and attributes that are written to describe the object module. This self description mechanism is much weaker (an therefor much more compact) than an XML file's schemata because it depends on a common understanding of the meaning of the tags and attributes that describe the pieces of the data. The tags them selves can be thought of as characters, and an abbrev entry is a magic cookie that tells you how to break the characters into words. DWARF is not self-descriptive, except in the very limited sense that several sections contain version numbers. The DWARF abbreviation table is not intended to be self-descriptive or to be descriptive of the DWARF data. It is a simple compression scheme which compacts the Tag/Attribute/Value structure of a DWARF DIE into a much more condensed format. Certainly the DWARF Abbrev table makes no pretense of being anything similar to an XML schema. I wouldn't use an XML schema for debugging information (it would be huge and slow), and I wouldn't expect to use DWARF abbreviation tables to describe anything at all. While I understand your point, from a language theoretic point of view, they really can be thought of a schemata. They are a description of the records types that have been written by the compiler (or some other tool) for the tags and attributes in this module. This may not be the way that they evolved in your community, but for someone coming from the outside with no knowledge of your history, this is what they are. However, this mechanism is only self descriptive if you do not extend the set of tags. That is not an option for LTO. While the DWARF3 designers seem to have added every conceivable tag to describe object code, it is not be surprising that this comes up short when describing GIMPLE. While some of GIMPLE could be shoe-horned into the existing tags, all of it will not fit, so anyone who wants to understand GIMPLE, will need to use the GCC headers. The DWARF standard provides for a consistent means to extend the debugging format. If you extend DWARF beyond the standard, as has often been done, you will need to provide descriptions (in some form) of what these extensions mean. You can provide this as written material, or you can have the implementer read the code. In either case, the Abbrev tables only provide compression for the Tag/Attr/Val structure of a DIE. They do no provide any semantic description. What I did was add 4 new DW_FORMS. This is one of the areas that, I assume, users are not supposed to extend. My assumption is based on the fact that the standard says how to extend tags and attributes, but is silent on how to extend forms. The four forms that I added are: 1) counted list of udata. 2) counted list of sdata. 3) flag with false value. 4) flag with true value. These are things that made my life simpler and provided for a more