Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Daniel Jacobowitz
On Thu, Aug 31, 2006 at 04:00:25PM -0700, Mark Mitchell wrote:
 I think this is probably moot, since I believe that Kenny feels DWARF is 
 not suitable for reasons other than the abbreviation table issue, but 
 this is a clever technique.

... for GIMPLE; when I discussed with him, I got the impression he was
still open to using it for other things, like types.  I may have been
mistaken.

-- 
Daniel Jacobowitz
CodeSourcery


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Kenneth Zadeck
Daniel Jacobowitz wrote:
 On Thu, Aug 31, 2006 at 04:00:25PM -0700, Mark Mitchell wrote:
   
 I think this is probably moot, since I believe that Kenny feels DWARF is 
 not suitable for reasons other than the abbreviation table issue, but 
 this is a clever technique.
 

 ... for GIMPLE; when I discussed with him, I got the impression he was
 still open to using it for other things, like types.  I may have been
 mistaken.

   
Given that Mark, and for that matter no one else,  did not really push
back, I am pretty much committed not to use dwarf.

Kenny


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Diego Novillo
Kenneth Zadeck wrote on 08/28/06 09:57:

 I have not done this because I do not rule the earth.  That was not
 what I was assigned to do, and I agreed that DWARF3 sounded like a
 reasonable way to go.  Now that I understand the details of DWARF3, I
 have changed my mind about the correct direction.  Now is the time to
 make that change before there is a lot of infrastructure built that
 assumes the DWARF3 encoding.
 
FWIW.  I agree with your conclusion.  I do not know DWARF3 very well,
but it always felt a bit heavy handed to me.

At first, I was under the impression that we were going to use DWARF3 to
describe the types and symbol table, and embed our bytecode alongside.
It sounds to me like we want to either invent our own bytecode or adapt
an existing one to our needs.  Inventing our own is probably the best
long-term alternative.

A few comments on the code:

 Index: lto-tree-flags.def
 [ ... ]
 +/* This file is designed to be inlined into both the writing and the
 +   reading of the lto information.  What it does depends on the glue
 +   that put in front and at the end of this and how ADD_BASE_FLAG is
 +   defined. 
 +
 +   For those interested in extra credit, this could also be used as
 +   the checking code to see if each flag is used correctly.  10 extra
 +   credit points will be given for the intrepid programmer who does
 +   this and is able to removes the comment that this was generated
 +   from in tree.h.  */

Nothing in this comment helps in understanding what the file is supposed
to do.  What is this used for?

 +  switch (code)
 +{
 +case ABS_EXPR:
 +  break;
 +  
 +case ADDR_EXPR:
 +  break;
 +  
 +  ADD_BASE_FLAG (static_flag)

This code is unreachable.  There are many instances of this here.


 +/* Return the master clone node of N if it is available and if
 +   CHECK_OVERWRITE is true, not overwritable.  */ 
 +
What does it mean to be overwritable?  You never seem to call this
function with check_overwrite == false.


 +struct char_ptr_base
 +{
 +  char *ptr;
 +};
 +
Hmm?  Why?  Planning to have something different in the future?


 +/* An incore byte stream to buffer the various parts of the
 +function. The entire structure should be zeroed when created.  The
 +record consists of a set of blocks.  The first sizeof (ptr) bytes are
 +used as a chain, and the rest store the bytes to be written.  */
 +
 +struct output_stream
 +{
 +  /* The pointer to the first block in the stream.  */
 +  struct char_ptr_base * first_block;
 +  /* The pointer to the last and current block in the stream.  */
 +  struct char_ptr_base * current_block;
 +  /* The pointer to where the next char should be written.  */
 +  char * current_pointer;
 +  /* The number of characters left in the current block.  */
 +  unsigned int left_in_block;
 +  /* The block size of the last block allocated.  */
 +  unsigned int block_size;
 +  /* The total number of characters written.  */
 +  unsigned int total_size;
 +};
 +
Maybe there's code out there for paging/caching data streams?  Though we
would probably want to tweak it quite a bit, it may save some effort for
the base functionality.

Even if we end up not using DWARF3 as output, the streaming aspect of
this code will remain, I guess?

 +#if STUPID_TYPE_SYSTEM

STUPID_TYPE_SYSTEM?  No need to be insulting.  It's unpleasant.

 +/* Find, or generate, if there is not one already, the abbrev table
 +   entries for the various stmt_table forms.  The forms in the case
 +   statement here must EXACTLY MATCH the forms in the case statement
 +   in output_stmt_operands.  */
 +
 +static int
 +get_stmt_form (unsigned int tag)
 +{
 +  int index = tag - DW_TAG_gimple_stmt_table_first;
 +  int entry = stmt_form_abbrev_table[index];
 +
 +  if (entry)
 +return entry;
 +
 +  switch (tag)
 +{
 +case DW_TAG_gimple_asm_expr:

Hmm.  Interesting.  We can use this as a verification tool, as well.
From here we could synthesize an up-to-date GIMPLE grammar and keep it
self-consistent.  Nice.

As we find problems with the writer/reader, we should update either the
grammar or GIMPLE (or both).

 @@ -168,12 +167,6 @@ struct eh_region GTY(())
int filter;
  } GTY ((tag (ERT_ALLOWED_EXCEPTIONS))) allowed;
  
 -/* The type given by a call to throw foo();, or discovered
 -   for a throw.  */
 -struct eh_region_u_throw {
 -  tree type;
 -} GTY ((tag (ERT_THROW))) throw;
 -
  /* Retain the cleanup expression even after expansion so that
 we can match up fixup regions.  */
  struct eh_region_u_cleanup {
 @@ -706,13 +699,6 @@ remove_unreachable_regions (rtx insns)
 bool kill_it = true;
 switch (r-type)
   {
 - case ERT_THROW:
 -   /* Don't remove ERT_THROW regions if their outer region
 -  is reachable.  */
 -   if (r-outer  reachable[r-outer-region_number])
 - kill_it = false;
 -   break;
 -
   case ERT_MUST_NOT_THROW:

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Daniel Jacobowitz
On Fri, Sep 01, 2006 at 09:45:34AM -0400, Kenneth Zadeck wrote:
 Given that Mark, and for that matter no one else,  did not really push
 back, I am pretty much committed not to use dwarf.

Then... what are you going to do about things like types?  Invent a new
serialization for those too?  I think that confusing dwarf-for-types
and dwarf-for-gimple would be a mistake.

-- 
Daniel Jacobowitz
CodeSourcery


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Kenneth Zadeck
Daniel Jacobowitz wrote:
 On Fri, Sep 01, 2006 at 09:45:34AM -0400, Kenneth Zadeck wrote:
   
 Given that Mark, and for that matter no one else,  did not really push
 back, I am pretty much committed not to use dwarf.
 

 Then... what are you going to do about things like types?  Invent a new
 serialization for those too?  I think that confusing dwarf-for-types
 and dwarf-for-gimple would be a mistake.

   
My part is only the function bodies, we are still going to use dwarf for
the types and the global variables.  There are people at codesoucery
who, even as we speak, are busily enhancing that part to get all of the
pieces output, not just the parts used for the debugger.

Kenny


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Daniel Jacobowitz
On Fri, Sep 01, 2006 at 10:19:07AM -0400, Kenneth Zadeck wrote:
 Daniel Jacobowitz wrote:
  On Fri, Sep 01, 2006 at 09:45:34AM -0400, Kenneth Zadeck wrote:

  Given that Mark, and for that matter no one else,  did not really push
  back, I am pretty much committed not to use dwarf.
  
 
  Then... what are you going to do about things like types?  Invent a new
  serialization for those too?  I think that confusing dwarf-for-types
  and dwarf-for-gimple would be a mistake.
 

 My part is only the function bodies, we are still going to use dwarf for
 the types and the global variables.  There are people at codesoucery
 who, even as we speak, are busily enhancing that part to get all of the
 pieces output, not just the parts used for the debugger.

OK, violent agreement.  Thanks for clarifying.

-- 
Daniel Jacobowitz
CodeSourcery


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Kenneth Zadeck
Diego Novillo wrote:
 Kenneth Zadeck wrote on 08/28/06 09:57:

   
 I have not done this because I do not rule the earth.  That was not
 what I was assigned to do, and I agreed that DWARF3 sounded like a
 reasonable way to go.  Now that I understand the details of DWARF3, I
 have changed my mind about the correct direction.  Now is the time to
 make that change before there is a lot of infrastructure built that
 assumes the DWARF3 encoding.

 
 FWIW.  I agree with your conclusion.  I do not know DWARF3 very well,
 but it always felt a bit heavy handed to me.

 At first, I was under the impression that we were going to use DWARF3 to
 describe the types and symbol table, and embed our bytecode alongside.
 It sounds to me like we want to either invent our own bytecode or adapt
 an existing one to our needs.  Inventing our own is probably the best
 long-term alternative.

 A few comments on the code:

   
 Index: lto-tree-flags.def
 [ ... ]
 +/* This file is designed to be inlined into both the writing and the
 +   reading of the lto information.  What it does depends on the glue
 +   that put in front and at the end of this and how ADD_BASE_FLAG is
 +   defined. 
 +
 +   For those interested in extra credit, this could also be used as
 +   the checking code to see if each flag is used correctly.  10 extra
 +   credit points will be given for the intrepid programmer who does
 +   this and is able to removes the comment that this was generated
 +   from in tree.h.  */

 
 Nothing in this comment helps in understanding what the file is supposed
 to do.  What is this used for?
   

When I get the other side of the code finished and enhance the comments,
you will see. 
   
 +  switch (code)
 +{
 +case ABS_EXPR:
 +  break;
 +  
 +case ADDR_EXPR:
 +  break;
 +  
 +  ADD_BASE_FLAG (static_flag)

 
 This code is unreachable.  There are many instances of this here.


   
I would have found that when I wrote the code that reads this back.
 +/* Return the master clone node of N if it is available and if
 +   CHECK_OVERWRITE is true, not overwritable.  */ 
 +
 
 What does it mean to be overwritable?  You never seem to call this
 function with check_overwrite == false.

   
I have no idea what overwritable is used for.  This is something that
honza and I went around many times on when I was writing my ipa code. 
It appears that you can put attributes on some functions that allow the
function to replaced late, like at link or even runtime. 

For instance, when I was doing my pure-const analysis, I have to mark
these as not being pure or const even if they looked like they could be
since they could be replaced by versions that were not pure or const.

However, here, I need the master clone node, but I do not care if it is
overwritable, I have to put it out anyway.


I  do call it that way now. Thanks for noticing.
   
 +struct char_ptr_base
 +{
 +  char *ptr;
 +};
 +
 
 Hmm?  Why?  Planning to have something different in the future?


   
 +/* An incore byte stream to buffer the various parts of the
 +function. The entire structure should be zeroed when created.  The
 +record consists of a set of blocks.  The first sizeof (ptr) bytes are
 +used as a chain, and the rest store the bytes to be written.  */
 +
 +struct output_stream
 +{
 +  /* The pointer to the first block in the stream.  */
 +  struct char_ptr_base * first_block;
 +  /* The pointer to the last and current block in the stream.  */
 +  struct char_ptr_base * current_block;
 +  /* The pointer to where the next char should be written.  */
 +  char * current_pointer;
 +  /* The number of characters left in the current block.  */
 +  unsigned int left_in_block;
 +  /* The block size of the last block allocated.  */
 +  unsigned int block_size;
 +  /* The total number of characters written.  */
 +  unsigned int total_size;
 +};
 +
 
 Maybe there's code out there for paging/caching data streams?  Though we
 would probably want to tweak it quite a bit, it may save some effort for
 the base functionality.

 Even if we end up not using DWARF3 as output, the streaming aspect of
 this code will remain, I guess?

   
Yes, the streaming remains, I need to do it this way so that I can then
pass some of the buffers into a compressor.

I had thought of using obstacks, they were close but they were a little
wierd.
 +#if STUPID_TYPE_SYSTEM

 
 STUPID_TYPE_SYSTEM?  No need to be insulting.  It's unpleasant.

   
 +/* Find, or generate, if there is not one already, the abbrev table
 +   entries for the various stmt_table forms.  The forms in the case
 +   statement here must EXACTLY MATCH the forms in the case statement
 +   in output_stmt_operands.  */
 +
 +static int
 +get_stmt_form (unsigned int tag)
 +{
 +  int index = tag - DW_TAG_gimple_stmt_table_first;
 +  int entry = stmt_form_abbrev_table[index];
 +
 +  if (entry)
 +return entry;
 +
 +  switch (tag)
 +{
 +case DW_TAG_gimple_asm_expr:

 
 Hmm.  

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Andrew Pinski
On Fri, 2006-09-01 at 09:51 -0400, Diego Novillo wrote:
 
  +#if STUPID_TYPE_SYSTEM
 
 STUPID_TYPE_SYSTEM?  No need to be insulting.  It's unpleasant.

Well it right now it is stupid, this is just a work around anyways until
people fix the type mismatches really.  Is it more insulting than having
stupid.c which existed in GCC before 3.0.0?  Also this is very temporary
and will go away when the problem in the rest of the compiler is fixed.

-- Pinski



Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-09-01 Thread Kenneth Zadeck
Andrew Pinski wrote:
 On Fri, 2006-09-01 at 09:51 -0400, Diego Novillo wrote:
   
 +#if STUPID_TYPE_SYSTEM

   
 STUPID_TYPE_SYSTEM?  No need to be insulting.  It's unpleasant.
 

 Well it right now it is stupid, this is just a work around anyways until
 people fix the type mismatches really.  Is it more insulting than having
 stupid.c which existed in GCC before 3.0.0?  Also this is very temporary
 and will go away when the problem in the rest of the compiler is fixed.

 -- Pinski

   
I will tone it down, the point has been made.

kenny


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread mathieu lacage
hi,

On Wed, 2006-08-30 at 16:44 -0500, Mark Mitchell wrote:

[snip]

 (Implied, but not stated, in your mail is the fact that the abbreviation 
 table cannot be indexed directly.  If it could be, then you wouldn't 
 have to read the entire abbreviation table for each function; you would 
 just read the referenced abbreviations.  Because the abbreviation table 
 records are of variable length, it is indeed true that you cannot make 
 random accesses to the table.  So, this paragraph is just fleshing out 
 your argument.)

I have spent a considerable amount of time looking at the abbrev tables
output by gcc are not totally random: their entries are sorted by their
abbrev code. That is, the abbrev code of entry i+1 is higher than that
of entry i.

I don't know how useful this would be for you but for me, it made a
_huge_ difference because it means I did not have to parse the whole
abbrev table when looking for an entry with a specific abbrev code and I
did not have to create a cache of code-offset mappings for the full
table. You can use a very small (I use 16 entries) cache of abbrev codes
for each abbrev table which tells you where the entry which contains
that abbrev code is located in the table. The key here is that you do
not need to cache all codes because you can use a code smaller than the
one you need to start parsing the table at that point.

The attached file implements such a cache. The
dwarf2_abbrev_cu_read_decl function searches for an entry in the abbrev
table which matches a given abbrev code using the code cache. I cannot
remember if this version actually works because I remember hacking it
quite a bit and stopping in the middle of the rework but this code
should illustrate what I was suggesting.

I hope this helps,

regards,
Mathieu


/* -*- Mode: C; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 8 -*- */
/*
  This program is free software; you can redistribute it and/or
  modify it under the terms of the GNU General Public License
  version 2 as published by the Free Software Foundation.
  
  This program is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.
  
  You should have received a copy of the GNU General Public License
  along with this program; if not, write to the Free Software
  Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.

  Copyright (C) 2004,2005  Mathieu Lacage
  Author: Mathieu Lacage [EMAIL PROTECTED]
*/

#ifndef DWARF2_ABBREV_H
#define DWARF2_ABBREV_H

#include stdint.h
#include stdbool.h
struct reader;

struct dwarf2_abbrev {
uint32_t start; /* start of .debug_abbrev from start of file */
uint32_t end;   /* end of .debug_abbrev from start of file */
};

#define CACHE_SIZE (16)
struct dwarf2_abbrev_cu {
uint32_t start; /* offset from start of file */
uint32_t end;   /* offset from start of file */
struct cache {
uint8_t keys[CACHE_SIZE];
uint32_t values[CACHE_SIZE];
uint8_t last_used[CACHE_SIZE];
uint8_t time;
} cache;
};

struct dwarf2_abbrev_decl {
uint32_t offset; /* offset from start of file to this decl. */
uint64_t abbr_code;  /* abbrev code for this entry */
uint64_t tag;/* abbrev tag for this entry */
uint8_t children;/* children for this entry */
};

struct dwarf2_abbrev_attr {
uint64_t form;
uint64_t name;
};


void dwarf2_abbrev_initialize (struct dwarf2_abbrev *abbrev,
   uint32_t abbrev_start /* offset from start of file */,
   uint32_t abbrev_end /* offset from start of file */);

void dwarf2_abbrev_initialize_cu (struct dwarf2_abbrev *abbrev,
  struct dwarf2_abbrev_cu *abbrev_cu,
  uint32_t offset /* offset from start of file */);

void dwarf2_abbrev_cu_read_decl (struct dwarf2_abbrev_cu *abbrev_cu,
 struct dwarf2_abbrev_decl *decl,
 uint64_t code,
 struct reader *reader);

void dwarf2_abbrev_decl_read_attr_first (struct dwarf2_abbrev_decl *decl,
 struct dwarf2_abbrev_attr *attr,
 uint32_t *new_offset,
 struct reader *reader);
void dwarf2_abbrev_read_attr (uint32_t cur_offset,
  struct dwarf2_abbrev_attr *attr,
  uint32_t *new_offset,
  struct reader *reader);
bool dwarf2_abbrev_attr_is_last (struct dwarf2_abbrev_attr *attr);


#endif /* DWARF2_ABBREV_H */
/* -*- Mode: C; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 8 -*- */
/*
  This program is free 

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Kenneth Zadeck
Mark Mitchell wrote:
 Kenneth Zadeck wrote:

 Even if we decide that we are going to process all of the functions in
 one file at one time, we still have to have access to the functions that
 are going to be inlined into the function being compiled.  Getting at
 those functions that are going to be inlined is where the double the i/o
 arguement comes from. 

 I understand -- but it's natural to expect that those functions will
 be clumped together.  In a gigantic program, I expect there are going
 to be clumps of tightly connected object files, with relatively few
 connections between the clumps.  So, you're likely to get good cache
 behavior for any per-object-file specific data that you need to access.

I just do not know.  I assume that you are right, that there is some
clumping.  But I am just no sure. 

 I have never depended on the kindness of strangers or the virtues of
 virtual memory.  I fear the size of the virtual memory when we go to
 compile really large programs. 

 I don't think we're going to blow out a 64-bit address space any time
 soon.  Disks are big, but they are nowhere near *that* big, so it's
 going to be pretty hard for anyone to hand us that many .o files. 
 And, there's no point manually reading/writing stuff (as opposed to
 mapping it into memory), unless we actually run out of address space.

I am not so concerned with running out of virtual address space than I
am about being able to break this up so that it can be done in parallel,
on a farm of machines.  Otherwise, lto can never be part of anyone's
compile/test loop. 

The notion of having 20 or 50 compile servers each mapping all of the
files of a large system in seems like a bad design point. 


 In fact, if you're going to design your own encoding formats, I would
 consider a format with self-relative pointers (or, offsets from some
 fixed base) that you could just map into memory.  It wouldn't be as
 compact as using compression, so the total number of bytes written
 when generating the object files would be bigger.  But, it will be
 very quick to load it into memory.

If you look at my code, that is what I have, at least with respect to
the function itself.
There is one big difference here between lto and what a debugger needs. 
I could see designing a debugger (and I have no idea if any such
debuggers exist) that simply maps in the debug information and just uses
the incore representation as is.  Dwarf seems to have been designed to
support this.  (but then again I could be dreaming).  With an
intermediate form of a compiler, the usage is quite different.  All that
we are going to do is load a function, convert it gimple and then throw
away (the notion of throw away may not have meaning for memory mapped
files) the on disk version.

The prime goal is that the format be designed so that an enitity
(generally a function) can be expanded into gimple in one pass.  Then
the question of the benefit of using a compressor comes down to
processor speed vs io speed.

With the parts that you are in charge of, namely the types and the
globals, this is not true.  I can very easily see an implementation of
the types and decls that is like I describe for the debugger, you map it
into mem and just use if from there.   But since the intermediate code
for a function body is going to get very modified, and our existing
gimple is chocked full of pointers, it is harder to envision ever
winning at that the mapping game.


 I guess my overriding concern is that we're focusing heavily on the
 data format here (DWARF?  Something else?  Memory-mappable?  What
 compression scheme?) and we may not have enough data.  I guess we just
 have to pick something and run with it.  I think we should try to keep
 that code as as separate as possible so that we can recover easily if
 whatever we pick turns out to be (another) bad choice. :-)

 One of the comments that was made by a person on the dwarf committee is
 that the abbrev tables really can be used for compression.  If you have
 information that is really common to a bunch of records, you can build
 an abbrev entry with the common info in it. 

 Yes.  I was a little bit surprised that you don't seem to have seen
 much commonality.  If you recorded most of the tree flags, and treated
 them as DWARF attributes, I'd expect you would see relatively many
 expressions of a fixed form.  Like, there must be a lot of PLUS_EXPRs
 with TREE_USED set on them.  But, I gather that you're trying to avoid
 recording some of these flags, hoping either that (a) they won't be
 needed, or (b) you can recreate them when reading the file.  I think
 both (a) and (b) hold in many cases, so I think it's reasonable to
 assume we're writing out very few attributes.

I had thought about that with the flags, and decided it was too much
work/per gain.  I decided to just compress the flags, so that only the
flags that are used for a given tree node were written as a uleb128 bit
number and then let zlib do the rest.  The flags 

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Daniel Berlin

On 8/31/06, Kenneth Zadeck [EMAIL PROTECTED] wrote:

Mark Mitchell wrote:
 Kenneth Zadeck wrote:

 Even if we decide that we are going to process all of the functions in
 one file at one time, we still have to have access to the functions that
 are going to be inlined into the function being compiled.  Getting at
 those functions that are going to be inlined is where the double the i/o
 arguement comes from.

 I understand -- but it's natural to expect that those functions will
 be clumped together.  In a gigantic program, I expect there are going
 to be clumps of tightly connected object files, with relatively few
 connections between the clumps.  So, you're likely to get good cache
 behavior for any per-object-file specific data that you need to access.

I just do not know.  I assume that you are right, that there is some
clumping.  But I am just no sure.


I just want to point out that this argument (okay cache locality) was
used as a reason the massive amount of open/seek/close behavior by
Subversion's FSFS filesystem is a-ok.

It turns out not to be in practice, for a few reasons:
First, the syscall overhead ends up being pretty bad.  Probably not as
bad in LTO as in Subversion (for obvious reasons), but you shouldn't
discount it.
Second, and more importantly, on a network file system, the
open/seek/read calls to get at the cached data are going to be RPC
calls.

--Dan


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Mark Mitchell

Daniel Berlin wrote:

On 8/31/06, Kenneth Zadeck [EMAIL PROTECTED] wrote:

Mark Mitchell wrote:
 Kenneth Zadeck wrote:

 Even if we decide that we are going to process all of the functions in
 one file at one time, we still have to have access to the functions 
that

 are going to be inlined into the function being compiled.  Getting at
 those functions that are going to be inlined is where the double 
the i/o

 arguement comes from.

 I understand -- but it's natural to expect that those functions will
 be clumped together.  In a gigantic program, I expect there are going
 to be clumps of tightly connected object files, with relatively few
 connections between the clumps.  So, you're likely to get good cache
 behavior for any per-object-file specific data that you need to access.

I just do not know.  I assume that you are right, that there is some
clumping.  But I am just no sure.


I just want to point out that this argument (okay cache locality) was
used as a reason the massive amount of open/seek/close behavior by
Subversion's FSFS filesystem is a-ok.


Here, we won't be making syscalls -- but we will be taking page faults 
if we go out of cache.  I don't know what the consequences of page 
faults for files backed over NFS are, but if your object files are 
coming over NFS, your linker isn't going to go too fast anyhow.  I would 
expect most users carefully use local disk for object files.


Since we're descending into increasingly general arguments, let me say 
it more generally: we're optimizing before we've fully profiled.  Kenny 
had a very interesting datapoint: that abbreviation tables tended to be 
about the size of a function.  That's great information.  All I'm 
suggesting is that this data doesn't necessarily imply that enabling 
random access to functions (as we all agree is necessary) implies a 2x 
I/O cost.  It's only a 2x I/O cost if every time you need to go look at 
a function the abbreviation table has been paged out.


I think we've gotten extremely academic here.  As far as I can tell, 
Kenny has decided not to use DWARF, and nobody's trying to argue that he 
should, so we should probably just move on.  My purpose in raising a few 
counterpoints is just to make sure that we're not overlooking anything 
obvious in favor of DWARF; since Kenny's already got that code written, 
it would be nice if we had a good reason not to start over.


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Daniel Jacobowitz
On Thu, Aug 31, 2006 at 09:24:20AM -0700, Mark Mitchell wrote:
 Here, we won't be making syscalls

Yes, you almost certainly will.  If you've got a thousand object files,
you probably don't want to keep them all opened all the time; there are
these pesky things like open file descriptor limits, for instance, and
you'll tend to degrade performance.

-- 
Daniel Jacobowitz
CodeSourcery


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Mark Mitchell

Kenneth Zadeck wrote:


I am not so concerned with running out of virtual address space than I
am about being able to break this up so that it can be done in parallel,
on a farm of machines.  Otherwise, lto can never be part of anyone's
compile/test loop. 


I think we just expanded the scope of work by an order of magnitude. :-)

If you had just said that you wanted to support multi-threaded LTO, that 
would have been a big deal.  But multiple machines with multiple address 
spaces trying to do LTO on one program is a really big deal.  (Of 
course, there is a cheap hack way of doing what you want: run LTO on 
clumps of object files in parallel, and then just link the pre-optimized 
files together in the ordinary way.)


I'd really like to see us inline a function before we even begin to have 
this conversation. :-)



I have no idea how stable all the types and decls are over a
compilation.  I write my info pretty early, and I assume the types and
decls are written pretty late in the compilation (otherwise you would
not have address expressions for the debugger).  If there has been any
processing on these between when I write my stuff and when the types
and decls get written, things may not match up. 


I don't think that this is an issue.  The important information about 
types and declaration is stable.  Things like is this declaration 
used? change over the course of the compilation, but that's not useful 
for DWARF anyhow -- and, in general, we don't write out information 
about types/declarations that are entirely unused.  The key aspects 
(sizes/layouts/etc.) are fixed.


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Kenneth Zadeck
Mark Mitchell wrote:
 Kenneth Zadeck wrote:

 I am not so concerned with running out of virtual address space than I
 am about being able to break this up so that it can be done in parallel,
 on a farm of machines.  Otherwise, lto can never be part of anyone's
 compile/test loop. 

 I think we just expanded the scope of work by an order of magnitude. :-)

 If you had just said that you wanted to support multi-threaded LTO,
 that would have been a big deal.  But multiple machines with multiple
 address spaces trying to do LTO on one program is a really big deal. 
 (Of course, there is a cheap hack way of doing what you want: run
 LTO on clumps of object files in parallel, and then just link the
 pre-optimized files together in the ordinary way.)

 I'd really like to see us inline a function before we even begin to
 have this conversation. :-)
I have no intention in expanding the scope of the work.  I just am not
going to do anything that precludes doing this.  As long as the function
bodies are in a form that can be easily accessed without reading and
processing the entire file, I am fine. 



 I have no idea how stable all the types and decls are over a
 compilation.  I write my info pretty early, and I assume the types and
 decls are written pretty late in the compilation (otherwise you would
 not have address expressions for the debugger).  If there has been any
 processing on these between when I write my stuff and when the types
 and decls get written, things may not match up. 

 I don't think that this is an issue.  The important information about
 types and declaration is stable.  Things like is this declaration
 used? change over the course of the compilation, but that's not
 useful for DWARF anyhow -- and, in general, we don't write out
 information about types/declarations that are entirely unused.  The
 key aspects (sizes/layouts/etc.) are fixed.

You are either right, or we will fix it.  Those are the only two
options.  My only point was that there is a lot of compiler in between
where someone could have done something silly.




Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Mark Mitchell

Daniel Jacobowitz wrote:

On Thu, Aug 31, 2006 at 09:24:20AM -0700, Mark Mitchell wrote:

Here, we won't be making syscalls


Yes, you almost certainly will.


OK, good point.

In any case, my concern is that we're worrying a lot about on-disk 
encoding, but that there are lots of other hard problems we have to 
solve before we can make this work -- even independent of resource 
constraints.


So, I suggest we choose an encoding that seems approximately reasonable, 
but not worry too much about exactly how optimal it is.  I think that 
boils down to: Kenny, I think you should do what you think best, but 
without working too terribly hard. :-)


--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-31 Thread Mark Mitchell

mathieu lacage wrote:


I have spent a considerable amount of time looking at the abbrev tables
output by gcc are not totally random: their entries are sorted by their
abbrev code. That is, the abbrev code of entry i+1 is higher than that
of entry i.


That's an interesting observation.  Essentially, you've shown that by 
storing lg(n) information, you can cut the cost to find an entry in an 
abbreviation table of size n to a constant.  Since, for LTO, we 
certainly can depend on the .o file being produced by GCC, we could 
depend on this behavior, even though it's not mandated by the DWARF 
standard.


I think this is probably moot, since I believe that Kenny feels DWARF is 
not suitable for reasons other than the abbreviation table issue, but 
this is a clever technique.


Thanks,

--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Tom Tromey
 KZ == Kenneth Zadeck [EMAIL PROTECTED] writes:

KZ 2) To have a discussion about the use of DWARF3.  I am now against the
KZ use of DWARF3 for encoding the GIMPLE.

FWIW your arguments convinced me.

I think what matters most is that the resulting format be relatively
well documented (say, better than GIMPLE), efficient, suitable, etc.
Reusing DWARF3 seems cute but inessential.

[...]
KZ +case TRUTH_NOT_EXPR:
KZ +case VIEW_CONVERT_EXPR:
KZ +#if STUPID_TYPE_SYSTEM
KZ +  output_type_ref (ob, TREE_TYPE (expr));
KZ +#endif

I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and
CONVERT_EXPR in the STUPID_TYPE_SYSTEM case.  VIEW_CONVERT_EXPR is a
type-casting expression.

KZ +/* When we get a strongly typed gimple, which may not be for another
KZ +   15 years, this flag should be set to 0 so we do not waste so much
KZ +   space printing out largely redundant type codes.  */
KZ +#define STUPID_TYPE_SYSTEM 1

You could write a more compact form by emitting explicit fake nop
nodes where needed, and then strip those when reading.  I think this
would avoid tweaking the optimizer bugs, as the reloaded trees would
be identical.

This does bring up another point about the format though: there's got
to be some versioning capability in there.

Tom


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Andrew Pinski
 [...]
 KZ +case TRUTH_NOT_EXPR:
 KZ +case VIEW_CONVERT_EXPR:
 KZ +#if STUPID_TYPE_SYSTEM
 KZ +  output_type_ref (ob, TREE_TYPE (expr));
 KZ +#endif
 
 I think VIEW_CONVERT_EXPR needs to be treated like NOP_EXPR and
 CONVERT_EXPR in the STUPID_TYPE_SYSTEM case.  VIEW_CONVERT_EXPR is a
 type-casting expression.

No it is not, it is more complex than a just a simple type-casting
expression, it is a cast which does nothing to the bits at all.  It acts
more like a reference than a cast as it is also can be used on the left
hand side.  I am working on a patch to have GCC use VCE more.

 You could write a more compact form by emitting explicit fake nop
 nodes where needed, and then strip those when reading.  I think this
 would avoid tweaking the optimizer bugs, as the reloaded trees would
 be identical.

Actually it would be better just to fix the problem in the first place
as mentioned before.

-- Pinski


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Mark Mitchell

Kenneth Zadeck wrote:


This posting is a progress report of my task of encoding and decoding
the GIMPLE stream into LTO.   Included in this posting is a patch that
encodes functions and dumps the result to files.  


[I'm sorry for not replying to this sooner.  I've been on a plane or in 
a meeting virtually all of my waking hours since you wrote this...]


Exciting!


2) To have a discussion about the use of DWARF3.  I am now against the
use of DWARF3 for encoding the GIMPLE. 


As the person who suggested this to you, I'm sorry it doesn't seem to 
meet your needs.  I'll make a few more specific comments below, but, to 
echo what I said at the time: I think it was a good idea to try it, but 
if it's not the right tool for the job, then so be it.  My opinion has 
always been that, for the bodies of functions, DWARF is just an 
encoding: if it's a decent encoding, then, it's nice that it's a 
standard; if it's a bad encoding, then, by all means, let's not use it!


I do think DWARF is a good choice for the non-executable information, 
for the same reasons I did initially:


(a) for debugging, we need to generate most of that information anyhow, 
so we're piggy-backing on existing code -- and not making object files 
bigger by encoding the same information twice in the case of -O2 -g, 
which is the default way that many GNU applications are built.


(b) it's a standard, and we already have tools for reading DWARF, so it 
saves the trouble of coming up with a new encoding,


(c) because bugs in the DWARF emission may not result in problems at 
LTO, we'll be validating our LTO information every time we use GDB, and, 
similarly, improving the GDB experience every time we fix an LTO bug in 
this area.


I understand that some of these benefits do not apply to the executable 
code, and that, even to the extent they may apply, the tradeoffs are 
different.  The comments I've made below about specific issues should 
therefore be considered as academic responses, not an attempt to argue 
the decision you have made.



3) To get some one to tell me what option we are going to add to the
compiler to tell it to write this information.  


I think a reasonable spelling is probably -flto.  It should not be a 
-m option since it is not machine-specific.  I don't think it should be 
a -O option either, since writing out LTO information isn't really 
supposed to affect optimization per se.



2) The code is, by design, fragile.  It takes nothing for granted.
Every case statement has gcc_unreachable as it's default case. 


That's the same way I've approached the DWARF reading code, and for the 
same reason.  I think that's exactly the right decision.



1) ABBREV TABLES ARE BAD FOR LTO.

However, this mechanism is only self descriptive if you do not extend
the set of tags.  That is not an option for LTO.  


Definitely true.  When we talked on the phone, we talked about creating 
a tag corresponding to each GIMPLE tree code.  However, you could also 
create a numeric attribute giving the GIMPLE tree code.  If you did 
that, you might find that the abbreviation table became extremely small 
-- because almost all interior nodes would be DW_TAG_GIMPLE_EXPR nodes. 
 The downside, of course, is that the storage required to store the 
nodes would be greater, as it would now contain the expression code 
(e.g., PLUS_EXPR), rather than having a DW_TAG_GIMPLE_PLUS_EXPR.



I strongly believe that for LTO to work, we are going to have to
implement some mechanism where the function bodies are loaded into the
compiler on demand (possibly kept in cache, but maybe not). 


Agreed.


This
will be more cumbersome if we have to keep reloading each object
file's abbrev table just to tear apart a single function in that .o
file.  While the abbrev sections average slightly less than %2 of the
of the size of the GIMPLE encoding for an entire file, each abbrev table
averages about the same size as a single function.


Interesting datapoint.

(Implied, but not stated, in your mail is the fact that the abbreviation 
table cannot be indexed directly.  If it could be, then you wouldn't 
have to read the entire abbreviation table for each function; you would 
just read the referenced abbreviations.  Because the abbreviation table 
records are of variable length, it is indeed true that you cannot make 
random accesses to the table.  So, this paragraph is just fleshing out 
your argument.)


I think the conclusion that you reach (that the size of the tables is a 
problem) depends on how you expect the compiler to process functions at 
link-time.  My expectation was that you would form a global control-flow 
graph for the entire program (by reading CFG data encoded in each .o 
file), eliminate unreachable functions, and then inline/optimize 
functions one-at-a-time.


If you sort the function-reading so that you prefer to read functions 
from the same object file in order, then I would expect that you would 
considerably reduce the impact of reading 

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Kenneth Zadeck
Mark Mitchell wrote:
 Kenneth Zadeck wrote:



 This
 will be more cumbersome if we have to keep reloading each object
 file's abbrev table just to tear apart a single function in that .o
 file.  While the abbrev sections average slightly less than %2 of the
 of the size of the GIMPLE encoding for an entire file, each abbrev table
 averages about the same size as a single function.

 Interesting datapoint.

 (Implied, but not stated, in your mail is the fact that the
 abbreviation table cannot be indexed directly.  If it could be, then
 you wouldn't have to read the entire abbreviation table for each
 function; you would just read the referenced abbreviations.  Because
 the abbreviation table records are of variable length, it is indeed
 true that you cannot make random accesses to the table.  So, this
 paragraph is just fleshing out your argument.)

 I think the conclusion that you reach (that the size of the tables is
 a problem) depends on how you expect the compiler to process functions
 at link-time.  My expectation was that you would form a global
 control-flow graph for the entire program (by reading CFG data encoded
 in each .o file), eliminate unreachable functions, and then
 inline/optimize functions one-at-a-time.

 If you sort the function-reading so that you prefer to read functions
 from the same object file in order, then I would expect that you would
 considerably reduce the impact of reading the abbreviation tables. 
 I'm making the assumption that it f calls N functions, then they
 probably come from  N object files.  I have no data to back up that
 assumption.

 (There is nothing that says that you can only have one abbreviation
 table for all functions.  You can equally well have one abbreviation
 table per function.  In that mode, you trade space (more abbreviation
 tables, and the same abbreviation appearing in multiple tables)
 against the fact that you now only need to read the abbreviation
 tables you need.  I'm not claiming this is a good idea.)

 I don't find this particular argument (that the abbreviation tables
 will double file I/O) very convincing.  I don't think it's likely that
 the problem we're going to have with LTO is running out of *virtual*
 memory, especially as 64-bit hardware becomes nearly universal.  The
 problem is going to be running out of physical memory, and thereby
 paging like crazy, running out of D-cache.  So, I'd assume you'd just
 read the tables as-needed, and never both discarding them.  As long as
 there is reasonable locality of reference to abbreviation tables
 (i.e., you can arrange to hit object files in groups), then the cost
 here doesn't seem like it would be very big.

Even if we decide that we are going to process all of the functions in
one file at one time, we still have to have access to the functions that
are going to be inlined into the function being compiled.  Getting at
those functions that are going to be inlined is where the double the i/o
arguement comes from. 

I have never depended on the kindness of strangers or the virtues of
virtual memory.  I fear the size of the virtual memory when we go to
compile really large programs. 


 2) I PROMISED TO USE THE DWARF3 STACK MACHINE AND I DID NOT.

 I never imagined you doing this; as per above, I always expected that
 you would use DWARF tags for the expression nodes.  I agree that the
 stack-machine is ill-suited.

 3) THERE IS NO COMPRESSION IN DWARF3.

 In 1 file per mode, zlib -9 compression is almost 6:1.  In 1 function
 per mode, zlib -9 compression averages about 3:1.

 In my opinion, if you considered DWARF + zlib to be satisfactory, then
 I think that would be fine.  For LTO, we're allowed to do whatever we
 want.  I feel the same about your confession that you invented a new
 record form; if DWARF + extensions is a suitable format, that's fine.
 In other words, in principle, using a somewhat non-standard variant of
 DWARF for LTO doesn't seem evil to me -- if that met our needs.

One of the comments that was made by a person on the dwarf committee is
that the abbrev tables really can be used for compression.  If you have
information that is really common to a bunch of records, you can build
an abbrev entry with the common info in it. 

I have not seen a place where any use can be made of this for encoding
gimple except for a couple of places where I have encoded a true or
false.  I therefor really do not see that they really add anything
except for the code to read and write them. 


 2) LOCAL DECLARATIONS 
 Mark was going to do all of the types and all of the declarations.
 His plan was to use the existing DWARF3 and enhance it where it was
 necessary eventually replacing the GCC type trees with direct
 references to the DWARF3 symbol table.  

  The types and global variables are likely OK, or at least Mark
 should be able to add any missing info.

I had a discussion on chat today with drow and he indicated that you
were busily adding all of the missing stuff here.  I 

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Mark Mitchell

Kenneth Zadeck wrote:


Even if we decide that we are going to process all of the functions in
one file at one time, we still have to have access to the functions that
are going to be inlined into the function being compiled.  Getting at
those functions that are going to be inlined is where the double the i/o
arguement comes from. 


I understand -- but it's natural to expect that those functions will be 
clumped together.  In a gigantic program, I expect there are going to be 
clumps of tightly connected object files, with relatively few 
connections between the clumps.  So, you're likely to get good cache 
behavior for any per-object-file specific data that you need to access.



I have never depended on the kindness of strangers or the virtues of
virtual memory.  I fear the size of the virtual memory when we go to
compile really large programs. 


I don't think we're going to blow out a 64-bit address space any time 
soon.  Disks are big, but they are nowhere near *that* big, so it's 
going to be pretty hard for anyone to hand us that many .o files.  And, 
there's no point manually reading/writing stuff (as opposed to mapping 
it into memory), unless we actually run out of address space.


In fact, if you're going to design your own encoding formats, I would 
consider a format with self-relative pointers (or, offsets from some 
fixed base) that you could just map into memory.  It wouldn't be as 
compact as using compression, so the total number of bytes written when 
generating the object files would be bigger.  But, it will be very quick 
to load it into memory.


I guess my overriding concern is that we're focusing heavily on the data 
format here (DWARF?  Something else?  Memory-mappable?  What compression 
scheme?) and we may not have enough data.  I guess we just have to pick 
something and run with it.  I think we should try to keep that code as 
as separate as possible so that we can recover easily if whatever we 
pick turns out to be (another) bad choice. :-)



One of the comments that was made by a person on the dwarf committee is
that the abbrev tables really can be used for compression.  If you have
information that is really common to a bunch of records, you can build
an abbrev entry with the common info in it. 


Yes.  I was a little bit surprised that you don't seem to have seen much 
commonality.  If you recorded most of the tree flags, and treated them 
as DWARF attributes, I'd expect you would see relatively many 
expressions of a fixed form.  Like, there must be a lot of PLUS_EXPRs 
with TREE_USED set on them.  But, I gather that you're trying to avoid 
recording some of these flags, hoping either that (a) they won't be 
needed, or (b) you can recreate them when reading the file.  I think 
both (a) and (b) hold in many cases, so I think it's reasonable to 
assume we're writing out very few attributes.



I had a discussion on chat today with drow and he indicated that you
were busily adding all of the missing stuff here.


All is an overstatement. :-) Sandra is busily adding missing stuff and 
I'll be working on the new APIs you need.



I told him that I
thought this was fine as long as there is not a temporal drift in
information encoded for the types and decls between the time I write my
stuff and when the types and decls are written.


I'm not sure what this means.

Thanks,

--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713



Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-30 Thread Seongbae Park

On 8/30/06, Mark Mitchell [EMAIL PROTECTED] wrote:
...

I guess my overriding concern is that we're focusing heavily on the data
format here (DWARF?  Something else?  Memory-mappable?  What compression
scheme?) and we may not have enough data.  I guess we just have to pick
something and run with it.  I think we should try to keep that code as
as separate as possible so that we can recover easily if whatever we
pick turns out to be (another) bad choice. :-)


At the risk of stating the obvious and also repeating myself,
please allow me give my thought on this issue.

I think we should take even a step further than try to keep the code
as separate.
We should try to come up with a set of
procedural and datastructural interface for the input/output
of the program structure,
and try  to *completely* separate the optimization/datastructure cleanup work
from the encoding/decoding.

Beside the basic requirement of being able to pass through
enough information to produce valid program,
I think there is a critical requirement
to implement inter-module/inter-procedural optimization efficiently
- that the I/O interface allows efficient handling of
iterating through module/procedure-level information
without reading each and every module/procedure bodies
(as Ken mentioned).

There are certain amount of information per object/procedure that
are accessed during different optimization and with sufficiently
different pattern -
e.g. type tree is naturally an object-level information
that we may want to go through for each and every object file,
without read all function bodies,
and other function level information such as caller/callee information
would be useful without the function body.

We'll need to identify such information (in other words,
the requirement of the interprocedural optimization/analysis)
so that the new interface would provide ways to walk through them
without loading the entire function bodies - even with large address space,
if the data is scattered everywhere, it becomes extremely inefficient
on modern machines to go through them,
so it's actually more important to identify what logical information
that we want to access during various interprocedural optimizations
and the I/O interface needs to handle them efficiently.

This requirement should dictate how we encode/layout the data
into the disk, before anything else. Also how the information is
presented to the actual inter-module optimization/analysis.

Also, part of defining the interface would involve restricting
the existing structures (e.g. GIMPLE) in possibly more limited form
than what's currently allowed. By virtue of having an interface
that separates the encoding/decoding from the rest of the compilation,
we can throw away and recompute certain information
(e.g. often certain control flow graph can be recovered,
hence does not need to be encoded)
but those details can be worked out as the implementation of the IO interface
gets more in shape.
--
#pragma ident Seongbae Park, compiler, http://seongbae.blogspot.com;


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-28 Thread Chris Lattner

On Aug 28, 2006, at 6:57 AM, Kenneth Zadeck wrote:


This posting is a progress report of my task of encoding and decoding
the GIMPLE stream into LTO.   Included in this posting is a patch that
encodes functions and dumps the result to files.


Interesting email.  One question: how big are the files you're  
generating?  For example, how much space does it take to store the  
body of a function with 100 add operations in it (before gzip'ing  
it)?  What is the size proportional to?  # instructions, # basic  
blocks, what else?


-Chris


Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-28 Thread Kenneth Zadeck
Chris Lattner wrote:
 On Aug 28, 2006, at 6:57 AM, Kenneth Zadeck wrote:

 This posting is a progress report of my task of encoding and decoding
 the GIMPLE stream into LTO.   Included in this posting is a patch that
 encodes functions and dumps the result to files.

 Interesting email.  One question: how big are the files you're
 generating?  For example, how much space does it take to store the
 body of a function with 100 add operations in it (before gzip'ing
 it)?  What is the size proportional to?  # instructions, # basic
 blocks, what else?

 -Chris
The size will be large because the trees are still large.  Encoding
everything in a bootstrap for the three languages was on the order of
60mb.  As you know, size is a well known problem in gcc.  The plan is to
track the changes in gimple and as that gets smaller, this will get
smaller. 

I would like to get the types and perhaps the flags out of the internal
nodes of the trees.  when that gets done, these operations should be small.

I actually do not think that it is productive to spend that much time
measuring this until we first assure ourselves that we are getting all
of the information output correctly.  That 60mb number will change a lot
(both up and down) as we get closer to getting everything tied
together.  Then we can spend more time looking at the pieces and seeing
where we need to do better.



Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-28 Thread Chris Lattner

On Aug 28, 2006, at 10:36 AM, Kenneth Zadeck wrote:

I actually do not think that it is productive to spend that much time
measuring this until we first assure ourselves that we are getting all
of the information output correctly.  That 60mb number will change  
a lot

(both up and down) as we get closer to getting everything tied
together.  Then we can spend more time looking at the pieces and  
seeing

where we need to do better.


Okay, thanks!

-Chris



Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-28 Thread Michael Eager

Kenneth Zadeck wrote:

This posting is a progress report of my task of encoding and decoding
the GIMPLE stream into LTO.   Included in this posting is a patch that
encodes functions and dumps the result to files.  


I have only a limited understanding of GIMPLE and LTO, but here are my
comments regarding DWARF.

DWARF is a debugging format used to describe a program's source and how
it is represented in an executable file.  GIMPLE is an intermediate
representation, not program source.  It's always interesting to see one
tool used in a different context, but I'm not clear that in this case
you are using the right tool for the right task in the way that it was
designed.

Speaking on behalf of the DWARF Standards Workgroup, we welcome
suggestions for extensions or improvements in the DWARF standard.
I wasn't able to identify any specific extension in your email (although
I didn't look at the patch in detail).  If there are specific areas
where DWARF can be improved, please let me know.



1) ABBREV TABLES ARE BAD FOR LTO.

The abbrev table is a mechanism for making a DWARF3 file self
descriptive.  It contains a set of templates that are used to describe
the records and attributes that are written to describe the object
module.

This self description mechanism is much weaker (an therefor much more
compact) than an XML file's schemata because it depends on a common
understanding of the meaning of the tags and attributes that describe
the pieces of the data.  The tags them selves can be thought of as
characters, and an abbrev entry is a magic cookie that tells you how
to break the characters into words.


DWARF is not self-descriptive, except in the very limited sense that
several sections contain version numbers.  The DWARF abbreviation table
is not intended to be self-descriptive or to be descriptive of the
DWARF data.  It is a simple compression scheme which compacts the
Tag/Attribute/Value structure of a DWARF DIE into a much more condensed
format.

Certainly the DWARF Abbrev table makes no pretense of being anything
similar to an XML schema.  I wouldn't use an XML schema for debugging
information (it would be huge and slow), and I wouldn't expect to use
DWARF abbreviation tables to describe anything at all.


However, this mechanism is only self descriptive if you do not extend
the set of tags.  That is not an option for LTO.  While the DWARF3
designers seem to have added every conceivable tag to describe object
code, it is not be surprising that this comes up short when describing
GIMPLE.  While some of GIMPLE could be shoe-horned into the existing
tags, all of it will not fit, so anyone who wants to understand
GIMPLE, will need to use the GCC headers.


The DWARF standard provides for a consistent means to extend
the debugging format.  If you extend DWARF beyond the standard, as has
often been done, you will need to provide descriptions (in some form)
of what these extensions mean.  You can provide this as written
material, or you can have the implementer read the code.  In either
case, the Abbrev tables only provide compression for the Tag/Attr/Val
structure of a DIE.  They do no provide any semantic description.


Using the abbrev tables also add a lot of complexity. It should be
understood that while the abbrev table for two separate files encodes
the same information, the way that the abbrev tables are generated
almost guarantees that the abbrev table will be encoded differently for each
file.  The abbrev table is a sequential list of record
descriptors. The way that the list is generated is first needed, first
output, and you only output the records you need.  Each of these
records is numbered sequentially.  So the numbering of the abbrev
records will generally be different for the two created object files since
they reference different GIMPLE nodes.


There is no need for Abbrev tables to be generated differently
for each object file.  Several compilers generate a standard Abbrev
table and then generate DWARF using these predefined abbreviations.

The DWARF standard also allows the Abbrev table to be stored in a
shared object and referenced by individual object files.  I'm not
aware of any actual implementation of this, but the structure is
there.


I strongly believe that for LTO to work, we are going to have to
implement some mechanism where the function bodies are loaded into the
compiler on demand (possibly kept in cache, but maybe not).  This
will be more cumbersome if we have to keep reloading each object
file's abbrev table just to tear apart a single function in that .o
file.  While the abbrev sections average slightly less than %2 of the
of the size of the GIMPLE encoding for an entire file, each abbrev table
averages about the same size as a single function.  Thus, we will either
have to keep all the abbrev entries for an entire compilation, implement
a second caching mechanism, or do about twice the file reading to load
a single function.


I'm not exactly sure what you are trying to 

Re: First cut on outputing gimple for LTO using DWARF3. Discussion invited!!!!

2006-08-28 Thread Kenneth Zadeck
Michael Eager wrote:
 Kenneth Zadeck wrote:
 This posting is a progress report of my task of encoding and decoding
 the GIMPLE stream into LTO.   Included in this posting is a patch that
 encodes functions and dumps the result to files.  

 I have only a limited understanding of GIMPLE and LTO, but here are my
 comments regarding DWARF.

 DWARF is a debugging format used to describe a program's source and how
 it is represented in an executable file.  GIMPLE is an intermediate
 representation, not program source.  It's always interesting to see one
 tool used in a different context, but I'm not clear that in this case
 you are using the right tool for the right task in the way that it was
 designed.

 Speaking on behalf of the DWARF Standards Workgroup, we welcome
 suggestions for extensions or improvements in the DWARF standard.
 I wasn't able to identify any specific extension in your email (although
 I didn't look at the patch in detail).  If there are specific areas
 where DWARF can be improved, please let me know.

Michael,

My original posting was primarily intended for the gcc developer
community which is familiar with gimple and not so familiar with dwarf. 
I actually think that dwarf is probably pretty good for what it was
designed for and it is also documented much better than most of the
internals of gcc. 

The task I was given was to encode gimple into dwarf, something that
looked like it might be a reasonable idea, but is, in fact, a bad idea. 
My posting was to tell the gcc community, been there, done that, now
lets move on.

I will respond to many of your points but you should not take this as
criticism of dwarf, these are mostly just places where the interface
between gimple and dwarf was particularly troublesome and unless you
want to get in the transporting intermediate code business, which I
strongly suggest you avoid, you can just ignore most of these problems. 



 1) ABBREV TABLES ARE BAD FOR LTO.

 The abbrev table is a mechanism for making a DWARF3 file self
 descriptive.  It contains a set of templates that are used to describe
 the records and attributes that are written to describe the object
 module.

 This self description mechanism is much weaker (an therefor much more
 compact) than an XML file's schemata because it depends on a common
 understanding of the meaning of the tags and attributes that describe
 the pieces of the data.  The tags them selves can be thought of as
 characters, and an abbrev entry is a magic cookie that tells you how
 to break the characters into words.

 DWARF is not self-descriptive, except in the very limited sense that
 several sections contain version numbers.  The DWARF abbreviation table
 is not intended to be self-descriptive or to be descriptive of the
 DWARF data.  It is a simple compression scheme which compacts the
 Tag/Attribute/Value structure of a DWARF DIE into a much more condensed
 format.

 Certainly the DWARF Abbrev table makes no pretense of being anything
 similar to an XML schema.  I wouldn't use an XML schema for debugging
 information (it would be huge and slow), and I wouldn't expect to use
 DWARF abbreviation tables to describe anything at all.

While I understand your point, from a language theoretic point of view,
they really can be thought of a schemata.  They are a description of the
records types that have been written by the compiler (or some other
tool) for the tags and attributes in this module. 
This may not be the way that they evolved in your community, but for
someone coming from the outside with no knowledge of your history, this
is what they are.

 However, this mechanism is only self descriptive if you do not extend
 the set of tags.  That is not an option for LTO.  While the DWARF3
 designers seem to have added every conceivable tag to describe object
 code, it is not be surprising that this comes up short when describing
 GIMPLE.  While some of GIMPLE could be shoe-horned into the existing
 tags, all of it will not fit, so anyone who wants to understand
 GIMPLE, will need to use the GCC headers.

 The DWARF standard provides for a consistent means to extend
 the debugging format.  If you extend DWARF beyond the standard, as has
 often been done, you will need to provide descriptions (in some form)
 of what these extensions mean.  You can provide this as written
 material, or you can have the implementer read the code.  In either
 case, the Abbrev tables only provide compression for the Tag/Attr/Val
 structure of a DIE.  They do no provide any semantic description.

What I did was add 4 new DW_FORMS.  This is one of the areas that, I
assume, users are not supposed to extend.  My assumption is based on the
fact that the standard says how to extend tags and attributes, but is
silent on how to extend forms.

The four forms that I added are:

1) counted list of udata.
2) counted list of sdata.
3) flag with false value.
4) flag with true value.

These are things that made my life simpler and provided for a more