Daniel Berlin wrote:
On 8/31/06, Kenneth Zadeck <[EMAIL PROTECTED]> wrote:
Mark Mitchell wrote:
> Kenneth Zadeck wrote:
>
>> Even if we decide that we are going to process all of the functions in
>> one file at one time, we still have to have access to the functions that
>> are going to be inlined into the function being compiled.  Getting at
>> those functions that are going to be inlined is where the double the i/o
>> arguement comes from.
>
> I understand -- but it's natural to expect that those functions will
> be clumped together.  In a gigantic program, I expect there are going
> to be clumps of tightly connected object files, with relatively few
> connections between the clumps.  So, you're likely to get good cache
> behavior for any per-object-file specific data that you need to access.
>
I just do not know.  I assume that you are right, that there is some
clumping.  But I am just no sure.

I just want to point out that this argument (okay cache locality) was
used as a reason the massive amount of open/seek/close behavior by
Subversion's FSFS filesystem is "a-ok".

Here, we won't be making syscalls -- but we will be taking page faults if we go out of cache. I don't know what the consequences of page faults for files backed over NFS are, but if your object files are coming over NFS, your linker isn't going to go too fast anyhow. I would expect most users carefully use local disk for object files.

Since we're descending into increasingly general arguments, let me say it more generally: we're optimizing before we've fully profiled. Kenny had a very interesting datapoint: that abbreviation tables tended to be about the size of a function. That's great information. All I'm suggesting is that this data doesn't necessarily imply that enabling random access to functions (as we all agree is necessary) implies a 2x I/O cost. It's only a 2x I/O cost if every time you need to go look at a function the abbreviation table has been paged out.

I think we've gotten extremely academic here. As far as I can tell, Kenny has decided not to use DWARF, and nobody's trying to argue that he should, so we should probably just move on. My purpose in raising a few counterpoints is just to make sure that we're not overlooking anything obvious in favor of DWARF; since Kenny's already got that code written, it would be nice if we had a good reason not to start over.

--
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713

Reply via email to