Michael Sperber wrote: > > In future releases, an unoptimized record access will > > consist of a procedure call, a double tag check, an > > indirect load, an eq? check, and a load. Twobit's > > existing optimizations, or easy extensions of them, > > will eliminate any or all of that code when it is safe > > to so. > > I am curious how that works. (And I don't mean that in a sarcastic > way.) I assume the procedure call mentioned above is the call to the > closed version of the record accessor.
Correct. > Eliminating that procedure call > and also inlining the index into the record object (so it can be used to > generate an instruction using that index) was the first concern you > voiced when the two of us talked about records in May 2005.... > > This seems hard to achieve with the procedural layer, as it can > presumably only use a finite number of lambdas to represent the > accessors for a potentially infinite number of record indices. You are taking the general case to mean arbitrarily large offsets. In most instruction sets, there are only a finite number of integers that can be generated as immediate operands. For sufficiently large integers, it is generally faster to fetch a pre-computed integer out of a closure than to generate the integer at run time from a sequence of machine instructions. Therefore the code generated for the general case, as you define it, is independent of the offset. In Larceny, it is enough to use immediate operands only for the most common offsets, and to accept an extra load instruction for unusually large offsets. Contrary to what is claimed by the 5.97 draft, it is easy to generate code for those options at compile time, with the choice between them made independently at run time for each accessor/mutator. Inlining and elimination of redundant code will require a simple flow analysis. See below. > It's > unclear to me how to do it without a successful flow analysis if the > syntactic record types are merely the run-time record types: If the > record types of the syntactic layers are merely run-time identifiers > referring to rtds, and as the indices in child types depend on parent > types, it seems to me you need to defer computation of the index until > run time, presumably requiring at least an extra instruction to load > that index. An extra load instruction wouldn't be so bad even if you always needed it, but you seldom need it, even without any flow analysis, as I explained above. As for the flow analysis, I recommend the attitude taken in Twobit: a simple flow analysis that achieves 95% of the potential speedup with 5% of the effort is a good compromise. In most cases, record types and associated accessors etc will be defined at the top level of a library, and will be explicitly exported. That means their definitions are immutable. For these top-level record types, all parent types will be defined before any record types that extend them. Furthermore programmers have no motive for obfuscating that code in ways that would make it hard for a compiler to perform its flow analysis. The offsets associated with most procedural record definitions can therefore be computed at compile time. The bottom line is that compilers will usually be able to eliminate the load instruction that concerns you. The rare circumstances in which that load instruction will be necessary do not justify the restrictions and weaknesses that have been built into the syntactic layer of the 5.97 draft. Will _______________________________________________ r6rs-discuss mailing list [email protected] http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss
