On Mar 25, 2011, at 2:51 AM, Toon Verwaest wrote:

> Ok, I will do so. (read the f-ing paper) I only read the blogpost until now.

which paper?
Is there something more than the blog? I read the old VW5 paper but eliot told 
me that this is old and not accurate with Cog anymore.

> 
> I just realized that I actually made a mistake in my mental model of your 
> model. See! It's complex!
> So I realized that getting to the remotes is exactly as fast as going to the 
> parent or outer context.
> 
> This makes it as fast as having a method context with maximally 2 nested 
> contexts (3 blocks nested), and faster than deeper nestings. How often does 
> it occur that you have deeper nesting in Pharo? Is it worthwhile to make the 
> remote arrays just for those cases?
> 
> Is the copying really worthwhile to make those cases faster?
> 
> My biggest problem until now is... why wouldn't you be able to do everything 
> you do with the remote arrays, directly with the context frames? Why limit it 
> to only the part that is being closed over? The naive implementation that 
> just extends squeak with proper closure-links will obviously be slow. I agree 
> that you need a stack. Now I'd just like to read why you choose to just take 
> a part of the frame (the remote array) rather than the whole frame. This 
> would avoid the copyTemps thing... 
> 
> But then. I guess I should go off and read the f-ing paper. I hope that 
> particular thing is described there, since it's basically the piece I'm 
> missing.
> 
> Also I don't exactly know what Peter Deutsch did, but if it was the 
> straightforward implementation then it seems obvious you get such a     
> speedup. Implementing it is less obvious, naturally ;)
> These responses are exactly why I posed the question here... I'd like to 
> understand why. No offense.
> 
> cheers,
> Toon
> 
> 
> On 03/25/2011 02:22 AM, Eliot Miranda wrote:
>> Toon,
>> 
>>     what you describe is how Peter Deutsch designed closures for ObjectWorks 
>> 2.4 & ObjectWorks 2.5, whose virtual machine and bytecode set served all the 
>> way through VisualWorks 3.0.  If you read the context management paper 
>> you'll understand why this is a really slow design for a JIT.  When I 
>> replaced that scheme by one essentially isomorphic to the Squeak one the VM 
>> became substantially faster; for example factors of two and three in 
>> exception delivery performance.  The description of the problem and the 
>> performance numbers are all in the paper.  There are two main optimizations 
>> I performed on the VisualWorkas VM, one is the closures scheme and the other 
>> is PICs.  Those together sped-up what was the fastest commercial Smalltalk 
>> implementation by a factor of two on most platforms and a factor of three on 
>> Windows.
>> 
>> I'm sorry it's complex, but if one wants good performance it's a price 
>> well-worth paying.  After all I was able to implement the compiler and 
>> decompiler within a month, and Jorge proved at INRIA-Lille that I'm far form 
>> the only person on the planet who understands it.  Lispers have understood 
>> the scheme for a long time now.
>> 
>> best,
>> Eliot
>> 
>> On Thu, Mar 24, 2011 at 6:01 PM, Toon Verwaest <[email protected]> 
>> wrote:
>> 
>> I can't say that i clearly understood your concept. But if it will
>> simplify implementation
>> without seemingly speed loss, i am all ears :)
>> 
>> 
>> test
>>    |b|
>>    [ |a|
>>      a + b ]
>> 
>> Suppose you can't compile anything away, then you get
>> 
>> |==============
>> |MethodContext
>> |
>> |a := ...
>> |==============
>>     ^
>>     |
>> |==============
>> |BlockContext
>> |
>> |b := ...
>> |==============
>> 
>> And you just look up starting at the current context and go up. Except if 
>> the var is from the homeContext, then you directly follow the home-context 
>> pointer.
>> Since all contexts link to the home-context, this makes it 1 pointer 
>> indirection to get to the method's context. 1 for the parent context. So 
>> that makes only 2 indirections starting from the 3 nested block (so when you 
>> have [ ... [ ... [ ... ] ... ] ... ]; where all of them are required for 
>> storing captured data. ifTrue:ifFalse: etc blocks obviously don't count. And 
>> blocks without shared locals could be left out (although we might not do 
>> that, for debugging reasons).
>> 
>> Hope that helps.
>> 
>> cheers,
>> Toon
>> 
>> 
> 


Reply via email to