On 25 March 2011 01:44, Toon Verwaest <[email protected]> wrote:
>
> No I can't.  Since I did it, I naturally think it's a good idea.  Perhaps,
> instead of denigrating it without substantiating your claims you could
> propose (and then implement, and then get adopted) a better idea?
>
> Sure. My own VM will take a lot longer to get done! ;) I don't want to
> blemish any of your credit for building a cool VM. I was rather just
> wondering why you decided to go for this particular implementation which
> seems unobvious to me. Hence the question. I guess I should've formulated it
> slightly differently :) More info below.
>>
>> I can see why it would pay off for Lisp programmers to have closures that
>> run like the Pharo closures, since it has O(1) access performance. However,
>> this performance boost only starts paying off once you have at least more
>> than 4 levels of nested closures, something which, unlike in LISP, almost
>> never happens in Pharo. Or at least shouldn't happen (if it does, it's
>> probably ok to punish the people by giving them slower performance).
>
> Slower performance than what?  BTW, I think you have things backwards.  I
> modelled the pharo closure implementation on lisp closures, not the other
> way around.
>
> This is exactly what I meant. The closures seem like a very good idea for
> languages with very deeply nested closures. Lisp is such a language with all
> the macros ... I don't really see this being so in Pharo.
>>
>> This implementation is pretty hard to understand, and it makes
>> decompilation semi-impossible unless you make very strong assumptions about
>> how the bytecodes are used. This then again reduces the reusability of the
>> new bytecodes and probably of the decompiler once people start actually
>> using the pushNewArray: bytecodes.
>
> Um, the decompiler works, and in fact works better now than it did a couple
> of years ago.  So how does your claim stand up?
>
> For example when I just use the InstructionClient I get in pushNewArray: and
> then later popIntoTemp. This combination is supposed to make clear that you
> are storing a remote array. This is not what the bytecode says however. And
> this bytecode can easily be reused for something else; what if I use the
> bytecode to make my own arrays? What if this array is created in a different
> way? I can think of a lot of ways the temparray could come to be using lots
> of variations of bytecodes, from which I would never (...) be able to figure
> out that it's actually making the tempvector. Somehow I just feel there's a
> bigger disconnect between the bytecodes and the Smalltalk code and I'm
> unsure if this isn't harmful.
>
> But ok, I am working on the Opal decompiler of course. Are you building an
> IR out with your decompiler? If so I'd like to have a look since I'm
> spending the whole day already trying to get the Opal compiler to somehow do
> what I want... getting one that works and builds a reusable IR would be
> useful. (I'm implementing your field-index-updating through bytecode
> transformation btw).
>>
>> You might save a teeny tiny bit of memory by having stuff garbage
>> collected when it's not needed anymore ... but I doubt that the whole design
>> is based on that? Especially since it just penalizes the performance in
>> almost all possible ways for standard methods. And it even wastes memory in
>> general cases. I don't get it.
>
> What has garbage collection got to do with anything?  What precisely are you
> talking about?  Indirection vectors?  To understand the rationale for
> indirection vectors you have to understand the rationale for implementing
> closures on a conventional machine stack.  For lisp that's clear; compile to
> a conventional stack as that's an easy model, in which case one has to store
> values that outlive LIFO discipline on the heap, hence indirection vectors.
>  Why you might want to do that in a Smalltalk implementation when you could
> just access the outer context directly has a lot to do with VM internals.
>  Basically its the same argument.  If one can map Smalltalk execution to a
> conventional stack organization then the JIT can produce a more efficient
> execution engine. Not doing this causes significant problems in context
> management.
>
> With the garbage collection I meant the fact that you can already collect
> part of the stack frames and leave other parts (the remote temps) and only
> get them GCd later on when possible.
>
> I do understand why you want to keep them on the stack as long as possible.
> The stack-frame marriage stuff for optimizations is very neat indeed. What
> I'm more worried about myself is the fact that stackframes aren't just
> linked to each other and share memory that way. This means that you only
> have 1 indirection to access the method-frame (via the homeContext), and 1
> for the outer context. You can directly access yourself. So only the 4rd
> context will have 2 indirections (what all contexts have now for remotes).
> From the 5th on it gets worse... but I can't really see this happening in
> real world situations.
>
> Then you have the problem that since you don't just link the frames and
> don't look up values via the frames, you have to copy over part of your
> frame for activation. This isn't necessarily -that- slow (although it is
> overhead); but it's slightly clumsy and uses more memory. And that's where
> my problem lies I guess ... There's such a straightforward implementation
> possible, by just linking up stackframes (well... they are already linked up
> anyway), and traversing them. You'll have to do some rewriting whenever you
> leave a context that's still needed, but you do that anyway for the remote
> temps right?
>


> The explanation is all on my blog and in my Context Management in
> VisualWorks 5i paper.
> But does a bright buy like yourself find this /really/ hard to understand?
>  It's not that hard a transformation, and compared to what goes on in the
> JIT (e.g. in bytecode to machine-code pc mapping) its pretty trivial.
>
> I guess I just like to really see what's going on by having a decent model
> around. When I look at the bytecodes; in the end I can reconstruct what it's
> doing ... as long as they are aligned in the way that the compiler currently
> generates them. But I can easily see how slight permutations would already
> throw me off completely.
>
>>
>> But probably I'm missing something?
>
> It's me who's missing something.  I did the simplest thing I knew could
> possibly work re getting an efficient JIT and a Squeak with closures
> (there's huge similarity between the above scheme and the changes I made to
> VisualWorks that resulted in a VM that was 2 to 3 times faster depending on
> platform than VW 1.0).  But you can see a far more efficient and simple
> scheme.  What is it?
>
> Basically my scheme isn't necessarily far more efficient. It's just more
> understandable I think. I can understand scopes that point to their outer
> scope; and I can follow these scopes to see how the lookup works. And the
> fact that it does some pointer dereferencing and copying of data less is
> just something that makes me think it wouldn't be less efficient than what
> you have now. My problem is not that your implementation is slow, rather
> that it's complex. And I don't really see why this complexity is needed.
>
> Obviously playing on my ego by telling me I should be clever enough to
> understand it makes me say I do! But I feel it's not the easiest; and
> probably less people understand this than the general model of just linking
> contexts together.
>

I can't say that i clearly understood your concept. But if it will
simplify implementation
without seemingly speed loss, i am all ears :)

> best,
> Eliot
>
> cheers,
> Toon
>



-- 
Best regards,
Igor Stasenko AKA sig.

Reply via email to