Re: Garbage collector vs variable lifetime

John Engelhart Mon, 09 Jun 2008 00:56:21 -0700


On Jun 8, 2008, at 11:48 PM, Chris Hanson wrote:

On Jun 8, 2008, at 5:39 PM, John Engelhart wrote:
On Jun 7, 2008, at 7:11 PM, Chris Hanson wrote:
This won't happen because each message expression -- just as withfunction-call expressions -- is a sequence point. The compilercan't know what side-effects [data self] might have, so it can'tre-order the invocation to elsewhere.
This is not necessarily true. If the const and pure GCC__attribute(())s were extended to objc methods then the compilerwould be free to perform common subexpression and loop invariantcode movement optimizations.
They can't be, while preserving the existing semantics of thelanguage. In the existing semantics, a message send always resultsin a dynamic dispatch.

This is true for C as well. The semantics of the C language are suchthat a function being called in the source always results in onefunction call during execution.


Then consider the case of:

int square(int x) { return(x*x); }

int something(int y) {
 int r = 0;
 for(int z = 0; z < 25; z++) { r += square(y); }
 return(r);
}

An inter-procedural optimizing compiler will eliminate the functioncall to square and replace it with (y*y). It will also determine that(y*y) is loop invariant and hoist it out of the loop body.

The semantics are preserved and identical results are calculated (the'meaning' is the same). The semantics do not require square() toliterally be called each time. In the same sense, there is norequirement that a message literally be sent each time it appears inthe source code as long as it can be shown that the results would beidentical. Identical results implies identical meaning, which in turnmeans semantics are preserved.


In the case of [data self], this essentially becomes the function:

id self(id self, SEL _cmd) { return(self); }

A sufficiently aware optimizing compiler (think every single line ofsource for everything as one multi-gigabyte translation unit) couldhypothetically trace message dispatches such that it could eliminatethe intermediate dynamic dispatch and be left with just the function,such as:

{ NSData *data = /* valid ptr */; /* do some work */ self(data,"self"); self(data, "self"); }

The compiler would be free to eliminate both function calls.Semantics would be wholly preserved: The self function causes noprogram visible state changes, therefore by definition its execution,or lack of execution, can not have an effect.

The run time dynamic dispatch nature of objc makes such 'inter-messagedispatch optimizations' much, much harder, especially at compiletime. Ultimately, though, they are fundamentally the same in terms ofoptimization.

The 'self' message would definitely fall under the domain of theseattributes, thus the original argument is apropos.
For source compatibility, you almost certainly could *not* suddenlyindicate that "[foo self]; [foo self];" results in only oneinvocation of -self by the compiler, at least for subclasses ofNSObject or NSProxy.
After all, a subclass of NSObject may have overridden -self to dosomething else, and the compilation unit containing the above twoinvocations may have no idea what the class of "foo" is with whichto make that judgment.

Nonsense. Look, these kinds of attributes are a lot liketypecasting. You can typecast away const, volatile, or whatever andeven stuff a short in to a pointer and vice versa. That doesn't makeit right. The typecast overrides the compilers safeties, youessentially certify that your typecasted result is true and correctdespite what the rules say. If you end up shooting yourself in thefoot, you only have yourself to blame.

Besides, it's not that hard to come up with some simple additions forwhen an attribute like 'const' or 'pure' can legitimately be appliedby the compiler within the context of objc objects. An obvious onewould be that the attribute only applies to the class it was declaredfor and nothing else, even subclasses.

Obvious candidates are immutable objects 'length', 'count', etc,which would result in a pretty big win if these attributes wereavailable.
If I write

- (void)doSomething:(NSArray *)array {
    NSUInteger count1 = [array count];
    NSUInteger count2 = [array count];
    NSLog(@"%u", count1);
    NSLog(@"%u", count2);
}
the compiler can't collapse those into a single invocation of -count. After all, it could be passed a subclass of NSArray for whom-count has side-effects. Think about a "thread-safe array" (as badas the concept might be) for example.

Well, in the case of your example, you have a bug: You have staticallytyped the class to NSArray, not your subclass. If one applies the'attribute only applies to the class it was specified for' rule:

By statically typing the class to NSArray, you have certified to thecompiler that no other object type will be received as an argument.When you passed it an object of a different type, even a subclass, youbroke your promise to the compiler.

If the declaration is switched to 'id', then by definition none of themethods will be tagged with the attribute, and thus things would workas expected (two calls to -count).

If the declaration is switched to 'MyThreadArray *' and one doesn'tsupply a new method prototype, the attribute is lost because it's adifferent class (result: two calls to -count). If one does supply anew 'count' prototype without any __attribute qualifications, thenthings will work as expected (result: two calls to -count). If thereis a new 'count' prototype and it is defined with __attribute((const))when it isn't, then you've only yourself to blame really.

Again, the const and pure attributes are nothing that a 'hyper aware'optimizing compiler wouldn't be able to figure out on its own. Evenif one takes the conservative stance that all existing objects andmethods won't be modified with const or pure attributes, I would stillbe free to apply them to the objects I create. If I had a customclass of an object, and altered its prototype for self so that it was__attribute((const)), then we've really only delayed the problem, notfixed it.

An alternative, if the whole __attribute__ thing gives you thewillies, is to consider 'If the compiler had sufficient information atits disposal, would it ultimately reach the same conclusion?' Evenrun time dependencies are potentially free game if one considers LLVMtaken to its logical conclusion, which is to say deferring finalcompilation and optimization until run time.

The case of '[data self]' is kind of an odd ball. It's pretty obviousthat it's unlikely to do anything 'useful'. Using our all powerful'brain optimizer', it's trivial for us to trace the appropriate codepaths and come to the conclusion that in this particular case, nothingis accomplished by calling '[data self]'. If we can do it, thenhypothetically a compiler could do it too.

I think what all of this is really trying to accomplish is afundamental deficiency in treating __strong pointers as equivalents togeneric void * pointers. What should really be done is change thevisibility rules of __strong pointers such that 'they are guaranteedto be visible to the GC system from the point of declaration to theend of the enclosing block.' Machinations like sticking '[data self]'near the end so the pointer stays visible up to that point and thepossible effects of optimization on such visibility become moot undersuch a definition.

Adding additional attributes to make any new API contracts stricteris an interesting idea, but it's likely to result in breakage ofexisting code either at runtime or during compilation.
-- Chris


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to [EMAIL PROTECTED]

Re: Garbage collector vs variable lifetime

Reply via email to