On Thu, 12 Jan 2023 02:14:10 GMT, Archie L. Cobbs <d...@openjdk.org> wrote:

>>> 
>>> D'oh, you're right. But if you made `returnMe()` static or private then the 
>>> argument would still hold.
>>> 
>>> > And, if the method is static, same story - you are passing ref3 somewhere 
>>> > else, and ref3 potentially contains this.
>>> 
>>> Not true... static methods are safe to "invoke" because they can't be 
>>> overridden. When the analyzer "invokes" the method, it will see that all it 
>>> does with its parameter is return it.
>>> 
>> 
>> So, for static methods, it could go down two ways: either we don't even look 
>> at referenced method bodies, give up and just declare "sorry, escaping". Or, 
>> if we look into method bodies, and see that the relationship between inner 
>> and outer parameter is as simple, it's just like assignment again:
>> 
>> ref1 -> ref2 -> ref3 -> x -> ref4
>> 
>>> But I don't really have an analysis of all the trade-offs. Really, at a 
>>> certain point I stumbled on what I thought was a fairly straightforward way 
>>> to achieve a good false negative rate and a very low false positive rate, 
>>> without a lot of work. And frankly at that point I stopped caring about the 
>>> question you're asking, although I agree it's certainly interesting from an 
>>> academic point of view (I'm a practical person).
>> 
>> The reason I'm asking these questions is that I'm trying to understand which 
>> ingredients went into the analysis and why, in order to try and build a 
>> mental problem of what the problem that needs to be solved is, what are the 
>> constraints, etc. I'm also a practical person :-) and I often find it easier 
>> to understand a piece of code if I know some of the reasoning that went 
>> behind it, or what's the behavior supposed to be.
>> 
>> I do not know, off-hands, whether there is a simpler solution - I was mostly 
>> probing for war stories of the kind "I tried X and I got Y", and I apologize 
>> if that came off the wrong way.
>
>> So, for static methods, it could go down two ways: either we don't even look 
>> at referenced method bodies, give up and just declare "sorry, escaping". Or, 
>> if we look into method bodies, and see that the relationship between inner 
>> and outer parameter is as simple, it's just like assignment again:
> 
> Right - and just to reconfirm, it's only the non-overridable methods in the 
> same compilation unit that are "invoked" (i.e., analyzed & tracked). Any 
> method that (might) exist in another compilation unit is off limits, so we 
> have to assume the worst case and declare a leak if we are passing it any 
> 'this' references.
> 
>>  in order to try and build a mental problem of what the problem that needs 
>> to be solved is
> 
> Gotcha.
> 
> Let's be clear about what exactly we're worrying about. It's this situation: 
> You have a class `B extends A`, where `B` and `A` are in different 
> compilation units, and during the `super()` call that constructor `B()` makes 
> to its superclass constructor `A()`, some code in class `B` gets executed 
> (really, it's some field in `B` getting accessed that actually matters) prior 
> to `A()` returning.
> 
> So the basic goal is to analyze `A` constructors and watch 'this' references 
> until if/when they go to someplace where we can no longer watch them. At that 
> point we have to assume that some code in `B` might get invoked and so we 
> generate a warning.
> 
> OK, so where can a 'this' reference go?
> 
> Well, here are all of the possible places a Java reference can exist:
> 1. On the Java stack
>     (a) The current 'this' instance
>     (b) A method parameter
>     (c) A local variable
>     (d) A temporary value that is part of the current expression being 
> evaluated
>     (e) The return value from a method that just returned
>     (f) A caught exception
> 1. In a field of some object...
>     (a) A normal field
>     (b) An outer 'this' instance
>     (c) Other synthetic field (e.g., captured free variable)
> 1. As an element in a reference array
> 1. In native code as a native reference
> 
> Those are the only possibilities AFAIK.
> 
> So one way to locate yourself on the spectrum from "simple" to "complex" is 
> to answer this question: Which of those are you going to try to keep track 
> of, and for each one, how hard are you going to try?
> 
> The `this-escape` analysis being proposed tracks 1(a-e) and 2(b) pretty 
> closely (it tries hard), and it adds a very course tracking of 2(a-c), and 3 
> using the notion of indirect references (it doesn't try very hard). We do not 
> track 1(f) or 4.
> 
> So to think about the overall problem, imagine how you might or might not 
> address all of those cases.

> * On the Java stack
>   (a) The current 'this' instance
>   (b) A method parameter
>   (c) A local variable
>   (d) A temporary value that is part of the current expression being evaluated
>   (e) The return value from a method that just returned
>   (f) A caught exception
> 
> * In a field of some object...
>   (a) A normal field
>   (b) An outer 'this' instance
>   (c) Other synthetic field (e.g., captured free variable)
> 
> * As an element in a reference array
> 
> * In native code as a native reference

Thanks for the classification. This is helpful.

I'm not sure what you mean by (1f). You mean `this` can be embedded in an 
exception being thrown? Is that different from (2)?

Also, it seems to me that (3) is a special case of (2) - in the sense that you 
can imagine a reference array as a big object that has many fields as there are 
elements - so the same reasoning applies.

When looking at (2d) and (2e) the fact that Javac's AST is not in SSA form 
means that, yes, we need to track _expressions_ not just variables (e.g. for 
chains of methods calls, ternary operators and such).

-------------

PR: https://git.openjdk.org/jdk/pull/11874

Reply via email to